Spring 2008 Events Schedule | Database Research Group | UW
[Please remove <h1>]
Spring 2008
Note: Events of interest to the
Database     Research     Group    are   posted    to the uw.cs.database
newsgroup and    are mailed   to  the 
db-group@lists.uwaterloo.ca 
mailing list. There are actually three mailing lists aggregated into the 
db-group list: db-faculty
(for DB group    faculty),  db-grads  (for DB group   graduate  students),
   and db-friends    (for DB group alumni,  visitors, and   friends).  If
you wish   to subscribe    to one of these three lists (or to unsubscribe), please 
visit 
https://lists.uwaterloo.ca/mailman/listinfo/<listname>, where 
 <listname> is the list you wish to subscribe to. 
                                  
-  DB group meetings
-  The DB group meets most Friday afternoons at 2pm, usually in DC1331. 
                See the list of current events for 
times      and    locations      of upcoming meetings. Each meeting lasts 
for an  hour    and  features  an informal     presentation by one of the 
members  of the   group.  Everyone  is welcome to   attend. These talks are 
intended  to raise   questions  and to  stimulate  discussion   rather than 
being polished  presentations     of research  results.  Speakers are   determined 
 using a rotating speaker     list, which can be found  on the
 DB group meeting  page
-  DB seminar series
-  The DB seminar series features visiting speakers. These seminars are 
                more-or-less monthly, and are usually scheduled on Monday 
mornings         at   11am.    See the list of current 
events     for    times    and locations    of upcoming seminars. The 
full schedule  can be   found on   the     DB  seminar series page.
 Recent and Upcoming Events
| DB Meeting: | Friday May 9, 2:00pm,
DC 1304 (Please note change of room) | 
| Speaker: | Xuhui Li | 
| Title: | Delayed Synchronization of I/O Writes | 
| Abstract: | Modern computers usually have multiple tiers of cache, such as file system
cache and storage system cache, lying between application user spaces and
storage devices. Although current I/O interfaces implemented by operating
systems can support applications to use these cache tiers, they are not
flexible enough to meet applications? various requirements of I/O
synchronization. As a result, some applications tradeoff I/O efficiency for
data safety and totally ignore the underlying caches. In this paper we propose
a novel I/O interface to address this problem. Our approach supports
applications to use underlying caches and at the same time still preserve data
safety. We implemented our approach on Linux Native AIO interface and modified
MySQL InnoDB storage engine to use it. By running some synthetic workload
against the new LAIO interface we found promising results. | 
| DB Meeting: | Friday May 16, 2:00pm, DC 1331 | 
| Speaker: | Ani Nica, Sybase iAnywhere | 
| Title: | Spatial Indexes and spatial support in relational database systems | 
| Abstract: | This talk will cover the current database research for spatial indexes such
as   R-trees, TV-trees, SS-trees, and Quadtrees.  A summary of the current
support for spatial data and spatial queries in commercial relational
database systems will be also part of the presentation. | 
| DB Meeting: | Friday May 23, 2:00pm, DC 1331 | 
| Speaker: | Gulay Unel | 
| Title: | Data Exchange | 
| Abstract: | This talk will be on data exchange which is the problem of taking data structured under a source 
schema and creating an instance of a target schema that reflects the source data as accurately as 
possible. It will cover two main papers "Data exchange: semantics and 
query answering" and "Data exchange: getting to the core" by Fagin et al. | 
| DB Meeting: | Friday June 6, 2:00pm, DC 1331 | 
| Speaker: | Peter Bumbulis, Sybase iAnywhere | 
| Title: | Enforcing Database Recoverability on Disks that Lack Write-Through | 
| Abstract: | Talk based on Robin Dhamankar, Hanuma Kodavalla, and Vishal Kathuria. "Enforcing Database Recoverability on Disks that Lack Write-Through," MSR-TR-2008-36, March 2008. | 
| Seminar: | Thursday June 19, 10:30am, DC 1331
(Please note unusual day and time) | 
| Speaker: | Zhenjie Zhang, National University of Singapore | 
| Title: | On Uncertain Data Clustering and Domination Game Analysis | 
| Abstract: | In this talk, I will talk about two pieces of work, uncertain data
clustering and domination game analysis. Uncertain Data Clustering: Applications, Models and AlgorithmsUncertain data is now ubiquitous in many database systems and
applications, such as scientific database, sensor  network, moving
objects and data stream, due to inaccurate measurement or
infrequent data update. In this talk, we will  present our new
studies on unsupervised learning over uncertain data sets. In our
study, every uncertain object is modelled as a sphere in the
corresponding space which bound the exact position without any
underlying distribution assumption. Based on the definition of
uncertainty, different computation models are proposed for unsupervised
learning tasks, including  Zero Uncertain Model, Static Uncertain Model,
Dissolvable Uncertain Model and Reversed Uncertain Model. Each
of the models can be applied to different environments with different
requirements. We will further present some preliminary solutions  to the
models with some of the popular learning algorithms, such as k-means
algorithm, EM algorithm. Some of the work presented here will
appear in ICML'08.
 Domination Game Analysis: When Game Theory Meet Data MiningGame theory is a powerful tool for modelling competitions among
manufacturers in a market. In this paper, we present a study on
combining game theory and data mining by introducing the concept
of domination game analysis. We present a multidimensional market
model, where every dimension represents one attribute of a
commodity. Every product or customer is represented by a point in
the multidimensional space, and a product is said to "dominate" a
customer if all of its attributes can satisfy the requirements of
the customer. The expected market share of a product is measured
by the expected number of the buyers in the customers, all of
which are equally likely to buy any product dominating him. A Nash
Equilibrium is a configuration of the products achieving stable
expected market shares for all products. We prove that Nash
Equilibrium in such a model can be computed in polynomial time if
every manufacturer tries to modify its product in a round robin
manner. To further improve the efficiency of the computation, we
also design two algorithms for the manufacturers to efficiently
find their best response to other products in the market. This is
joint work with Laks V.S. Lakshmanan and Anthony K. H. Tung
 | 
| Bio: | Zhenjie Zhang is a PhD candidate in the Database Group at the
National University of Singapore. He received his B.Sc. in
Computer Science from Fudan University, China. His research
interests include general skyline query, unsupervised learning,
and game theoretical analysis over large data. Zhenjie presently
has 10 research papers to his name including papers in major
venues such  as SIGMOD, ICML and TKDE. He was a recipient of the
prestigious NUS President Fellowship in 2007. More about Zhenjie's
research can be found at www.comp.nus.edu.sg/~zhangzh2/ | 
| DB Meeting: | Friday July 4, 2:00pm, DC 1331 | 
| Speaker: | Xin Liu | 
| Title: | Application Hints for Multi-tier Cache Management | 
| Abstract: | In today's computer system, a data access usually goes through several  
tiers of cache, such as application buffer, file system cache, and  
storage server cache. Typically each tier of cache is managed by its  
own sub-system independently. The information about the data access  
shared between these sub-systems is either limited or under-utilized.  
Although sub-systems try their best to improve their individual cache  
performance, their lack of I/O information from other tiers, or their  
ignorance of it, makes their efforts sub-optimal. In fact, when an I/O  
request is passed from one tier to another, some attributes associated  
with it are useful in predicting future access of the same data page.  
These attributes can be passed from an upper cache tier, such as a  
storage client, to a lower tier cache, such as a storage server, as  
hints. They can be used by the latter in its cache management. In this  
presentation I will present our study on how I/O attributes in a  
storage client are related to its future data accesses to the storage  
server, and how we utilize these attributes to manage the lower-tier  
cache. | 
| DB Meeting: | Friday July 11, 2:00pm, DC 1331 | 
| Speaker: | Luiz Celso Gomez Jr. | 
| Title: | Web user databases | 
| Abstract: | The demand for richer interactive web applications has
turned browsers into powerful programing platforms. The only piece
still missing is an integrated user data management system to replace
the limited flexibility provided by cookies. In this talk I will
outline the requirements for such system, present what has been done
towards this end and possible new alternatives. At the end I will
overview a few techniques to meet selected requirements. | 
| DB Meeting: | Friday July 18, 2:00pm, DC 1331 | 
| Speaker: | Raymond Wong, Hong Kong University of Science and Technology (HKUST) | 
| Title: | Minimality Attack in Privacy Preserving Data Publishing | 
| Abstract: | Data publishing generates much concern over the protection
of individual privacy. Recent studies consider cases where the
adversary may possess different kinds of knowledge about the
data. In this paper, we show that knowledge of the mechanism
or algorithm of anonymization for data publication can
also lead to extra information that assists the adversary and
jeopardizes individual privacy. In particular, all known mechanisms
try to minimize information loss and such an attempt
provides a loophole for attacks. We call such an attack a minimality
attack. In this paper, we introduce a model called
m-confidentiality which deals with minimality attacks, and
propose a feasible solution. Our experiments show that minimality
attacks are practical concerns on real datasets and
that our algorithm can prevent such attacks with very little
overhead and information loss. | 
| DB Meeting: | Friday July 25, 2:00pm, DC 1331 | 
| Speaker: | Patrick Kling | 
| Title: | Distributed XML Query Processing | 
| Abstract: | XML is commonly used to exchange data among a variety of systems.
Therefore, XML data can be viewed as inherently distributed according
to the origin of individual fragments. Large XML collections and heavy
workloads also force us to distribute XML data. In this talk, I will
describe a number of ways in which XML data can be distributed. I will
then discuss techniques that allow us to query distributed XML and how
querying can take advantage of known distribution characteristics. | 
| DB Meeting: | Friday August 1, 2:00pm, DC 1331 | 
| Speaker: | Wei Jiang | 
| Title: | Ordered Conjunctive Queries | 
| Abstract: | Order properties of data have been playing an important role in relational 
query processing and optimization. However, conventional database systems 
consider order support only as an add-on feature to the core query 
optimization. They fail to provide a systematic and consistent treatment 
of order throughout query processing. We propose a novel approach to 
represent ordered data models, and trace and refer order properties 
throughout query processing and optimization. By considering order from 
the first beginning of query processing, we expect to gain the most 
benefits from query optimization. An ordered data model and ordered 
algebra will be presented for ordered conjunctive queries. | 
This page is maintained by 
Ashraf Aboulnaga.