Database Research Group Events

Spring 2004

Note: Events of interest to the Database Research Group are posted to the uw.cs.database newsgroup and are mailed to the dbgroup mailing lists: db-faculty (for DB group faculty), db-grads (for DB group graduate students), and db-friends (for DB group alumni, visitors, and friends). If you wish to subscribe to one of these lists, send mail to majordomo@db with "subscribe <list>" in the message body, where <list> is the list you wish to subscribe to.  For example, use "subscribe db-friends" to subscribe to the db-friends list. To unsubscribe, send "unsubscribe <list>" to the same address.
DB group meetings
The DB group meets most Friday afternoons at 2pm, usually in DC1331. See the list of current events for times and locations of upcoming meetings. Each meeting lasts for an hour and features an informal presentation by one of the members of the group. Everyone is welcome to attend. These talks are intended to raise questions and to stimulate discussion rather than being polished presentations of research results. Speakers are determined using a rotating speaker list, which can be found on the DB group meeting page
DB seminar series
The DB seminar series features visiting speakers. These seminars are more-or-less monthly, and are usually scheduled on Monday mornings at 11am. See the list of current events for times and locations of upcoming seminars. The full schedule can be found on the DB seminar series page.

Recent and Upcoming Events

DB meeting: Friday, May 7th, 2:00pm, DC1331
Speaker: Anil Goel
Topic: I will discuss two scalable solutions for increasing the throughput of a database server transparently in the presence of heavy query loads. The two approaches I will discuss are from Microsoft and Sybase. The Microsoft approach was presented by Paul Larson at ICDE and the Sybase approach uses ASA as part of the solution.

DB meeting: Friday, May 14th, 2:00pm, DC1331
Speaker: Serge Bourbonnais
Topic: Federated Archives: transparently archiving relational data
Abstract: We describe a system where applications using a relational database get the benefits of data archiving, such as smaller active data sets and therefore better response time, while being able to access the archived data as if it were still in the database, e.g., using standard SQL. Users are able to combine both archived and non-archived data in a single query. The retrieval of the data from the archive(s) is done transparently, using wrappers that can federate data between the database and the remote storage systems that are used for data archiving. By design, the data appears as if in only one place at any given time, and the data is moved between storage systems while the database is online.

DB meeting: Friday, May 21st, 2:00pm, DC1331
Speaker: Qiang Wang
Topic: Decentralized routing of XML query
Abstract: Concerning many XML data are inherently distributed e.g. XML data georgraphically close to sensors in sensor database, or distributed XML data referred by ActiveXML documents, a scalable and efficient approach to locate related data for distributed queries is in urgent need. This talk will be about my ongoing work on a decentralized routing strategy of XML queries, which is expected to solve the problem. The basic idea is to distribute the XML fragments into a multiple dimension overlay network space; map each XML fragment and query to a coordinate in the space; then a node can route the query step by step to the target query point by querying neighboring information.

DB meeting: Friday, May 28th, 2:00pm, DC1331
Speaker: Robert Warren
Topic: Correctness, performance and testing in database integration

This talk explores the issues of correctness and performance when designing database integration methods. The integration must be correct in that it must not integrate incompatible information, and must perform in that it must integrate information which is represented in a dissimilar fashion.

Ad-hoc integration relies on human oversight and application-driven tests which do not generalise well. Another testing approach is reviewed where a known database is integrated with a modified copy as a means of benchmarking an integration system.

DB meeting: Friday, June 4th, 2:00pm, DC1331
Speaker: Khuzaima Daudjee
Topic: Database Server Scale-up
Abstract: There has been recent interest in scaling-up a database server. Existing techniques do not provide scalability and transaction ordering guarantees. I'll talk about techniques for providing both.

DB Seminar: Monday, June 7th, 11:00 am, DC1304
Speaker: Ken Ross, Columbia University
Title: Architecture Sensitive Design of Database Engines

DB meeting: Friday, June 11th, 2:00pm, DC1331
Speaker: Glenn Paulley

I'm going to speak about approaches and tradeoffs with respect to the implementation of semantic query optimization in a relational database system. In particular, I will briefly review two ideas from the literature:

  • Gail Mitchell, Umeshwar Dayal, and Stanley B. Zdonik (1993). Control of an Extensible Query Optimizer: A Planning-Based Approach. In Proceedings, VLDB19, Dublin, Ireland, pp. 517-528.
  • Mitch Cherniack and Stan Zdonik (1998). Changing the rules:Transformations for Rule-Based Optimizers. In Proceedings, SIGMOD 1998,Seattle, Washington, pp. 61-72.

DB meeting: Friday, July 9th, 2:00pm, DC1331
Speaker: Lubomir Stanchev
Abstract: In the first part of the talk I will outline current research in the area of applying database technology to sensor networks. In particular, I will cover two approaches:
  • the Cougar Project developed at Cornell University
  • the TinyDB Project developed at UC Berkley.
I will try to make the second part of talk an open ended discussion. First, I will describe one possible area for future research that relates to real-time sensor networks and then I will solicit the audience's opinion.

DB meeting: Friday, July 16th, 2:00pm, DC1331
Speaker: Lei Chen
Topic: Robust and Fast Similarity Search for Moving Object Trajectories
Abstract: Similarity-based retrieval of moving object trajectories is useful to many applications. A number of distance functions have been proposed, but they are sensitive to noise, shifts and scaling of data that commonly occur due to sensor failures, errors in detection techniques, disturbance signals, and different sampling rates. Cleaning data is not always possible. In this work, I introduce a distance function, Edit Distance for Real sequence (EDR) which is robust against these data imperfections. Analysis and comparison of EDR with other popular distance functions, such as Euclidean distance, Dynamic Time Warping (DTW), Edit distance with Real Penalty (ERP), and Longest Common Subsequences (LCSS), indicate that EDR is more robust than Euclidean distance, DTW and ERP, and it is more accurate than LCSS. We also develop three pruning techniques to improve the retrieval efficiency and show that these three techniques can be combined perfectly in a search, increasing the pruning power significantly. The experimental results confirm the superior efficiency of the combined methods.

DB meeting: Friday, July 23rd, 2:00pm, DC1331
Speaker: Peter Bumbulis
Abstract: I'll talk about the lock manager described in the paper "High-Performance,Space-efficient, Automated Object Locking" by Laurent Daynes and Grzegorz Czajkowski (ICDE 2001). While their target implementation is a persistent java virtual machine, it is also well suited for use in a relational database. It's a nice demonstration of the benefits that adding a level of indirection can bring (and is definitely not your typical Gray & Reuter-style implementation!)

DB meeting: Friday, August 6th, 2:00pm, DC1331
Speaker: Thomas Heimrich, Technical University of Ilmenau
Topic: Modelling and Checking Output Constraints in Multimedia Database Management Systems
Abstract: Constraints are used in traditional database systems to define consistent database states. For multimedia data it is also important to define constraints for a correct data output. The output of multimedia data can be distorted if output parameters of multimedia data (e.g. resolution of an image) are changed arbitrary. The producer of multimedia data should specify constraints for a correct data output. For that we introduce 'output constraints'. Output constraints can restrict output parameters (e.g. resolution). They can also define relationships (e.g. synchronization) during the output of multimedia data. We show, how we can model output constraints. The database management system must check output constraints in the case of data modification and/or during the data output. We show, how we can use existing database features to do this.

MMath presentation: Monday, August 9th, 11:00am (note day and time), DC1331
Speaker: Yuhui (George) Wen
Topic: Similarity Search in Metric Spaces
Abstract: Similarity search is sometimes called nearest neighbour search. It is used widely in pattern recognition, graphics, text data, bioinformatics and etc. A new category based on the pivot selection techniques are used and it divides all of the current algorithms into divide-and-conquer and all-at-once. A new algorithms using all-at-once is presented and it outperforms the fastest algorithm usign all-at-once, MVP-tree. Furthermore, it is proven probabilistically that all-at-once is preferrable for scattered data while divide-and-conquer is better for clustered data. Other topics including how to reduce disk IO and a new algorithms NGH-tree are addressed in the thesis but will not be presented in this presentation.

DB meeting: Friday, August 13th, 2:00pm, DC1331
Speaker: Matthew Young-Lai
Topic: order optimization

This page is maintained by 
Ken Salem.