Database Research Group Events

Fall 2005

Note: Events of interest to the Database Research Group are posted to the uw.cs.database newsgroup and are mailed to the dbgroup mailing lists: db-faculty (for DB group faculty), db-grads (for DB group graduate students), and db-friends (for DB group alumni, visitors, and friends). If you wish to subscribe to one of these lists, send mail to majordomo@db with "subscribe <list>" in the message body, where <list> is the list you wish to subscribe to.  For example, use "subscribe db-friends" to subscribe to the db-friends list. To unsubscribe, send "unsubscribe <list>" to the same address.
DB group meetings
The DB group meets most Friday afternoons at 2pm, usually in DC1331. See the list of current events for times and locations of upcoming meetings. Each meeting lasts for an hour and features an informal presentation by one of the members of the group. Everyone is welcome to attend. These talks are intended to raise questions and to stimulate discussion rather than being polished presentations of research results. Speakers are determined using a rotating speaker list, which can be found on the DB group meeting page
DB seminar series
The DB seminar series features visiting speakers. These seminars are more-or-less monthly, and are usually scheduled on Monday mornings at 11am. See the list of current events for times and locations of upcoming seminars. The full schedule can be found on the DB seminar series page.

Recent and Upcoming Events

DB Meeting: Friday September 16, 2:00pm, DC1331
Topic: Kickoff meeting

DB Seminar: Monday September 19, 11:00am, DC1304
Speaker: Kevin Chang, University of Illinois, Urbana-Champaign
Title: Building a MetaQuerier and Beyond: A Trilogy of Search, Integration, and Mining for Web Information Access

DB Meeting: Friday September 23, 2:00pm, DC1331
Speaker: David DeHaan
Topic: View Matching for Outer-Joins, OQL, and Conjunctive XQuery
Abstract: This talk will be an informal attempt to draw connections between three papers:
1. "View Matching for Outer-Join Views", Larson & Zhou, VLDB 2005
2. "Deciding Containment for Queries with Complex Objects", Levy & Suciu, PODS 1997
3. "Containment of Nested XML Queries", Dong, Halevy & Tatarinov, VLDB 2004

The first paper takes previous work by Cesar Galindo-Legaria on normal forms (hence equivalence) for select-project-outer-join expressions and uses it as the basis for a view matching algorithm. The next two papers consider the containment/equivalence problems for conjunctive OQL and XQuery, respectively. I obviously won't be able to describe all three papers in detail, so I'll focus primarily on the Larson paper, and then address the other two papers at a fairly high level, pointing out the similarities and differences in the problems being considered.

DB Seminar: Tuesday October 4, 11:00am, MC5136 (Please note the unusual day and place)
Speaker: Kaladhar Voruganti, IBM Almaden Research Center
Title: SMaestro: Second Generation Storage Infrastructure Management

Master's Thesis Presentation: Friday October 7, 10:00am, DC1304
Speaker: Tony Young
Title: Communication Cost Modeling for Federated Database Systems

DB Meeting: Friday October 7, 2:00pm, DC1331
Speaker: Jeremy Barbay
Title: From the Conjunctive Queries to XPATH, between Theory and Practice, some realisations and perspectives
Abstract: I worked in the past on adaptive algorithms for conjunctive queries (aka Google-like queries), and adapted some of those algorithms in a search engine on file systems. In the last years I tried to generalize the theoretical analysis to some queries on XML documents, and in the last monthes I have been implementing some of my algorithms and testing them on data provided by Google.

At this point I would like to get some feedback from the database group before going forward in any direction (theoretical analysis vs practical, exact vs approximated queries, conjonctive vs structured queries), and eventually discuss potential collaborations.

DB Meeting: Friday October 14, 2:00pm, DC1331
Speaker: Frank Tompa
Title: How does records management fit into things?
Abstract: "Records management is the application of systematic and scientific controls to recorded information required in the operation of an organization's business." I will start with an overview of various components of records management, concentrating on its application to electronic records. Thereafter I'll look briefly at some implications of an information retention and disposition program on historical databases, backup and recovery, and access control.

DB Seminar: Monday October 17, 11:00am, DC1304
Speaker: Volker Markl, IBM Almaden Research Center
Title: Learning in Query Optimization

DB Meeting: Friday October 21, 2:00pm, DC1331
Speaker: Grant Weddell
Title: Capturing more meaning for the semantic web
Abstract: I'll review a sequence of pregressively richer ontology languages that have been proposed for the semantic web, beginning with RDF and ending with (full) OWL. It is hoped that this will have a side effect of creating a lively discussion on the idea of a semantic web.

Seminar: Monday October 24, 11:00am, DC1304
Speaker: Divesh Srivastava,  AT&T Labs Research
Title: Approximate Joins: Concepts and Techniques
Abstract: The quality of the data residing in information repositories and databases gets degraded due to a multitude of reasons. In the presence of data quality errors, a central problem is to identify all pairs of entities (tuples) in two sets of entities that are approximately the same. This operation has been studied through the years and it is known under various names, including record linkage, entity identification, entity reconciliation and approximate join, to name a few. The objective of this talk is to provide an overview of key research results and techniques used for approximate joins.

This is joint work with Nick Koudas.

DB Meeting: Friday October 28, 2:00pm, DC1331
Speaker: Dan Farrar, Sybase iAnywhere
Title: Optimizing the Other Half of Database Applications
Abstract: In practice, the factor most limiting the performance and scalability of many database applications is not the DBMS itself, but is the expertise of the application developers. Application architecture and interfacing issues can impose significant penalties on system performance. However, for non-expert designers and programmers, identifying these issues can be difficult. Making this determination easier is an important part of reducing the total cost of database ownership.

In this talk, I will discuss common kinds of problems that can hurt database application performance. I will talk about the type of support that a DBMS must provide to help application developers identify and resolve these problems, and how this need is addressed by the major vendors. I will also suggest areas in which this functionality can be leveraged to automatically improve database performance.

DB Meeting: Friday November 4, 2:00pm, MC 5158 (Not our regular meeting room)
Speaker: Ning Zhang
Title: XSEED: Accurate and Fast Cardinality Estimation for XPath Queries
Abstract: In this talk, I am going to present XSEED, a synopsis of path queries for cardinality estimation. The synopsis is constructed by starting from a very small kernel, then it is incrementally updated. With such an incremental construction, a synopsis structure can be dynamically configured to accommodate different memory and construction time budgets. Cardinality estimation based on XSEED can be performed very efficiently and accurately. Extensive experiments on both synthetic and real data sets are conducted, and our results show that even with less memory, the accuracy of XSEED could achieve an order of magnitude better than that of other synopsis structures. The estimation time is under 2% of the actual querying time for a wide range of queries in all test cases.

This is a join work with M. Tamer Ozsu, Ashraf Aboulnaga, and Ihab F. Ilyas.

DB Seminar: Thursday November 10, 11:00am, DC 1304 (Please note the unusual day)
Speaker: Sihem Amer-Yahia, AT&T Labs-Research
Title: The Role of Document Structure in Querying, Scoring and Evaluating XML Full-Text Search

DB Meeting: Friday November 11, 2:00pm, DC1331
Speaker: Gordon Cormack
Title: TREC 2005 Spam Track Overview
Abstract: TREC's Spam Track introduces a standard testing framework that presents a chronological sequence of email messages, one at a time, to a spam fiter for classification. The fiter yields a binary judgement (spam or ham [i.e. non-spam]) which is compared to a human-adjudicated gold standard. The filter also yields a spamminess score, intended to reflect the likelihood that the classified message is spam, which is the subject of post-hoc ROC (Receiver Operating Characteristic) analysis. The gold standard for each message is communicated to the filter immediately following classification. Eight test corpora -- email messages plus gold standard judgements -- were used to evaluate 53 subject filters. Five of the corpora (the public corpora) were distributed to participants, who ran their filters on the corpora using a track-supplied toolkit implementing the framework. Three of the corpora (the private corpora) were not distributed to participants; rather, participants submitted filter implementations that were run, using the toolkit, on the private data. Twelve groups participated in the track, submitting 44 filters for evaluation. The other nine subject filters were variants of popular open-source implementations adapted for use in the toolkit in consultation with their authors.

DB Seminar: Monday November 14, 11:00am, DC1304
Speaker: Ling Liu, Georgia Institute of Technology
Title: MobiEyes: Distributed Processing of Moving Queries over Moving Objects

DB Meeting: Friday December 2, 2:00pm, DC1331
Speaker: Anil Goel, Sybase iAnywhere
Topic: Supporting Multiple View Maintenance Policies in DBMS
Abstract: In this talk I will present the issues related to multiple maintenance policies for materialized views in a database management system and how they affect the optimizer decisions on when to use materialized views to answer a query. I will review first the pioneer ideas presented in the paper [1] on the subject, and then I will discuss how the work on which paper [2] is based relates to this issue. A quick review of how commercial database management systems implement multiple maintenance policies will follow.

[1] Latha S. Colby, Akira Kawaguchi, Daniel F. Lieuwen, Inderpal Singh Mumick, Kenneth A. Ross: "Supporting Multiple View Maintenance Policies", SIGMOD 1997
[2] Hongfei Guo, Per-Åke Larson, Raghu Ramakrishnan: "Caching with 'Good Enough' Currency, Consistency, and Completeness", VLDB 2005

DB Seminar: Monday December 5, 11:00am, DC1304
Speaker: Mary Fernandez, AT&T Labs - Research
Title: Implementing XQuery 1.0: The Story of Galax

This page is maintained by Ashraf Aboulnaga.

Campaign Waterloo

Data Systems Group
David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario, Canada N2L 3G1
Tel: 519-888-4567
Fax: 519-885-1208

Contact | Feedback: | Data Systems Group

Valid HTML 4.01!Valid CSS! Last modified: Friday, 01-Jun-2012 11:01:02 EDT