Database Research Group Events

Spring 2003

Note: Events of interest to the Database Research Group are posted to the uw.cs.database newsgroup and are mailed to the dbgroup mailing lists: db-faculty (for DB group faculty), db-grads (for DB group graduate students), and db-friends (for DB group alumni, visitors, and friends). If you wish to subscribe to one of these lists, send mail to majordomo@db with "subscribe <list>" in the message body, where <list> is the list you wish to subscribe to.  For example, use "subscribe db-friends" to subscribe to the db-friends list. To unsubscribe, send "unsubscribe <list>" to the same address.
DB group meetings
The DB group meets most Friday afternoons at 2pm, usually in DC1331. See the list of current events for times and locations of upcoming meetings. Each meeting lasts for an hour and features an informal presentation by one of the members of the group. Everyone is welcome to attend. These talks are intended to raise questions and to stimulate discussion rather than being polished presentations of research results. Speakers are determined using a rotating speaker list, which can be found on the DB group meeting page
DB seminar series
The DB seminar series features visiting speakers. These seminars are more-or-less monthly, and are usually scheduled on Monday mornings at 11am. See the list of current events for times and locations of upcoming seminars. The full schedule can be found on the DB seminar series page.

Recent and Upcoming Events


DB meeting : Friday, May 2nd, 2:00 pm, DC1331
Speaker : Lei Chen
Topic: Overview of techniques for similarity-based retrieval on time series data
Abstract: In this talk, I will give a brief overview on the techniques that have been applied on similarity-based time series data retrieval.  Two main aspects will be discussed: similarity metrics and dimensional reduction techniques. In the end, I will talk about some existing problems on similarity-based multi-dimensional time series data retrieval.

DB Seminar : Monday, May 12th, 11:00 am, DC1304
Speaker : Aidong Zhang, University at Buffalo
Title: Bioinformatics: Gene Expression Data Analysis

DB meeting : Friday, May 16th, 2:00 pm, DC1331
Speaker : Fei Ku
Topic: Towards Automatic Initial Buffer Configuration
Abstract: Buffer pools are blocks of memory used in commercial database systems to retain frequently referenced pages.  Configuring the buffer pools is a difficult and manual task.  A good buffer configuration improves performance by reducing the number of disk accesses. Empirical studies have shown that optimizing the initial buffer configuration (determined at database design time) can improve system performance.  A good initial configuration can also provide a faster convergence towards an optimal dynamic buffer allocation. Previous studies have not considered automating the buffer pool configuration process.

In this talk, I will present two techniques that facilitate initial buffer configuration.  First, a workload characterization scheme is proposed.  Secondly,  an analytic model of the GCLOCK buffer replacement policy is developed that can be used to evaluate the effectiveness of a particular buffer configuration for a given workload. Finally, I will present results from our experimental study which demonstrate the effectiveness of our proposed techniques.

DB meeting : Friday, May 23rd, 2:00 pm, DC1331
Speaker : Hai Chen
Topic: Supporting set-at-a-time extensions for XML through DOM
Abstract: With the rapid growth of the web and e-commerce, W3C produced a new standard, XML, as a universal data representation format to facilitate information interchange and integration from heterogeneous systems. In order to process XML, documents need to be parsed and then users can access to manipulate, and retrieve the XML data easily. DOM is one of application programming interfaces, which provides the abstract, logical tree structure of an XML document. Though it is a very promising initiative in defining the standard interface for XML documents to access all data required by applications, significant benefit can result if some functions are extended. In this thesis, we want to support set-at-a-time extensions for XML through DOM. Through extending some powerful functions to get summary information from different groups of related document elements, and filter, extract and transform this information as a sequence of nodes, the extended DOM can reduce the communications overhead and respond time between the client and the server and therefore provide applications with more convenience. Our work is to explore the ideas for set-at-a-time processing, define,  implement some suitable methods and code some query application examples by using the DOM and our extensions to make comparisons through a test system. Finally, future work will be given.

DB meeting : Friday, May 30th, 2:00 pm, DC1331
Speaker : Grant Weddell
Topic: On the use of database technology in embedded control programs
Abstract: From the point of view of high level architecture, is it almost always useful to see a control program as having a repository style wherein a collection of subsystems interact with a common database. The talk will focus on the idea of using database technology to support an SQL interface to this database. Included is a short review of how current relational technology supports this idea.

DB Seminar : Monday, June 2nd  POSTPONED 
Speaker : Divesh Srivastava, AT&T Research
Title: Phrase Matching in XML

DB meeting : Friday, June 6th, 2:00 pm, DC1331
Speaker: Xuerong Tang
Topic: Paired SynTrees: specifying structured document transformation
Abstract: In this talk, I will present a specification language, called Paired SynTrees, for structured document transformation.

We consider instances taken from a collection of documents that share a common structure in the sense that they can all be characterized by grammar rules such as found in a context-free grammar (CFG) or forest-regular grammar (FRG). Thus a single XML (or SGML) document with accompanying DTD (document type definition) is structured. As long as documents do not all conform to a single universal standard, data transformation remains a problem. Thus in the absence of a universal tag set and schema, structured document transformation is important for XML to serve as the data interchange format for the Web.

Many rule-based languages (XSLT, TXL, XDuce, YATL) and XML query languages (XQuery, XML-QL, XML-GL) can be used to carry out such structural transformations, but they are not ideal candidates as a specification language. We therefore propose a new high-level specification language by extending existing grammar-based specification techniques such as SDT, TT grammar and filters.

During the talk, I will first describe the language and how it is used to specify transformations via examples; I will then present a prototype implementation. The  system will parse the specification in Paired SynTrees, then generate XSLT templates, finally execute the XSLT program to accomplish the transformation.

DB meeting : Friday, June 27th, 2:00 pm, DC1331
Speaker: Peter Bumbulis
Topic: My talk will be on "ARC: A Self-Tuning, Low Overhead Replacement Cache" by Megiddo and Modha.  The ARC policy "works uniformly well across varied workloads and cache sizes without any need for workload specific a priori knowledege or tuning".   It's online, simple to implement (constant complexity per request), and offers performance that is usually considerably better than LRU (comparable to a tuned 2Q).

DB meeting : Friday, July 4th, 2:00 pm, DC1331
Speaker : Anil Goel
Topic: I'll discuss some work done in the area of selectivity estimation of LIKE predicates. In particular, I will describe, in detail, a technique proposed by Krishnan, Vitter, and Iyer and, time permitting, will talk about a simpler approach implemented by us for a reduced instance of the same problem.


DB Seminar : Monday, July 7th, 11:00am, DC1304
Speaker : Wolfgang Lehner, Technische Universität Dresden
Title: Database Support for Data Mining Applications

DB meeting : Friday, July 11th, 2:00 pm, DC1331
Speaker: Lubomir Stanchev
Topic: Index Selection for Embedded Control Applications using Description Logics
Abstract: In the paper we examine the problem of automated index selection for Embedded Control Programs (ECPs). Such systems are usually characterized by the property that the transaction types, which can consist of queries and updates, are predefined and can be classified as either critical or non-critical. In the paper we concern ourselves exclusively with the critical part of an ECP transaction workload. More precisely, our problem input consists of a set of critical transaction types and a semantic database schema. The goal is to find the minimal number of extended indices that will allow for every critical operation to be performed efficiently. The proposed solution uses reasoning over the Description Logics (DL) dialect DLFD in order to transform the original problem into a NP-Complete problem that can be solved in exponential time using dynamic programming.

DB meeting : Friday, July 18th, 2:00 pm, DC1331
Speaker : Huizhu Liu
Topic: TBA
Abstract: TBA

DB meeting : Friday, July 25th, 2:00 pm, DC1331
Speaker : Matthew Young-Lai
Topic: I'll give an overview of the XML features that have been added to ASA, and talk about some details of the XPath implementation.

This page is maintained by  Ken Salem.