Database Research Group Events

Winter 2002

DB meeting: Friday, January 11th, 2:00 pm, DC1331
Speaker: Ivan Bowman
Topic: Reducing Client/Server Latency by Detecting Stylized Usage
Snacks: Hui Zhang
Abstract: Current  applications in client-server  database systems may suffer a performance penalty  due to the latency  associated with   networked  communication.   By  recognizing  stylized patterns in the requests  submitted by applications, we  can reduce  this  latency  through  prefetching  of results that are likely to be needed in  the near future. We can do  even better  if we can recognize patterns that can be implemented using  relational  primitives  such  as  joins  and  unions; these  can  be  processed  more  efficiently if expressed as queries to the database engine.  This talk will discuss  the types  of patterns  that commonly  occur in  database client applications, and  will  show  how  these  patterns  can  be effectively recognized and exploited to improve performance.

DB meeting: Friday, January 18th, 2:00 pm, DC1304
Speaker: Ningyan Zhong
Topic: Constraint Databases
Snacks: Ivan Bowman
Abstract: Constraint Databases was initiated in 1990, and grew out of the research on Datalog and Constraint Logic Programming. The key idea is that the notion of a tuple in a relational database could be replaced by a conjunction of constraints from an appropriate language, and many of the features of the relational model could be extended in an appropriate way. In CDB, the infinite data can be stored in a finite and compact way, and the query complexity doesn't depend on the size of the data. In this talk, we will give a brief introduction to CDB(definition,expressive power, and application). Then some issues in Constraint Logic Programming will be discussed(constraint solving techniques in LP).  The main purpose of this talk is to illustrate a framework of constraint extension on datalog--a technique based on program transformation and constraint operations' definition. Together with examples implemented on several different constraint classes, we will also introduce two optimization techniques(partial evaluation and memoing evaluation on constraint). Finally, the talk will be concluded with some future work. 

DB meeting: Friday, January 25th, 2:00 pm, DC3301 (the DB Lab)
Speaker: come for a coffee break - no speaker today

DB seminar: Friday February 1st, 10:00AM, DC1304
Speaker: Paul Cotton, Microsoft Canada
Title: W3C XML Query WG: A Status Report

DB meeting: Friday, February 8th, 2:00 pm, DC1331
Speaker: Yasser Ebrahim
Topic: Mental Models and Comprehension of Diagrammatic Representations
Snacks: Ningyan Zhong

DB seminar: Monday February 11th, 11:30AM, DC1304
Speaker: Pat Martin, Queens University
Title: Self-Managing Database Management Systems

DB meeting: Friday, February 15th, 2:00 pm, DC1331
Speaker: Glenn Paulley
Topic: The Index Selection Problem
Snacks: Yasser Ebrahim
Abstract: A long-standing research problem in physical database design is the index selection problem (ISP). Essentially the ISP is: given a database instance and a workload comprised of both queries and update DML statements, determine the optimal set of indexes that results in the lowest overall elapsed time. Usually the problem is constrained in some way: two typical examples are
  1. at most k indexes can be created, or
  2. the sum of the sizes of the recommended indexes may not exceed n disk pages.
In this talk I will give a critical overview of some of the ISP research literature that has been published since 1973, and describe recent attempts to solve this problem in commercial database systems, namely DB2 and Microsoft SQL Server.

DB meeting: Friday, March 1st, 2:00 pm, DC1331
Speaker: Charlie Clarke
Topic: A Domain-Specific Language for Web Data Gathering
Snacks: Glenn Paulley

DB meeting: Friday, March 8th, 2:00 pm, MC5136
Speaker: Ning Zhang
Topic: Towards Optimization of XML Query - A First Step
Snacks: Charlie Clarke

DB seminar: Monday March 11th, 11:00AM, DC1304
Speaker: H.V. Jagadish, University of Michigan
Title: Timber: A Native XML Database Management System

DB meeting: Friday, March 15th, 2:00 pm, DC1331
Speaker: Ani Nica
Topic: Optimization techniques for queries containing  subqueries and aggregations
Snacks: Ning Zhang
Abstract: In this talk I will present an overview of subquery evaluation techniques.  The overview will cover optimizations such as correlation removal, subquery flattening and using outerjoins for subquery executions. The presentation will include examples of how these techniques are used for queries from TPC-H benchmark.

DB meeting: Friday, March 22nd, 2:00 pm, DC1331
Speaker: Khuzaima Daudjee
Topic: I will talk about Data Staging for On-Demand Broadcast by Aksoy, Franklin & Zdonik from VLDB'01.
Snacks: Ani Nica

DB meeting: Friday, April 5th, 2:00 pm, DC1331
Speaker: Xuerong Tang
Topic: Three perspectives on XSLT
Snacks: Khuzaima Daudjee
Abstract: XSLT (eXtensible Stylesheet Language Transformation) is a language for transforming one XML document into another. It defines a transformer serving as part of W3C XSL specification which also contains a second piece: a formatting vocabulary. Since XSLT became a recommendation in W3C at 1999, it has generated a lot of interests among programmers, database theoreticians and curious normal users, which brings three different perspectives on XSLT. I will informally talk about these perspectives one by one.

DB seminar: Monday April 8th, 11:00AM, DC1304
Speaker: Daniel Barbará, George Mason University
Title: Clustering by Impact: Scalable, Incremental Clustering of Data Streams

DB meeting: Friday, April 12th, 2:00 pm, DC1331
Speaker: Grant Weddell
Topic: Fine Grained Information Integration with Description Logic
Snacks: Xuerong Tang

DB meeting: Friday, April 19th, 2:00 pm, DC1331
Speaker: Kong Ching Ma
Topic: Approximate Query Answering for Aggregate Queries
Snacks: Grant Weddell
Abstract: Over the last decade, we have seen the trend of using large-scale database to support decisions in applications known as Online Analytical Processing (OLAP) applications.  The database used in these systems are so large that the traditional query answering techniques will take a very long time to finish.  To take advantage of the fact that users of these applications do not require a very accurate answer, there are different approaches to minimize the response time at the expense of accuracy.  This is a survey of the current approaches to approximate such queries.  (PowerPoint slides)

DB meeting: Friday, April 26th, 2:00 pm, DC1331
Speaker: Sunny Lam
Topic: WebQA: A Web Querying System That Uses A QA Approach
Snacks: Kong Ching Ma
Abstract: As the size of the World Wide Web (WWW or Web) has grown, access to the data on the Web has become a significant problem. Data on the Web are managed by many individuals, organizations, and companies, they are stored in many different locations, and adhere to very different formats. These factors contribute to the difficulty of retrieving Web data. The common paradigm of searching and retrieving information on the Web is based on keyword-based search using one or more search engines, and then browsing through the large number of returned URLs. This thesis investigates a declarative query-based approach to Web data retrieval that uses question-answering technology in extracting information from Web sites that are retrieved by search engines. The approach consists of first using meta-search techniques in an open environment to gather candidate responses from search engines and other on-line databases, and then using information extraction techniques to find the answer to the specific question from these candidates. A prototype system, called WebQA, has been developed to test this approach.  Testing includes evaluation of its performance as a question-answering system using a well-known evaluation system called TREC-9.  Its accuracy using TREC-9 data for simple questions is high and its retrieval performance is good.  The system employs an open system architecture allowing for on-going improvements in various aspects.

DB seminar: Monday April 29th, 11:00AM, DC1304
Speaker: Philip Bernstein, Microsoft Research
Title: Generic Model Management - A Database Infrastructure for Schema Manipulation

This page is maintained by Frank Tompa and Ken Salem.