Database Research Group Events

Winter 2006

Note: Events of interest to the Database Research Group are posted to the uw.cs.database newsgroup and are mailed to the dbgroup mailing lists: db-faculty (for DB group faculty), db-grads (for DB group graduate students), and db-friends (for DB group alumni, visitors, and friends). If you wish to subscribe to one of these lists, send mail to majordomo@db with "subscribe <list>" in the message body, where <list> is the list you wish to subscribe to.  For example, use "subscribe db-friends" to subscribe to the db-friends list. To unsubscribe, send "unsubscribe <list>" to the same address.
DB group meetings
The DB group meets most Friday afternoons at 2pm, usually in DC1331. See the list of current events for times and locations of upcoming meetings. Each meeting lasts for an hour and features an informal presentation by one of the members of the group. Everyone is welcome to attend. These talks are intended to raise questions and to stimulate discussion rather than being polished presentations of research results. Speakers are determined using a rotating speaker list, which can be found on the DB group meeting page
DB seminar series
The DB seminar series features visiting speakers. These seminars are more-or-less monthly, and are usually scheduled on Monday mornings at 11am. See the list of current events for times and locations of upcoming seminars. The full schedule can be found on the DB seminar series page.

Recent and Upcoming Events

DB Meeting: Friday January 6, 2:00pm, DC1331
Speaker: David Toman

DB Meeting: Friday January 13, 2:00pm, DC1331
Speaker: Oguzhan Ozmen
Title: End-to-End Database Physical Design in a Virtualized World
Abstract: A database is "physically" composed of tables, materialized views, indexes and other data(base) structures, and meta-information which glues all of these components (i.e. database objects). Organization of those components is defined as "physical database design". Mainly, physical database design has two major steps: selecting objects and laying out or mapping them on the storage. Currently, available database design tools try to automate/facilitate the first step. For example, IBM DB2 Advisor Tool employs optimizer to select which indexes to create for the sake of performance. Unlike the first step, database-oriented research is mostly blind to the second step because a DBMS is not aware of the details of the complex storage system underlying itself. This abstraction, hiding the details of storage system from applications, is called storage virtualization; any application even the operating system sees a single storage (or a set of storages each of) which is actually composed of a pool of storage devices. Thus, mapping/laying out aforementioned physical database objects onto actual storage system is out of control of the DBMS, and generally handled by human administrators (storage system administrators), most probably, other than DBAs. There is also research being done on the design of physical storage system; however, it is mostly independent of database-oriented approach and mainly based on different concerns, e.g. lowering the cost. Thus, the second step of physical database design, automation of the layout of database objects, remains as an open problem and untouched when compared to the first step. In this talk, the current state of layout automation will be presented and a possible way of combining database and storage systems-oriented approaches to achieve an "end-to-end database physical design" is discussed.

DB Seminar: Monday January 16, 11:00am, MC5136 (Please note room change)
Speaker: Raghu Ramakrishnan, University of Wisconsin
Title: Discovering Interesting Subsets of Data in Cube Space

DB Meeting: Friday January 20, 2:00pm, DC1331
Speaker: Amir Chinaei
Title: Access Control Management without Administrators
Abstract: Access control is a major component of data sharing systems. Access control management should be decentralized in order to improve the efficiency of the system, and benefit users by providing self governance. In this talk, I present a User-Managed Access Control (UMAC) model in which administrators are not distinguished from end users. In our model, users manage the data and delegate their responsibilities to others as far as they want, as long as they comply with the corporate policy. The model is policy neutral: the corporate policy is defined when the system is initialized. I will mainly focus on our delegation mechanism thorough examples and pictures. If time allows, I will compare our contributions with major existing models.

DB Meeting: Friday January 27, 2:00pm, DC1331
Speaker: Huaxin Zhang
Title: Efficient Query Optimization for Complex Query Under Multiple User Customization (Access Controls)
Abstract: Lots of applications have complex queries used by a large amount of users, with each user having his own customization on the queries. This customization may come from the need of access controls or from user preferences. Each customized query may be different from all previously met query and may need to be re-compiled from scratch, resulting large amount of overhead. Traditional parametric query optimization and query clustering approaches are not enough to solve this problem, since the queries can be structurally and semantically different from each other. In this talk, I will demonstrate the distinctness of the problem, and show in details why previous approaches fail. A possible solution to this problem will be presented in the end.

DB Meeting: Friday February 3, 2:00pm, DC1304 (Please note room change)
Speaker: Andrei Voronkov, Microsoft Research and the University of Manchester
Title: Combatting SUMO and Terrorism with Vampire
Abstract: We identify a number problem hampering the use of first-order theorem provers for reasoning with large ontologies based on first-order logic and its extensions.

Then we describe a modification of the theorem prover Vampire able to reason with ontologies containing over 20,000 first-order formulas with equality.

We also give a brief analysis of inconsistencies found by Vampire in the SUMO and terrorism ontologies.

DB Meeting: Friday February 10, 2:00pm, DC1331
Speaker: Lukasz Golab
Title: Sliding Window Query Processing over Data Streams
Abstract: Database management systems (DBMSs) have been used successfully in business applications. Typically, it is assumed that the data are relatively static, with database updates occurring less frequently than queries. However, many emerging applications, such as sensor networks, real-time Internet traffic analysis and on-line financial trading, require support for fast processing of possibly infinite streams of data. The fundamental assumption of a data stream management system (DSMS) is that new data are generated continually, making it infeasible to store a stream in its entirety. At best, we may be able to maintain a sliding window of recently arrived data. Since the contents of a sliding window evolve over time, it makes sense for users to ask a query once and receive updated answers as time goes on. In this talk, I will summarize my research on query processing over sliding windows that focuses on two fundamental differences between a DBMS and a DSMS: the time-evolving nature of the data and the long-running nature of the queries.

DB Seminar: Monday February 13, 11:00am, DC1304
Speaker: Volker Haarslev, Concordia University
Title: Racer - Optimizing in ExpTime and Beyond: Lessons Learnt and Challenges Ahead

CS848 Guest Lecture: Wednesday February 15, 2:00pm, DC 3314
Speaker: Danny Zilio, IBM
Title: Autonomic Features in IBM DB2

DB Meeting: Friday February 17, 2:00pm, DC1331
Speaker: Ning Zhang
Title: Query Processing and Optimization in Native XML Databases
Abstract: XML has emerged from a markup language for web pages to the de facto language for data exchange over the World Wide Web. Declarative query languages, such as XPath and XQuery, are proposed for querying over large volumes of XML data in the same fashion as SQL to relational databases. Over the past few years, many techniques are proposed to evaluating XML queries more efficiently. Many of these techniques are not only appropriate for XML data per se, but are also applicable to other data sources that can be explicitly or implicitly translated into XML/hierarchical data model as well. In this talk, I will first give an overview of the database management issues related to storing and querying large volumes of XML data. Then I will focus on some query processing and optimization techniques that we have worked in the XDB project at Waterloo: a succinct native XML storage system, a physical operator based on the storage system, and a synopsis structure for estimating the cardinality of a path expression. Finally, I will give an outline of our ongoing research and possible applications of these techniques to other fields of computer science.

ICR Seminar: Monday February 20, 2:30pm, DC 1302
Speaker: Peter Corbett, Network Appliance
Title: Topics in Storage Research; New Advances in RAID Algorithms
Abstract: Network Appliance is actively involved in research and advanced development in many aspects of storage, data management and information retrieval. This talk will present a brief overview of these activities, including parallel file systems, indexing, continuous data protection, and virtualization. One area where we have published results is in erasure coding algorithms for RAID systems. The Row-Diagonal parity algorithm is a two-erasure correcting algorithm that is provably optimal in disk space overhead, disk I/O, and computation complexity. It forms the basis of Network Appliance's RAID-DP (RAID-6) implementation, which protects against two disk failures with a very small performance penalty compared to RAID-4 or RAID-5. The talk will present this algorithm, along with some other new RAID ideas.
Bio: Peter Corbett is a Technical Director at Network Appliance, where he has been working on file systems, scalable storage systems, NFS, pNFS, and RAID. Prior to joining Network Appliance, Peter spent 10 years at IBM Research T. J. Watson Lab, where he worked on parallel processing and parallel file systems. He was the architect of the Vesta and PIOFS parallel file systems, and participated in a number of I/O efforts, including co-authoring the original MPI-IO draft specification and the Scalable I/O Initiative specification. He previously worked at General Electric Corporate Research and Development Center in Schenectady NY, where he co-developed a digit serial silicon compiler for digital signal processing integrated circuits. Peter received a PhD in Electrical Engineering from Princeton University in 1990, where his thesis topic was interconnection networks, sorting and routing algorithms for parallel computers. He received a BASc and MASc in 1983 and 1985 respectively, also in Electrical Engineering, from the University of Waterloo, where his masters thesis topic was a multi-core processor design with rapid context switching among many threads. Peter holds 19 U.S. patents, and has over 30 technical publications.

CS848 Guest Lecture: Wednesday March 1, 2:00pm, DC 3314
Speaker: Glenn Paulley, Sybase iAnywhere
Title: Self-management Features in Sybase SQL Anywhere

DB Meeting: Friday March 3, 2:00pm, DC1331
Speaker: Xuhui Li
Title: Multi-tier Buffer I/O and Management for Database Systems
Abstract: Buffer management is one of the important tasks of a DBMS. With the trend of centralized storage management, the multi- tier caching architecture becomes more and more popular. How to manage the multi-tier buffers and improve the overall system performance, have drawn attention from researchers. In this talk, I'll introduce our study on IO patterns of some database buffers running OLTP workload. I'll talk about how the database configurations and buffer replacement algorithms can impact on these IO patterns. And consequently, how these 1st tier buffer IO patterns can be recognized and utilized in the 2nd tier buffer management.

DB Meeting: Friday March 10, 2:00pm, DC1331
Speaker: Gulay Unel
Title: Deciding Second-order Logics using Complex-value Datalog
Abstract: Our work is on determining satisfiability of second-order logic formulas using techniques developed for query evaluation of Complex-value Datalog queries. We show that the use of these techniques---in particular the Magic Set rewriting of Datalog queries and the top-down resolution-based evaluation with memoing---can, in many cases, considerably improve the performance of decision procedures based on the connection between logics and automata, such as the MONA system. In this talk, I will focus on WS1S and WS2S logics, and briefly describe how our method can be extended to other logics whose decidability can be shown using automata-theoretic techniques.

DB Meeting: Friday March 17, 2:00pm, DC1304 (Please note room change)
Speaker: Paul Larson, Microsoft Research
Title: Efficient Maintenance of Materialized Outer-Join Views
Abstract: Prior work on incremental maintenance of materialized views has focused primarily on SPJG views, that is, views limited to projection, selection and inner joins, possible with a single group-by on top. The talk will describe our recent results on efficient incremental maintenance for outer-join views, that is, views where some or all of the joins are outer joins. We construct maintenance expressions based on a normal form for outer-join expressions. Constraints, in particular, foreign-key constraints are exploited to reduce maintenance overhead. Experimental results show that maintaining an outer-join view need not be more expensive than maintaining an inner-join view. For aggregation views it may actually be cheaper.

This is joint work with Jingren Zhou.

Bio: Paul Larson received his Ph.D. in 1976 from Åbo Akademi University, Finland where also served as an Assistant Professor. He joined the Department of Computer Science at the University of Waterloo, Canada in 1981, where he was promoted to Full Professor in 1987. He served as chairman of the department from 1989 to 1992. Dr. Larson moved to Microsoft Research in 1996 where he is a Senior Researcher. He is an ACM Fellow. His primary research interests are query optimization and query processing in database systems.

CS848 Guest Lecture: Wednesday March 22, 4:00pm, DC 1331
Speaker: Khaled Yagoub, Oracle (by teleconference)
Title: Self-management Features in Oracle 10g

DB Seminar: Monday April 17, 11:00am, DC1304
Speaker: Lise Getoor, University of Maryland
Title: Entity Resolution in Relational Data

This page is maintained by Ashraf Aboulnaga.

Campaign Waterloo

Data Systems Group
David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario, Canada N2L 3G1
Tel: 519-888-4567
Fax: 519-885-1208

Contact | Feedback: | Data Systems Group

Valid HTML 4.01!Valid CSS! Last modified: Friday, 01-Jun-2012 11:01:03 EDT