Spring 2009 Events Schedule | Database Research Group | UW

[Please remove <h1>]

Spring 2009

Note: Events of interest to the Database Research Group are posted to the uw.cs.database newsgroup and are mailed to the db-group@lists.uwaterloo.ca mailing list. There are actually three mailing lists aggregated into the db-group list: db-faculty (for DB group faculty), db-grads (for DB group graduate students), and db-friends (for DB group alumni, visitors, and friends). If you wish to subscribe to one of these three lists (or to unsubscribe), please visit https://lists.uwaterloo.ca/mailman/listinfo/<listname>, where <listname> is the list you wish to subscribe to.
DB group meetings
The DB group meets most Friday afternoons at 2pm, usually in DC1331. See the list of current events for times and locations of upcoming meetings. Each meeting lasts for an hour and features an informal presentation by one of the members of the group. Everyone is welcome to attend. These talks are intended to raise questions and to stimulate discussion rather than being polished presentations of research results. Speakers are determined using a rotating speaker list, which can be found on the DB group meeting page
DB seminar series
The DB seminar series features visiting speakers. These seminars are more-or-less monthly, and are usually scheduled on Monday mornings at 11am. See the list of current events for times and locations of upcoming seminars. The full schedule can be found on the DB seminar series page.

Recent and Upcoming Events


DB Meeting: Friday May 8, 2:00pm, DC 1331
Speaker: Vincent Oria, NJIT
Title: Mining Lecture Videos for slides: An Approach to Semantic Querying of lecture videos
Abstract: Pattern matching in the video is challenging especially if the patterns have to reflect some semantics. The aim of the work I will be presenting is to automatically synchronize a sequence of slides with the video of the lecture that used the same slides. Knowing where a particular slide appears in a lecture video can help index the video based on the slides and find where a particular topic is covered. Another application for this work is live video conferences where matching a slide in a live video with the actual slide can help replace the poor quality slide in the video with a slide previously downloaded.

Although the problem of pattern matching in video is not new and has already been tackled, existing methods are designed for general purpose pattern matting and turn out to be very expensive and time consuming. We defined a similarity measure for quickly matching video frames containing slides with the original slides.


DB Seminar: Monday May 11, 10:30am, DC 1304
Speaker: Chris Olston, Yahoo! Research
Title: Pig: High-Level Dataflow on top of Map-Reduce

DB Meeting: Friday May 22, 2:00pm, DC 1331
Speaker: Ahmed Soror
Title: A Virtualization Design Advisor for DBMS Workloads
Abstract: In this talk we address the problem of automatically configuring multiple virtual machines that are all running database systems and sharing a pool of physical resources. Our approach to solving this problem is implemented as a virtualization design advisor (VDA) that takes information about the different database workloads and uses this information to determine how to split the available physical computing resources among the virtual machines prior to runtime. We will overview how our VDA works pre- and post-runtime. We will focus on how the VDA makes use of actual performance measurements to refine the cost models used for the recommendations. We will also present a dynamic resource re-allocation scheme where the VDA uses runtime information to react to dynamic changes in the workloads.

DB Meeting: Friday June 5, 2:00pm, DC 1331
Speaker: Ashraf Aboulnaga
Title: Overview of Some Recent Papers on Ad-hoc Data Integration
Abstract:

DB Seminar: Monday June 8, 10:30am, DC 1304
Speaker: Boon Thau Loo, University of Pennsylvania
Title: Declarative Secure Distributed Systems

MMath Thesis Presentation: Wednesday June 17, 9:30am, DC 2314
Speaker: Tim Benke
Title: Flexible Monitoring of Storage I/O

DB Meeting: Friday June 19, 2:00pm, DC 1331
Speaker: Lei Zou, Huazhong University of Science and Technology
Title: DistanceJoin: Pattern Match Query In a Large Graph Database
Abstract: The growing popularity of graph databases has generated interesting data management problems, such as subgraph search, shortest-path query, reachability verification, and pattern matching. Among all these interesting queries, a pattern matching query is more flexibly compared to a subgraph search and more informative compared to a shortestpath or reachability query. In this talk, I will present our recent work, which address pattern matching problems over a large data graph G. Specifically, given a pattern graph (i.e., query Q), we want to find all matches (in G) that has the similar connections as those specified in Q. In order to reduce the search space significantly, we first transform the vertices into points in a vector space via graph embedding techniques, coverting a pattern matching query into a distance-based multi-way join problem over transformed vector space. We also propose several pruning strategies and join order selection method to process join processing efficiently.

DB Meeting: Friday June 26, 2:00pm, DC 1331
Speaker: Peter Bumbulis, Sybase iAnywhere
Title: Stratified Indexes - A Method for Creating Balanced Search Structures
Abstract: I'd like to discuss the paper "Stratified Indexes - A Method for Creating Balanced Search Structures" by Alon Itai and Moshe Shadmon. It presents a disk-based PATRICIA trie as an instance of a more general approach for constructing access methods.

DB Meeting: Friday July 10, 2:00pm, DC 1331
Speaker: Xuhui Li
Title: Delayed Synchronization of Writes
Abstract: Some applications, e.g. DBMS, prefer to use synchronized writes to ensure data integrity and durability. Although this approach is safe it ignores the underlying cache tiers and is not I/O efficient. We propose a new I/O operation and a revised I/O interface which provides applications the flexibility to synchronize their write I/Os only when necessary. Our evaluation shows significant performance gains by using the proposed approach.

DB Meeting: Friday July 17, 2:00pm, DC 1331
Speaker: Ani Nica, Sybase iAnywhere
Title: SQL Anywhere Optimizer
Abstract: In this talk I will present an overview of the features of the SQL Anywhere Optimizer including its search space generation algorithm, and the support of the materialized views in SQL Anywhere server. The talk will also showcase the SearchSpaceAnalyzer system [1] which had been demonstrated this year at the SIGMOD conference held in Providence, Rhode Island, USA. SearchSpaceAnalyzer system is a research prototype used to analyze the search spaces generated by the SQL Anywhere Optimizer during the optimization process of a SQL statement. Namely, the system visualizes and analyzes (1) a single search space and (2) the differences between two search spaces generated for the same query by two different optimization processes.

[1]Anisoara Nica, Daniel Scott Brotherston, David William Hillis: Extreme visualisation of query optimizer search spaces. SIGMOD Conference 2009: 1067-1070


DB Meeting: Friday July 24, 2:00pm, DC 1331
Speaker: Luiz Celso Gomez Jr.
Title: Database augmentation through Information Extraction
Abstract: Information Extraction has proven to be a valuable tool in harvesting the data hidden in text documents. In this talk I will focus on the integration of Information Extraction and Relational Databases, presenting techniques aimed at expanding a preexisting database with data extracted from text corpora.

DB Meeting: Friday July 31, 2:00pm, DC 1331
Speaker: Patrick Kling
Title: Optimizing distributed XML queries through localization and pruning
Abstract: Distributing data collections by fragmenting them is an effective way of improving the scalability of a database system. While the distribution of relational data is well understood, the unique characteristics of the XML data and query model present challenges that require different distribution techniques. In this talk, I will present solutions to two of the problems encountered in distributed query processing and optimization on XML data, namely localization and pruning. Localization takes a fragmentation-unaware query plan and converts it to a distributed query plan that can be executed at the sites that hold XML data fragments in a distributed system. I will show how the resulting distributed query plan can be pruned so that only those sites are accessed that can contribute to the query result. I will demonstrate that these techniques significantly improve the performance of distributed query execution when they are integrated into an XML database system.

This page is maintained by Ashraf Aboulnaga.