Fall 2009 Events Schedule | Database Research Group | UW

[Please remove <h1>]

Fall 2009

Events of interest to the Database Research Group are posted here, and are also mailed to the uw.cs.database newsgroup and the db-faculty, db-grads, db-friends mailing lists. Subscribe to one of these mailing lists to receive e-mail notification of upcoming events.

The DB group meets most Friday afternoons at 2pm. See the list below gives the times and locations of upcoming meetings. Each meeting lasts for an hour and features either a local speaker or, on Seminar days, an invited outside speaker. Everyone is welcome to attend.

Recent and Upcoming Events


DB Seminar: Wednesday September 16, 2:30pm, DC 1302
Speaker: Randal Burns, Johns Hopkins University
Title: Engineering the Billion-Object Authenticated Directory

DB Meeting: Wednesday September 23, 2:30pm, DC 1331
Speaker: Amod Gupta
Title: Audio Processing on Constrained Devices
(Masters thesis presentation)
Abstract: This thesis discusses the future of smart business applications on mobile phones and the integration of voice interface across several business applications. It proposes a framework that provides speech processing support for business applications on mobile phones. The framework uses Gaussian Mixture Models (GMM) for low-enrollment speaker recognition and limited vocabulary speech recognition. Algorithms are presented for pre-processing of audio signals into different categories and for start and end point detection. A method is proposed for speech processing that uses Mel Frequency Cepstral Coefficients (MFCC) as primary feature for extraction. In addition, optimization schemes are developed to improve performance, and overcome constraints of a mobile phone. Experimental results are presented for some prototype applications that evaluate the performance of computationally expensive algorithms on constrained hardware. The thesis concludes by discussing the scope for improvement for the work done in this thesis and future directions in which this work could possibly be extended.

DB Meeting: Wednesday September 30, 2:30pm, DC 1331
Speaker: Ihab Ilyas
Title: Creating Competitive Products
Abstract: The importance of dominance and skyline analysis has been well recognized in multi-criteria decision making applications. Most previous works study how to help customers find a set of "best" possible products from a pool of given products. In this work, we identify an interesting problem, creating competitive products: Given a set of products in the existing market, we want to study how to create a set of "best" possible products such that the newly created products are not dominated by the products in the existing market. We refer such products as competitive products. A straightforward solution is to generate a set of all possible products and check for dominance relationships. However, the whole set is quite large. We propose a solution to generate a subset of this set efficiently.

Papers: "Creating Competitive Products". Qian Wan, Raymond Chi-Wing Wong, Ihab F. Ilyas, M. Tamer Ozsu and Yu Peng. In VLDB 2009, Lyon, France - PVLDB 2(1): 898-909 (2009).


DB Meeting: Wednesday October 7, 3:30pm, DC 1331 (Note special time.)
Speaker: Chen Zhang, Dept. of Applied Mathematics
Title: CloudWF: A Computational Workflow System for Clouds Based on Hadoop
Abstract: In this talk we introduce the design and implementation of CloudWF, a scalable and lightweight computational workflow system for clouds on top of Hadoop. CloudWF can run workflow jobs composed of multiple Hadoop MapReduce or legacy programs. The system features the following: a simple workflow description language that encodes workflow blocks and block-to-block dependencies separately as standalone executable components; a workflow storage method that uses Hadoop HBase sparse tables to store workflow information internally and reconstruct workflow block dependencies implicitly for efficient workflow execution; transparent file staging with Hadoop DFS; and decentralized workflow execution management relying on the MapReduce framework for task scheduling and fault tolerance.

DB Meeting: Wednesday October 14, 2:30pm, DC 1331
Speaker: Wei Jiang
Title: Incompleteness of Ordered Relational Algebras
Abstract: The expressive power of various query languages is one of topics in the database theory that were most deeply explored. Like in the conventional relational databases, the first-order calculus is also a yardstick to measure the expressive power of query languages in ordered relational databases.

In this talk, I will discuss two representations of ordered relational query: the two-sorted first-order calculus FO≤, and the ordered relational algebra. I will show that the ordered relational algebra is less powerful than the two-sorted first-order calculus FO≤, and therefore cannot express all first-order expressible ordered relational queries. This causes serious problems when implementing ordered query processing as typical implementations in the conventional databases are essentially based on the equivalence of the relational algebra and first-order calculus.


DB Seminar: Wednesday October 21, 2:30pm, DC 1302
Speaker: Christopher Ré, University of Wisconsin
Title: Managing Large-scale Probabilistic Databases

DB Meeting: Wednesday October 28, 2:30pm, DC 1331
Speaker: Andrew Kane
Title: Unusual Disk Optimization Techniques
Abstract: I will first present a brief overview of the I/O stack including traditional, log-structured and journaling file systems. I will then present high level descriptions of several optimization techniques used to improve read/write performance on a single hard disk drive. These optimization techniques will include freeblock scheduling, track-based logging, eager writing, dual-actuator disks, and virtual logs.

DB Meeting: Wednesday November 11, 2:30pm, MC 5158
Speaker: Kevin Haas, Yahoo! Search
Title: Using Semantic Web Data Within Web Search
Abstract: This talk will present the current state of the semantic web and ongoing research challenges, common applications of semantic web data, and two years of lessons learned while using the semantic web to improve Yahoo's web search results.

Kevin's slides are available on slideshare.


DB Seminar: Wednesday November 18, 2:30pm, DC 1302
Speaker: Dirk Van Gucht, Indiana University
Title: Towards a Theory of Dataspace Queries

CS 741 class: Thursday November 19, 9:30, DC 1304 (Note: non-standard time and room)
Speaker: Dirk Van Gucht, Indiana University
Title: The Duality Between Query Languages and Index Structures
Abstract: With some of my colleagues, I discovered a new methodology for index design. The principle idea is to start from a query language, study its finite model theoretic properties, and derive from those an index that can perfectly answer queries in that language (i.e., by just streaming out the data). Conversely, given an index, we can derive the query language that the index support pefectly. Thus we have a duality between query language and index structures that can and has been used in partice, in particular in the context of quuery processing on XML databases. We are currently extending this work so that is can also apply to query graphs (such as RDF etc).

DB Meeting: Wednesday November 25, 2:30pm, DC 1331
Speaker: George Beskales
Title: Modeling the Possible Repairs of Functional Dependencies Violations
Abstract: Violations of integrity constraints are found in real world database. Such violations arise in several scenarios such as data integration and Web data extraction. Repairing violations of a particular class of integrity constraints, namely Functional Dependencies (FDs), has gained increasing attention in the past few years.

Previous work concentrated on producing one clean database instance that satisfies all FDs by performing the minimum number of changes to the dirty database. Unfortunately, generating one clean instance while ignoring other possible (potential) repairs results in information loss. Other approaches overcome such a problem by providing consistent query answers (i.e., the answers that are true in every possible repair).

Our goal is to generalize consistent query answers by viewing the possible repairs as the outcome of a random process (i.e., the repairing process). This allows for associating query answers with the probability of being a true answer. In this talk, I will review some of the previous works and provide an overview of our approach.


DB Meeting: Wednesday December 2, 2:30pm, DC 1331
Speaker: Dan Farrar
Title: Mining the Metadata: Schema as Software Specification
Abstract: Relational databases contain relatively large quantities of metadata, describing not only the shape of the data but also how it behaves (eg. constraints, triggers, etc.). While XML can be used without an associated schema, many XML schema languages exist (DTD, XSD, RELAX NG, etc.). In addition to ensuring the integrity of the data (the syntactic level), it is possible to infer relationships and intended use cases (the semantic level) from schema information accompanying the data. In this talk, I will explore techniques for automatically generating applications from this metadata, using examples from RDBMS and XML datastores.

References:


DB Meeting: Wednesday December 9, 2:30pm, DC 1331
Speaker: Ian McKillop
Title: Using Ontario Health Databases in Database Research
Abstract: The University of Waterloo is exploring establishing a linkage with the Institute for Clinical Evaluative Science (ICES). ICES is a provincially funded prescribed entity that holds a wide variety of health databases about people who live in Ontario and receive healthcare in Ontario. This talk will describe the nature of the ICES data holdings, information on how UW may be able to access those holdings, as well as a discussion of the types of database research questions that ICES would be excited to see UW scientists pursue.

DB Meeting: Wednesday December 16, 2:30pm, DC 1331
Speakers: William Leung and Hongsen Liu
Title: The Eucalyptus Cloud Computing Infrastructure
Abstract: Eucalyptus is a system that supports cloud computing on clusters. We have set up a small Eucalyptus cloud on the muscat cluster. This talk will provide a short overview of Eucalyptus - what it does, how to use it. It is intended for those who'd like to learn more about cloud computing and those who might want to use a Eucalyptus cloud in their research.

This page is maintained by Ken Salem.