[Please remove <h1>]
Fall 2009
Events of interest to the
Database Research Group are posted here, and are also
mailed to the uw.cs.database newsgroup and the
db-faculty,
db-grads,
db-friends
mailing lists.
Subscribe to one of these mailing lists to receive e-mail notification
of upcoming events.
The DB group meets most Friday afternoons at 2pm.
See the list below gives the
times and locations of upcoming meetings.
Each meeting lasts for an hour and features either
a local speaker or, on
Seminar days,
an invited outside speaker.
Everyone is welcome to attend.
Recent and Upcoming Events
DB Meeting:
|
Wednesday September 23, 2:30pm, DC 1331
|
Speaker:
|
Amod Gupta
|
Title:
|
Audio Processing on Constrained Devices
(Masters thesis presentation)
|
Abstract:
|
This thesis discusses the future of smart business applications on
mobile phones and the integration of voice interface across several
business applications. It proposes a framework that provides speech
processing support for business applications on mobile phones. The
framework uses Gaussian Mixture Models (GMM) for low-enrollment
speaker recognition and limited vocabulary speech
recognition. Algorithms are presented for pre-processing of audio
signals into different categories and for start and end point
detection. A method is proposed for speech processing that uses Mel
Frequency Cepstral Coefficients (MFCC) as primary feature for
extraction. In addition, optimization schemes are developed to improve
performance, and overcome constraints of a mobile phone. Experimental
results are presented for some prototype applications that evaluate
the performance of computationally expensive algorithms on constrained
hardware. The thesis concludes by discussing the scope for improvement
for the work done in this thesis and future directions in which this
work could possibly be extended.
|
DB Meeting:
|
Wednesday September 30, 2:30pm, DC 1331
|
Speaker:
|
Ihab Ilyas
|
Title:
|
Creating Competitive Products
|
Abstract:
|
The importance of dominance and skyline analysis has been well
recognized in multi-criteria decision making applications. Most
previous works study how to help customers find a set of "best"
possible products from a pool of given products. In this work, we
identify an interesting problem, creating competitive products:
Given a set of products in the existing market, we want to study how
to create a set of "best" possible products such that the newly
created products are not dominated by the products in the existing
market. We refer such products as competitive products. A
straightforward solution is to generate a set of all possible products
and check for dominance relationships. However, the whole set is quite
large. We propose a solution to generate a subset of this set
efficiently.
Papers:
"Creating Competitive Products".
Qian Wan, Raymond Chi-Wing Wong, Ihab F. Ilyas, M. Tamer Ozsu and Yu Peng.
In VLDB 2009, Lyon, France - PVLDB 2(1): 898-909 (2009).
|
DB Meeting:
|
Wednesday October 7, 3:30pm, DC
1331 (Note special time.)
|
Speaker:
|
Chen Zhang, Dept. of Applied Mathematics
|
Title:
|
CloudWF: A Computational Workflow System for Clouds Based on Hadoop
|
Abstract:
|
In this talk we introduce the design and implementation of CloudWF, a
scalable and lightweight computational workflow system for clouds on
top of Hadoop. CloudWF can run workflow jobs composed of multiple
Hadoop MapReduce or legacy programs. The system features the
following: a simple workflow description language that encodes
workflow blocks and block-to-block dependencies separately as
standalone
executable components; a workflow storage method that uses Hadoop
HBase sparse tables to store workflow information internally and
reconstruct workflow block dependencies implicitly for efficient
workflow execution; transparent file staging with Hadoop DFS; and
decentralized workflow execution management relying on the MapReduce
framework for task scheduling and fault tolerance.
|
DB Meeting:
|
Wednesday October 14, 2:30pm, DC 1331
|
Speaker:
|
Wei Jiang
|
Title:
|
Incompleteness of Ordered Relational Algebras
|
Abstract:
|
The expressive power of various query languages is one of topics in
the database theory that were most deeply explored. Like in the
conventional relational databases, the first-order calculus is also a
yardstick to measure the expressive power of query languages in
ordered relational databases.
In this talk, I will discuss two representations of ordered relational
query: the two-sorted first-order calculus FO≤, and the ordered
relational algebra. I will show that the ordered relational algebra is
less powerful than the two-sorted first-order calculus FO≤, and
therefore cannot express all first-order expressible ordered
relational queries. This causes serious problems when implementing
ordered query processing as typical implementations in the
conventional databases are essentially based on the equivalence of the
relational algebra and first-order calculus.
|
DB Meeting:
|
Wednesday October 28, 2:30pm, DC 1331
|
Speaker:
|
Andrew Kane
|
Title:
|
Unusual Disk Optimization Techniques
|
Abstract:
|
I will first present a brief overview of the I/O stack including
traditional, log-structured and journaling file systems. I will then
present high level descriptions of several optimization techniques
used to improve read/write performance on a single hard disk drive.
These optimization techniques will include freeblock scheduling,
track-based logging, eager writing, dual-actuator disks, and virtual
logs.
|
DB Meeting:
|
Wednesday November 11, 2:30pm, MC 5158
|
Speaker:
|
Kevin Haas, Yahoo! Search
|
Title:
|
Using Semantic Web Data Within Web Search
|
Abstract:
|
This talk will present the current state
of the semantic web and ongoing research challenges, common
applications of semantic web data, and two years of lessons learned
while using the semantic web to improve Yahoo's web search results.
Kevin's
slides are available on slideshare.
|
CS 741 class:
|
Thursday November 19, 9:30, DC 1304 (Note: non-standard time and room)
|
Speaker:
|
Dirk Van Gucht, Indiana University
|
|
Title:
|
The Duality Between Query Languages and Index Structures
|
Abstract:
|
With some of my colleagues, I discovered a new methodology for index design. The principle idea is to start from a query language, study its finite model theoretic properties, and derive from those an
index that can perfectly answer queries in that language (i.e., by
just streaming out the data). Conversely, given an index, we can
derive the query language that the index support pefectly. Thus we
have a duality between query language and index structures that can
and has been used in partice, in particular in the context of quuery
processing on XML databases. We are currently extending this work so
that is can also apply to query graphs (such as RDF etc).
|
DB Meeting:
|
Wednesday November 25, 2:30pm, DC 1331
|
Speaker:
|
George Beskales
|
Title:
|
Modeling the Possible Repairs of Functional Dependencies Violations
|
Abstract:
|
Violations of integrity constraints are found in real world
database. Such violations arise in several scenarios such as data
integration and Web data extraction. Repairing violations of a
particular class of integrity constraints, namely Functional
Dependencies (FDs), has gained increasing attention in the past few
years.
Previous work concentrated on producing one clean database instance
that satisfies all FDs by performing the minimum number of changes to
the dirty database. Unfortunately, generating one clean instance while
ignoring other possible (potential) repairs results in information
loss. Other approaches overcome such a problem by providing consistent
query answers (i.e., the answers that are true in every possible
repair).
Our goal is to generalize consistent query answers by viewing the
possible repairs as the outcome of a random process (i.e., the
repairing process). This allows for associating query answers with
the probability of being a true answer. In this talk, I will review
some of the previous works and provide an overview of our approach.
|
DB Meeting:
|
Wednesday December 2, 2:30pm, DC 1331
|
Speaker:
|
Dan Farrar
|
Title:
|
Mining the Metadata: Schema as Software Specification
|
Abstract:
|
Relational databases contain relatively large quantities of metadata,
describing not only the shape of the data but also how it behaves (eg.
constraints, triggers, etc.). While XML can be used without an associated
schema, many XML schema languages exist (DTD, XSD, RELAX NG, etc.). In
addition to ensuring the integrity of the data (the syntactic level), it is
possible to infer relationships and intended use cases (the semantic level)
from schema information accompanying the data. In this talk, I will
explore techniques for automatically generating applications from this
metadata, using examples from RDBMS and XML datastores.
References:
|
DB Meeting:
|
Wednesday December 9, 2:30pm, DC 1331
|
Speaker:
|
Ian McKillop
|
Title:
|
Using Ontario Health Databases in Database Research
|
Abstract:
|
The University of Waterloo is exploring establishing a linkage with the Institute for Clinical Evaluative Science (ICES). ICES is a provincially funded prescribed entity that holds a wide variety of health databases about people who live in Ontario and receive healthcare in Ontario. This talk will describe the nature of the ICES data holdings, information on how UW may be able to access those holdings, as well as a discussion of the types of database research questions that ICES would be excited to see UW scientists pursue.
|
DB Meeting:
|
Wednesday December 16, 2:30pm, DC 1331
|
Speakers:
|
William Leung and Hongsen Liu
|
Title:
|
The Eucalyptus Cloud Computing Infrastructure
|
Abstract:
|
Eucalyptus is a system that supports cloud computing on clusters.
We have set up a small Eucalyptus cloud on the muscat cluster.
This talk will provide a short overview of Eucalyptus - what it
does, how to use it. It is intended for those who'd like to
learn more about cloud computing and those who might
want to use a Eucalyptus cloud in their research.
|
This page is maintained
by
Ken Salem.