|DB meeting :||Friday, May 2nd, 2:00 pm, DC1331|
|Speaker :||Lei Chen
|Topic:||Overview of techniques for similarity-based
retrieval on time series data
|Abstract:||In this talk, I will give a
brief overview on the techniques that have been applied on similarity-based
time series data retrieval. Two main aspects will be discussed:
similarity metrics and dimensional reduction techniques. In the end,
I will talk about some existing problems on similarity-based multi-dimensional
time series data retrieval.
|DB Seminar :||Monday, May 12th, 11:00 am, DC1304|
|Speaker :||Aidong Zhang, University at Buffalo
Bioinformatics: Gene Expression Data Analysis
|DB meeting :||Friday, May 16th, 2:00 pm, DC1331|
|Speaker :||Fei Ku
|Topic:||Towards Automatic Initial Buffer Configuration
|Abstract:||Buffer pools are blocks of memory
used in commercial database systems to retain frequently referenced
pages. Configuring the buffer pools is a difficult and manual
task. A good buffer configuration improves performance by reducing
the number of disk accesses. Empirical studies have shown that optimizing
the initial buffer configuration (determined at database design time)
can improve system performance. A good initial configuration can
also provide a faster convergence towards an optimal dynamic buffer allocation.
Previous studies have not considered automating the buffer pool configuration
In this talk, I will present two techniques that facilitate initial buffer configuration. First, a workload characterization scheme is proposed. Secondly, an analytic model of the GCLOCK buffer replacement policy is developed that can be used to evaluate the effectiveness of a particular buffer configuration for a given workload. Finally, I will present results from our experimental study which demonstrate the effectiveness of our proposed techniques.
|DB meeting :||Friday, May 23rd, 2:00 pm, DC1331|
|Speaker :||Hai Chen
|Topic:||Supporting set-at-a-time extensions for XML through DOM|
|Abstract:||With the rapid growth of the
web and e-commerce, W3C produced a new standard, XML, as a universal data
representation format to facilitate information interchange and integration
from heterogeneous systems. In order to process XML, documents need to
be parsed and then users can access to manipulate, and retrieve the XML
data easily. DOM is one of application programming interfaces, which provides
the abstract, logical tree structure of an XML document. Though it is a very
promising initiative in defining the standard interface for XML documents
to access all data required by applications, significant benefit can result
if some functions are extended. In this thesis, we want to support set-at-a-time
extensions for XML through DOM. Through extending some powerful functions
to get summary information from different groups of related document elements,
and filter, extract and transform this information as a sequence of nodes,
the extended DOM can reduce the communications overhead and respond time
between the client and the server and therefore provide applications with
more convenience. Our work is to explore the ideas for set-at-a-time processing,
define, implement some suitable methods and code some query application
examples by using the DOM and our extensions to make comparisons through
a test system. Finally, future work will be given.
|DB meeting :||Friday, May 30th, 2:00 pm, DC1331|
|Speaker :||Grant Weddell
|Topic:||On the use of database technology in embedded
|Abstract:||From the point of view of high
level architecture, is it almost always useful to see a control program
as having a repository style wherein a collection of subsystems interact
with a common database. The talk will focus on the idea of using database
technology to support an SQL interface to this database. Included is a
short review of how current relational technology supports this idea.
|DB Seminar :|
|Speaker :||Divesh Srivastava, AT&T Research
|Title:||Phrase Matching in XML|
|DB meeting :||Friday, June 6th, 2:00 pm, DC1331|
|Topic:||Paired SynTrees: specifying structured
|Abstract:||In this talk, I will present
a specification language, called Paired SynTrees, for structured document
We consider instances taken from a collection of documents that share a common structure in the sense that they can all be characterized by grammar rules such as found in a context-free grammar (CFG) or forest-regular grammar (FRG). Thus a single XML (or SGML) document with accompanying DTD (document type definition) is structured. As long as documents do not all conform to a single universal standard, data transformation remains a problem. Thus in the absence of a universal tag set and schema, structured document transformation is important for XML to serve as the data interchange format for the Web.
Many rule-based languages (XSLT, TXL, XDuce, YATL) and XML query languages (XQuery, XML-QL, XML-GL) can be used to carry out such structural transformations, but they are not ideal candidates as a specification language. We therefore propose a new high-level specification language by extending existing grammar-based specification techniques such as SDT, TT grammar and filters.
During the talk, I will first describe the language and how it is used to specify transformations via examples; I will then present a prototype implementation. The system will parse the specification in Paired SynTrees, then generate XSLT templates, finally execute the XSLT program to accomplish the transformation.
|DB meeting :||Friday, June 27th, 2:00 pm, DC1331|
|Topic:||My talk will be on "ARC: A Self-Tuning,
Low Overhead Replacement Cache" by Megiddo and Modha. The ARC
policy "works uniformly well across varied workloads and cache sizes without
any need for workload specific a priori knowledege or tuning".
It's online, simple to implement (constant complexity per request), and
offers performance that is usually considerably better than LRU (comparable
to a tuned 2Q).
|DB meeting :||Friday, July 4th, 2:00 pm, DC1331|
|Speaker :||Anil Goel
|Topic:||I'll discuss some work done in the area
of selectivity estimation of LIKE predicates. In particular, I will
describe, in detail, a technique proposed by Krishnan, Vitter, and Iyer
and, time permitting, will talk about a simpler approach implemented
by us for a reduced instance of the same problem.
|DB Seminar :||Monday, July 7th, 11:00am, DC1304|
|Speaker :||Wolfgang Lehner, Technische
Database Support for Data Mining Applications
|DB meeting :||Friday, July 11th, 2:00 pm, DC1331|
|Topic:||Index Selection for Embedded Control Applications
using Description Logics
|Abstract:||In the paper we examine the
problem of automated index selection for Embedded Control Programs (ECPs).
Such systems are usually characterized by the property that the transaction
types, which can consist of queries and updates, are predefined and can be
classified as either critical or non-critical. In the paper we concern ourselves
exclusively with the critical part of an ECP transaction workload. More precisely,
our problem input consists of a set of critical transaction types and a semantic
database schema. The goal is to find the minimal number of extended indices
that will allow for every critical operation to be performed efficiently.
The proposed solution uses reasoning over the Description Logics (DL) dialect
DLFD in order to transform the original problem into a NP-Complete problem
that can be solved in exponential time using dynamic programming.
|DB meeting :||Friday, July 18th, 2:00 pm, DC1331|
|Speaker :||Huizhu Liu
|DB meeting :||Friday, July 25th, 2:00 pm, DC1331|
|Speaker :||Matthew Young-Lai
|Topic:||I'll give an overview of the XML features
that have been added to ASA, and talk about some details of the XPath implementation.