[Please remove <h1>]
Winter 1999
- Friday 08 January- research group meeting,
2pm, DC1331
- Speaker: Koji Ueda
- Snacks: Anthony Cox
- Topic:
- Friday 15 January - research group meeting
2pm, DC1331
- Speaker: Vitaliy Khizder
- Snacks: Koji Ueda
- Title: Vitaliy will be talking about functional dependency constraints
in description logics.
- Friday 22 January - research group meeting
2pm, DC1331
- Speaker: Curtis Cartmill
- Snacks: Vitaliy Khizder
- Title: What is Information Integration?
- Friday 29 January - research group meeting
2pm, DC1331
- Speaker: Tim Snider
- Snacks: Curtis Cartmill
- Topic: I've heard "rumblings" that Tim plans a lively discussion on the topic of semi-structured data! He particularly requests that Frank, Marianno and Paul show up, and has also asked me to be there to ensure a open non-defensive calm environment.
- Friday 5 February - research group meeting
2pm, DC1331
- Speaker: Gord Cormack
- Snacks: Tim Snider
- Topic: I am going to talk about one or more of the following things:
1. How can we estimate information retrieval performance on an infinite
collection, based on a sample? [This is related to, but not the same problem
as I spoke about before; that formulation was limited to large finite collections]
2. How can we formulate 1, 2, or 3 term queries that outperform the 60
term queries that are the state-of-the art for probabilistic information
retrieval. 3. How can we evaluate queries efficiently for very very large
corpuses? (say, 10^12 documents). I'll talk about probabilistic methods
as well as our own. All of these are research in progress, for which I
have some ideas but not a complete solution.
- Friday 12 February - research group meeting
2pm, DC1331
- Speaker: Frank Tompa
- Snacks: Gord Cormack
- Topic: On querying XML documents! Frank will be presenting some ideas
he is developing for an XML query language.
- Friday 19 February - research group meeting
2pm, DC1331
- Speaker: Peter Bumbulis
- Snacks: Frank Tompa
- Topic: TBA
- Friday 26 February - research group meeting
2pm, DC1331
- Speaker: Mike Van Biesbrouck
- Snacks: Peter Bumbulis
- Topic: This talk will be an overview of the work that I am doing for
my thesis: compiling GCL queries as if they were functional programs. I
will give a short overview of the MultiText system so that I can explain
what I am trying to do and why it is worthwhile. Some of the optimizations
and code generation details will be discussed. If there is time, I will
hypothesize about the benefits of using lazy evaluation instead of strict
evaluation for the functional programs, something that I am not doing for
my thesis.
- Friday 5 March - research group meeting
2pm, DC1331
- Speaker: Ani Nica
- Snacks: Mike Van Biesbrouck
- Topic: I will present some of the issues from the TODS (vol 22, no1,
March 1997, pages 43-74) paper: "Outerjoin Simplification and Reordering
for Query Optimization" by Cesar Galindo-Legaria and Arnon Rosenthal.
- Friday 12 March - research group meeting
2pm, DC1331
- Speaker: Huizhu Liu
- Snacks: Ani Nica
- Topic: This Friday, I will try to give a survey of information integration,
or multi-databases projects. Businesses today need to access and combine
data stored in diverse sources with differing capabilities. Therefore more
and more interests are paid to information integration technologies. There
are many groups of people working on it. I will try to present this topic
by discussing various methods used in: 1) overall architecture 2) data
modelling 3) source description 4) most importantly, query optimization
Particularly, I will talk about the cost-based query optimization of mediator
in Garlic project developed by IBM Almaden Lab and if I have time, the
two-phase query optimization in DISCO project developed by INREA.
- Friday 19 March - research group meeting
2pm, DC1331
- Speaker: Glenn Paulley
- Snacks: Huizhu Liu
- Speaker: Christian Combaa
- Snacks: Glenn Paulley
- Topic: For the DB group talk this week, I will discuss the algorithm described in the paper
- "Combinatorial pattern discovery in biological sequence: the TEIRESIAS algorithm" by Isidore Rigoutsos (IBM Thomas J. Watson) and Aris Floratos (Courant Institute of thematical Sciences, NYU), published in _Bioinformatics_, vol. 14, no. 1 (1998).
- TEIRESIAS finds all maximally specific, rigid patterns occurring at least a minimum number of times in a set of (biological) sequences. The authors argue that the algorithm runs in time quasi-linear to the size of the generated output.
- I will compare TEIRESIAS to the data-mining algorithm Apriori (another IBM product), in terms of efficiency and applicability to biosequence data.
- Speaker: Paul Ward
- Snacks: Christian Combaa
- Topic: Darrell Raymond's PhD thesis on Partial Order Databases
- Speaker: Arun Marathe
- Snacks: Paul Ward
- Topic: a practice presentation for a paper by Arun and Ken that will be presented at SIGMOD 1999
- Speaker: Connie Zhang
- Snacks: Arun Marathe
- Topic: Transaction programs are comprised of read and write operations issued against the database. In a shared database system, one transaction program conflicts with another if it reads or writes data that another transaction program has written. This thesis presents a semi-automatic technique for pairwise static conflict analysis of embedded transaction programs. The analysis predicts whether a given pair of programs will conflict when executed against the database.
- There are several potential applications of this technique, the most obvious being transaction concurrency control in systems where it is not necessary to support arbitrary, dynamic queries and updates. By analyzing transactions in such systems before a transaction runs, it is possible to reduce or eliminate the need for locking or other dynamic concurrency control schemes.
- Speaker: Ian Davis
- Snacks: Connie Zhang
- Topic: This talk will explore the implications of implementing a language (derived from the TRDBMS project) that allows semi-structured text to be converted into relations, for use within conventional database technology. The file structures used to support this implementation, and the data structures used ithin these file structures will be explained. The algorithmic process needed to perform selection given a particular request in this text language will be explored, and if time permits I will move on to discuss the unforseen significance/curse of the generalised birthday paradox when applied to retrieval of small volumes of information from huge indices.
- Topic: TBA
Friday 26 March - research group meeting 2pm, DC1331
Friday 9 April - research group meeting 2pm, DC1331
Friday 16 April - research group meeting 2pm, DC1331
Friday 23 April - research group meeting 2pm, DC1331
Friday 30 April - research group meeting 2pm, DC1331