Database Research Group Events -- Spring 1999
[Please remove <h1>]
Spring 1999
Time: |
Research group meeting, Friday May 7,
2:00 pm, DC1331 |
Speaker: |
Khuzaima Daudjee |
Snacks: |
Ian Davis |
Topic: |
I'll talk about the Mariposa distributed database system this Friday.
Mariposa was designed and developed at Berkeley by Stonebraker, Aoki, Sah,
Staelin, Sidell and a few others. |
Time: |
Research group meeting, Friday May 14,
2:00 pm, DC1331 |
Speaker: |
Reem Al-Halimi |
Snacks: |
Khuzaima Daudjee |
Topic: |
I will give an overview of data and text mining and present a text
mining effort that uses data mining techniques to discover phrase usage
trends in documents. The main paper is: B. Lent, R. Agrawal, R. Srikant:
"Discovering Trends in Text Databases", Proc. of the 3rd Int'l Conference
on Knowledge Discovery in Databases and Data Mining, Newport Beach, California,
August 1997. and can be found along with other papers this work is
based on at: http://www.almaden.ibm.com/cs/people/ragrawal/pubs.html |
Time: |
Research group meeting, Friday, May 28,
2:00 pm, DC1331 |
Speaker: |
Rudolf Fleischer |
Snacks: |
Reem Al-Halimi |
Topic: |
In the page replication problem, we are given a weighted graph and
a start node $s$ which initially contains a page. Other nodes which
want to access this page can do so by sending a request to any other node
holding the page. The cost of this access is then the distance between
the two nodes. A node can also choose to copy the whole page at a
cost of $d$ times the distance, where $d$ is usually a large constant (the
page size). If the sequence of requests is not known in advance,
we have an online problem so we use competitive analysis to measure the
performance of algorithms, i.e., we compare the cost of our algorithm to
the cost of the best possible offline algorithm. Various graph topologies
(trees, rings, arbitrary graphs) have been studied in this model.
In this talk, we present several variants of the classical page replication
problem and give some new upper and lower bounds. In the {\em continuous
page replication problem} we allow requests at and replication to arbitrary
points on any edge of the graph. In this model randomization does
not help, and the deterministic algorithms are equivalent to randomized
algorithms in the classical discrete model. In the {\em unequal cost
model} we assume that the online algorithm has a replication factor different
from $d$, the replication factor of the offline algorithm. We give
optimal deterministic and randomized algorithms for the discrete and continuous
variant of this model on trees. From this we can derive much simpler proofs
for known algorithms on rings. |
Time: |
Research group meeting, Friday, June 4,
2:00 pm, DC1331 |
Speaker: |
Julia Johnson |
Snacks: |
Ian Munro |
Topic: |
Rough Sets for Informative Question Answering
When it is not possible to give a precise answer, it may be possible
to give an imprecise answer which is nevertheless informative. The rough
set model will be demonstrated for distinguishing between precise answers
that say "There are no objects X" or "The objects X are ..." from uncertain
answers of the form "Objects X are included in set Y" or "There may be
additional objects X other than those in set Y". These uncertain answers
offer an improvement over the more traditional precise answers because
the system is better able to report on its lack of knowledge. |
Time: |
Research group meeting, Friday June 11,
2:00 pm, DC1331 |
Speaker: |
Forbes Burkowski |
Snacks: |
Ian Munro |
Topic: |
Forbes will be talking about evolutionary
computing, and its use in data mining applications.
|