[Please remove <h1>]
Fall 2000
| DB meeting: | Friday, September 8 , 2:00 pm, DC1331 | 
| Speaker: | N/A | 
| Topic: | 1. General introductions around the table, including brief descriptions
of research interests. 2. Lab update. 3. Speaker rotation. | 
| DB meeting: | Friday, September 15 , 2:00 pm, Niagara Room, Sybase iAnywhere Solutions, 415 Phillip St. | 
| Speaker: | Anil Goel | 
| Topic: | I'll talk about the issue of creating and maintaining low cost histograms
and describe the algorithms presented in the following paper: Self-tuning Histograms: Building Histograms Without Looking at Data, Ashraf Aboulnaga and Surajit Chaudhuri, SIGMOD 99 | 
| Note: | This talk will take place at Sybase and will
be followed by a brief tour for those who are interested. Sybase is located
at 415 Phillip Street, just north of the corner of Phillip and Columbia,
and a short walk from the Davis Centre. Enter through the front door, and
sign in with the receptionist. A group will leave the Davis Center on foot for Sybase at 1:45pm sharp. We'll depart from the Davis Center north door, near ICR. Join us if you'd like some company for the walk. | 
| DB meeting: | Friday, September 22, 2:00pm, DC1331 | 
| Speaker: | Heng Yu | 
| Topic: | A look at
What's
not in a name: Some Properties of a Purely Structural Approach to Integrating
Description Logic Terminologies, by Alex Borgida and Ralf Kuesters,
presented at 2000 International Workshop on Description Logics. One approach to integrating knowledge bases is to based on finding assertions that relate the expressions in the constituent terminologies. For knowledge bases with many terms this task requires computer support. The authors set up formal framework for merging Description Logic TBoxs, and then explore the limits of a purely structural approach to the problem of finding inter-relationshops between knowledge bases. Some theoretical notions are empirically examined in a real medical ontology (GALEN). | 
| References: | (For Description Logic Background:) Description Logics in Data Management, A. Borgida. IEEE TKDE 1995 (For Schema Integration Background:)
 (For More Detailed Work of the Authors:)
 | 
| Snacks: | Anil Goel | 
| DB meeting: | Friday, September 29, 2:00 pm, DC1304 (Not the usual room!) | 
| Speaker: | Vlado Keselj | 
| Topic: | A Unification-based Approach to Question Answering The problem of Question Answering (QA) as used in the TREC can be formulated
as follows: I will: 
 | 
| Snacks: | Heng Yu | 
| DB meeting: | Friday, October 6, 2:00 pm, DC1331 | 
| Speaker: | Gord Cormack | 
| Topic: | Question Answering at TREC 9 For TREC 9, we completed some of the experiments that we didn't get done in time for TREC 8. We parsed the questions, and used the parse to enhance the set of search terms and to locate the answer within candidate passages. I will present some results and ideas for further development. | 
| Snacks: | Vlado Keselj | 
| DB meeting: | Friday, October 13th , 2:00 pm, DC1331 | 
| Speaker: | Bill O'Connell, IBM Toronto | 
| Topic: | DB2 Universal Database in the Astronomical Expansion of the e-business
Universe This talk will explain how DB2 is evolving to meet the extensive demands of e-business. Starting with an assessment of the role databases play in the application development process in the world of e-business, I will show you how users leverage the core e-business standards optimized in DB2 application development. Next, I`ll address the expansion of the new functions in DB2 V7 for UNIX, Windows and OS/2 that address business intelligence environments in the e-business world. This includes features to support rolling OLAP functions, enhancement to OLAP cube support, advanced statistical analysis, complex query cache management, and how DB2 helps OLAP and mining tools, and ERP applications. The session will also include an analysis of XML in the handling of business objects and business-to-business and a demonstration of DB2`s capabilities in this domain as it relates to business intelligence. | 
| SPECIAL: | Friday, October 20, 2:30-4:00 PM, Humanities Theatre | 
| Speaker: | Donald E. Knuth, Professor Emeritus, Stanford Univ. | 
| Title: | The Joy of Asymptotics | 
| DB seminar: | Monday, October 23, 11:00 AM, MC 5158 | 
| Speaker: | Vincent Oria, New Jersey Institute of Technology | 
| Title: | Courseware-On-Demand: Generating New Course Material From Existing Courses | 
| DB meeting: | Friday, October 27th , 2:00 pm, DC1331 | 
| Speaker: | Airi Salminen | 
| Topic: | Grammars++ revisited I will briefly describe the Grammars++ model for text, jointly developed with Frank Tompa. The model is based on the notion of a constraining grammar. Given a context-free grammar G as a base grammar, a constraining grammar is derived from productions of G by attaching predicates to selected non-terminals. Sequences of constraining grammars are then used as filters to specify a subset of information in structured text. I will also give a quick overview of some successful applications of the model for data retrieval, transformation, and hypertext creation. I will then discuss the potential for applying the model to XML data and introduce some interesting reseach problems. Reference: A.Salminen and F.W.Tompa, "Grammars++ for Modelling Information in Text," Information Systems, Vol. 24, No. 1 (1999) 1-24. | 
| Snacks: | Gord Cormack | 
| DB meeting: | Friday, November 3rd, 2:00pm, DC1331 | 
| Speaker: | Gary Promhouse, Open Text | 
| Topic: | Vertical Relational Database Representations We start with a brief review of the literature on vertical database representations. We then describe a representation that achieves significant compaction, while simultaneously supporting the basic operations of selection and join without auxiliary index structures. Experiments on some real life tables have realized several orders of magnitude reduction in representation size, while simultaneosly greatly reducing the computational cost of these fundamental operations. | 
| DB seminar: | Monday, November 6, 11:00 AM, DC 1304 | 
| Speaker: | Hector Garcia-Molina, Stanford University | 
| Title: | How to Crawl the Web | 
| DB meeting: | Friday, November 10th, 2:00 pm, DC1331 | 
| Speaker: | Hui Zhang | 
| Topic: | I will present the paper titled "Quilt: An XML Query Language for Heterogeneous
Data Sources" and review Quilt solutions to some use cases posed in Appendix
B of the "XML Query Requirements" document. Here are some references: 
 
 | 
| Snacks: | Airi Salminen | 
| DB meeting: | Friday, November 17th , 2:00 pm, DC1331 | 
| Speaker: | Yasser Ebrahim | 
| Topic: | I intend to present a paper titled "Why a picture is sometimes worth a 1000 words". This paper compares textual and diagrammatic representations and tries to explain why the latter could be superior to the former. | 
| Reference: | Reference: Larkin, J. and Simon, H., Why a Diagram Is (Sometimes) Worth Ten Thousand Words.; Diagrammatic Reasoning, AAAI/The MIT Press, Menlo Park, California, pp. 69-109, 1995 | 
| Snacks: | Hui Zhang | 
| DB seminar: | Monday, November 20, 11:00 AM, DC 1304 | 
| Speaker: | Avigdor Gal, Rutgers University | 
| Title: | An Authorization Model for Temporal Data | 
| DB meeting: | Friday, November 24th , 2:00 pm, DC1331 | 
| Speaker: | Jianchao Han | 
| Topic: | Interactive Construction of Classifiers This talk will describe two systems that I implemented: CVizT -- interactive construction of classification rules based on Table Lens visualization technique, and DTViz -- interactive construction of decision trees based on Parallel Segments visualization technique. | 
| Snacks: | Yasser Ebrahim | 
| DB meeting: | Friday, December 1st , 2:00 pm, DC1331 | 
| Speaker: | Ian Davis | 
| Topic: | Indexing XML Since 1997 I have been developing a SQL2 database search engine, which facilitated efficient access to fragments of structured text. As part of this software development a subordinate text engine was developed. The objective of a text engine is to provide rapid retrieval to useful textual information, to be efficient in both space and time, and to provide sufficient flexibility to allow for easy growth in as yet undefined directions, as the needs of the XML community become clearer. A further as yet unrealised goal is to allow rapid update of text indices as minor changes are made to the text being indexed. In this talk I will explore the approach I use to index structured text, the methods used to compress these indices, the data structures produced by the indexing process, and the use made of these data structures in resolving queries made against instances of text. | 
| Snacks: | Jianchao Han | 
| DB seminar: | Monday, December 4, 11:00 AM, DC 1304 | 
| Speaker: | Jan Chomicki, SUNY at Buffalo | 
| Title: | Consistent Query Answers in Inconsistent Databases | 
| Seminar: | Thursday, December 7th , 10:30 am, DC1304 | 
| Speaker: | Laks V. S. Lakshmanan, Concordia University | 
| Topic: | Constraints and Structures in Data Mining Spurred by the potential of discovering interesting and useful knowledge, substantial research has been done on data mining. Most previous work essentially falls into one of two "generations". In the first generation, the focus was on identifying which patterns are interesting and significant, and devising fast algorithms for mining them from large data sets. In the second generation, the importance of integrating data mining with the other key components of the knowledge discovery (KDD) process has been recognized. One such component is the underlying DBMS and strategies and architectures for integrating association rule discovery with the DBMS have been studied. In this talk, I will focus on the other, equally (if not more) important component of KDD -- the human user. Many mining algorithms devised in the first generation implicitly assume data mining is a one-shot exercise, as opposed to the iterative exploratory process it really is. This is a significant shortcoming since mining algorithms tend to be computationally intensive. Thus, the concerns in integrating the user in the loop are: (i) how can the user control the nature of the mining computation undertaken by the system at any point? (ii) how can the user enforce focus on mining based on his knowledge of application semantics? (iii) how can the user migrate from specific mining tasks performed, to issuing ad hoc mining queries? I will discuss how constraints can play a significant role in addressing these concerns, specifically in the domains of frequent sets and clustering. Many a time, finding {\em when} a pattern holds in a data set can be as interesting as the pattern itself: e.g., when is an itemset frequently purchased? I will briefly discuss algorithms and structures that facilitate this kind of dual mining. Finally, the data mining process tends to involve multiple mining tasks, such as classification, frequent sets, data cube, etc. I will briefly discuss our recent work on a model and an algebra for data mining that neatly integrates many mining tasks into one framework so the input of one mining task can be the output of another. | 
| DB meeting: | Friday, December 8th , 2:00 pm, DC1304 (Not the usual room!) | 
| Speaker: | Matthew Young-Lai | 
| Topic: | Longest Regular Expression Matching In general, it is inefficient to search for longest matches for a regular expression using a single pass. I will describe two ways of dealing with this. | 
| Snacks: | Ian Davis | 
This page is maintained by Frank Tompa and Ken Salem.