Database Research Group Events

Fall 2014

Events of interest to the Database Research Group are posted here, and are also mailed to the uw.cs.database newsgroup and the db-faculty, db-grads, db-friends mailing lists. Subscribe to one of these mailing lists to receive e-mail notification of upcoming events.

The DB group meets Wednesday afternoons at 2:30pm. The list below gives the times and locations of upcoming meetings. Each meeting lasts for an hour and features either a local speaker or, on Seminar days, an invited outside speaker. Everyone is welcome to attend.

Fall 2014 Events

Cheriton Symposium: Friday, September 19, DC 1302
  • David Cheriton, Stanford University, 10:45am
    "HICAMP Bitmap: Space-efficient updatable bitmap index for in-memory databases"
  • Ihab Ilyas, University of Waterloo, 3:00pm
    "Data Cleaning from Theory to Practice"
  • M. Tamer Özsu, University of Waterloo, 3:45pm "Web Data Management in the RDF Age"

  • DB Seminar: Wednesday, September 24, 2:30pm, DC 1302
    Speaker: Frank Dehne, Carleton University
    Title: Real-Time On-line Analytical Processing (OLAP) On Multi-Core and Cloud Architectures

    DB Meeting: Wednesday, October 1, 2:30pm, DC 1331
    Speaker: Greg Drzadzewski
    Title: Partial Materialization for On-Line Analytical Processing on Multi-Tagged Document Collections
    Abstract: On-Line Analytical Processing (OLAP) systems are commonly used on top of structured data to help users make sense of large data collections by providing them with summary information that can be examined at various levels of detail. Partial materialization has been used as part of these OLAP systems as a way of reducing the time required to calculate summaries as well as satisfying the constraints of limited storage and available time for updates.
    When dealing with large collections of tagged documents, one would also benefit from the summarization operations provided by an OLAP system. Such a system could make it less time consuming for users to explore and understand the information contained in large document collections. Tagged document collections, however, require different types of measures for summarizing the data, and the data exhibits considerably different properties than is the case with the data in traditional OLAP. To address these issues, an OLAP system for documents will require a different design and partial materialization approach.
    In this talk I will describe a new document centric partial materialization strategy that offers faster average response time to expected query workload compared to the current partial materialization approaches, along with a lower storage space requirement. The performance of this new partial materialization strategy is evaluated over real and synthetic document collections.

    DB Meeting: Wednesday October 8, 2:30pm, DC 2585
    Speaker: Jeff Pound, SAP Waterloo
    Title: Distributed Databases: the Good, the Bad, and the Ugly
    Abstract: This talk is inspired by the pop sensation and distributed database enthusiast Carly Rae Jepson. In this talk we will discuss her hit song "Call Me Maybe", a poetic ode to the challenges in building fault-tolerant distributed systems. We will see how Ms. Jepson's concerns about being called affect a variety of open source database systems, and explore how these systems behave when the "calls" fail or are delayed (ie., under network partition). In particular, we will look at consistency models as advertised vs. actual system behaviour under faults. We will survey "good" systems that make consistency guarantees and adhere to them, "bad" systems that forgo consistency for scalability and availability, and "ugly" systems that guarantee consistency but do not actually provide it in practice.
    This talk is based on a number blogs (yes, blogs). Primarily Kyle Kingsbury's Jepson blog series, but also blog articles by Daniel Abadi and LinkedIn's Jay Kreps.

    DB Seminar: Wednesday October 15, 2:30pm, DC 1302
    Speaker: Jignesh Patel, University of Wisconsin
    Title: Towards hardware-software co-design for data analytics: A plea and a proposal

    DB Meeting: Wednesday October 29, 2:30pm, DC 2585
    Speaker: Ani Nica, SAP Waterloo

    DB Meeting: Wednesday November 5, 2:30pm, DC 2585
    Speaker: Arian Baer, FTW Telecom Research Centre, Vienna
    Title: Cache-Oblivious Scheduling of Shared Workloads
    Abstract: Shared workload optimization is feasible if the set of tasks to be executed is known in advance, as is the case in updating a set of materialized views or executing an extract-transform-load workflow. In this talk, we consider data-intensive shared workloads with precedence constraints arising from data dependencies, i.e., before executing some task, other tasks may have to run first and generate some data needed by the next task(s). While there has been previous work on identifying common subexpressions in shared workloads and task re-ordering to enable shared scans, we go a step further and solve the problem of scheduling shared data-intensive workloads in a cache-oblivious way. Our solution relies on a novel formulation of precedence constrained scheduling with the additional constraint that once a data item is in the cache, all tasks that require this data item should execute as soon as possible thereafter. The intuition behind this formulation is that the longer a data item remains in the cache, the more likely it is to be evicted regardless of the cache size. We give an optimal ordering algorithm using A* search over the space of possible orderings, and we propose efficient and effective heuristics that obtain nearly-optimal results in much less time. We present experimental results on real-life data warehouse workloads and the TCP-DS benchmark to validate our claims.

    DB Meeting: Wednesday November 12, 2:30pm, DC 2585
    Speaker: Peter Bumbulis, SAP Waterloo
    Abstract: I'll be talking about the following paper as well as a bit about time synchronization. There will probably be a bit about Spanner as well.
    Clock-SI: Snapshot Isolation for Partitioned Data Stores Using Loosely Synchronized Clocks, SRDS 2013.

    DB Seminar: Wednesday, November 19 2:30pm, DC 1304 CANCELLED
    Speaker: Nesime Tatbul, Intel Labs and MIT
    Title: S-Store: A Streaming NewSQL System for Big Velocity Applications

    This page is maintained by Khuzaima Daudjee.

    Campaign Waterloo

    Data Systems Group
    David R. Cheriton School of Computer Science
    University of Waterloo
    Waterloo, Ontario, Canada N2L 3G1
    Tel: 519-888-4567
    Fax: 519-885-1208

    Contact | Feedback: | Data Systems Group

    Valid HTML 4.01!Valid CSS! Last modified: Sunday, 16-Nov-2014 08:44:52 EST