Data Systems Seminar Series (2017-2018)

The Data Systems Seminar Series provides a forum for presentation and discussion of interesting and current database issues. It complements our internal database meetings by bringing in external colleagues. The talks that are scheduled for this year are listed below.

The talks are usually held on a Monday at 10:30am in room DC 1302. Exceptions are flagged.

We will try to post the presentation notes, whenever that is possible. Please click on the presentation title to access these notes.


The Database Seminar Series is supported by


Benny Kimefeld
Heng Ji
Aditya Parameswaran
 
 
 
Panos K. Chrysanthis
 

5 September 2017, 2:30 pm, DC1331 (Please note the special time and place)

Title: Enumerating Tree Decompositions: Why and How
Speaker: Benny Kimelfeld, Technion
Abstract:

Many intractable problems on graphs have efficient solvers when graphs are trees or forests. Tree decompositions often allow to apply such solvers to general graphs by grouping nodes into bags laid out in a tree structure, thereby decomposing the problem into the sub-problems induced by the bags. This approach has applications in a plethora of domains, partly because it allows the optimize inference on probabilistic graphical models, as well as evaluation of database queries. Nevertheless, a graph can have exponentially many tree decompositions and finding an ideal one is challenging, for two main reasons. First, the measure of goodness often depends on subtleties of the specific application at hand. Second, theoretical hardness is met already for the simplest measures such as the maximal size of bag (a.k.a. “width”).  Therefore, we explore the approach of producing a large space of high-quality tree decompositions for the application to choose from. 

I will describe our application of tree decompositions in the context of “worst-case optimal” joins --- a new breed of in-memory join algorithms that satisfy strong theoretical guarantees and were found to feature a significant speedup compared to traditional approaches. Specifically, I will explain how this development led us to the challenge of enumerating tree decompositions. Then, I will describe a novel enumeration algorithm for tree decompositions with a theoretical guarantee on the delay (the time between consecutive answers), and an experimental study thereof (on graphs from various relevant domains). Finally, I will describe recent results that provide guarantees on both the delay and the quality of the generated tree decompositions.

The talk will be based on papers that appeared in EDBT 2017 and PODS 2017, co-authored with Nofar Carmeli, Yoav Etsion, Oren Kalinsky and Batya Kenig.

Bio: Benny Kimelfeld is an Associate Professor at Technion, Israel.  In the past he has been at LogicBlox and at IBM Research – Almaden. His research interests are around aspects of data management, such as database theory and systems, algorithms for query evaluation, information extraction, information retrieval, data mining, and database uncertainty. He received his Ph.D. in Computer Science from The Hebrew University of Jerusalem, under the supervision of Prof. Yehoshua Sagiv.

16 October 2017, 10:30 am, DC 1304 (Please note the unusual room)

Title: Universal Information Extractionnotes video
Speaker: Heng Ji, Rensselaer Polytechnic Institute
Abstract: The goal of Information Extraction (IE) is to extract structured facts from a wide spectrum of heterogeneous unstructured data types including texts, speech, images and videos. Traditional IE techniques are limited to a certain source X (X = a particular language, domain, limited number of pre-defined fact types, single data modality...). When we move from X to a new source Y, we need to start from scratch again by annotating a substantial amount of training data and developing Y specific extraction capabilties. We propose a new Universal Information Extraction (IE) paradigm to combine the merits of traditional IE (high quality and fine granularity) and Open IE (high scalability). This framework aims to discover schemas and extract facts from any input corpus, without any annotated training data or predefined schema. It can also be extended to multiple data modalities (images, videos) and 282 languages by constructing a common semantic space and transfer learning across sources.
Bio: Heng Ji is Edward P. Hamilton Development Chair Professor in Computer Science Department of Rensselaer Polytechnic Institute. She received her Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction and Knowledge Base Population. She was selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. She received "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, Google Research Awards in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014, Bosch Research Awards in 2015 and 2016. She coordinated the NIST TAC Knowledge Base Population task since 2010. She is now serving as the Program Committee Co-Chair of NAACL2018.

2 November 2017, 10:30 pm, DC 1302 (Please note the unusual day)

Title:

TBD notesvideo

Speaker: Aditya Parameswaran, University of Illinois-Urbana Champaign
Abstract:

TBD

Bio: TBD

4 December 2017, 10:30 am, DC 1302

Title: TBD notes
Speaker: Rumi Chunara, New York University
Abstract:

TBD

Bio: Rumi Chunara is an Assistant Professor at New York University, jointly appointed in Computer Science and in Global Public Health. Her research interests combine data mining and machine learning with social and ubiquitous computing. Specifically she focuses on feature extraction from and statistical modeling of unstructured and observational personally-generated data -- for epidemiological applications. She received her Ph.D. from MIT and was named an MIT Technology Review Innovator Under 35 in 2014.

8 January 2018, 10:30 am, DC 1302

Title:

TBD

Speaker: TBD
Abstract:

TBD

Bio: TBD

23 April 2018, 10:30 pm, DC 1302

Title:

TBD

Speaker: TBD
Abstract: TBD
Bio: TBD

14 May 2018, 10:30 am, DC 1302

Title: TBD notes video
Speaker: Panos K. Chrysanthis, University of Pittsburgh
Abstract: TBD
Bio: TBD

11 June 2018, 10:30 pm, DC 1302

Title: TBD notes video
Speaker: TBD
Abstract:

TBD

Bio: TBD


Campaign Waterloo

Data Systems Group
David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario, Canada N2L 3G1
Tel: 519-888-4567
Fax: 519-885-1208

Contact | Feedback: db-webmaster@cs.uwaterloo.ca | Data Systems Group


Valid HTML 4.01!Valid CSS! Last modified: Monday, 18-Sep-2017 09:27:00 EDT


Menu:ShowHide