Upcoming Talks

Ist logo

Understanding distributed dataflow systems and applications

Date: Monday, March 18, 2019 10:00 - 11:00
Speaker: Ioannis Liagouris (ETH Zurich)
Location: Mondi Seminar Room 2, Central Building
Series: Mathematics and CS Seminar
Host: Krish Chatterjee

Abstract:

Distributed dataflow systems have become the de-facto systems for large-scale data processing. So far, most efforts from academia and industry focus on improving the performance, scalability, and reliability of these systems. However, as their complexity increases, explainability emerges as a first-class concern.

In this talk I will present my work on understanding the behavior of distributed dataflow systems and applications. First, I will describe a framework for explaining the semantics: Why and how does a dataflow return certain results? To answer such questions, the framework leverages ideas from database provenance to provide output explanations that are both sufficient and concise. Second, I will focus on understanding performance: Why is a dataflow execution slow and which are the bottlenecks in the pipeline? My work in this area generalizes existing approaches on critical path analysis methods to dynamic and continuous computations. I will conclude the talk with a discussion on the challenges of explaining emerging AI applications and an overview of my future research agenda.

Qr image
Download ICS Download invitation
Back to eventlist