Detailed Program

Copenhagen, Denmark (GMT+2)

ACM DEBS 2022 (Copenhagen, Denmark) - HYBRID

ACM DEBS 2022 (Copenhagen, Denmark) - HYBRID FINAL Program Overview

Time Zone Table: http://tiny.cc/bssruz

Day 1: Monday, June 27, 2022

9:30

10:00

Registration

10:00

10:45

Opening, Awards and D&I Brief

Session Co-Chairs: Yongluan Zhou (University of Copenhagen), Panos K. Chrysanthis (University of Pittsburgh), Vincenzo Gulisano (Chalmers University of Technology) and Sihem Amer-Yahia (CNRS, Univ. Grenoble Alpes)

10:45

11:00

Discussion Break

11:00

12:00

Panel: Climate Change and Computing: Facts, Perspectives and an Open Discussion

Session Co-Chairs: Panos K. Chrysanthis (University of Pittsburgh) and Vincenzo Gulisano (Chalmers University of Technology)

Climate Change and Computing: Facts, Perspectives and an Open Discussion
Panel Moderators: Antoine Amarilli, Christophe Claramunt, and Demetris Zeinalipour

12:00

13:00

Lunch Break

13:00

14:00

Research Session 1

Session Chair: Daniele Dell'Aglio (Aalborg University)

(#10) Travel Light - State Shedding for Efficient Operator Migration, Espen Volnes (University of Oslo)*; Thomas Plagemann (University of Oslo); Boris Koldehofe (University of Groningen); Vera Goebel (University of Oslo)

(#22) Zero-Shot Cost Models for Distributed Stream Processing, Roman Heinrich (DHBW Mannheim)*; Manisha Luthra (TU Darmstadt); Harald Kornmayer (DHBW Mannheim); Carsten Binnig (TU Darmstadt)

(#23) Window-based Parallel Operator Execution with In-Network Computing, Bochra Boughzala (University of Groningen)*; Christoph Gärtner (Technical University of Darmstadt); Boris Koldehofe (University of Groningen)

(#30) Substream Management in Distributed Streaming Dataflows, Artem Trofimov (lzy.ai)*; Nikita Sokolov (Yandex Cloud); Nikita Marshalkin; Igor E Kuralenok (Huawei); Boris Novikov (HSE University)

14:00

14:30

Discussion Break

14:30

15:15

Tutorial 1: Formal Models of Complex Event Recognition

Session Co-Chairs: Paris Carbone (KTH) and Danh Le Phuoc (TU Berlin)

Formal Models of Complex Event Recognition, Alexander Artikis (University of Piraeus)

15:15

15:30

Discussion Break

15:30

16:45

Demo and Ph.D. forum

Session Co-Chairs: David Eyers (University of Otago) and Genoveva Vargas-Solar (CNRS)

(#55) PANDA: Performance Prediction for Parallel and Dynamic Stream Processing, Pratyush Agnihotri (TU Darmstadt)*; Boris Koldehofe (University of Groningen); Carsten Binnig (TU Darmstadt); Manisha Luthra (TU Darmstadt)

(#56) SMAS: A Smart Alert System for Localization and First Response to Fires on Ro-Ro Vessels, Paschalis Mpeis (University of Cyprus); Athina Hadjichristodoulou (University of Cyprus); Jaime Bleye Vicario (Integral Maritime Safety Centre Jovellanos); Demetrios Zeinalipour-Yazti (University of Cyprus)*

(#57) StreamVizzard - An Interactive and Explorative Stream Processing Editor, Timo Räth (TU Ilmenau)*

(#59) A Sneak Peek at RisingWave: a Cloud-Native Streaming Database, Yanghao Wang (Singularity Data Inc)*; Zhi LIU (Singularity Data Inc)

(#53) Interactive and Explorative Stream Processing, Timo Räth (TU Ilmenau)*

16:45

17:00

Discussion Break

Day 2: Tuesday, June 28, 2022

8:30

9:00

Registration

9:00

9:45

Research Session 2

Session Chair: Alessandro Margara (Politecnico di Milano)

(#36) Toward Reducing Cross-Shard Transaction Overhead in Sharded Blockchains, Liuyang Ren (University of Waterloo)*; Paul Ward (University of Waterloo); Bernard Wong (University of Waterloo)

(#33) Event-Based Data-Centric Semantics for Consistent Data Management in Microservices, Tilman Zuckmantel (University of Copenhagen)*; Yongluan Zhou (University of Copenhagen); Boris Düdder (University of Copenhagen); Thomas Hildebrandt (University of Copenhagen)

(#25) CougaR: Fast and Eclipse-Resilient Dissemination for Blockchain Networks, Evangelos Kolyvas (Athens University of Economics and Business)*; Spyros Voulgaris (Athens University of Economics and Business)

9:45

10:00

Discussion Break

10:00

10:45

Research Session 3

Session Chair: Boris Koldehofe (University of Groningen)

(#20) A Multi-level Caching Architecture for Stateful Stream Computation, Muhammed Tawfiqul Islam (University of Melbourne)*; Renata Borovica-Gajic (University of Melbourne); Shanika Karunasekera (University of Melbourne, Australia)

(#24) Predicate-Based Push-Pull Communication for Distributed Complex Event Processing, Steven Purtzel (Humboldt-Universität zu Berlin)*; Samira Akili (Humboldt-Universität zu Berlin); Matthias Weidlich (Humboldt-Universität zu Berlin)

(#14) Optimizing Complex Event Forecasting, Vasileios Stavropoulos (National Centre of Scientific Research "Demokritos"); Elias Alevizos (NCSR'D)*; Nikos Giatrakos (Technical University of Crete); Alexander Artikis (University of Piraeus)

10:45

11:00

Discussion Break

11:00

12:00

Keynote 1: Rethinking how distributed applications are built

Session Chair: Yongluan Zhou (University of Copenhagen)

Rethinking how distributed applications are built
Till Rohrmann (Ververica)

Abstract:
In our more and more connected world where people are used to managing their lives via digital services, it has become mandatory for a successful company to build applications that can scale with the popularity of the company’s services. Scalability is not the only requirement but similarly important is that modern applications are highly available and fast because users are not willing to wait in our ever faster moving world. Due to this, we have seen a shift from the classic monolith towards micro service architectures which promise to be more easily scalable. The emergence of serverless functions further strengthened this trend more recently. By implementing a micro service architecture, application developers are all of a sudden exposed to the realm of distributed applications with its seemingly limitless scalability but also its pitfalls nobody tells you about upfront. So instead of solving business domain problems, developers find themselves fighting with race conditions, distributed failures, inconsistencies and in general a drastically increased complexity. In order to solve some of these problems, people introduce endless retries, timeouts, sagas and distributed transactions. These band aids can quickly result in a not so scalable system that is brittle and hard to maintain. The underlying problem is that developers are responsible for ensuring reliable communication and consistent state changes. Having a system that takes care of these aspects could drastically reduce the complexity of developing scalable distributed applications. By inverting the traditional control-flow from application-to-database to database-to-application, we can put the database in charge of ensuring reliable communication and consistent state changes and, thus, freeing the developer to think about it. In this keynote, I want to explore the idea of putting the database in charge of driving the application logic using the example of Stateful Functions, a library built on top of Apache Flink that follows this idea. I will explain how Stateful Functions achieves scalability and consistency but also what its limitations are. Based on these results, I would like to sketch the requirements for a runtime that can truly realise the full potential of Stateful Functions and discuss with you ideas how it could be implemented.

12:00

13:00

Lunch Break

13:00

14:00

Keynote 2: Teasing journalistic findings out of heterogeneous sources: a data/AI journey

Session Chair: Yongluan Zhou (University of Copenhagen)

Teasing journalistic findings out of heterogeneous sources: a data/AI journey
Ioana Manolescu (Inria and LIX/Ecole Polytechnique)

14:00

14:30

Discussion Break

14:30

15:30

D&I Talk: It’s funny because it’s true - confronting scientific catechisms

Session Co-Chairs: Panos Chrysanthis (University of Pittsburgh) and Vana Kalogeraki (Athens University of Economics and Business)

It’s funny because it’s true - confronting scientific catechisms
Falaah Arif Khan (NYU)

15:30

16:00

Discussion Break

16:00

16:45

Tutorial 2: Hopsworks Feature Store

Session Co-Chairs: Paris Carbone (KTH) and Danh Le Phuoc (TU Berlin)

Hopsworks Feature Store, Jim Dowling (KTH Royal Institute of Technology)

16:45

17:00

Discussion Break

17:30

19:00

Reception and Posters (17:30 - 19:00)

Day 3: Wednesday, June 29, 2022

8:30

9:00

Registration

9:00

10:15

Tutorial 3 (Part A): A Unifying Model for Distributed Data-Intensive Systems

Session Co-Chairs: Paris Carbone (KTH) and Danh Le Phuoc (TU Berlin)

A Unifying Model for Distributed Data-Intensive Systems, Alessandro Margara (Politecnico di Milano Italy)

10:15

10:45

Discussion Break

10:45

12:00

Tutorial 3 (Part B): A Unifying Model for Distributed Data-Intensive Systems

Session Co-Chairs: Paris Carbone (KTH) and Danh Le Phuoc (TU Berlin)

A Unifying Model for Distributed Data-Intensive Systems, Alessandro Margara (Politecnico di Milano Italy)

12:00

13:00

Lunch Break

13:00

14:45

Grand Challenge

Session Chair: Sebastian Frischbier (Infront Financial Technology GmbH), Arne Hormann, (Infront Quant AG), Ruben Mayer (TU Munich), Jawad Tahir (TU Munich), and Christoph Doblander (TU Munich)

(#54) The DEBS 2022 Grand Challenge: Detecting Trading Trends in Financial Tick Data, Sebastian Frischbier (Infront Financial Technology GmbH); Jawad Tahir (Technical University of Munich)*; Christoph Doblander (Technical University of Munich); Arne Hormann (Infront Quant AG); Ruben Mayer (Technical University of Munich); Hans-Arno Jacobsen (University of Toronto)

(#05) DEBS Grand Challenge: Analysis of Market Data with Noir, Luca De Martini (Politecnico di Milano); Alessandro Margara (Politecnico di Milano)*; Gianpaolo Cugola (Politecnico di Milano)

(#07) Detecting Trading Trends in Streaming Financial Data using Apache Flink, Emmanouil Kritharakis (Boston University)*; Shengyao Luo (Boston University); Vivek Unnikrishnan (Boston University); Karan Vombatkere (Boston University)

(#08) DEBS Grand Challenge 2022: Detecting Technical Trading Patterns in Financial Data with Apache Flink, Shekhar Sharma (Boston University)*; Ryte Richard (Boston University); Quang D Nguyen (Boston University); Xavier Ruiz (Boston University); Quan Pham (Boston University)

(#11) Efficient Processing of High-Volume Tick Data with Apache Flink for the DEBS 2022 Grand Challenge, Stefanos Kalogerakis (FORTH); Antonis Papaioannou (ICS - FORTH)*; Kostas Magoutis (ICS - FORTH and University of Crete)

(#17) Real-time Analysis of Market Data Leveraging Apache Flink, Cecilia Calavaro (University of Rome Tor Vergata); Gabriele Russo Russo (University of Rome Tor Vergata); Valeria Cardellini (University of Rome Tor Vergata)*

(#26) A High-Performance Stream Processing System Implementation for Monitoring Stock Market Data Stream, Kevin A Li (The University of Texas at Austin); Daniel Fernandez (The University of Texas at Austin); David Klingler (The University of Texas at Austin); Yuhan Gao (The University of Texas at Austin ); Jacob Rivera (University of Texas at Austin); Kia Teymourian (The University of Texas at Austin)*

(#48) Real-time Stock Market Analytics for Improving Deployment and Accessibility using PySpark and Docker, Suyeon Wang (Dong-A University); JaeKyeong Kim (Dong-A University); Yoonsang Yang (Dong-A University); Jinseong Hwang (Dong-A University); Jungkyu Han (Dong-A University); Sejin Chun (Dong-A University)*

14:45

15:30

Discussion Break

15:30

16:30

Keynote 3: I'm SO Glad I'm Uncoordinated!

Session Chair: Yongluan Zhou (University of Copenhagen)

I'm SO Glad I'm Uncoordinated!
Pat Helland (Salesforce)

Bio:
Pat Helland has been building distributed systems, database systems, high-performance messaging systems, and multiprocessors since 1978, shortly after dropping out of UC Irvine without a bachelor's degree. That hasn't stopped him from having a passion for academics and publication. From 1982 to 1990, Pat was the chief architect for TMF (Transaction Monitoring Facility), the transaction logging and recovery systems for NonStop SQL, a message-based fault-tolerant system providing high-availability solutions for business critical solutions. In 1991, he moved to HaL Computers where he was chief architect for the Mercury Interconnect Architecture, a cache-coherent non-uniform memory architecture multiprocessor. In 1994, Pat moved to Microsoft to help the company develop a business providing enterprise software solutions. He was chief architect for MTS (Microsoft Transaction Server) and DTC (Distributed Transaction Coordinator). Starting in 2000, Pat began the SQL Service Broker project, a high-performance transactional exactly-once in-order message processing and app execution engine built deeply into Microsoft SQL Server 2005. From 2005-2007, he worked at Amazon on scalable enterprise solutions, scale-out user facing services, integrating product catalog feeds from millions of sellers, and highly-available eventually consistent storage. From 2007 to 2011, Pat was back at Microsoft working on a number of projects including Structured Streams in Cosmos. Structured streams kept metadata within the "big data" streams that were typically 10s of terabytes in size. This metadata allowed affinitized placement within the cluster as well as efficient joins across multiple streams. On launch, this doubled the work performed within the 250PB store. Pat also did the initial design for Baja, the distributed transaction support for a distributed event-processing engine implemented as an LSM atop structured streams providing transactional updates targeting the ingestion of "the entire web in one table" with changes visible in seconds. Starting in 2012, Pat has worked at Salesforce on database technology running within cloud environments. His current interests include latency bounding of online enterprise-grade transaction systems in the face of jitter, the management of metastability in complex environments, and zero-downtime upgrades to databases and stateful applications. In his spare time, Pat regularly writes for ACM Queue, Communications of the ACM, and various conferences. He has been deeply involved in the organization of the HPTS (High Performance Transactions Systems - www.hpts.ws) workshop since 1985. His blog is at pathelland.substack.com and he parsimoniously tweets with the handle @pathelland.

16:30

17:00

Discussion Break

17:30

21:30

Excursion and Banquet (17:30 - 21:30)

Session Chair: Yongluan Zhou (University of Copenhagen)

Day 4: Thursday, June 30, 2022

8:30

9:00

Registration

9:00

9:45

Tutorial 4: Apache Wayang (Incubating): Performing AIoT Seamlessly

Session Co-Chairs: Paris Carbone (KTH) and Danh Le Phuoc (TU Berlin)

Apache Wayang (Incubating): Performing AIoT Seamlessly, Jorge Arnulfo Quiane Ruiz (TU Berlin and DFKI)

9:45

10:00

Discussion Break

10:00

11:00

DEBS 2023 Pitch and Test-of-Time Ceremony

Session Chair: Yongluan Zhou (University of Copenhagen)

11:00

12:00

Invited Industry Talks

Session Co-Chairs: Valeria Cardellini (University of Rome Tor Vergata) and Yingjun Wu (Singularity Data, Inc.)

		RisingWave: A Distributed SQL Streaming Database Designed for the Cloud Yingjun Wu, Founder and CEO at Singularity Data Abstract: In this talk, I will present RisingWave, a distributed SQL streaming database designed for the cloud. RisingWave provides standard SQL as the interactive interface. It speaks in PostgreSQL dialect, and can be seamlessly integrated with the PostgreSQL ecosystem with no code change. RisingWave treats streams as tables and allows users to compose complex queries over streaming and historical data declaratively. RisingWave is designed for the cloud. The cloud-native architecture enables RisingWave to scale compute and storage resources separately and infinitely based on the users’ demands. We open-sourced RisingWave kernel under Apache License 2.0. Together with the open community, we are on the mission to democratize stream processing: to make stream processing simple, affordable, and accessible for everyone. Bio: Yingjun Wu is the founder and CEO of Singularity Data, a startup innovating next-generation cloud-native database systems. Before starting his adventure, Yingjun was a software engineer at the Redshift team, Amazon Web Services, and a researcher at the Database group, IBM Almaden Research Center. Yingjun received his PhD degree from National University of Singapore, where he was affiliated with the Database Group (advisor: Kian-Lee Tan). He was also a visiting PhD at the Database Group, Carnegie Mellon University (advisor: Andrew Pavlo). Yingjun Wu is passionate about integrating research into real-world system products. During his time at AWS, Yingjun was responsible for boosting Amazon Redshift performance using advanced vectorization and compression techniques. Before that, he participated in the development of IBM Db2 Event Store’s indexing structure and transaction processing mechanism. Yingjun was an early contributor to Stratosphere, which is now widely known as Apache Flink. Yingjun is also active in academia. He is serving as a Program Committee member in several top-tier database conferences, such as SIGMOD, VLDB, and ICDE.
		The Industrial Challenges of Evolving Graph Networks Madalina Ciortan, Head of the Data Science Department at Euranova Abstract: Large network data evolving over time have become ubiquitous across most industries, ranging from automotive and pharma to e-commerce and banking. Despite recent efforts, using temporal graph neural networks on continuously changing data in an effective and scalable way has been a challenge. This talk presents one of the research tracks studied at Euranova. We provide an overview of relevant continual learning methods directly applicable to real-world use cases. As explainability has become the central ingredient in trustworthy AI, we introduce the landscape of state-of-the-art methods designed for explaining node, link or graph level predictions. Bio: Madalina Ciortan is the head of the data science department at Euranova. After graduating as an engineer, a master in computer science and a postmaster in bioinformatics, she earned a doctorate in data science. She has over 15 years of experience in roles ranging from development, architecture, team leading, coaching and research. She worked on topics including computer vision, NLP, time series analysis, unsupervised analysis, self-supervised learning, as well as high dimensional and noisy data analysis in the industry.
		The Value and Challenge of Real Time Market Data: be careful what you are asking for Anna Almén, Chief Technology Officer at Infront Financial Technology GmbH Abstract: Across our business and personal lives, we’re used to getting information in real-time. How many stops until my parcel arrives? Where’s my Uber? Is it faster for me to walk or take the bus? Are there any new videos to watch? We all carry super-computers in our pockets with instant access to all the streaming data we want, customised to our personal preferences. Naturally, financial services professionals in brokerage, trading, and wealth management expect their market data to be real-time as well, allowing them to make split second decisions to buy and sell. But real time market data is hard to provide with sufficient Quality of Information (QoI) and Quality of Service (QoS). It’s massive, constant, and complex. And most days, real-time is much more than people need, even if it is what they ask for. We’ll look at what the real business drivers are for going real-time, when delayed or daily is better, and how to balance the market data needs against the wants. Bio: Anna Almén – CTO, joined Infront, a leading European provider of information and technology solutions, in March 2022. Anna has extensive experience in integrating companies based on a data-driven approach leading to more efficient organisations. Anna Almén has held various positions in the Swedish financial technology sector as well as working with startup organisations going through exponential growth. Most recently, she held the position of CTO of eCommerce at Worldline following the merger with the startup company Bambora where she had a key role in the technology leadership. She has also worked for many years for companies in the trading segment including Nasdaq OMX. Anna Almén earned her Master’s Degree in Computer Science at KTH Royal Institute of Technology.

12:00

13:00

Lunch Break

13:15

14:15

Industry Session 1

Session Co-Chairs: Valeria Cardellini (University of Rome Tor Vergata) and Yingjun Wu (Singularity Data, Inc.)

(#19) AMESoS: A Scalable and Elastic Framework for Latency Sensitive Pipelines, Michail Tsenos (Athens University of Economics and Business); Aristotelis Peri (Athens University of Economics and Business), Vana Kalogeraki (Athens University of Economics and Business, Greece)*

(#41) Deriving a realistic workload model to simulate high-volume financial data feeds for performance benchmarking, Vladimir Sladojević (Politecnico di Milano); Sebastian Frischbier (Infront Financial Technology GmbH); Alexander Echler (Infront Financial Technology GmbH); Mario Paic (Infront Financial Technology GmbH); Alessandro Margara (Politecnico di Milano)*

(#42) Knowledge Graph Stream Processing at the Edge, Joffrey de Oliveira (UPEM); Christophe Callé (UPEM); Weiqin Xu (ENGIE); Philippe Calvez (ENGIE); Olivier Curé (UPEM)*