My research interests include distributed stream processing and large-scale graph analytics.
Before coming to BU, I was a member of the Systems Group at ETH Zurich, where I worked on Strymon, a system for predictive datacenter analytics. In September 2017 I was awarded the ETH Zurich Postdoctoral Fellowship for my research project "Automatic Scaling of Distributed Streaming Computations Using Graph Analytics on Real-Time Monitoring Data". I received my PhD from KTH, Stockholm, and UCL, Belgium, where I was admitted to a double doctoral program as an EMJD-DC fellow. My thesis, "Performance Optimization Techniques and Tools for Distributed Graph Processing" received the IBM Innovation Award 2017. During my PhD studies, I also spent time at DIMA TU Berlin, Telefonica Research Barcelona, and data Artisans.
Our USENIX HotStorage'20 paper In support of workload-aware streaming state management won the Outstanding New Research Direction Award and was a finalist for the Best Presentation Award.
I contributed an article to the SIGOPS Blog where I discuss the evolution and future of stream processing systems.
I gave a Zoom guest lecture on large-scale stream processing at Vijay Chidambaram's distributed systems class.
Our position paper, titled In support of workload-aware streaming state management, was accepted for presentation at the upcoming 12th USENIX Workshop on Hot Topics in Storage and File Systems. The workshop will take place on July 13, 2020 and it will be virtually co-located with USENIX ATC'20.
We will be presenting the tutorial Beyond Analytics: the Evolution of Stream Processing Systems at ACM SIGMOD'20.
I will be one of the keynote speakers at the North East Database Day 2020.
How can we design and implement scalable data processing systems whose capabilities stretch beyond those of traditional data management platforms? My recent work in this area includes understanding the performance of streaming dataflows and enabling accurate automatic scaling of streaming jobs.
How can we represent, partition, summarize, and analyze possibly unbounded data of various formats and originating from diverse, distributed sources? My recent work in this area includes a distributed graph summarization technique and a survey of streaming graph partitioning methods in the context of data-parallel continuous processing.
How can we achieve end-to-end, efficient big data processing while providing expressive, high-level programming models, accessible to data scientists and non-expert users? My recent work in this area includes a survey of high-level programming abstractions for distributed graph processing.
IEEE ICDE 2021 (PC Member), ACM SIGMOD 2020 Student Research Competition (Judge), EuroSys 2021 (Travel Grant co-Chair), ACM DEBS 2020 (PC Member), ICDE 2020 (Demonstration Track), EDBT 2020 (Demonstration Track), CCGrid 2019 (Applications and Data Science track co-Chair), OPODIS 2018 (PC member), Middleware Doctoral Symposium (ACM/IFIP Middleware 2020), GRADES-NDA 2020 (co-located with SIGMOD 2020), USENIX HotStorage 2020, DBPL 2019 (co-located with PLDI 2019), GRADES-NDA 2019 (co-located with SIGMOD 2019), DBTest 2018 (co-located with SIGMOD 2018), GRADES-NDA 2018 (co-located with SIGMOD 2018), GABB 2018 (co-located with IPDPS 2018), GABB 2017 (co-located with IPDPS 2017), DEEM 2017 (co-located with SIGMOD 2017).
Flink Forward Berlin 2019, Flink Forward San Francisco 2017, Berlin Buzzwords 2017, Flink Forward Berlin 2016, Berlin Buzzwords 2016
From data stream management to distributed dataflows and beyond at North East Database Day 2020. [slides]