
Stefan Grafberger
@sgrafberger
Ph.D. Student at @bifoldberlin, researching data management for ML
ID: 1044475802824450048
https://stefan-grafberger.com 25-09-2018 06:36:51
78 Tweet
352 Followers
416 Following



Excited to present BtrBlocks, our new columnar compression format for data lakes at the SIGMOD/PODS 2025 compression and fairness session at ~5pm PDT github.com/maxi-k/btrbloc…


Today I started my research internship in Redmond with the Microsoft Microsoft Gray Systems Lab. Looking forward to an amazing summer!


Awesome paper by xiaozhong lyu Stefan Grafberger @ce__zhang Sebastian shows how #RAG can be improved through data importance learning. The approach learns weights for data sources based on their performance on a validation set and then re-weights or prunes the corpus. 1/3

Our paper "Towards Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" has been accepted for the DEEM Workshop @ SIGMOD at SIGMOD! 🎉 In this vision paper, we present our initial ideas for my next research project. Joint work with Sebastian and Paul Groth.


We can't wait for DEEM Workshop @ SIGMOD 2024 to get started @sigmodconf! Join us tomorrow Sunday 9 June from 9am in the Tupungato room!! Check out the full program at: deem-workshop.github.io 🇨🇱

Looking forward to presenting our vision "Towards Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" at the DEEM Workshop @ SIGMOD today! The talk will be around 10:40 a.m. in the Tupungato room. stefan-grafberger.com/shadow-pipelin…


📢 Excited to share Feature Clock, an open-source library and paper accepted at IEEE VIS! Feature Clock enhances the explainability and compactness of visualizations of high-dimensional effects in two-dimensional plots. Big thanks to my co-authors Valentina Boeva and Rita Sevastjanova!


"Snapcase" allows users to regain control over their recommendations in online shopping platforms. #VLDB24: a.o. BIFOLD researchers introduced "Snapcase," a demo paper that addresses the concept of machine unlearning. bifold.berlin/news-events/ne… Sebastian Stefan Grafberger Maarten de Rijke



The Data Management for End-to-End Machine Learning workshop (DEEM Workshop @ SIGMOD) will be back at #SIGMOD2025! ✨ 🔗 Check out the CfP: deem-workshop.github.io 📝 Submission deadline: March 21 📢 Notifications: April 25 Join us for the 9th edition in Berlin! #DEEM2025

Our vision paper "Towards Regaining Control over Messy ML Pipelines" was accepted for the DAIS@ICDE2025 at IEEE ICDE Conference! Initial experiments show LLMs are promising for extracting declarative query plans from messy ML code. Joint work w/ Hao Chen, Olga Ovcharenko, Sebastian



📢 Deadline extension for DEEM 2025 SIGMOD/PODS 2025! Following requests, we're extending the submission deadline to April 1, 5pm Pacific Time. More info at: deem-workshop.github.io
