Stefan Grafberger (@sgrafberger) 's Twitter Profile
Stefan Grafberger

@sgrafberger

Ph.D. Student at @bifoldberlin, researching data management for ML

ID: 1044475802824450048

linkhttps://stefan-grafberger.com calendar_today25-09-2018 06:36:51

78 Tweet

352 Followers

416 Following

Stefan Grafberger (@sgrafberger) 's Twitter Profile Photo

We just open-sourced our prototype StreamDQ, a library built on top of Apache Flink for defining "unit tests for data", which measure data quality in large data streams. github.com/stefan-grafber… Joint work with Sebastian and Paul Groth.

We just open-sourced our prototype StreamDQ, a library built on top of Apache Flink for defining "unit tests for data", which measure data quality in large data streams.   

github.com/stefan-grafber…  

Joint work with <a href="/sscdotopen/">Sebastian</a> and <a href="/pgroth/">Paul Groth</a>.
Ce Zhang (@ce_zhang) 's Twitter Profile Photo

Congrats Sebastian Stefan Grafberger! Data quality for ML is becoming increasingly important, excited to see ArgusEyes brings PTIME Data Shapely into practice -- to help improve the data to improve model! - system: ssc.io/publication/pr… - PTME Shapley: arxiv.org/abs/2204.11131

Maximilian Kuschewski (@maxikuschewski) 's Twitter Profile Photo

Excited to present BtrBlocks, our new columnar compression format for data lakes at the SIGMOD/PODS 2025 compression and fairness session at ~5pm PDT github.com/maxi-k/btrbloc…

Excited to present BtrBlocks, our new columnar compression format for data lakes at the <a href="/SIGMODConf/">SIGMOD/PODS 2025</a> compression and fairness session at ~5pm PDT github.com/maxi-k/btrbloc…
TruLens (@trulensml) 's Twitter Profile Photo

Awesome paper by xiaozhong lyu Stefan Grafberger @ce__zhang Sebastian shows how #RAG can be improved through data importance learning. The approach learns weights for data sources based on their performance on a validation set and then re-weights or prunes the corpus. 1/3

Stefan Grafberger (@sgrafberger) 's Twitter Profile Photo

Our paper "Towards Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" has been accepted for the DEEM Workshop @ SIGMOD at SIGMOD! 🎉 In this vision paper, we present our initial ideas for my next research project. Joint work with Sebastian and Paul Groth.

Our paper "Towards Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" has been accepted for the <a href="/deem_workshop/">DEEM Workshop @ SIGMOD</a> at SIGMOD! 🎉 

In this vision paper, we present our initial ideas for my next research project.

Joint work with <a href="/sscdotopen/">Sebastian</a> and <a href="/pgroth/">Paul Groth</a>.
DEEM Workshop @ SIGMOD (@deem_workshop) 's Twitter Profile Photo

We can't wait for DEEM Workshop @ SIGMOD 2024 to get started @sigmodconf! Join us tomorrow Sunday 9 June from 9am in the Tupungato room!! Check out the full program at: deem-workshop.github.io 🇨🇱

Stefan Grafberger (@sgrafberger) 's Twitter Profile Photo

Looking forward to presenting our vision "Towards Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" at the DEEM Workshop @ SIGMOD today! The talk will be around 10:40 a.m. in the Tupungato room. stefan-grafberger.com/shadow-pipelin…

Stefan Grafberger (@sgrafberger) 's Twitter Profile Photo

Life update: After three amazing years in Amsterdam, I moved to Berlin to finish my PhD with Sebastian at BIFOLD. Very excited to join the data management community in Berlin!

Olga Ovcharenko (@o_ovcharenko) 's Twitter Profile Photo

📢 Excited to share Feature Clock, an open-source library and paper accepted at IEEE VIS! Feature Clock enhances the explainability and compactness of visualizations of high-dimensional effects in two-dimensional plots. Big thanks to my co-authors Valentina Boeva and Rita Sevastjanova!

📢 Excited to share Feature Clock, an open-source library and paper accepted at <a href="/ieeevis/">IEEE VIS</a>! Feature Clock enhances the explainability and compactness of visualizations of high-dimensional effects in two-dimensional plots.

Big thanks to my co-authors <a href="/val_boeva/">Valentina Boeva</a> and <a href="/RSevastjanova/">Rita Sevastjanova</a>!
BIFOLD (@bifoldberlin) 's Twitter Profile Photo

"Snapcase" allows users to regain control over their recommendations in online shopping platforms. #VLDB24: a.o. BIFOLD researchers introduced "Snapcase," a demo paper that addresses the concept of machine unlearning. bifold.berlin/news-events/ne… Sebastian Stefan Grafberger Maarten de Rijke

"Snapcase" allows users to regain control over their recommendations in online shopping platforms.

#VLDB24: a.o. BIFOLD researchers introduced "Snapcase," a demo paper that addresses the concept of machine unlearning.

bifold.berlin/news-events/ne…

<a href="/sscdotopen/">Sebastian</a> <a href="/SGrafberger/">Stefan Grafberger</a> <a href="/mdr/">Maarten de Rijke</a>
Sebastian (@sscdotopen) 's Twitter Profile Photo

Interested in a *PhD in Data Engineering* in Berlin? Our institute has several openings for PhD positions as part of its graduate school, see the post below! Here is how to work with the DEEM Lab as part of the graduate school deem.berlin/#jobs-189196

DEEM Workshop @ SIGMOD (@deem_workshop) 's Twitter Profile Photo

The Data Management for End-to-End Machine Learning workshop (DEEM Workshop @ SIGMOD) will be back at #SIGMOD2025! ✨ 🔗 Check out the CfP: deem-workshop.github.io 📝 Submission deadline: March 21 📢 Notifications: April 25 Join us for the 9th edition in Berlin! #DEEM2025

Stefan Grafberger (@sgrafberger) 's Twitter Profile Photo

Our vision paper "Towards Regaining Control over Messy ML Pipelines" was accepted for the DAIS@ICDE2025 at IEEE ICDE Conference! Initial experiments show LLMs are promising for extracting declarative query plans from messy ML code. Joint work w/ Hao Chen, Olga Ovcharenko, Sebastian

Our vision paper "Towards Regaining Control over Messy ML Pipelines" was accepted for the <a href="/DAIS_workshop/">DAIS@ICDE2025</a> at <a href="/icdeconf/">IEEE ICDE Conference</a>!

Initial experiments show LLMs are promising for extracting declarative query plans from messy ML code.

Joint work w/ <a href="/guangchen811/">Hao Chen</a>,  <a href="/o_ovcharenko/">Olga Ovcharenko</a>, <a href="/sscdotopen/">Sebastian</a>
DEEM Workshop @ SIGMOD (@deem_workshop) 's Twitter Profile Photo

📢 Deadline extension for DEEM 2025 SIGMOD/PODS 2025! Following requests, we're extending the submission deadline to April 1, 5pm Pacific Time. More info at: deem-workshop.github.io

Stefan Grafberger (@sgrafberger) 's Twitter Profile Photo

Our demo "mlidea: Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" was accepted at VLDB! 🥳 We demo suggestions for ML pipelines, similar to IntelliJ code inspections or Grammarly suggestions youtu.be/ePGm1J6S2qk Joint work w/ Sebastian Paul Groth

Our demo "mlidea: Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'" was accepted at VLDB! 🥳

We demo suggestions for ML pipelines, similar to IntelliJ code inspections or Grammarly suggestions

youtu.be/ePGm1J6S2qk

Joint work w/ <a href="/sscdotopen/">Sebastian</a> <a href="/pgroth/">Paul Groth</a>