Andrew Lamb (@andrewlamb1111) 's Twitter Profile
Andrew Lamb

@andrewlamb1111

Apache {DataFusion, Arrow} PMC, Database Engineer

ID: 1326266114805002241

linkhttp://andrew.nerdnetworks.org/ calendar_today10-11-2020 20:51:18

571 Tweet

2,2K Followers

58 Following

Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

This is so cool -- an example of embedding a special index (a DistinctValues index no less) inside a Apache Parquet file: github.com/apache/datafus… (coming in ApacheDataFusion 49.0.0)

Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

Thanks to Jax Liu DataFusion 49.0.0 will offer `async` user defined functions: github.com/apache/datafus… šŸ™ github.com/goldmedal and Canner

Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

šŸ”’ 🄁github.com/apache/datafus… DataFusion will now have parquet modular encryption thanks to Adam Reeve and Corwin Join (courtesy of G-Research)

Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

Two new (reposted) blogs about Optimizing SQL and DataFrames: datafusion.apache.org/blog/2025/06/1… datafusion.apache.org/blog/2025/06/1…

Two new (reposted) blogs about Optimizing SQL and DataFrames: datafusion.apache.org/blog/2025/06/1…
datafusion.apache.org/blog/2025/06/1…
Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

New blog post about cooperative scheduling using tokio and Rust async, and how cancellation works in ApacheDataFusion: datafusion.apache.org/blog/2025/06/3…

New blog post about cooperative scheduling using tokio and Rust async, and  how cancellation works in <a href="/ApacheDataFusio/">ApacheDataFusion</a>: datafusion.apache.org/blog/2025/06/3…
Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

I publicly apologize to snapping at Yuchen Liang and Andy Pavlo (@andypavlo.bsky.social) and CMU Database Group . "You need to have a push based scheduler to do ..." TUM / DuckDB(CWI) created group-think in Databases where push schedulers are required, ClickHouse, Spark, DataFusion, etc not withstanding 🤦

Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

Quite a list of contributors already to the Rust Apache Parquet implementation of Variant (support for semi structured data). I was making some slides to explain what Variant is and made up a list I wanted to share. The feature will be amazing github.com/apache/arrow-r…

Quite a list of contributors already to the Rust <a href="/ApacheParquet/">Apache Parquet</a>  implementation of Variant (support for semi structured data). I was making some slides to explain what Variant is and made up a list I wanted to share. The feature will be amazing
github.com/apache/arrow-r…
Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

Example of the kind of low level obsession we foster in the Rust parquet / arrow / DataFusion community: github.com/apache/arrow-r… I am skeptical that proprietary engines will be able to compete with OSS long term (though I am biased) Huge thanks to Qi Zhu and DaniĆ«l Heres @[email protected]

Yingjun Wu šŸš€ (@yingjunwu) 's Twitter Profile Photo

From what I can see, commercial open-source software keeps pulling ahead of closed-source alternatives. Trust is the primary driver - technical excellence comes second. I’ve even seen companies pick a product solely because it’s written in Rust Language.

Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

I am speaking at the #ApacheIceberg NYC MeetupĀ onĀ July 10th about Variant in Apache Parquet which enable more efficient of processing semi structured data such as that found in JSON. lu.ma/95a5qys1

I am speaking at the #ApacheIceberg NYC MeetupĀ onĀ July 10th about Variant in <a href="/ApacheParquet/">Apache Parquet</a> which enable more efficient of processing semi structured data such as that found in JSON.

lu.ma/95a5qys1
Andrew Lamb (@andrewlamb1111) 's Twitter Profile Photo

Sweet VLDB paper from TUM (Mateusz Gienieczko / github.com/v0ldek) proposing extending Apache Parquet using user defined encodings (via WASM). Favorite image shows the ease of integrating into ApacheDataFusion gienieczko.com/anyblox-paper

Sweet VLDB paper from TUM  (Mateusz Gienieczko / github.com/v0ldek) proposing extending <a href="/ApacheParquet/">Apache Parquet</a>  using user defined encodings (via WASM). 
Favorite image shows the ease of integrating into <a href="/ApacheDataFusio/">ApacheDataFusion</a> 
gienieczko.com/anyblox-paper