Dr. Stefan Baack | @tootbaack@infosec.exchange (@tweetbaack) 's Twitter Profile
Dr. Stefan Baack | @[email protected]

@tweetbaack

This account is inactive. Please follow me on Mastodon at @[email protected] or Bluesky at @sbaack.com

ID: 45529639

linkhttp://sbaack.com/ calendar_today08-06-2009 09:02:23

19 Tweet

787 Followers

1,1K Following

Rasmus Kleis Nielsen (@rasmus_kleis) 's Twitter Profile Photo

Excellent report from Dr. Stefan Baack | @[email protected] Mozilla on Common Crawl, used to train many LLMs. Throwaway line for news publishers to ponder: "We will focus on the main crawl because the news crawl is rarely used by AI builders to train their LLMs (only once in our sample of 47 [models])."