samsja (@samsja19) 's Twitter Profile
samsja

@samsja19

training LLM across the globe at @PrimeIntellect

ID: 1241357619400491008

linkhttps://github.com/samsja calendar_today21-03-2020 13:35:10

2,2K Tweet

3,3K Followers

1,1K Following

samsja (@samsja19) 's Twitter Profile Photo

yes and no, dataclass are not good enough. Pydantic model / dataclass are way more expressive and validate the input There is a lib maintain by the Pydantic team that allow for overriding cli and load config via toml, that's all you need github.com/pydantic/pydan…

will brown (@willccbb) 's Twitter Profile Photo

imagine if gas stations didn't tell you how many gallons you were getting because car mileage was a trade secret and the gas station owned the car companies and you could either buy way overpriced gas per-mile or a monthly "max gas subscription" that turns off randomly sometimes

mania.build (@adilmania) 's Twitter Profile Photo

"winning isn’t just about data and distribution; it’s also about taste and trust - something tech giants lost. midjourney wasn’t built by adobe, perplexity by google, cursor by microsoft. There’s never been a better time to be davids fighting goliaths." Hugo Amsellem said the

samsja (@samsja19) 's Twitter Profile Photo

torchtitan has built it HSDP + diloco support, it's probably the best place right now to start doing decentralized learning research. It also come with support for many arch (llama3,llama4, deepseekv3...) as well as all possible parallelism (6d?). Pytorch team cooked here

Arthur Douillard (@ar_douillard) 's Twitter Profile Photo

Sharing Zachary Charles' open problems for decentralized learning presented at ICML. Many low-hanging fruits will significantly impact the next years, moving beyond the current monolithic training paradigm.

Sharing <a href="/MatharyCharles/">Zachary Charles</a>' open problems for decentralized learning presented at ICML.

Many low-hanging fruits will significantly impact the next years, moving beyond the current monolithic training paradigm.