Keya Hu (@hulillian39250) Twitter Tweets • TwiCopy

Keya Hu

@hulillian39250

+ Follow

Cornell 24 fall research intern
SJTU 25' CS
Senior undergraduate student

ID: 1682047729634279426

linkhttps://lillian039.github.io/ calendar_today20-07-2023 15:20:22

8 Tweet

160 Followers

42 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

New ARC-AGI paper ARC Prize w/ fantastic collaborators Wen-Ding Li @ ICLR'25 Keya Hu Zenna Tavares evanthebouncy Basis For few-shot learning: better to construct a symbolic hypothesis/program, or have a neural net do it all, ala in-context learning? cs.cornell.edu/~ellisk/docume…

New ARC-AGI paper
<a href="/arcprize/">ARC Prize</a> w/ fantastic collaborators <a href="/xu3kev/">Wen-Ding Li @ ICLR'25</a> <a href="/HuLillian39250/">Keya Hu</a> <a href="/ZennaTavares/">Zenna Tavares</a> <a href="/evanthebouncy/">evanthebouncy</a> <a href="/BasisOrg/">Basis</a>
For few-shot learning: better to construct a symbolic hypothesis/program, or have a neural net do it all, ala in-context learning?
cs.cornell.edu/~ellisk/docume…

thumb_up_off_alt906

chat_bubble_outline18

repeat166

shareShare

Keya Hu

@hulillian39250

9 months ago

🎊 New work and new SOTA on ARC-AGI public evaluation dataset!! We finetune Llama3.1-8B on the 400k synthetic dataset generated by our pipeline and do both transduction (directly output grid) and induction (output transform programs) that complement each other.

thumb_up_off_alt12

chat_bubble_outline1

repeat0

shareShare

Keya Hu

@hulillian39250

8 months ago

🤩

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Zenna Tavares

@zennatavares

8 months ago

Thrilled that joint work by Kevin Ellis's lab and Basis won 1st prize in ARC Prize Paper Awards and 2nd prize in ARC-AGI-PUB (w/ MIT) This is our first result from Project MARA: an effort to build Modeling, Abstraction, and Reasoning Agents capable of "everyday science"

thumb_up_off_alt39

chat_bubble_outline2

repeat9

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

5 months ago

Is a single accuracy number all we can get from model evals?🤔 🚨Does NOT tell where the model fails 🚨Does NOT tell how to improve it Introducing EvalTree🌳 🔍identifying LM weaknesses in natural language 🚀weaknesses serve as actionable guidance (paper&demo 🔗in🧵) [1/n]

thumb_up_off_alt240

chat_bubble_outline4

repeat89

shareShare

Keyon Vafa

@keyonv

4 months ago

AI models appear to mimic the real world. But how can we tell if they truly understand it? Excited to announce the ICML 2025 Workshop on Assessing World Models! Working on related questions? Submit a paper (max. 4 pages) by May 20!

thumb_up_off_alt42

chat_bubble_outline1

repeat9

shareShare