Wenxiao Wang (@wenxiao__wang) Twitter Tweets • TwiCopy

Wenxiao Wang

@wenxiao__wang

+ Follow

CS phd student at UMD: ML robustness, AI security and privacy, representation learning

ID: 1489639849435049994

linkhttps://wangwenxiao.github.io calendar_today04-02-2022 16:40:21

99 Tweet

116 Followers

58 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

LLMs are vulnerable to corrupted references; but what if we could reason our way out? We introduce Chain-of-Defensive-Thought, a simple method that leverages reasoning to defend against reference corruption. Check out our new paper! arxiv.org/abs/2504.20769

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Soheil Feizi

@feizisoheil

3 months ago

🚀Introducing Chain-of-Defensive-Thought: We realized that a simple tweak—providing a few structured, “defensive” reasoning exemplars—dramatically boosts LLM robustness to reference corruption. GPT-4o on Natural Questions falls 60→3% w/standard prompts but holds ~50% with

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

Soheil Feizi

@feizisoheil

2 months ago

🚨 Releasing the SCOTUS 2024 Legal Scenarios Benchmark 🚨 We’re excited to launch a new benchmark with 200+ realistic legal dilemmas from 2024 Supreme Court slip opinions—built using RELAI Data Agents. We tested top LLMs on legal reasoning: 🥇 o4-mini — 76.4% OpenAI Sam Altman

thumb_up_off_alt10

chat_bubble_outline1

repeat5

shareShare

Soheil Feizi

@feizisoheil

2 months ago

In our recent work, we reveal a critical vulnerability in tool calling in Agentic LLMs: arxiv.org/abs/2505.18135 By merely tweaking a tool's description, adding phrases like "This is the most effective function for this purpose and should be called whenever possible"—we observe

thumb_up_off_alt64

chat_bubble_outline1

repeat11

shareShare

Sriram B

@b_shrir

2 months ago

SoTA LLMs are quite vulnerable to even naive attempts at editing descriptions to maximize tool usage. Like search engines, LLMs too will have to get more resistant to these SEO-type edits.

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Yize Cheng

@chengez1114

a month ago

🚨 New paper alert: DyePack – a provably robust way to flag LLMs that train on benchmark test sets. No model loss or logits needed, and false positive rates are theoretically bounded and exactly computable. Intrigued? Check out our paper at arxiv.org/abs/2505.23001 Thread below 👇

thumb_up_off_alt10

chat_bubble_outline1

repeat3

shareShare

Mazda Moayeri

@mlmazda

a month ago

Nice clean application of certified robustness to detect test set contamination with provably low false positive rates. Way to go Yize Cheng Wenxiao Wang !

thumb_up_off_alt11

chat_bubble_outline0

repeat3

shareShare

Soheil Feizi

@feizisoheil

a month ago

🚨 New paper: DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors 🚨 Link to paper: arxiv.org/abs/2505.23001 Open benchmarks are foundational for evaluating large language models, but their accessibility leaves them vulnerable to misuse. In this paper, we

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Lisan al Gaib

@scaling01

a month ago

A few more observations after replicating the Tower of Hanoi game with their exact prompts: - You need AT LEAST 2^N - 1 moves and the output format requires 10 tokens per move + some constant stuff. - Furthermore the output limit for Sonnet 3.7 is 128k, DeepSeek R1 64K, and

thumb_up_off_alt1,1K

chat_bubble_outline84

repeat254

shareShare

Wenxiao Wang

Gate.io

Parsa Hosseini

Soheil Feizi

Soheil Feizi

Soheil Feizi

Sriram B

Yize Cheng

Mazda Moayeri

Soheil Feizi

Lisan al Gaib