Alex Cloud (@cloud_kx) Twitter Tweets • TwiCopy

Alex Cloud

@cloud_kx

+ Follow

ID: 1176952955590905864

calendar_today25-09-2019 20:15:45

26 Tweet

103 Followers

59 Following

Alex Turner

@turn_trout

a year ago

1) AIs are trained as black boxes, making it hard to understand or control their behavior. This is bad for safety! But what is an alternative? Our idea: train structure into a neural network by configuring which components update on different tasks. We call it "gradient routing."

thumb_up_off_alt703

chat_bubble_outline23

repeat87

shareShare

Alex Turner

@turn_trout

6 months ago

Thought real machine unlearning was impossible? We show that distilling a conventionally “unlearned” model creates a model resistant to relearning attacks. 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 𝐦𝐚𝐤𝐞𝐬 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐫𝐞𝐚𝐥.

thumb_up_off_alt327

chat_bubble_outline16

repeat46

shareShare

Owain Evans

@owainevans_uk

5 months ago

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵