DeepSPIN (@deep_spin) Twitter Tweets • TwiCopy

DeepSPIN

@deep_spin

+ Follow

Deep structured prediction in NLP. ERC project coordinated by @andre_t_martins. Instituto de Telecomunicações.

ID: 1026171704744325120

linkhttps://deep-spin.github.io/ calendar_today05-08-2018 18:22:54

23 Tweet

349 Followers

73 Following

DeepSPIN

@deep_spin

7 years ago

"Structure Back in Play, Translation Wants More Context" DeepSPINner André Martins writes on the Unbabel R&D blog his notes from this year's #icml2018 and #ACL2018: medium.com/unbabel/icml-a…

thumb_up_off_alt16

chat_bubble_outline0

repeat3

shareShare

Towards Dynamic Computation Graphs via Sparse Latent Structure: #emnlp2018 + André Claire Cardie - marginalize over structured latent vars w/ SparseMAP - CG a function of discrete structure - eg latent dependency TreeLSTM pdf arxiv.org/abs/1809.00653 code github.com/vene/sparsemap…

Towards Dynamic Computation Graphs via Sparse Latent Structure: #emnlp2018 + André <a href="/clairecardie/">Claire Cardie</a>

- marginalize over structured latent vars w/ SparseMAP
- CG a function of discrete structure
- eg latent dependency TreeLSTM

pdf arxiv.org/abs/1809.00653
code github.com/vene/sparsemap…

thumb_up_off_alt108

chat_bubble_outline1

repeat45

shareShare

DeepSPIN

@deep_spin

7 years ago

DeepSPIN talks at #emnlp2018 ! - Thu, 11:00AM, talk @ Blackbox NLP Interpretable Structure Induction via Sparse Attention. Peters/Niculae/Martins. - Fri, 3:36PM, main conf talk @ ML(3B) Towards Dynamic Computation Graphs via Sparse Latent Structure. Niculae/Martins/Cardie.

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

DeepSPIN

@deep_spin

7 years ago

A nice write-up of the challenges of lemmatization by DeepSPINner Erick! Multilingual examples reveal different complexities hard to imagine if focusing on English.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

timorous bestie 😷

@vnfrombucharest

6 years ago

Adaptively Sparse Transformers @emnlp2019 +Gonçalo Correia, André Martins α-entmax attention α=1: softmax, α=2: sparsemax, continuous in between. twist: we learn α for each head, w gradients! Some heads become dense, some sparse. arxiv.org/abs/1909.00015 github.com/deep-spin/entm…

thumb_up_off_alt114

chat_bubble_outline2

repeat31

shareShare