Sepp Hochreiter (@hochreitersepp) Twitter Tweets • TwiCopy

Sepp Hochreiter

@hochreitersepp

+ Follow

Pioneer of Deep Learning and known for vanishing gradient and the LSTM.

ID: 1463119548115087362

linkhttps://www.nx-ai.com/ calendar_today23-11-2021 12:18:29

666 Tweet

13,13K Followers

375 Following

Florian

5 months ago

Happy to introduce 🔥LaM-SLidE🔥! We show how trajectories of spatial dynamical systems can be modeled in latent space by --> leveraging IDENTIFIERS. 📚Paper: arxiv.org/abs/2502.12128 💻Code: github.com/ml-jku/LaM-SLi… 📝Blog: ml-jku.github.io/LaM-SLidE/ 1/n

Happy to introduce 🔥LaM-SLidE🔥!

We show how trajectories of spatial dynamical systems can be modeled in latent space by

--> leveraging IDENTIFIERS.

📚Paper: arxiv.org/abs/2502.12128
💻Code: github.com/ml-jku/LaM-SLi…
📝Blog: ml-jku.github.io/LaM-SLidE/
1/n

thumb_up_off_alt33

chat_bubble_outline1

repeat18

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

xLSTM for the classification of assembly tasks: arxiv.org/abs/2505.18012 "xLSTM model demonstrated better generalization capabilities to new operators. The results clearly show that for this type of classification, the xLSTM model offers a slight edge over Transformers."

xLSTM for the classification of assembly tasks: arxiv.org/abs/2505.18012

"xLSTM model demonstrated better generalization capabilities to new operators. The results clearly show that for this type of classification, the xLSTM model offers a slight edge over Transformers."

thumb_up_off_alt155

chat_bubble_outline11

repeat26

shareShare

Günter Klambauer

5 months ago

Recommended read for the weekend: Sepp Hochreiter's book on AI! Lots of fun anecdotes and easily accessible basics on AI! beneventopublishing.com/ecowing/produk…

Recommended read for the weekend: Sepp Hochreiter's book on AI!

Lots of fun anecdotes and easily accessible basics on AI!

beneventopublishing.com/ecowing/produk…

thumb_up_off_alt18

chat_bubble_outline2

repeat4

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

Attention!! Our TiRex time series model, built on xLSTM, is topping all major international leaderboards. A European-developed model is leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from Alibaba.

thumb_up_off_alt116

chat_bubble_outline1

repeat26

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

Attention!! Our TiRex time series mode is a European-developed model leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from companies such as Alibaba.

thumb_up_off_alt52

chat_bubble_outline2

repeat2

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

A European-developed TiRex is leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from companies such as Alibaba.

thumb_up_off_alt36

chat_bubble_outline2

repeat9

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

We are soooo proud. Our European-developed TiRex is leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from companies such as Alibaba.

thumb_up_off_alt74

chat_bubble_outline1

repeat10

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

Mein Buch “Was kann Künstliche Intelligenz?“ ist erschienen. Eine leicht zugängliche Einführung in das Thema Künstliche Intelligenz. LeserInnen – auch ohne technischen Hintergrund – wird erklärt, was KI eigentlich ist, welche Potenziale sie birgt und welche Auswirkungen sie hat.

thumb_up_off_alt25

chat_bubble_outline2

repeat4

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

Learn more about TiRex and its potential. Time series are central to many business operations.

thumb_up_off_alt13

chat_bubble_outline0

repeat4

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

xLSTM for Human Action Segmentation: arxiv.org/abs/2506.09650 "HopaDIFF, leveraging a novel cross-input gate attentional xLSTM to enhance holistic-partial long-range reasoning" "HopaDIFF achieves state-of-theart results on RHAS133 in diverse evaluation settings."

xLSTM for Human Action Segmentation: arxiv.org/abs/2506.09650

"HopaDIFF, leveraging a novel cross-input gate attentional xLSTM to enhance holistic-partial long-range reasoning"

"HopaDIFF achieves state-of-theart results on RHAS133 in diverse evaluation settings."

thumb_up_off_alt137

chat_bubble_outline1

repeat25

shareShare

KorbinianPoeppel

5 months ago

Ever wondered how linear RNNs like #mLSTM (#xLSTM) or #Mamba can be extended to multiple dimensions? Check out "pLSTM: parallelizable Linear Source Transition Mark networks". #pLSTM works on sequences, images, (directed acyclic) graphs. Paper link: arxiv.org/abs/2506.11997

Ever wondered how linear RNNs like #mLSTM (#xLSTM) or #Mamba can be extended to multiple dimensions?
Check out "pLSTM: parallelizable Linear Source Transition Mark networks". #pLSTM works on sequences, images, (directed acyclic) graphs.
Paper link: arxiv.org/abs/2506.11997

thumb_up_off_alt129

chat_bubble_outline4

repeat37

shareShare

Sepp Hochreiter

@hochreitersepp

5 months ago

Parallelizable and state tracking and learnable information flow. Wowww. Super Work of Korbinian and team.

thumb_up_off_alt117

chat_bubble_outline0

repeat15

shareShare

Sepp Hochreiter

@hochreitersepp

4 months ago

NXAI has successfully demonstrated that their groundbreaking xLSTM (Long Short Term Memory) architecture achieves exceptional performance on AMD Instinct™ GPUs - significant advancement in RNN technology for edge computing applications. amd.com/en/blogs/2025/…

thumb_up_off_alt131

chat_bubble_outline2

repeat21

shareShare

Günter Klambauer

4 months ago

Great application but built on the wrong model architecture... We've already shown that Transformer is inferior to xLSTM on DNA: arxiv.org/abs/2411.04165

thumb_up_off_alt16

chat_bubble_outline0

repeat2

shareShare

Sepp Hochreiter

@hochreitersepp

4 months ago

xLSTM for multivariate time series anomaly detection: arxiv.org/abs/2506.22837 “In our results, xLSTM showcases state-of-the-art accuracy, outperforming 23 popular anomaly detection baselines.” Again, xLSTM excels in time series analysis.

xLSTM for multivariate time series anomaly detection: arxiv.org/abs/2506.22837

“In our results, xLSTM showcases state-of-the-art accuracy, outperforming 23 popular anomaly detection baselines.”

Again, xLSTM excels in time series analysis.

thumb_up_off_alt157

chat_bubble_outline1

repeat25

shareShare

Jürgen Schmidhuber

4 months ago

10 years ago, in May 2015, we published the first working very deep gradient-based feedforward neural networks (FNNs) with hundreds of layers (previous FNNs had a maximum of a few dozen layers). To overcome the vanishing gradient problem, our Highway Networks used the residual

thumb_up_off_alt265

chat_bubble_outline4

repeat35

shareShare

Sepp Hochreiter

@hochreitersepp

4 months ago

xLSTM for Aspect-based Sentiment Analysis: arxiv.org/abs/2507.01213 Another success story of xLSTM. MEGA: xLSTM with Multihead Exponential Gated Fusion. Experiments on 3 benchmarks show that MEGA outperforms state-of-the-art baselines with superior accuracy and efficiency”

xLSTM for Aspect-based Sentiment Analysis: arxiv.org/abs/2507.01213

Another success story of xLSTM. MEGA: xLSTM with Multihead Exponential Gated Fusion.

Experiments on 3 benchmarks show that MEGA outperforms state-of-the-art baselines with superior accuracy and efficiency”

thumb_up_off_alt29

chat_bubble_outline0

repeat4

shareShare

Sepp Hochreiter

@hochreitersepp

4 months ago

xLSTM for Monaural Speech Enhancement: arxiv.org/abs/2507.04368 xLSTM has superior performance vs. Mamba and Transformers but is slower than Mamba. New Triton kernels, xLSTM is faster than MAMBA at training and inference: arxiv.org/abs/2503.13427 and arxiv.org/abs/2503.14376

xLSTM for Monaural Speech Enhancement: arxiv.org/abs/2507.04368

xLSTM has superior performance vs. Mamba and Transformers but is slower than Mamba.

New Triton kernels, xLSTM is faster than MAMBA at training and inference: arxiv.org/abs/2503.13427 and arxiv.org/abs/2503.14376

thumb_up_off_alt98

chat_bubble_outline3

repeat16

shareShare

Sepp Hochreiter

@hochreitersepp

3 months ago

General relativity modeled by neural tensor fields. Super exiting work. Geometric modeling of gravitational fields through tensor-valued PDEs. Cool stuff: a simulation of a black hole.

thumb_up_off_alt38

chat_bubble_outline0

repeat2

shareShare

Sepp Hochreiter

@hochreitersepp

3 months ago

xLSTM for Cellular Traffic Forecasting: arxiv.org/abs/2507.19513 "Empirical results showed a 23% MAE reduction over the original STN and a 30% improvement on unseen data, highlighting strong generalization." xLSTM shines again in time series forecasting.

xLSTM for Cellular Traffic Forecasting: arxiv.org/abs/2507.19513

"Empirical results showed a 23% MAE reduction over the original STN and a 30% improvement on unseen data, highlighting strong generalization."

xLSTM shines again in time series forecasting.

thumb_up_off_alt24

chat_bubble_outline1

repeat4

shareShare