Sepp Hochreiter (@hochreitersepp) 's Twitter Profile
Sepp Hochreiter

@hochreitersepp

Pioneer of Deep Learning and known for vanishing gradient and the LSTM.

ID: 1463119548115087362

linkhttps://www.nx-ai.com/ calendar_today23-11-2021 12:18:29

666 Tweet

13,13K Followers

375 Following

Florian (@fses91) 's Twitter Profile Photo

Happy to introduce 🔥LaM-SLidE🔥! We show how trajectories of spatial dynamical systems can be modeled in latent space by --> leveraging IDENTIFIERS. 📚Paper: arxiv.org/abs/2502.12128 💻Code: github.com/ml-jku/LaM-SLi… 📝Blog: ml-jku.github.io/LaM-SLidE/ 1/n

Happy to introduce 🔥LaM-SLidE🔥! 

We show how trajectories of spatial dynamical systems can be modeled in latent space by

--> leveraging IDENTIFIERS.

📚Paper: arxiv.org/abs/2502.12128 
💻Code: github.com/ml-jku/LaM-SLi…
📝Blog: ml-jku.github.io/LaM-SLidE/
1/n
Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM for the classification of assembly tasks: arxiv.org/abs/2505.18012 "xLSTM model demonstrated better generalization capabilities to new operators. The results clearly show that for this type of classification, the xLSTM model offers a slight edge over Transformers."

xLSTM for the classification of assembly tasks: arxiv.org/abs/2505.18012

"xLSTM model demonstrated better generalization capabilities to new operators. The results clearly show that for this type of classification, the xLSTM model offers a slight edge over Transformers."
Günter Klambauer (@gklambauer) 's Twitter Profile Photo

Recommended read for the weekend: Sepp Hochreiter's book on AI! Lots of fun anecdotes and easily accessible basics on AI! beneventopublishing.com/ecowing/produk…

Recommended read for the weekend: Sepp Hochreiter's book on AI!

Lots of fun anecdotes and easily accessible basics on AI!

beneventopublishing.com/ecowing/produk…
Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

Attention!! Our TiRex time series model, built on xLSTM, is topping all major international leaderboards. A European-developed model is leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from Alibaba.

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

Attention!! Our TiRex time series mode is a European-developed model leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from companies such as Alibaba.

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

A European-developed TiRex is leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from companies such as Alibaba.

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

We are soooo proud. Our European-developed TiRex is leading the field—significantly ahead of U.S. competitors like Amazon, Datadog, Salesforce, and Google, as well as Chinese models from companies such as Alibaba.

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

Mein Buch “Was kann Künstliche Intelligenz?“ ist erschienen. Eine leicht zugängliche Einführung in das Thema Künstliche Intelligenz. LeserInnen – auch ohne technischen Hintergrund – wird erklärt, was KI eigentlich ist, welche Potenziale sie birgt und welche Auswirkungen sie hat.

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM for Human Action Segmentation: arxiv.org/abs/2506.09650 "HopaDIFF, leveraging a novel cross-input gate attentional xLSTM to enhance holistic-partial long-range reasoning" "HopaDIFF achieves state-of-theart results on RHAS133 in diverse evaluation settings."

xLSTM for Human Action Segmentation: arxiv.org/abs/2506.09650

"HopaDIFF, leveraging a novel cross-input gate attentional xLSTM to enhance holistic-partial long-range reasoning"

"HopaDIFF achieves state-of-theart results on RHAS133 in diverse evaluation settings."
KorbinianPoeppel (@korbipoeppel) 's Twitter Profile Photo

Ever wondered how linear RNNs like #mLSTM (#xLSTM) or #Mamba can be extended to multiple dimensions? Check out "pLSTM: parallelizable Linear Source Transition Mark networks". #pLSTM works on sequences, images, (directed acyclic) graphs. Paper link: arxiv.org/abs/2506.11997

Ever wondered how linear RNNs like #mLSTM (#xLSTM)  or #Mamba can be extended to multiple dimensions?
Check out "pLSTM: parallelizable Linear Source Transition Mark networks". #pLSTM works on sequences, images, (directed acyclic) graphs.
Paper link: arxiv.org/abs/2506.11997
Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

NXAI has successfully demonstrated that their groundbreaking xLSTM (Long Short Term Memory) architecture achieves exceptional performance on AMD Instinct™ GPUs - significant advancement in RNN technology for edge computing applications. amd.com/en/blogs/2025/…

Günter Klambauer (@gklambauer) 's Twitter Profile Photo

Great application but built on the wrong model architecture... We've already shown that Transformer is inferior to xLSTM on DNA: arxiv.org/abs/2411.04165

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM for multivariate time series anomaly detection: arxiv.org/abs/2506.22837 “In our results, xLSTM showcases state-of-the-art accuracy, outperforming 23 popular anomaly detection baselines.” Again, xLSTM excels in time series analysis.

xLSTM for multivariate time series anomaly detection: arxiv.org/abs/2506.22837

“In our results, xLSTM showcases state-of-the-art accuracy, outperforming 23 popular anomaly detection baselines.”

Again, xLSTM excels in time series analysis.
Jürgen Schmidhuber (@schmidhuberai) 's Twitter Profile Photo

10 years ago, in May 2015, we published the first working very deep gradient-based feedforward neural networks (FNNs) with hundreds of layers (previous FNNs had a maximum of a few dozen layers). To overcome the vanishing gradient problem, our Highway Networks used the residual

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM for Aspect-based Sentiment Analysis: arxiv.org/abs/2507.01213 Another success story of xLSTM. MEGA: xLSTM with Multihead Exponential Gated Fusion. Experiments on 3 benchmarks show that MEGA outperforms state-of-the-art baselines with superior accuracy and efficiency”

xLSTM for Aspect-based Sentiment Analysis: arxiv.org/abs/2507.01213

Another success story of xLSTM. MEGA: xLSTM with Multihead Exponential Gated Fusion.

Experiments on 3 benchmarks show that MEGA outperforms state-of-the-art baselines with superior accuracy and efficiency”
Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM for Monaural Speech Enhancement: arxiv.org/abs/2507.04368 xLSTM has superior performance vs. Mamba and Transformers but is slower than Mamba. New Triton kernels, xLSTM is faster than MAMBA at training and inference: arxiv.org/abs/2503.13427 and arxiv.org/abs/2503.14376

xLSTM for Monaural Speech Enhancement: arxiv.org/abs/2507.04368

xLSTM has superior performance vs. Mamba and   Transformers but is slower than Mamba. 

New Triton kernels, xLSTM is faster than MAMBA at training and inference: arxiv.org/abs/2503.13427 and arxiv.org/abs/2503.14376
Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

General relativity modeled by neural tensor fields. Super exiting work. Geometric modeling of gravitational fields through tensor-valued PDEs. Cool stuff: a simulation of a black hole.

Sepp Hochreiter (@hochreitersepp) 's Twitter Profile Photo

xLSTM for Cellular Traffic Forecasting: arxiv.org/abs/2507.19513 "Empirical results showed a 23% MAE reduction over the original STN and a 30% improvement on unseen data, highlighting strong generalization." xLSTM shines again in time series forecasting.

xLSTM for Cellular Traffic Forecasting: arxiv.org/abs/2507.19513

"Empirical results showed a 23% MAE reduction over the original STN and a 30% improvement on unseen data, highlighting strong generalization."

xLSTM shines again in time series forecasting.