Sepp Hochreiter

Sepp Hochreiter
Sepp Hochreiter
Born	14 February 1967 (age 54); Mühldorf, West Germany
Nationality	German
Alma mater	Technische Universität München
	Scientific career
Fields	Machine learning, bioinformatics
Institutions	Johannes Kepler University Linz
Thesis	Generalisierung bei neuronalen Netzen geringer Komplexität (1999)
Doctoral advisor	Wilfried Brauer

Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018. In 2017 he became the head of the Linz Institute of Technology (LIT) AI Lab. Previously, he was at the Technical University of Berlin, at the University of Colorado at Boulder, and at the Technical University of Munich.

Hochreiter has made numerous contributions in the fields of machine learning, deep learning and bioinformatics, most notably development of the long short-term memory (LSTM),^[1]^[2] but also in meta learning,^[3] reinforcement learning^[4]^[5] and biclustering with application to bioinformatics data.

Hochreiter is a chair of the Critical Assessment of Massive Data Analysis (CAMDA) conference;^[6] and is also a founding director of the Institute of Advanced Research in Artificial Intelligence (IARAI).^[7]

Scientific career[]

Long short-term memory (LSTM)[]

Hochreiter developed the long short-term memory (LSTM) for which the first results were reported in his diploma thesis in 1991.^[1]^[2] LSTM overcomes the problem of recurrent neural networks (RNNs) and deep networks to forget information over time or, equivalently, through layers (vanishing or exploding gradient).^[1]^[8]^[9] LSTM with an optimized architecture was successfully applied to very fast protein homology detection without requiring a sequence alignment.^[10] LSTM networks are also used in Google Voice transcription,^[11] Google voice search,^[12] and Google's Allo^[13] for voice searches and commands.

Other machine learning contributions[]

Beyond LSTM, Hochreiter has also developed algorithms to avoid overfitting in neural networks^[14] and introducing rectified factor networks (RFNs)^[15]^[16] which have been applied in bioinformatics and genetics.^[17] Hochreiter introduced modern Hopfield networks with continuous states^[18] and applied them to the task of immune repertoire classification.^[19]

In the field of reinforcement learning, Hochreiter has worked on actor-critic systems that learn by "backpropagation through a model".^[4]^[20] Hochreiter introduced the RUDDER method which is designed to learn optimal policies for Markov Decision Processes (MDPs) with highly delayed rewards.

Hochreiter has been involved in the development of factor analysis methods with application to bioinformatics, including FABIA for biclustering,^[21] HapFABIA for detecting short segments of identity by descent^[22] and FARMS for preprocessing and summarizing high-density oligonucleotide DNA microarrays to analyze RNA gene expression.^[23]

Hochreiter proposed an extension of the support vector machine (SVM), the "Potential Support Vector Machine" (PSVM),^[24] which can be applied to non-square kernel matrices and can be used with kernels that are not positive definite. The PSVM has been applied to feature selection, including gene selection for microarray data.^[25]^[26]^[27]

Awards[]

Hochreiter was awarded the IEEE CIS Neural Networks Pioneer Prize in 2021 for his work on the LSTM architecture.^[28]

References[]

^ Jump up to: ^a ^b ^c Hochreiter, S. (1991). Untersuchungen zu dynamischen neuronalen Netzen (PDF) (diploma thesis). Technical University Munich, Institute of Computer Science.
^ Jump up to: ^a ^b Hochreiter, S.; Schmidhuber, J. (1997). "Long Short-Term Memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276. S2CID 1915014.
^ Hochreiter, S.; Younger, A. S.; Conwell, P. R. (2001). Learning to Learn Using Gradient Descent (PDF). Lecture Notes in Computer Science - ICANN 2001. Lecture Notes in Computer Science. 2130. pp. 87–94. CiteSeerX 10.1.1.5.323. doi:10.1007/3-540-44668-0_13. ISBN 978-3-540-42486-4. ISSN 0302-9743.
^ Jump up to: ^a ^b Hochreiter, S. (1991). Implementierung und Anwendung eines neuronalen Echtzeit-Lernalgorithmus für reaktive Umgebungen (PDF) (Report). Technical University Munich, Institute of Computer Science.
^ Arjona-Medina, J. A.; Gillhofer, M.; Widrich, M.; Unterthiner, T.; Hochreiter, S. (2018). "RUDDER: Return Decomposition for Delayed Rewards". arXiv:1806.07857 [cs.LG].
^ "CAMDA 2021". 20th International Conference on Critical Assessment of Massive Data Analysis. Retrieved 2021-02-13.
^ "IARAI – INSTITUTE OF ADVANCED RESEARCH IN ARTIFICIAL INTELLIGENCE". www.iarai.ac.at. Retrieved 2021-02-13.
^ Hochreiter, S. (1998). "The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions". International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 06 (2): 107–116. doi:10.1142/S0218488598000094. ISSN 0218-4885.
^ Hochreiter, S.; Bengio, Y.; Frasconi, P.; Schmidhuber, J. (2000). Kolen, J. F.; Kremer, S. C. (eds.). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. A Field Guide to Dynamical Recurrent Networks. New York City: IEEE Press. pp. 237–244. CiteSeerX 10.1.1.24.7321.
^ Hochreiter, S.; Heusel, M.; Obermayer, K. (2007). "Fast model-based protein homology detection without alignment". Bioinformatics. 23 (14): 1728–1736. doi:10.1093/bioinformatics/btm247. PMID 17488755.
^ "The neural networks behind Google Voice transcription".
^ "Google voice search: faster and more accurate".
^ "Chat Smarter with Allo".
^ Hochreiter, S.; Schmidhuber, J. (1997). "Flat Minima". Neural Computation. 9 (1): 1–42. doi:10.1162/neco.1997.9.1.1. PMID 9117894. S2CID 733161.
^ Clevert, D.-A.; Mayr, A.; Unterthiner, T.; Hochreiter, S. (2015). "Rectified Factor Networks". arXiv:1502.06464v2 [cs.LG].
^ Clevert, D.-A.; Mayr, A.; Unterthiner, T.; Hochreiter, S. (2015). Rectified Factor Networks. Advances in Neural Information Processing Systems 29. arXiv:1502.06464.
^ Clevert, D.-A.; Unterthiner, T.; Povysil, G.; Hochreiter, S. (2017). "Rectified factor networks for biclustering of omics data". Bioinformatics. 33 (14): i59–i66. doi:10.1093/bioinformatics/btx226. PMC 5870657. PMID 28881961.
^ Ramsauer, H.; Schäfl, B.; Lehner, J.; Seidl, P.; Widrich, M.; Gruber, L.; Holzleitner, M.; Pavlović, M.; Sandve, G. K.; Greiff, V.; Kreil, D.; Kopp, M.; Klambauer, G.; Brandstetter, J.; Hochreiter, S. (2020). "Hopfield Networks is All You Need". arXiv:2008.02217 [cs.NE].
^ Widrich, M.; Schäfl, B.; Ramsauer, H.; Pavlović, M.; Gruber, L.; Holzleitner, M.; Brandstetter, J.; Sandve, G. K.; Greiff, V.; Hochreiter, S.; Klambauer, G. (2020). "Modern Hopfield Networks and Attention for Immune Repertoire Classification". arXiv:2007.13505 [cs.LG].
^ Schmidhuber, J. (1990). Making the world differentiable: On Using Fully Recurrent Self-Supervised Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environments (PDF) (Technical report). Technical University Munich, Institute of Computer Science. FKI-126-90 (revised).
^ Hochreiter, Sepp; Bodenhofer, Ulrich; Heusel, Martin; Mayr, Andreas; Mitterecker, Andreas; Kasim, Adetayo; Khamiakova, Tatsiana; Van Sanden, Suzy; Lin, Dan; Talloen, Willem; Bijnens, Luc; Göhlmann, Hinrich W. H.; Shkedy, Ziv; Clevert, Djork-Arné (2010-06-15). "FABIA: factor analysis for bicluster acquisition". Bioinformatics. 26 (12): 1520–1527. doi:10.1093/bioinformatics/btq227. PMC 2881408.
^ Hochreiter, S. (2013). "HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data". Nucleic Acids Research. 41 (22): e202. doi:10.1093/nar/gkt1013. PMC 3905877. PMID 24174545.
^ Hochreiter, S.; Clevert, D.-A.; Obermayer, K. (2006). "A new summarization method for affymetrix probe level data". Bioinformatics. 22 (8): 943–949. doi:10.1093/bioinformatics/btl033. PMID 16473874.
^ Hochreiter, S.; Obermayer, K. (2006). "Support Vector Machines for Dyadic Data". Neural Computation. 18 (6): 1472–1510. CiteSeerX 10.1.1.228.5244. doi:10.1162/neco.2006.18.6.1472. PMID 16764511. S2CID 26201227.
^ Hochreiter, S.; Obermayer, K. (2006). Nonlinear Feature Selection with the Potential Support Vector Machine. Feature Extraction, Studies in Fuzziness and Soft Computing. pp. 419–438. doi:10.1007/978-3-540-35488-8_20. ISBN 978-3-540-35487-1.
^ Hochreiter, S.; Obermayer, K. (2003). "Classification and Feature Selection on Matrix Data with Application to Gene-Expression Analysis". 54th Session of the International Statistical Institute. Archived from the original on 2012-03-25.
^ Hochreiter, S.; Obermayer, K. (2004). "Gene Selection for Microarray Data". Kernel Methods in Computational Biology. MIT Press: 319–355. Archived from the original on 2012-03-25.
^ "Sepp Hochreiter receives IEEE CIS Neural Networks Pioneer Award 2021 - IARAI". www.iarai.ac.at. Retrieved 3 June 2021.

External links[]

[thesis-1] Jump up to: ^a ^b ^c Hochreiter, S. (1991). Untersuchungen zu dynamischen neuronalen Netzen (PDF) (diploma thesis). Technical University Munich, Institute of Computer Science.

[Neco-2] Jump up to: ^a ^b Hochreiter, S.; Schmidhuber, J. (1997). "Long Short-Term Memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276. S2CID 1915014.

[3] Hochreiter, S.; Younger, A. S.; Conwell, P. R. (2001). Learning to Learn Using Gradient Descent (PDF). Lecture Notes in Computer Science - ICANN 2001. Lecture Notes in Computer Science. 2130. pp. 87–94. CiteSeerX 10.1.1.5.323. doi:10.1007/3-540-44668-0_13. ISBN 978-3-540-42486-4. ISSN 0302-9743.

[report-4] Jump up to: ^a ^b Hochreiter, S. (1991). Implementierung und Anwendung eines neuronalen Echtzeit-Lernalgorithmus für reaktive Umgebungen (PDF) (Report). Technical University Munich, Institute of Computer Science.

[ReferenceB-5] Arjona-Medina, J. A.; Gillhofer, M.; Widrich, M.; Unterthiner, T.; Hochreiter, S. (2018). "RUDDER: Return Decomposition for Delayed Rewards". arXiv:1806.07857 [cs.LG].

[6] "CAMDA 2021". 20th International Conference on Critical Assessment of Massive Data Analysis. Retrieved 2021-02-13.

[7] "IARAI – INSTITUTE OF ADVANCED RESEARCH IN ARTIFICIAL INTELLIGENCE". www.iarai.ac.at. Retrieved 2021-02-13.

[Hochreiter:1998-8] Hochreiter, S. (1998). "The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions". International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 06 (2): 107–116. doi:10.1142/S0218488598000094. ISSN 0218-4885.

[Hochreiter:2000book-9] Hochreiter, S.; Bengio, Y.; Frasconi, P.; Schmidhuber, J. (2000). Kolen, J. F.; Kremer, S. C. (eds.). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. A Field Guide to Dynamical Recurrent Networks. New York City: IEEE Press. pp. 237–244. CiteSeerX 10.1.1.24.7321.

[10] Hochreiter, S.; Heusel, M.; Obermayer, K. (2007). "Fast model-based protein homology detection without alignment". Bioinformatics. 23 (14): 1728–1736. doi:10.1093/bioinformatics/btm247. PMID 17488755.

[GoogleVoiceTranscription-11] "The neural networks behind Google Voice transcription".

[GoogleVoiceSearch-12] "Google voice search: faster and more accurate".

[GoogleAllo-13] "Chat Smarter with Allo".

[Flat_Minima-14] Hochreiter, S.; Schmidhuber, J. (1997). "Flat Minima". Neural Computation. 9 (1): 1–42. doi:10.1162/neco.1997.9.1.1. PMID 9117894. S2CID 733161.

[15] Clevert, D.-A.; Mayr, A.; Unterthiner, T.; Hochreiter, S. (2015). "Rectified Factor Networks". arXiv:1502.06464v2 [cs.LG].

[16] Clevert, D.-A.; Mayr, A.; Unterthiner, T.; Hochreiter, S. (2015). Rectified Factor Networks. Advances in Neural Information Processing Systems 29. arXiv:1502.06464.

[17] Clevert, D.-A.; Unterthiner, T.; Povysil, G.; Hochreiter, S. (2017). "Rectified factor networks for biclustering of omics data". Bioinformatics. 33 (14): i59–i66. doi:10.1093/bioinformatics/btx226. PMC 5870657. PMID 28881961.

[Hopfield-18] Ramsauer, H.; Schäfl, B.; Lehner, J.; Seidl, P.; Widrich, M.; Gruber, L.; Holzleitner, M.; Pavlović, M.; Sandve, G. K.; Greiff, V.; Kreil, D.; Kopp, M.; Klambauer, G.; Brandstetter, J.; Hochreiter, S. (2020). "Hopfield Networks is All You Need". arXiv:2008.02217 [cs.NE].

[DeepRC-19] Widrich, M.; Schäfl, B.; Ramsauer, H.; Pavlović, M.; Gruber, L.; Holzleitner, M.; Brandstetter, J.; Sandve, G. K.; Greiff, V.; Hochreiter, S.; Klambauer, G. (2020). "Modern Hopfield Networks and Attention for Immune Repertoire Classification". arXiv:2007.13505 [cs.LG].

[techreport-20] Schmidhuber, J. (1990). Making the world differentiable: On Using Fully Recurrent Self-Supervised Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environments (PDF) (Technical report). Technical University Munich, Institute of Computer Science. FKI-126-90 (revised).

[fabia-21] Hochreiter, Sepp; Bodenhofer, Ulrich; Heusel, Martin; Mayr, Andreas; Mitterecker, Andreas; Kasim, Adetayo; Khamiakova, Tatsiana; Van Sanden, Suzy; Lin, Dan; Talloen, Willem; Bijnens, Luc; Göhlmann, Hinrich W. H.; Shkedy, Ziv; Clevert, Djork-Arné (2010-06-15). "FABIA: factor analysis for bicluster acquisition". Bioinformatics. 26 (12): 1520–1527. doi:10.1093/bioinformatics/btq227. PMC 2881408.

[22] Hochreiter, S. (2013). "HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data". Nucleic Acids Research. 41 (22): e202. doi:10.1093/nar/gkt1013. PMC 3905877. PMID 24174545.

[FARMS2006-23] Hochreiter, S.; Clevert, D.-A.; Obermayer, K. (2006). "A new summarization method for affymetrix probe level data". Bioinformatics. 22 (8): 943–949. doi:10.1093/bioinformatics/btl033. PMID 16473874.

[24] Hochreiter, S.; Obermayer, K. (2006). "Support Vector Machines for Dyadic Data". Neural Computation. 18 (6): 1472–1510. CiteSeerX 10.1.1.228.5244. doi:10.1162/neco.2006.18.6.1472. PMID 16764511. S2CID 26201227.

[NFS2006-25] Hochreiter, S.; Obermayer, K. (2006). Nonlinear Feature Selection with the Potential Support Vector Machine. Feature Extraction, Studies in Fuzziness and Soft Computing. pp. 419–438. doi:10.1007/978-3-540-35488-8_20. ISBN 978-3-540-35487-1.

[26] Hochreiter, S.; Obermayer, K. (2003). "Classification and Feature Selection on Matrix Data with Application to Gene-Expression Analysis". 54th Session of the International Statistical Institute. Archived from the original on 2012-03-25.

[27] Hochreiter, S.; Obermayer, K. (2004). "Gene Selection for Microarray Data". Kernel Methods in Computational Biology. MIT Press: 319–355. Archived from the original on 2012-03-25.

[28] "Sepp Hochreiter receives IEEE CIS Neural Networks Pioneer Award 2021 - IARAI". www.iarai.ac.at. Retrieved 3 June 2021.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

hide Authority control
General	ISNI 1 ORCID 1 VIAF 1 WorldCat
National libraries	United States Czech Republic
Scientific databases	DBLP (computer science)
Other	SUDOC (France) 1

Sepp Hochreiter

Born	(1967-02-14) 14 February 1967 (age 54) Mühldorf, West Germany
Nationality	German
Alma mater	Technische Universität München
Scientific career
Fields	Machine learning, bioinformatics
Institutions	Johannes Kepler University Linz
Thesis	Generalisierung bei neuronalen Netzen geringer Komplexität (1999)
Doctoral advisor	Wilfried Brauer