Stanford Seminar - Deep Learning in Speech Recognition

Описание к видео Stanford Seminar - Deep Learning in Speech Recognition

EE380: Computer Systems Colloquium Seminar
Deep Learning in Speech Recognition
Speaker: Alex Acero, Apple Computer

While neural networks had been used in speech recognition in the early 1990s, they did not outperform the traditional machine learning approaches until 2010, when Alex's team members at Microsoft Research demonstrated the superiority of Deep Neural Networks (DNN) for large vocabulary speech recognition systems. The speech community rapidly adopted deep learning, followed by the image processing community, and many other disciplines. In this talk I will give an introduction to speech recognition, go over the fundamentals of deep learning, explained what it took for the speech recognition field to adopt deep learning, and how that has been contributed to popularize personal assistants like Siri.

About the Speaker:
Alex Acero (PhD, Carnegie Mellon, 1990) is Sr. Director in the Siri team in charge of speech recognition, speech synthesis, and machine translation. Prior to joining Apple in 2013, he spent 20 years at Microsoft Research managing teams in speech, audio, multimedia, computer vision, natural language processing, machine translation, machine learning, and information retrieval. Dr. Acero is an IEEE Fellow and ISCA Fellow. Alex has served as President of the IEEE Signal Processing Society and is currently a member of the IEEE Board of Directors. He is the author of the textbook Spoken Language Processing. Dr. Acero has published over 250 technical papers and has over 150 US patents.

For more information about this seminar and its speaker, you can visit http://ee380.stanford.edu/Abstracts/1...

Support for the Stanford Colloquium on Computer Systems Seminar Series provided by the Stanford Computer Forum.

Colloquium on Computer Systems Seminar Series (EE380) presents the current research in design, implementation, analysis, and use of computer systems. Topics range from integrated circuits to operating systems and programming languages.

It is free and open to the public, with new lectures each week.

Learn more: http://bit.ly/WinYX5

0:00 Introduction
1:31 Birth of Artificial Intelligence
1:57 Checkers (Arthur Samuel, 1956)
2:09 ELIZA (Weizenbaum 1966)
2:37 2001 Space Odyssey (Stanley Kubrick, 1968)
4:06 Deep Blue (IBM, 1997)
4:50 Deep Learning (Hinton, 2006)
5:21 Jeopardy (IBM, 2011)
6:59 The imitation game (2014)
10:44 Improve on Task T with respect to performance metric P based on experience E
14:55 Perceptron Learning (Rosenblatt, 1957)
16:12 A probabilistic framework
18:38 Loss function Loss function between two probability distributions
20:39 Stochastic gradient descent
22:23 N-ary classification
24:03 Multi-layer Perceptron (Werbos, 1974)
28:50 Binary Classification Tasks
30:08 Fundamental Equation of Speech Recognition
32:32 Language Model
35:19 Acoustic Model (Hidden Markov Models) HUT
36:34 Neural Networks for Speech Recognition in the 1990s
37:17 Neural Network Winter for Speech Recogntion
37:32 Open Challenge Tasks (DARPA)
43:32 Deep Belief Networks = Deep Neural Networks
44:14 Deep Learning for Speech (Deng et al., 2010)
47:35 Deep Neural Networks: What was new?
51:08 DNN on Face Images (2012) Deep Belief Net on Face Images
52:39 Deep Learning in Speech Recognition
57:32 Machine Learning across Apple Products
58:49 Siri Architecture
59:30 Hands-Free Siri
59:39 Dictation
59:52 Voicemail transcription

Комментарии

Информация по комментариям в разработке