Program
Program
Program Deep Learning @UCA Event 2019, From July 15 to 19
• 2:45-hour Lecture in the morning (9 am – 12:15)
• Coffee break (around 10:15)
• Optional: UCA Lab sessions (2pm to 4:30pm)
If you are a researcher, an engineer, a PhD student, a master student, working in tech companies or in academic labs, this school is made for you!
Program Deep Learning @UCA Event 2019, From July 15 to 19
• 2:45-hour Lecture in the morning (9 am – 12:15)
• Coffee break (around 10:15)
• Optional: UCA Lab sessions (2pm to 4:30pm)
If you are a researcher, an engineer, a PhD student, a master student, working in tech companies or in academic labs, this school is made for you!
Speakers
July 15th
July 15th
Automatic speech recognition (ASR) systems -- which convert input speech into word hypotheses -- are becoming ubiquitous in our daily lives. ASR technologies have become the backbone that power our interactions with smartphones and digital assistants, allowing us to access the wealth of available information, improve productivity, and communicate faster and easier than before.
The technologies and the underlying approaches that power traditional ASR systems have remained fairly stable over the last few years. Traditional ASR systems are comprised of a set of separate components: an acoustic model (AM); a pronunciation model (PM); and a language model (LM). The AM takes acoustic features as input and predicts a distribution over sub-word units (i.e., the individual sounds in the target language). The PM, which is traditionally a hand-engineered pronunciation dictionary maps the sequence of sub-word units produced by the acoustic model to words. Finally, the LM assigns probabilities to various word hypotheses. In traditional ASR systems, these components are trained independently on separate datasets, while making a number of independence assumptions for tractability.
The dominance of traditional ASR systems, however, has been challenged recently by growing interest in the field of end-to-end ASR systems, which attempt to learn these separate components jointly in a single system. Examples of such systems include attention-based models [1, 5], the recurrent neural transducer [2, 3], and connectionist temporal classification with word targets [4]. A common feature of all of these models is that they are composed of a single neural network, which accepts acoustic frames as input and outputs a probability distribution over characters or word hypotheses. In fact, as has been demonstrated in recent work, such end-to-end models can surpass the performance of a conventional ASR systems [5].
In this lecture, I shall provide a detailed introduction to the topic of end-to-end modeling in the context of ASR. I shall begin by charting out the historical development of these systems, while emphasizing commonalities and differences between the various end-to-end approaches that have been considered in the literature. Next, I shall discuss a number of recently introduced innovations that have significantly improved the performance of end-to-end models, allowing these to surpass the performance of conventional ASR systems. I shall then describe some of the exciting applications of this research, along with possible fruitful directions to explore. Finally, I shall discuss some of the shortcomings of existing end-to-end modeling approaches and discuss ongoing efforts to address these challenges.
[1] W. Chan, N. Jaitly, Q. V. Le, and O. Vinyals, “Listen, Attend and Spell,” in Proc. ICASSP, 2016.
[2] A. Graves, “Sequence transduction with recurrent neural networks,” in Proc. of ICASSP, 2012.
[3] K. Rao, H. Sak, and R. Prabhavalkar, “Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer,” in Proc. ASRU, 2017.
[4] H. Soltau, H. Liao, and H. Sak, “Neural speech recognizer: acoustic-to-word LSTM model for large vocabulary speech recognition,” in Proc. of Interspeech, 2017.
[5] C.C. Chiu, T. N. Sainath, Y. Wu, R. Prabhavalkar, P. Nguyen, Z. Chen, A. Kannan, R. J. Weiss, K. Rao, E. Gonina, N. Jaitly, B. Li, J. Chorowski and M. Bacchiani, “State-of-the-art Speech Recognition With Sequence-to-Sequence Models,” in Proc. ICASSP, 2018.
LABS:
Speech Recognition Using Deep learning
Go on this link, then File-> Save a copy in your Google Drive
https://colab.research.google.com/drive/1x9qUCIn-8hCq2428exXMU4ZrFNBXlx8Y
Automatic speech recognition (ASR) systems -- which convert input speech into word hypotheses -- are becoming ubiquitous in our daily lives. ASR technologies have become the backbone that power our interactions with smartphones and digital assistants, allowing us to access the wealth of available information, improve productivity, and communicate faster and easier than before.
The technologies and the underlying approaches that power traditional ASR systems have remained fairly stable over the last few years. Traditional ASR systems are comprised of a set of separate components: an acoustic model (AM); a pronunciation model (PM); and a language model (LM). The AM takes acoustic features as input and predicts a distribution over sub-word units (i.e., the individual sounds in the target language). The PM, which is traditionally a hand-engineered pronunciation dictionary maps the sequence of sub-word units produced by the acoustic model to words. Finally, the LM assigns probabilities to various word hypotheses. In traditional ASR systems, these components are trained independently on separate datasets, while making a number of independence assumptions for tractability.
The dominance of traditional ASR systems, however, has been challenged recently by growing interest in the field of end-to-end ASR systems, which attempt to learn these separate components jointly in a single system. Examples of such systems include attention-based models [1, 5], the recurrent neural transducer [2, 3], and connectionist temporal classification with word targets [4]. A common feature of all of these models is that they are composed of a single neural network, which accepts acoustic frames as input and outputs a probability distribution over characters or word hypotheses. In fact, as has been demonstrated in recent work, such end-to-end models can surpass the performance of a conventional ASR systems [5].
In this lecture, I shall provide a detailed introduction to the topic of end-to-end modeling in the context of ASR. I shall begin by charting out the historical development of these systems, while emphasizing commonalities and differences between the various end-to-end approaches that have been considered in the literature. Next, I shall discuss a number of recently introduced innovations that have significantly improved the performance of end-to-end models, allowing these to surpass the performance of conventional ASR systems. I shall then describe some of the exciting applications of this research, along with possible fruitful directions to explore. Finally, I shall discuss some of the shortcomings of existing end-to-end modeling approaches and discuss ongoing efforts to address these challenges.
[1] W. Chan, N. Jaitly, Q. V. Le, and O. Vinyals, “Listen, Attend and Spell,” in Proc. ICASSP, 2016.
[2] A. Graves, “Sequence transduction with recurrent neural networks,” in Proc. of ICASSP, 2012.
[3] K. Rao, H. Sak, and R. Prabhavalkar, “Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer,” in Proc. ASRU, 2017.
[4] H. Soltau, H. Liao, and H. Sak, “Neural speech recognizer: acoustic-to-word LSTM model for large vocabulary speech recognition,” in Proc. of Interspeech, 2017.
[5] C.C. Chiu, T. N. Sainath, Y. Wu, R. Prabhavalkar, P. Nguyen, Z. Chen, A. Kannan, R. J. Weiss, K. Rao, E. Gonina, N. Jaitly, B. Li, J. Chorowski and M. Bacchiani, “State-of-the-art Speech Recognition With Sequence-to-Sequence Models,” in Proc. ICASSP, 2018.
LABS:
Speech Recognition Using Deep learning
Go on this link, then File-> Save a copy in your Google Drive
https://colab.research.google.com/drive/1x9qUCIn-8hCq2428exXMU4ZrFNBXlx8Y
July 16th
July 16th
The past decade has seen a remarkable increase in the level of performance of computer vision techniques, including with the introduction of effective deep learning techniques. Much of this progress is in the form of rapidly increasing performance on standard, curated datasets. However, translating these results into operational vision systems for robotics applications remains a formidable challenge. This talk with explore some of the fundamental questions at the boundary between deep learning/computer vision and robotics that need to be addressed. This includes minimizing supervision (low-shot learning, metalearning, self supervision), introspection/self-awareness of performance, anytime algorithms for computer vision, multi-hypothesis generation, rapid learning and adaptation. The discussion will be illustrated by examples from autonomous air and ground robots.
LABS:
Deep Reinforcement learning
Go on this link, then File-> Save a copy in your Google Drive
https://colab.research.google.com/drive/13Q3nTOJY9vYbhg1C0aXXIVM-IQaVVSaU
The past decade has seen a remarkable increase in the level of performance of computer vision techniques, including with the introduction of effective deep learning techniques. Much of this progress is in the form of rapidly increasing performance on standard, curated datasets. However, translating these results into operational vision systems for robotics applications remains a formidable challenge. This talk with explore some of the fundamental questions at the boundary between deep learning/computer vision and robotics that need to be addressed. This includes minimizing supervision (low-shot learning, metalearning, self supervision), introspection/self-awareness of performance, anytime algorithms for computer vision, multi-hypothesis generation, rapid learning and adaptation. The discussion will be illustrated by examples from autonomous air and ground robots.
LABS:
Deep Reinforcement learning
Go on this link, then File-> Save a copy in your Google Drive
https://colab.research.google.com/drive/13Q3nTOJY9vYbhg1C0aXXIVM-IQaVVSaU
July 17th
July 17th
Will deep learning reshape the AI pillars of self-driving cars, or any autonomous moving platform? In this talk, we will overview state-of-the-art deep learning based methods for perception, prediction, and planning.
LABS:
Object Detection Using Deep learning
Go on this link, then File-> Save a copy in your Google Drive
https://colab.research.google.com/drive/15cRmY36JkM4EiV30FzBQXv0rPy5AKk41
Will deep learning reshape the AI pillars of self-driving cars, or any autonomous moving platform? In this talk, we will overview state-of-the-art deep learning based methods for perception, prediction, and planning.
LABS:
Object Detection Using Deep learning
Go on this link, then File-> Save a copy in your Google Drive
https://colab.research.google.com/drive/15cRmY36JkM4EiV30FzBQXv0rPy5AKk41
July 18th
July 18th
The scale of research and application of deep learning continues to accelerate. In our work and our daily lives, we see more dependence on deep learning systems for making predictions. In some cases, these are automated, and in others, they involve humans "in-the-loop". Unfortunately, these systems can fail silently. Examples of failure include systems that make erroneous predictions but believe with high confidence that they are correct, susceptibility to adversarial attacks and so-called fooling images.
In the first part of the tutorial, I will review the adversarial examples phenomenon and current work that aims to address a model's fault tolerance with respect to its input. I will present some recent work from our lab that aims to characterize tolerance to diverse input faults, and also a surprising result that relates the widely-used batch normalization technique to adversarial vulnerability.
In the second part of the tutorial, I will discuss one way to build more fault-tolerant machine learning systems: that is, by calibrating their confidence or uncertainty measures for interpretation by humans or other systems. Such measures are useful, for example, to refer data points to a human expert or another system for further processing. In automated scenarios, they can be used to hand control to a human. I will also highlight how calibration is related to fairness in machine learning. I will demonstrate the use of confidence and uncertainty measures for out-of-distribution detection and for improving exploration in reinforcement learning.
LABS:
Sentiment analysis lab - From a baseline to a RNN with attention
Download the zip file: https://drive.google.com/file/d/1ojtEkJUGzAduSimYhMSkKNPyXyhzU1es/view?usp=sharing
Then Uncompress directly from Google Drive inside a repository "Colab Notebooks" that you have to create first
Then from your Google Drive you will be able to "Open with" Colaboratory the ipnyb files
The scale of research and application of deep learning continues to accelerate. In our work and our daily lives, we see more dependence on deep learning systems for making predictions. In some cases, these are automated, and in others, they involve humans "in-the-loop". Unfortunately, these systems can fail silently. Examples of failure include systems that make erroneous predictions but believe with high confidence that they are correct, susceptibility to adversarial attacks and so-called fooling images.
In the first part of the tutorial, I will review the adversarial examples phenomenon and current work that aims to address a model's fault tolerance with respect to its input. I will present some recent work from our lab that aims to characterize tolerance to diverse input faults, and also a surprising result that relates the widely-used batch normalization technique to adversarial vulnerability.
In the second part of the tutorial, I will discuss one way to build more fault-tolerant machine learning systems: that is, by calibrating their confidence or uncertainty measures for interpretation by humans or other systems. Such measures are useful, for example, to refer data points to a human expert or another system for further processing. In automated scenarios, they can be used to hand control to a human. I will also highlight how calibration is related to fairness in machine learning. I will demonstrate the use of confidence and uncertainty measures for out-of-distribution detection and for improving exploration in reinforcement learning.
LABS:
Sentiment analysis lab - From a baseline to a RNN with attention
Download the zip file: https://drive.google.com/file/d/1ojtEkJUGzAduSimYhMSkKNPyXyhzU1es/view?usp=sharing
Then Uncompress directly from Google Drive inside a repository "Colab Notebooks" that you have to create first
Then from your Google Drive you will be able to "Open with" Colaboratory the ipnyb files
July 19th
July 19th
We will cover representational learning from text: this includes algorithms such as word2vec and fastText. We will describe the differences between various algorithms for learning representations. Efficient supervised text classification with the fastText algorithm will also be discussed. Statistical language models based on neural networks will be introduced, and certain advanced topics such as vanishing and exploding gradients, as well as learning the longer term memory in recurrent networks will be explained. We will also talk about the limitations of the current learning algorithms, and discuss the limitations of generalization in the context of sequential data, and learning from language in general.
No LAB
We will cover representational learning from text: this includes algorithms such as word2vec and fastText. We will describe the differences between various algorithms for learning representations. Efficient supervised text classification with the fastText algorithm will also be discussed. Statistical language models based on neural networks will be introduced, and certain advanced topics such as vanishing and exploding gradients, as well as learning the longer term memory in recurrent networks will be explained. We will also talk about the limitations of the current learning algorithms, and discuss the limitations of generalization in the context of sequential data, and learning from language in general.
No LAB
Registration
Access & Accommodation
Access & Accommodation
The University provides student individual rooms and studios in Nice at very interesting prices.
Please visit the following site
Résidences ICARE - Crous de Nice-Toulon
For English version, go to Bottom of the page and click on Langage.
The University provides student individual rooms and studios in Nice at very interesting prices.
Please visit the following site
Résidences ICARE - Crous de Nice-Toulon
For English version, go to Bottom of the page and click on Langage.
Partners
Partners
To become a partner, please contact us
To become a partner, please contact us