I am going run my 3rd lecture on deep learning (DL). Here I would like to explain my thoughts on how I designed/revised all 12 weeks lectures from ground up. It is NOT really hard to follow as it works well for me, but I have no guarantees it works well for others!. Here is the my basic principles and guidances. Note that the lecture will be published to public if I have more time to fix some areas :).
- Don’t Create Lecture Materials. First is to combine all available materials from world top Universities and MOOC, capture the key concepts and innovate on “teaching styles” based on participants background. Start from what students have already got in schools (let say, we can safely assumed statistics, numerical method, probability), then introduce key concepts with bottom up flow. I usually cover “learning from data problem domain” then make it clear why learning is feasible with theories like probability, VC dimension/generalization, and bias-variance. Prof. Yasser gave beautiful explanations of those concepts. From there I introduce common machine learning types (supervised, unsupervised, reinforcement) and lifecycle such as noisy target, error measures, validation set, model selection and cross-validation. I commonly use simplest linear model to introduce basic concept of “learning from data” and simplify the math (but I never skip) to explain its fundamental part. Remember, math is best language on earth to explain complex concepts. All of materials and teaching techniques already discovered in best books and lectures so don’t re-create.
- Start with Simple Perceptron. I introduce neural network concept from simple perceptron model. It explains clearly why and how connected simple neurons can be used as a generic learning model. I introduce concepts of neuron, weight, activation and objective/cost function, also a simple iterative algorithm for classification task (handwritten recognition with MNIST dataset for example). Based on my personal experience, those concepts are easier to understand using visual animations rather than textual description. Once students have good understanding on perceptron learning algorithm, then I think they are ready to learn more complex concepts starting with common problems on perceptron and how we can fix it. For example, if the activation is a step function, the problem becomes combinatoric and we cant use calculus of differential to minimize objective. Problems on perceptron is the reason why we have modern neural network model.
- Animate How Basic Model Work. My experience showed that Indonesian students do not learn much from reading. However, they have very good visual memory. I use that fact to accelerate learning process on basic neural network models. To arrive at neural network concept, I first introduce three modifications of perceptron, (1) the non-linearity of activation functions (from step to sigmoid), (2) change on objective function and (3) additional hidden layers. Those modifications are necessary to improve the performance of perceptron model, however, one will find new problem on calculating weight gradient for deeper layer network to minimize the objective/cost. This is why we come up with well-known dynamic programming technique called back-propagation, which is the most important algorithm that makes deep feed-forward neural networks (DNNs) possible. Again, I think we should still use visual animation to explain the basic DNN concept without skipping the math. Matrix computation and partial differential equation (to calculate gradient) are two required knowledge in this case which we can animate it too! Other important concepts are stochastic gradient descent (SGD) for objective function optimization, over-fitting vs regularization and model evaluations. We can use tools like TF playground to explain all of those concepts comprehensively.
- Teach The Problem First. I have reviewed many deep learning lectures that tried to teach with bottom up approach. However, some failed to explain the main problems, as mostly focus on explaining solutions. I think engineers who understand problems are much better than engineers who memorize solutions. Why? First, too many heuristic techniques are invented in this field and still growing very fast. That implies we cant memorize all techniques, but we can build understanding on few important concepts. Second, engineer who knows a problem deeper has more probability to find optimal solution faster. The deep learning field is growing fast in term of techniques, but the problem domains that we are trying to solve with it are still limited. So I prefer to explain key problem domains in computer vision, speech recognition and NLP/NLU. Once students understand the problems such as image classification, segmentation, captioning, then they will have proper curiosity to learn state of the arts techniques on applying deep learning for computer vision to recognize spatial patterns. The similiar case with NLP/NLU as well as Automatic Speech Recognition (ASR), that usually consists of sequential pattern recognition.
- Introduce State of the Art Models. Once students understand a problem domain well, I usually able to trigger their curiosity to learn its state of art techniques. For computer vision for example, we should pick a problem like feature extraction in image recognition, then teach convolutional neural network (CNN, convolute & pooling) as the best solution. Currently there are many too CNN-based model architectures, however, pick few of them based on selected problems to be solved. Other than CNN, LSTM is well-known state of art recurrent neural network (RNN) model for sequential pattern learning problem. It is widely being used in NLP/NLU and speech problem domains. The most important concept in LSTM are concept of unit cell (instead of neuron) and back-propagation through time (BPTT). BPTT is a modification of back-propagation algorithm in addition of time dimension. Based on my previous experiences, CNN and LSTM are very important models to introduce for vision, NLP/NLU and speech problem domains. Many advanced models in deep learning are combination of DNN, CNN and LSTM. Image captioning for example. After the recognition part with DNN/CNN, we use LSTM to develop the caption.
- Introduce Deep Generative Model Earlier. Not only for discriminative learning tasks (such as classification and regression), deep learning now can be used for generative tasks in which we can generate synthetic sample with same distribution of training data. It means, we have to capture/approximate probability density of real data, model it and generate new data sample based on the resulted model density. This field is growing very fast nowadays, led by state of the art models called GAN (Generative Adversarial Networks). To teach GAN models, there many concepts to introduce (maximum likelihood, probability, game theory), however, the essence is how GANs model probability density of real data implicitly so it can perform supervised, semi-supervised and unsupervised learnings. Again, animation tools like GANLab are very helpful to explain hard concepts. The reason to teach GAN early is because it can give more intelligence by combining discriminative and generative capabilities to solve bigger problems in vision, NLP/NLU and ASR.
- Introduce Deep Reinforcement Learning Earlier. Just like generative learning, RL is a very important topic in deep learning field as human naturally learnt within this paradigm. RL is the most natural learning paradigm we should put on machine. I planned to introduce RL in an overview style, explain the problem, draw big pictures, and fill with details iteratively. The common way is to start with background of AI paradigms supervised, unsupervised and reinforcement learning. Then introduce RL core concepts, including value function, policy, reward, model, exploration vs. exploitation, and representation. We also have to cover important mechanisms for RL, including attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn. As closing, I should discuss RL applications, including games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art. With the final discussion, I should be able to tell my students that AI is still a very early field and opens unlimited opportunities for them to solve more problems. I hope the discussion will give them new curiosity to invest more time to learn and contribute on deep learning field.
Hope this helps!
TSMRA – 2019.