I went to ICML 2019 in Long Beach, 10-13 June 2019, one of the major Machine Learning Conferences, next to ICLR and NeurIPS. I very much enjoyed the conference. There have been enormous advances in success in neural networks, starting with supervised learning for image recognition, and now there were very many papers for reinforcement learning and games as well. I had been afraid that ICML would have been been too much supervised learning, but I was pleasantly surprised to find all three days there was a reinforcement learning track, that was very high quality and that I found interesting and could understand.
As you know, I am interested in reinforcement learning, coming from the area of search (alphabeta, MCTS) in combinatorial games (chess, checkers, go). My conferences used to be the classical AI conferences, AAAI, IJCAI, ECAI. Since AlphaGo it has been clear that a combination of search and neural networks yields amazing results, and the field of neural networks has also seen much interest going from supervised image processing into reinforcement learning and games. That is the reason I now go to NeurIPS, ICML, and ICLR.
Let me list for you the highlights of the conference as I understood them. The advances in deep learning in image processing are great, and have been caused by better hardware (GPU), better algorithms, and big data (the labeled ImageNet dataset), generating huge amounts of data for the large deep networks to train on. For reinforcement learning in games the advances have also been caused by ways to generate many examples for the large deep networks to train on. So, there are great successes, but at a high training cost. A central trend that was present is for cheaper training, since the high training cost are unsustainable, and a major impediment going to more applications of deep learning.
Sample efficiency needs to be improved, since training is too expensive. I saw the following approaches to improve sample efficiency from the sessions I attended:
- few-shot learning/meta learning
- model-free learning
- off-policy learning
Few-shot learning takes sample efficiency head on. There are very exciting successes in the field to train a network, and then retrain it so that the resulting network is suitable for very fast adaptation (few-shot) to new examples. It uses the second derivative to adjust the networks. Work by Chelsea Finn and others on MAML: https://arxiv.org/abs/1606.04080, https://arxiv.org/abs/1902.08438, https://arxiv.org/abs/1703.03400, https://arxiv.org/pdf/1703.03400.pdf, https://arxiv.org/pdf/1903.08254.pdf, http://proceedings.mlr.press/v97/liu19g/liu19g.pdf, https://arxiv.org/abs/1810.09502
Model-free learning notes that explicit tree search such as MCTS is very expensive, and tries to improve sample efficiency by seeing if explicit planning can be generalized into a neural network. This is a very ambitious and bold approach, that can expect a lot of resistance from the explicit planning field (ICAPS and such). Do these neural network people want to take over everything? But there were a few papers on this, claiming success, amongst others in Sokoban: https://arxiv.org/abs/1901.03559, https://arxiv.org/abs/1811.04551, http://reinforcement-learning.ml/papers/pgmrl2018_azizzadenesheli.pdf
Off-policy learning is about on-policy/off-policy approaches to improves sample efficiency. There were a few papers with very nice approaches that I could not fit in the other categories: https://arxiv.org/abs/1902.01119, https://arxiv.org/abs/1812.02900, https://arxiv.org/abs/1903.08254
There were a few very inspiring papers on the Generalization/Overfitting trade-off. I found the very nice. https://arxiv.org/abs/1804.06893, https://arxiv.org/abs/1812.02341, https://arxiv.org/abs/1806.07937, https://arxiv.org/abs/1706.05394, https://arxiv.org/abs/1810.07052
Note that some of the listed papers are not at ICML, but are major works that the other papers refer to and build on.
I hope this write up of the papers that I enjoyed is as useful and inspiring for you as being here is for me!