Deep neural network training is a highly non-convex optimisation problem with poorly understood properties. We know that a solution can be found by following the negative gradient to walk down the loss landscape, but we have little guarantees that the discovered minimum will, for example, generalise well to unseen data. The shape of the loss landscape is important when choosing the best strategy to find a minimum, and may also be important when differentiating between two solutions equivalent in terms of the training error, but markedly different when performing inference on unseen data. Thus, quantifying and visualising neural network loss landscapes has much practical relevance. This talk aims to review the various methods used for loss landscape analysis, and to highlight some of the practical insights that this lens offers.
Anna Bosman: I am a senior lecturer of Computer Science at the University of Pretoria and the leader of the Computational Intelligence Research Group (CIRG). I have completed my PhD degree in 2019, with a thesis focused on fitness landscape analysis of neural network loss surfaces. I am interested in developing a better fundamental understanding of how and why neural networks work. I enjoy working on hybrid approaches that combine different computational intelligence paradigms. I also love seeing machine learning in action, thus I am always keen to try my hand at real-life applications. In particular, I dabble in computer vision in various applied domains from satellite images to radio astronomy.
27 September 2023