This reading group is internal to the University of Michigan, and is intended for general, rigorous ML theory. It is more informal than other reading groups on this list, so it’s a good place to practice talks/presentations for conferences or research meetings. Some presenters (me) discuss ML models for chemistry, but that is not the intent of the group. This group is organized by Matt Raymond (me) and is not currently funded. We are always looking for new presenters, so please consider joining! To maintain the informal and low-risk nature of this reading group, meetings are not recorded.
This reading group specializes in learning on graphs and other geometric structures. It is not specifically about chemistry, but many of the papers are about chemistry, particle physics, or other such topics. The group is organized by Hannes Stark, a PhD student from MIT, and is funded by Valence Discovery. Recordings are available here.
This reading group is directly organized and funded by Valence Discovery, which specializes in ML for chemistry. Their talk are typically more applied than the talks from LoGaG but appear to be slightly less rigorous. Recordings are available here.
I’ve only started this book recently, but it has very intuitive explanations and many exercises. I wouldn’t call it a “hardcore” topology book, but it’s definitely helpful for picking up some topology on the side. It also comes with several associated lectures, which are available trough YouTube links on the site. Overall, I think it’s a very enjoyable read.
This book gives a good introduction to Gaussian Processes. I found it to be very helpful.
Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.
To me, this seems like the definitive work on Geometric Deep Learning. There’s more than enough material to warrant several readings.
The last decade has witnessed an experimental revolution in data science and machine learning, epitomized by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach – such as computer vision, playing Go, or protein folding – are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learning, whereby adapted, often hierarchical, features capture the appropriate notion of regularity for each task, and second, learning by local gradient-descent type methods, typically implemented as backpropagation. While learning generic functions in high dimensions is a cursed estimation problem, most tasks of interest are not generic, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world. This text is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of applications. Such a ‘geometric unification’ endeavour, in the spirit of Felix Klein’s Erlangen Program, serves a dual purpose: on one hand, it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers. On the other hand, it gives a constructive procedure to incorporate prior physical knowledge into neural architectures and provide principled way to build future architectures yet to be invented.
Tai-Danae Bradley is a research mathematician at Alphabet. Her blog focuses on the intersection of category theory and natural language processing.
Will Kurt is a statistics author. His blog covers topics in frequentist and bayesian statistics, and typically provides in-depth derivations.
Michael Bronstein is the DeepMind Professor of AI at Oxford and the Head of Graph ML Research at Twitter. This blog provides non-rigorous introductions to the ideas in papers published by himself or by his colleagues. It’s probably one of the more interesting ML blogs on Medium.
Lilian Weng is a researcher at OpenAI. Her blog typically provides overviews of existing ML architectures, rather than covering specific theory topics. It’s generally readable by someone without too much ML background, and provides more derivations than the source papers.
Jalon Brownlee is an ML author and professor. This blog provides basic introductions to ML concepts, but is not specially rigorous or in-depth. This blog is good if you know very little about a given topic, but does not provide more than heuristic justifications for its assertions.
Topology, differential geometry, and algebraic topology. “A gentle introduction to insanity.”
Provides useful intuitions for abstract mathematical concepts using beautiful animations (Manim). Its typical focus is ML.