11

Recently, I have stumbled upon certain articles and lecture videos that use category theory to explain certain aspects of machine learning or deep learning (e.g. Cats for AI and the paper An enriched category theory of language: from syntax to semantics). On the other hand, people have looked into how higher categories and homotopy type theory can be used to talk about computer science (e.g. The Quantum Monadology.) My question is this:

Are there potential applications of higher category theory (or in particular $\infty$-categories) to machine and deep learning? If so, what would be the merit of the homotopy coherent formalization that $\infty$-categories bring along with them in this computer science picture?

David White
  • 29,779
h3fr43nd
  • 221
  • 5

2 Answers2

7

Sure, there are plenty of potential applications of higher category theory to machine learning and deep learning. Such applications are still in their infancy, so don't expect a new algorithm, getting better results in faster time, in the immediate future. If that's your goal, then I think Mirco's answer is great (simplicial neural networks) and it's the one I would have given. Since he's already done so, let me instead sketch some higher level, conceptual connections between higher category and machine learning.

To understand such connections, it's helpful to zoom way out and focus on the forest instead of the trees. What is machine learning really about?

  1. Data, since the model needs to train on data.
  2. Iterating towards a good model.
  3. Building up complexity from simple pieces.

Towards (1), category theory has been proposed as an engine for the study of Databases. I learned this from David Spivak's work, and wrote about it previously in this MO answer (which has tons of applications of category theory in the sciences). The idea is that a category can be a model for a database, and you can use category theory to enforce constraints, to make sure your database doesn't have bad data that could break the machine learning algorithm. In my book on Data Systems, we had three chapters devoted to databases and three chapters devoted to constraints. Similarly, you can use category theory to enforce constraints on the types of solutions your machine learning algorithm will come up with. Spivak has also written papers about the uses of category theory in experimental design (also summarized in the MO answer I linked to above), which matters if you want to be sure you get good data on which to build your model. You probably know that Google is running experiments on users all the time. Experimental design still matters. You might also be interested in Spivak's current work, with the Topos Institute, which uses (higher) category theory on all sorts of modern, real-world issues, including Artificial Intelligence.

Towards (2), iterating towards success makes a machine learning algorithm an evolutive system, and Spivak has studied those, too, and I wrote about them in that linked MO answer. This also brings in an area where $\infty$-categories can play a role. In any dynamical system, we care about the time axis. Often, we want to study aspects of the system that are invariant under reparameterizing time, e.g., the long-term tendancy towards equilibrium. When you start considering all possible ways to reparameterize time, you're doing homotopy theory, and $\infty$-categories can appear. There have been many papers studying the homotopy theory of dynamical systems: check out work of Bubenik, Jardine, Gaucher, and Sanjeevi Krishnan, among others.

Towards (3), this is a fundamental use of category theory, via colimits. The mindset of category theory could potentially be useful to one of the fundamental issues of deep learning, which is to understand the inner workings of a neural network and why it comes up with the answer it does. You could even imagine viewing the neural network as a functor, from data to models, and break it down using functor calculus. There are SO MANY ways to apply category theory to these questions, and I expect to see plenty of such papers in the years to come.

Here are some other, more concrete, connections between category theory and deep learning:

I'll add more if I think of any. The take-away is that there are tons of potential connections and I encourage you to investigate any that seem interesting to you!

David White
  • 29,779
  • 1
    Hei David, cool answer, and yes, quite complementary to mine. Dunno why someone decided to downvote both:.... – Mirco A. Mannucci Feb 22 '24 at 14:44
  • 1
    @MircoA.Mannucci I don't take it personally. People downvote all the time without giving a reason. Probably this person didn't like the nature of the question. Or maybe they hate category theory. Plenty of folks out there like that. – David White Feb 22 '24 at 14:59
  • Hi David, thanks for the detailed answer and all the references. Maybe a very general, probably ill-posed, follow-up question: If one wanted to get into categorical aspects of machine learning more seriously (but only has a background in higher category theory, and knows very little of ML), what would be the ideal starting point for learning these things (especially when one is interested in LLMs)? Would you recommend one of the given references specifically for such an endeavor? – h3fr43nd Feb 22 '24 at 15:49
  • @h3fr43nd Perhaps not exactly but you're looking for, but there's a book called Data Science for Mathematicians, with a chapter on machine learning, and another on TDA. Full disclosure, I wrote the stats chapter of that book (but I don't get any money from sales): https://ds4m.github.io/site/index.html – David White Feb 22 '24 at 16:04
  • Thanks I will look into this :) – h3fr43nd Feb 22 '24 at 16:10
6

After the success of Graph Deep Learning, folks are moving up the chain: a few articles, see for instance this one, have begun considering Simplicial Neural Architectures (SSN) (the main motivation is that one can study complex data structures where the vertices interact in a non dyadic relationship).

Now, all of the above is still in tis infancy, but I would venture to think that in a few years hyper-data structures and hyper-neural networks will be part and parcel of the ML arsenal. At this point, higher category theory may jump in.

How: for instance by describing operations on hyper-architectures, such as morphing one into another one.

Here are just some freewheeling suggestions:

Morphisms as Feature Maps: In an ∞-category framework, morphisms between simplices can be interpreted as feature maps or transformation functions in SNNs. This includes not only the direct relationships between simplices but also the relationships between their faces, edges, etc.

Higher Morphisms for Aggregation and Propagation: Higher morphisms can model the aggregation and propagation of features across different dimensions of the simplicial complex. For instance, a 2-morphism could represent a transformation that involves not just a direct feature map from one simplex to another but also incorporates information from their mutual relationships and boundaries.

Homotopy Coherence for Regularization: The concept of homotopy coherence from ∞-categories can be applied to ensure that operations in SNNs preserve certain topological invariants or properties. This could be seen as a form of regularization that maintains the integrity of data's topological structure, ensuring that learned features are robust and meaningful with respect to the data's inherent geometry.

PS There are a few basic implementations of simplicial NNs which you can play with: one is TopoX

  • 1
    Wow this seems really exciting, thanks for the neat answer! :) I will have to look into this more. – h3fr43nd Feb 22 '24 at 15:31