# Manifold Learning

One of the perennial quant guessing games is speculating on RenTech (*e.g.* see amusing *5-year* NP thread), particularly given the fascinating background of Jim Simons (see his arXiv for recent work on differential cohomology). Ignoring public commentary, whose veracity is obviously questionable, careful consideration of historical hiring trends and corresponding employee backgrounds are suggestive. While such speculation is amusing, potential relevance arises in *assisting in filtering the exploration of research*.

Specifically, several themes are consistent:

- Infrastructure / execution: computer scientists speaks to the mundane realities of large-scale offline and online data management, risk management, multi-venue execution, and the usual collection of optimal execution concerns (particularly relevant for liquidity providing and statarb)
- Applied mathematics: “natural scientists”, with an emphasis on modern physics (much of which is built upon differential geometry and statistical mechanics), seems reasonable given heavy mathematical and statistical modeling
- High dimensionality: analysis and signal generation from high-dimensional spaces, which seems reasonable given many trading problems can elegantly be formulated in such a context; plus a deep well exists of both pure and applied math built by academia over the past 20 years; further, this makes obvious sense given Jim’s academic background (
*e.g.*see Chern-Simons) - Mixing models: RenTech grew through a combination of small acquisitions and internal development, suggesting “the predictive model” (historically referred to as “Basic System”) is not one but rather a collection of heterogeneous models which are dynamically overlaid and mixed; seems reasonable, given market regimes and consistent Medallion performance over the past 20 years
- Computational linguistics / NLP: numerous high-profile folks originated from speech recognition, of which numerous advancements over the past 30 years are based upon applied signal processing and statistical information theory (
*e.g.*Mathematics of Statistical Machine Translation, by Brown, Pietra, and Mercer); a particularly consistent theme is HMM (going back to the Dragon system by Baker in 1975), which naturally support mixing via HHMM, and causal filtering (see also Berlekamp, who worked with Kelly)

Given this context, let’s return to the point: Partha Niyogi has presented numerous times over the past few years (prior to his untimely death) on the topic of *manifold learning*. For example, he presented A Geometric Perspective on Learning Theory and Algorithms at the AMS/MAA/SIAM joint meeting (with collaborator Mikhail Belkin).

Manifold learning is particularly interesting as it lies at the intersection of many of the above themes. Specifically, the presentation theme is: *“Geometrically motivated approach to learning: nonlinear, nonparametric, high dimensions”*. This should ring bells.

The abstract tells more:

Increasingly, we face machine learning problems in very high dimensional spaces. We proceed with the intuition that although natural data lives in very high dimensions, they have relatively few degrees of freedom. One way to formalize this intuition is to model the data as lying on or near a low dimensional manifold embedded in the high dimensional space. This point of view leads to a new class of algorithms that are “manifold motivated” and a new set of theoretical questions that surround their analysis. A central construction in these algorithms is a graph or simplicial complex that is data-derived and we will relate the geometry of these to the geometry of the underlying manifold. Applications to embedding, clustering, classification, and semi-supervised learning will be considered.

In other words, *formulating machine learning within the context of differential geometry*. One of the most common applications is non-linear dimensional reduction. This is relevant to quantitative modeling for obvious reasons.

For readers wanting to brave the mathematics, consider the following:

- A Geometric Perspective on Learning Theory and Algorithms
- Geometric Methods and Manifold Learning
- Manifold Learning
- Manifold Learning: With Applications to Object Recognition

Depending upon reader interest, subsequent post(s) may explore this topic further.

Hey Quantivity,

I’m very interested in further exploration of this topic, which is completely new to me. Keep up the great posts.

Thanks,

Trey

Hmm, haven’t looked at the links yet, but does the description remind anyone else of SVMs?

@Uzair: correct; SVM can be formalized as a special case of manifold learning; Partha discusses this relationship in his AMS/MAA/SIAM presentation.

Howdy. I’m 7 years late. How many forms around the world are trying to replicate RenTec? I would like to see a CAGR ranking of them.