# Trading the Unobservable

Security prices are driven by diverse factors and constraints, many of which are neither directly observable nor quantifiable by traders. Examples includes fundamental (*e.g.* corporate actions), behavioral (*e.g.* herd mentality), financial (*e.g.* liquidity), macro (*e.g.* central bank intervention), and microstructure (*e.g.* market impact algos). Yet, many classic quant models are formulated exclusively using variables which are directly observed: quotes, trades, prices, volumes, spreads, *etc*.

*This is an odd contradiction*.

Unraveling this contradiction is central to exploring market regimes, as they defy characterization by observable variables.

A short detour through a bit of mathematical intuition helps illustrate this contradiction and point a way towards potential solutions not found in classic time series statistics. Consider the price for any security at tick time :

In other words, the price of the security at time is determined by some function of observed () and unobserved () variables evaluated during the preceding time . Admittedly, an apparent tautology if there ever was. Yet, from this simple equivalence, we can highlight the *a priori* constraints imposed by traditional time-series quant models:

- Probabilistic: the future cannot be predicted with certainty, thus are assumed to be probabilistic, usually drawing from a single distribution (usually analytically tractable)
- Independent and identically-distributed (i.i.d): values of are drawn from , and assumed to be i.i.d.
- Observability: unobservable variables are excluded (since they cannot be quantified), thus is omitted (
*i.e.*empty set ) - Statistical significance: achieving statistical significance requires a minimum set of observations, thus is commonly assumed the same for long contiguous sequences of , if not all (
*i.e.*)

These constraints are compounded by two unfortunate facts:

- Unknown : the true function is not known to the model, as it is unknowable
- Lack of objective function: there is no quantitative way to know how to improve any given model relative to , as is unknown

Which result in the following methodological problems, when seeking to choose one quant model in favor of another for purposes of profitable trading:

- Heuristic: models are ultimately “best educated guesses”
- Non-optimizable: not generally amendable to techniques from mathematical optimization
- Data snooping: inevitability of data snooping, due to lack of unambiguous fitness statistic (
*e.g.*Sharpe ratio)

Yet, not all is lost. There is a beautiful mathematical trick:

Where are *unobserved variables* (usually estimated in a state space) and the system is evaluated with Bayesian inference.

This trick is conceptually simple, yet manifests in myriad elegant ways from hidden markov models and principle components to Kalman / particle filters and state space models. So many ways, in fact, the fledgling discipline of machine learning is seeking to unify them.

Yet, machine learning (ML) remains shrouded in mystery and corresponding intrigue for many traders. Although numerous causes may be root, two seem to regularly stand out:

- Incoherent intuition: ML lacks coherent unifying intuition, due to non-standardized terminology (both conceptual and symbolic) and conceptual disjointness arising from its intersection of Bayesian statistics, mathematical optimization, information theory, dynamical systems, computer science, control theory, and decision theory
- Practical application: going from ML theory to trading algos is technically non-trivial (due to both mathematical and computational complexity) and the ML trading literature is relatively sparse and specialized

Or, as summed up nicely by Gappy in a recent post comment; many modern quantitative trading techniques:

“Take too long to learn how to apply the concepts well, and it’s all too easy to misapply them.”

Given understanding market regimes depends in part upon quantifying the unobservable, a subsequent series of posts will seek to illuminate selected ML techniques. Motivated by the above discussion, the first post in this series will introduce two elegant and ubiquitous workhorses of machine learning: expectation maximization and the Kullback-Leibler divergence.

I enjoy reading your posts and seeing a well written veiw-point from a mathematician and economist. I think you hit the nail on the head in the sense that it is extremely difficult to even begin to isolate the myriad number of factors that move any particular segment of the market at any given time, let alone have observable access to a fraction of them. Having studied econometrics and time series, I often find that although there is a rigorous approach to modeling and fitting series down to minute details (for instance, making certain there is no serial correlation for residuals in AR, ARMA, models, etc..) , these models only serve to describe the past and at best, overfit the future. I haven’t found too many arguments to demonstrate that they perform much better than common TA in out of sample data. Some of the best results that come from ML deal with averaging, and ensemble type methodologies rather than precise modeling (the Netflix prize is a good example), which in a sense, mirrors following average ‘trends’ and ‘momentum’ rather than tracking high frequency fluctuations.

I look forward to seeing some of your discussions on markov chains and applications towards regime switching. I truly wish that the academic texts in this area would be more focused on practical applications that users could build, rather than abstract explanations that are difficult to build, observe, and verify.

Most state space model approaches suffer tremendously in this respect (kalman, particle, etc…).

@ intelligent trading: thanks for your comments; I agree. A key part remains admitted trial-and-error in the formulation, whether algo selection or latent variable structure. Also, as another reader pointed out: the literature is rarely directly applicable, and thus requires customization of either/both mathematics or computational implementation; for example, assumptions of either/both normality and i.i.d. remain pervasive (e.g. standard Q in EM).

nice blog, but please, you didn’t need to look up ‘coagulating’ in the thesaurus to impress us.

@Bob: thanks for the complement. Let me clarify my word choice (with edit to original), emphasizing my intended meaning: (a) ML theoretical literature is immature and often lacks coherent intuition, making it non-trivial to adapt algos to specific trading problems; (b) literature lacks practical trading use cases, as say compared with well-worn PCA and residual analysis for statarb. Of course, it is precisely this immaturity which makes the discipline so interesting to apply to trading.