Delay Embedding as Regime Signal

February 24, 2011

Infantino and Itzhaki, in their 2010 thesis Developing High-Frequency Equities Trading Models, utilize a regime switching signal based upon time delay embedding. The intuition underlying this signal and use for regime discovery are unexpectedly interesting.

Conceptually, their signal is framed within the context of a two-state switching regime (interpreted in the classic technical sense): “momentum” or “mean reversion”. With high frequency equities portfolio data, they informally observe the familiar volatility-regime correlation: high volatility implies momentum (e.g. herd effects), low volatility implies mean reversion (e.g. market making).

In their words: “As the short-term changes in $\sigma_D$ appeared to be more pronounced — identified by very narrow peaks in the $\sigma_D$ time series — cumulative returns from the basic mean-reverting strategy seemed to decrease” (p. 44). Note $\sigma_D$ is a measure of cross-sectional volatility on dimensionally reduced returns (i.e. standard deviation of returns projected on dominant PCA eigenvectors). This relationship is illustrated in the right graphic.

How they translate this intuitive volatility-regime correlation into a switching signal is the fun part. They define the difference of $\sigma_D$ as $\psi$ , then define the following distance metric (illustrated in left image):

$E_H (t) = \sqrt{\sum\limits_{i=1}^H{[\psi(t - i)]^2}} = \sqrt{\sum\limits_{i=1}^H{[\sigma_D(t - i) - \sigma_D (t - i - 1)]^2}}$

This is an interesting starting point, as dynamical systems reminds us that $\psi(t - i)$ is a phase space reconstruction for $\sigma_D$ , given $\psi$ is the delayed chain of discrete i.i.d. steps walking backwards in time for $\sigma_D$ . In other words, the following are vectors reconstructing the volatility from which mutual distance is being measured for each observed time $t$ :

$[ \psi(t - 1), \psi(t - 2), \cdots , \psi(t - H) ]$

From which they define a binary regime signal $\omega$ as the positive first difference of $E_H(t)$ :

$\omega = [ E_H(t) - E_H(t - 1) ] > 0$

From which the regime switch is defined: $\omega > 0$ indicates volatility is increasing and thus a “momentum” regime is appropriate. On the contrary, $\omega \le 0$ indicates volatility is decreasing and thus a “mean-reverting” regime is appropriate.

This signal is quite interesting when considered within the larger context of several familiar time series analysis traditions:

Time delay embedding: $\psi(t - i)$ is a delay embedding of $\sigma_D$ , and thus benefits from classic theorems of Takens, Mañé, and Sauer et al.
Frequency analysis: delay embedding hints at potential applicability of frequency techniques from signal processing, such as singular spectrum analysis
Distance metrics: $E_H(t)$ is indeed the familiar Euclidean distance metric, and thus begs consideration of non-Euclidean metric spaces and reframing the notion of temporal distance such as via dynamic time warping
Markov chains: interesting questions arise when considering the structure of $\psi$ , such as whether it is Markovian and thus may benefit from corresponding Markov chain / HMM machinery

Undoubtedly not by accident, the authors conveniently omit their choice of embedding dimension $H$ . Such is presumably left as an exercise for the reader, as selecting optimal embedding dimension is indeed well-known to be one of the most significant challenges in reconstruction.

11 Comments leave one →

Aleksey permalink

March 1, 2011 3:42 pm

I really find this approach interesting, and I decided to experiment with it a bit. If I achive anything interesting, I will share with you via email. Thanks for the nice blog!

Reply
- Quantivity permalink*
  
  March 1, 2011 9:55 pm
  
  @Aleksey: glad you found it useful; I look forward to hearing if you have similarly positive results. Also, I am drafting a follow-up post describing the cross-sectional volatility metric in more detail, including drilling into the principal component space from which it originates.
  
  Reply
Scott Locklin permalink

April 12, 2011 7:25 pm

I’ve not looked at SSD, though its use by climatologists make it suspect to me (HH-transforms are on my short list of things to look at though). On the other hand, I have been using this “trick” for a while now, with some success. Carol Alexander wrote about it in one of her books, but I kind of thought of it because I wrote a dissertation which talked a bit about the trace formula, and I thought it was an obvious thing to try.

Reply
- quantivity permalink*
  
  April 12, 2011 10:25 pm
  
  @Scott: Curious to hear what problem domains you have found success with embedding, as it’s a rather general technique.
  
  Re SSD: am I correct in presuming from context that you mean SSA (as first step of SSA is embedding, and climatology research is a frequent user), rather than SSD? I have found the necessity of a priori parameter specification to be an annoyance with applying SSA to trading. While specifying window length is modestly annoying (but not conceptually different than selecting trading clock and time-series sample length), having to evaluate separability for use in X_i groupings gives it an unpleasant whiff of data snooping.
  
  Reply
Maxim permalink

June 29, 2012 12:51 pm

Thank you for initiating this discussion. I have one question. It seems to me that your definition of E_H(t) is different from the paper’s. Or, to be precise, your interpretation of Psi(t-i). You define Psi(t-i)=sigma_D(t) – sigma_D(t-i), when the paper defines it simply as Psi=d sigma_D / dt, which I believe implies Psi(t-i)=sigma_D(t-i) – sigma_D(t-i-1).
Am I missing something here?

Reply
- quantivity permalink*
  
  June 30, 2012 9:25 am
  
  @Maxim: correct, thanks for comment. Post updated accordingly.
  
  Reply
  - Maxim permalink
    
    July 2, 2012 7:24 am
    
    @quantivity: thank you. Do you have any idea for the reasonable range for H?
  - quantivity permalink*
    
    July 2, 2012 9:21 pm
    
    Having not optimized on live data, my speculation is optimal H varies dynamically by tick velocity.
manish permalink

October 8, 2015 7:28 pm

The authors state in the paper that the value of H is same as the accumulation parameter they chose before for the regression.

Reply