Skip to content

Direction of Change Forecasting

January 16, 2012

Index Return Decomposition prompted several readers to inquire about forecasting the signs of returns, as implied by the s_t decomposition variable. This is an interesting topic worth review, quick survey of intuition from the literature, and some R code for exploratory analysis.

This topic is known as direction-of-change forecasting in the literature. Needless to say, successful prediction of the sign for future returns is quite interesting from a trading perspective. Traditionally, only univariate return series were considered; Anatolyev (2008) is an exception, modeling two or more interrelated markets via dependence ratios. This literature tends to be a bit obtuse, due to commonly unstated stylistic assumptions regarding conditional return dynamics.

The traditional formulation for this topic considers the estimation of the probabilities of returns exceeding an upper or lower threshold c, optionally conditioned on an information set I from the previous time step:

   U_t(c) = \text{Pr}(r_t > c | I_{t-1})
   D_t(c) = \text{Pr}(r_t < -c | I_{t-1})

If c = 0, the probabilities reduce to forecasting positive or negative returns; for trading, a natural choice for c is roundtrip transaction costs:

   U_t(c) = \text{Pr}(r_t > 0 | I_{t-1})
   D_t(c) = \text{Pr}(r_t < 0 | I_{t-1})

Estimating these probabilities can be undertaken via several techniques. One approach is to use a logit model, based upon the logistic function:

   U_t(c) = \frac{\exp(\boldsymbol{\theta}_t)}{1 + \exp(\boldsymbol{\theta}_t)} : \boldsymbol{\theta}_t = \hat{\boldsymbol{\beta}} \textbf{X}_t

Where \textbf{X}_t \in I_{t-1} are explanatory variables from the previous time step. Challenge of this model is proper selection of explanatory variables.

An alternative approach is to consider the following functional decomposition for univariate return series:

   r_t = \mu_t + \sigma_t \epsilon_t

Where \mu_t = \text{E} (Y_t | I_{t-1}) is the conditional expected value, \sigma^2 = \text{Var}(Y_t | I_{t-1}) is the conditional variance, and \epsilon_t is a martingale with zero mean, unity variance, and conditional distribution function F_{\epsilon}(\cdot | I_{t-1}). From which the direction of change probabilities can be expressed:

   U_t(c) = \text{Pr} \left[ (\mu_t + \sigma_t \epsilon_t) > c \right] = \text{Pr} \left[ \epsilon_t > \left( \frac{c - \mu_t}{\sigma_t} \right) \right]

   D_t(c) = \text{Pr} \left[ (\mu_t + \sigma_t \epsilon_t) < -c \right] = \text{Pr} \left[ \epsilon_t < \left( \frac{-c - \mu_t}{\sigma_t} \right) \right]

With corresponding conditional expectations:

   \text{E} \left[U_t(c) | I_{t-1} \right] = 1 - F_r(c | I_{t-1}) = 1 - F_{\epsilon} \left( \frac{c - \mu_t}{\sigma_t} | I_{t-1} \right)
   \text{E} \left[D_t(c) | I_{t-1} \right] = F_r(c | I_{t-1}) = F_{\epsilon} \left( \frac{-c - \mu_t}{\sigma_t} | I_{t-1} \right)

These expectations simplify to the following when c = 0, assuming \mu \ne 0 (otherwise, expectation is constant and thus uninteresting):

   \text{E} \left[U_t(c) | I_{t-1} \right] = 1 - F_{\epsilon} \left( \frac{- \mu_t}{\sigma_t} | I_{t-1} \right)
   \text{E} \left[D_t(c) | I_{t-1} \right] = F_{\epsilon} \left( \frac{- \mu_t}{\sigma_t} | I_{t-1} \right)

These expectations can be evaluated explicitly via calculating the empirical distribution function \hat{F}_{\epsilon} (requiring assumption of a parametric distribution), where \mathbb{I} is the indicator function:

   \hat{F}_{\epsilon} \left( \frac{- \mu_t}{\sigma_t} | I_{t-1} \right) = \frac{1}{k} \sum\limits_{t=1}^k \mathbb{I} \left( \frac{r_t - \mu_t}{\sigma_t} \le \frac{-\mu_k}{\sigma_k} \right)

Alternatively, this decomposition suggests one potential formulation for the logit parameters \boldsymbol{\theta}_tt from the above model, where \hat{\mu_t} is estimated by the logit and \sigma_t are historical observations:

   \boldsymbol{\theta}_t = \frac{\hat{\mu_t}}{\hat{\sigma_t}}

Of course, the non-trivial work is generating forecast estimates for next-step average conditional return \hat{\mu_t} and conditional variance \hat{\sigma_t}.

An alternative way to model the logit parameters \boldsymbol{\theta}_t is to apply ARMA intuition with a binary autoregression (BARMA) due to Startz (2006), including lags for both autoregressive parameters and past indicator values, due to Anatolyev (2008):

   \boldsymbol{\theta}_t = w + \sum\limits_{j=1}^p \alpha_j \theta_{t-j} + \sum\limits_{j=1}^p \beta_j I_{t-j}

A survey of estimation techniques may be considered in a subsequent post, depending on reader interest.


One exploratory analysis technique relevant to sign forecasting is visualizing up/down runs, signed difference (i.e. up-down), and corresponding averages for a return series.

returnRuns <- function(r, bound=0, doPlot=TRUE, startAvg=5, avgLen=-1)
{
  # Generate up/down runs and average runs for a return series, optionally
  # plotting them.
  #
  # Args:
  #   r: return series
  #   bound: symmetric upper and lower bound, aka c
  #   doPlot: flag indicating whether plots should be generated for runs
  #   startAvg: Number of average runs which should be excluded for
  #             eliminating unstable average with few leading observations
  #   avgLen: number of periods over which to generate average; of -1 for
  #           entire period
  #
  # Returns: none
  
  up <- cumsum(ifelse(r > bound, 1, 0))
  down <- cumsum(ifelse(r < -bound, 1, 0))
  
  if (doPlot)
  {
    plot(up, main='Signed Runs: Up & Down', ylim=range(up,down))
    lines(down, col='red')
    legend("topleft",legend=c("Up","Down"), fill=colors, cex=0.5)
    
    plot(up-down, main="Signed Run Difference (up-down)")
  }
  
  if (avgLen == -1)
  {
    avgUp <- xts(sapply(c(1:length(up)), function(i) {
      up[i]/i
    }), order.by=index(up))
    avgDown <- xts(sapply(c(1:length(down)), function(i) {
      down[i]/i
    }), order.by=index(up))
  } else
  {
    avgUp <- xts(sapply(c(avgLen:length(up)), function(i) {
      start <- i - avgLen + 1
      last(cumsum(ifelse(r[start:i] > bound, 1, 0))) / avgLen
    }), order.by=index(up[avgLen:length(up)]))
    avgDown <- xts(sapply(c(avgLen:length(down)), function(i) {
      start <- i - avgLen + 1
      last(cumsum(ifelse(r[start:i] < bound, 1, 0))) / avgLen
    }), order.by=index(up[avgLen:length(up)]))
  }
  
  if (doPlot)
  {
    n <- length(avgUp)
    plot(avgUp[startAvg:n], main=paste("Average Runs: Up & Down (",avgLen," periods)",sep=""), type='l', ylim=range(avgUp,avgDown))
    lines(avgDown[startAvg:n], col='red')
    legend("topleft",legend=c("Up","Down"), fill=colors, cex=0.5)
  }
      
  return (list(up=up, down=down, avgUp=avgUp, avgDown=avgDown))
}

For example, the following plots illustrate CRM run dynamics from 2005 to present. First plot illustrates the running sums for both up and down returns, indicating negative returns are more prevalent:

Second plot illustrates the difference in signed sums, showing time-dynamics for the difference in up and down returns. Not surprising, this difference mirrors the CRM price curve closely:

Third plot illustrates the average probabilities for both up and down, running incrementally over the entire timeframe:


The following are representative papers from the direction-of-change literature, ignoring the early papers focused on evaluating market efficiency (e.g. run tests):

Finally, Kinlay briefly surveyed this topic in two posts on volatility sign prediction.

About these ads
8 Comments leave one →
  1. sethm permalink
    January 16, 2012 9:42 pm

    Excellent post. My research into direction of change forecasting has revolved around simpler, intuitive use of conditional probabilities based on volatility, LIBOR, etc (largely similar to Kinlay’s posts). Looking forward to delving into more advanced treatment of the subject via the papers you linked to. Would be very interested in any further treatment on this blog.

    Also an important paper on volatility that might be of interest here is: Forecasting Volatility in Financial Markets: A Review, Poon and Granger (2003)

    • quantivity permalink*
      January 16, 2012 10:10 pm

      @sethm: thanks for complement; in your research, do you find success with those explanatory variables; if so, curious for what types of equities?

      Thanks for paper reference. Volatility forecasting is a beautiful topic; my appetite for lit survey is only tempered by its massive scope.

      • sethm permalink
        January 16, 2012 10:42 pm

        For equity, I have stuck to indices. Thus far I have been unable to reliably forecast volatility for individual stocks. I’ve found that the volatility of volatility of indices to be lower and thus more readily forecastable.

  2. tyler permalink
    June 8, 2012 9:16 pm

    Thanks for the post. Happened upon it after happening upon your blog. Looking into whether these methodologies would be useful for forecasting pairwise relative returns among asset classes over longer periods, e.g, ~4 weeks, for purposes of tilting allocation levels within an overall portfolio. Just delving in now and this is a great start. Please do keep it up, your site will definitely go onto my favorites.

  3. August 6, 2012 9:17 am

    Hi

    There has been lots of quantitative finance modeling and techniques available from books and on the net but most of them are focused on helping people to find a sell-side job in investment bank. This blog is very unique that it applies those techniques from retail investors perspective.

    Thank you so much for sharing your knowledge.

    I have a question regarding this formula (hope the HTML tag displays…)

    I guess the sigma(t) is probably estimated by historical methods such as GARCH or EWMA or looking at the options IV, but how should the mu(t) be estimated?

    I have been looking at the HSI index in Hong Kong market and most of the time the student’s T test told me the daily mean return is zero.

    Thanks a lot.
    Paul

    • August 6, 2012 9:19 am

      oops the HTML does not show up. I mean this formula

      http://s0.wp.com/latex.php?latex=r_t+%3D+%5Cmu_t+%2B+%5Csigma_t+%5Cepsilon_t+&bg=ffffff&fg=333333&s=0

      Thanks

    • quantivity permalink*
      August 6, 2012 8:53 pm

      @plchung3: thanks for your kind comments.

      Re HSI mu: zero daily mean return is not surprising. Use of a longer sample duration is more likely to generate non-zero mean return, whose sign will depend on market regime(s) prevailing during estimation period.

      Logit estimation of mu is one approach, as mentioned above; other approaches exist (e.g. quantile estimation has shown promising results). Recall the literature has consistently demonstrated much higher accuracy in forecasting volatility (esp. conditional) than mean return (whether conditional or unconditional), hence many folks focus on volatility. Note this is the same rationale which motivates use of minimum variance portfolios.

Trackbacks

  1. Direction of Change Forecasting » A Strategy Rich Tactically Sound Techically Futuristic Trading Research Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 226 other followers

%d bloggers like this: