This post begins by identifying *macroeconomic market regimes arising from multi-asset economic activity*.

One big challenge in analyzing market regimes is *identification*, as they are not directly observable. As an unsupervised statistical learning problem, there is no verifiable “right answer”. Beyond identifying regimes, we also want to know the probability of being in a given regime at any given point in time. Finally, our economic activity is measured via time series.

Fortunately, a standard ML technique exists which possesses these attributes: *hidden Markov model* (HMM). For readers unfamiliar with HMM, here is a brief summary on the theory relevant to our problem. See Hidden Markov Models for Time Series by Zucchini and MacDonald (2009) for more details.

HMMs are useful because they estimate both unobserved regimes and corresponding probabilities of being in each regime at every point in time. The latter is termed “local decoding”, and expressed as the conditional state probabilities:

In other words, the probability of being in regime at time given the observed data .

Where is the observed time-series data and is an unobserved parameter process which conditional probability at time depends at most on the previous time , also known as the Markov property:

Thus, is an unobservable Markov chain; because of which the conditional probability at time nicely reduces to (see equation 5.6 in Zucchini, derived from 4.9):

Where and are forward and backward probabilities along the Markov chain at time (estimated via iterative maximum likelihood using expectation maximization), and is the likelihood at time .

For those who prefer code, where `val`

is observed time series and `numOfStates`

is number of unobserved regimes for :

hmmFit <- HMMFit(val, nStates=numOfStates) fb <- forwardBackward(hmmFit, val) eu <- exp(fb$Alpha + fb$Beta - fb$LL)

Readers with ML background will recognize HMM as an elementary temporal graphical model (see § 6.2 of Koller and Friedman (2009)).

With this tiny bit of HMM theory, we can now formulate our regime analysis problem in economic terms and use a bit of R to solve it.

Consider four measures of economic activity: US equities, real gross domestic product (GDP), inflation, and G10 currencies. Posit each measure of economic activity can be characterized at any point in time as being in either one of *two* states: stable with corresponding small downside volatility or contracting with corresponding high volatility. This broadly matches traditional wisdom, namely the macroeconomy is either acting “fairly normal” or is exceptional (either panic or exuberance).

Worth noting is the obvious caveat that this bi-state model oversimplifies reality, in particular ignoring potential distinction between “growth” and “stagnancy” (of potential importance during 1970s and 2010s). Intuition is unclear *a priori* whether a bi- or tri-state model is preferable. A subsequent post will take up this model selection question, as it lacks an easy answer.

Illustrating the conditional probabilities of this model graphically, where a value of 0 indicates 100% likelihood of being in the “normal” regime and value of 1 indicates 100% likelihood of being in “exceptional” regime:

These regimes match our economic recollection. Equities were normal through much of the 1990s and mid-2000, and in panic the remaining time. GDP growth was exceptionally strong in the mid-1990s, during recovery from dotcom bubble in early 2000s, and during recovery of mortgage bubble in late 2000s; growth during all other times was “normal”. Inflation was high during the late 1970s and flanking the mortgage bubble. Currencies were volatile throughout 1980s and 1990s and then again flanking the mortgage bubble.

These regimes also illustrate just how unusual the mortgage bubble was in historical sense, as it is the only time in the past 30 years during which all four measures of macroeconomic activity were simultaneously in exceptional regime.

Code to replicate the above results, and quite a bit more to be discussed in subsequent posts.

Note: US equity regime is estimated using daily returns from SPX, rather than equally-weighted basket of S&P 500 sector indices as in the original article. Doing so results in nearly identical equity regime conditional probabilities, hence SPX is chosen in recognition of Occam.

library("RHmm") library("TTR") displayKritzmanRegimes <- function() { # Display regimes from Kritzman et al. (2012), printing regime # statistics and plotting local decoding. equityRegime <- getEquityTurbulenceRegime() inflationRegime <- getInflationRegime() growthRegime <- getGrowthRegime() currencyTurbulenceRegime <- getCurrencyTurbulenceRegime() print(equityRegime) print(inflationRegime) print(growthRegime) print(currencyTurbulenceRegime) plotMarkovRegimes(equityRegime, "Equity (SPX)", plotDensity=F) plotMarkovRegimes(inflationRegime, "Inflation (CPIAUCNS)", plotDensity=F) plotMarkovRegimes(growthRegime, "Real GDP (GDPC1)", plotDensity=F) plotMarkovRegimes(currencyTurbulenceRegime, "G10 Currency Turbulence", plotDensity=F) plotLocalDecodings(list(equityRegimeTurbulence, growthRegime, inflationRegime, currencyTurbulenceRegime), list("US Equity (SPX)", "Real GDP (GDPC1)", "Inflation (CPIAUCNS)","G10 Currency Turbulence"), regimeNums=c(2,2,2,2)) } getEquityTurbulenceRegime <- function(startDate=as.Date("1977-12-01"), endDate=Sys.Date(), numOfStates=2) { # Estimate two-state markov (SPX-based) equity regime. In lieu of S&P 500 # sector indices, use SPX instead. # # Args: # startDate: date which to begin panel for regime estimation # endDate: end which to end panel for regime estimation # numOfStates: number of hidden states in regime # # Returns: hmmFit from HMMFit(), suitable for display with plotMarkovRegime() spx <- dROC(getOhlcv(instrumentSymbol="^GSPC", startDate=startDate, endDate=endDate, quote=c("close"))) spxTurb <- rollingTurbulence(spx, avgWidth=(250 * 10), covarWidth=(250 * 10)) meanTurb <- apply.monthly(spxTurb, mean) estimateMarkovRegimes(meanTurb, numOfStates=numOfStates) } getInflationRegime <- function(startDate=as.Date("1946-01-01"), endDate=Sys.Date(), numOfStates=2) { # Estimate two-state markov (CPI-based) inflation regime. # # Args: # startDate: date which to begin panel for regime estimation # endDate: end which to end panel for regime estimation # numOfStates: number of hidden states in regime # # Returns: hmmFit from HMMFit(), suitable for display with plotMarkovRegime() val <- 100 *dROC(getFREDData(symbol="CPIAUCNS", startDate=startDate, endDate=endDate)) estimateMarkovRegimes(val, numOfStates=numOfStates) } getGrowthRegime <- function(startDate=as.Date("1946-01-01"), endDate=as.Date("2012-12-31"), numOfStates=2) { # Estimate two-state markov (GDP-based) growth regime. # # Note: Growth regime appears to be bi-modal, and thus need to estimate # several times to get convergence on the regime reported by Kritzman. # # Args: # startDate: date which to begin panel for regime estimation # endDate: end which to end panel for regime estimation # numOfStates: number of hidden states in regime # # Returns: hmmFit from HMMFit(), suitable for display with plotMarkovRegime() val <- 100 * dROC(getFREDData(symbol="GDPC1", startDate=startDate, endDate=endDate)) estimateMarkovRegimes(val, numOfStates=numOfStates) } getCurrencyTurbulenceRegime <- function(startDate=as.Date("1971-01-01"), endDate=Sys.Date(), numOfStates=2) { # Estimate two-state markov (G10-based) currency turbulence regime. # # Args: # startDate: date which to begin panel for regime estimation # endDate: end which to end panel for regime estimation # numOfStates: number of hidden states in regime # # Returns: hmmFit from HMMFit(), suitable for display with plotMarkovRegime() g10rates <- getG10Currencies() avgg10rates <- xts(100 * rowMeans(dROC(g10rates), na.rm=T), order.by=last(index(g10rates), -1)) turbG10rates <- rollingTurbulence(avgg10rates, avgWidth=(250 * 3), covarWidth=(250 * 3)) meanTurbG10rates <- apply.monthly(turbG10rates, mean) estimateMarkovRegimes(meanTurbG10rates, numOfStates=numOfStates) } estimateMarkovRegimes <- function(val, numOfStates=2) { # Estimate n-state hidden markov model (HMM) for val. # # Args: # val: series # numOfStates: number of hidden states in HMM # # Returns: hmmFit from HMMFit(), suitable for display with plotMarkovRegime() hmmFit <- HMMFit(val, nStates=numOfStates) return (list(val=val, hmmFit=hmmFit)) } plotLocalDecodings <- function(regimes, symbols, plotDateRange="1900::2012", regimeNums) { # Plot local decodings for a list of HMM regimes, optionally over a set # date range. # # Args: # regimes: list of regimes, as returned by estimateMarkovRegimes() # symbols: list of human-readable symbols for regimes # plotDateRange: option date over which to plot regime local decodings # regimeNums: index of HMM regime, into regimes, to plot oldpar <- par(mfrow=c(1,1)) on.exit(par(oldpar)) layout(c(1,2,3,4)) # generate merge of local decodings localList <- lapply(c(1:length(regimes)), function(i) { regime <- regimes[[i]] fb <- forwardBackward(regime$hmmFit, regime$val) eu <- exp(fb$Alpha + fb$Beta - fb$LL) local <- xts(eu[,regimeNums[i]], index(regime$val))[plotDateRange] plota(local, type='l', plotX=T, col=drawColors[i], main=symbols[i]) }) } plotMarkovRegimes <- function(regime, symbol, plotDateRange="1900::2012", plotDensity=T, plotTimeSeries=T) { # Plot markov regimes from HMM: kernel densities and per-regime local decodings. # # Args: # hmmFit: fit for HMM, as generated by estimateMarkovRegimes() # symbol: human-readable description of series with markov regimes # plotDateRange: contiguous range of time which to plot val <- regime$val hmmFit <- regime$hmmFit # calculate local decoding fb <- forwardBackward(hmmFit, val) eu <- exp(fb$Alpha + fb$Beta - fb$LL) hmmMeans <- hmmFit$HMM$distribution$mean hmmSD <- sqrt(hmmFit$HMM$distribution$var) # plot kernel density with regime means oldpar <- par(mfrow=c(1,1)) on.exit(par(oldpar)) if (plotDensity) { plot(density(val), main=paste("Density with Regime Means:", symbol)) abline(v=mean(val), lty=2) sapply(c(1:length(hmmMeans)), function(i) { abline(v=hmmMeans[i], lty=2, col=drawColors[(i+1)]) curve(dnorm(x, hmmMeans[i], hmmSD[i]), add=T, lty=3, col=drawColors[(i+1)]) }) } # Plot time series of percent change and local decoding for each regime if (plotTimeSeries) { merged <- merge(val, eu) layout(c(1:(1+ncol(eu)))) plota(merged[,1][plotDateRange], type='l', paste("Regime:", symbol), plotX=F) sapply(c(1:length(hmmMeans)), function(i) { abline(h=hmmMeans[i], lty=2, col=drawColors[(i+1)]) }) plota.legend("Percent Change:", drawColors[1], last(merged[,1])) sapply(c(1:ncol(eu)), function(i) { plota(xts(merged[,(i+1)], index(val))[plotDateRange], type='l', plotX=(i==(ncol(eu))), col=drawColors[(i+1)]) plota.legend(paste0("Event Regime ", i, ":"), drawColors[(i+1)], last(merged[,(i+1)])) }) } } dROC <- function(x, n=1) { # Return discrete rate-of-change (ROC) for a series, without padding ROC(x, n, type="discrete", na.pad=F) }]]>

Yet, perhaps there is a common root cause at work, not yet stated: *implicit momentum bias*.

Let me break that down, as there is a bit of intentional double entendre. By “momentum”, I intend a sustained trend, whether conceptual or technical. By “implicit bias”, I intend a cognitive preference which is rarely stated and often partially or fully unconscious.

You can see this bias in numerous ways, affecting both blog readers and authors:

**Retail trading**: the vast majority of retail investors and traders are pursuing discretionary strategies which boil down to technical momentum, whether conceived explicitly (*e.g.*ad hoc moving average strategies or more arcane technical analysis voodoo) or implicitly (*e.g.*cocktail party stock tips or watching CNBC). Thus, the vast majority of retail exhibit this bias.**Asset managers**: many active asset managers are pursuing strategies which also essentially boil down to technical momentum, either due to stock picking or executing strategies aligned with historically strong long-term risk-adjusted returns of momentum. This is exemplified by a quick scan of the Top Papers from SSRN, for which 2 of 10 top are momentum papers (followed by lots more of varying sophistication). Thus, majority of asset managers exhibit this bias.**Quant funds**: many low-, and medium-frequency funds run strategies which are essentially momentum dressed in sophisticated clothing, often tracking one or more risk premia (whether due to anomalies or market structure) rather than raw prices. This is nicely exemplified by Ilmanen’s well-reviewed Expected Returns, who now works at AQR. Thus, many hedge funds exhibit this bias.**Mainstream financial media**: ratings are driven by viewers’ perception “something is happening”, which almost always occurs in strongly trending markets: bubbles or crashes. In contrast, range-bound markets are often remarked to be “boring” or “frustrating”, with corresponding tanking in ratings. Thus, mainstream media exhibits this bias.**Blogs / Twitter**: building readership depends on establishing a distinctive perspective or “voice”, enabling a blog or stream to stand out amongst all the noise. This demands the corresponding content to be formulaic in some predictable sense. Blogs have an even higher bar than Twitter, as blogs are expected to hold up as being “good” despite hindsight bias—meaning their shelf life must extend until they are read at some indeterminate point in the future. Falkenblog is a good example, which has been a tireless advocate of the low-vol anomaly for many years. Thus, many blogs and twitter exhibit this bias.**Alpha research**: much of the search for alpha is tested via historical backtests that span many years, invariably including both trending and range-bound regimes. Invariably, majority of both alpha and beta strategies perform best in momentum-driven trending markets, of which the past 30 years have mostly been; those strategies that perform well in range markets often are regime-aware, and trade different across regimes. Careful review of nearly all popular tactical and dynamic asset allocation strategies exhibit this (along with many exhibiting a pervasive, unstated bias from the 30-year bond bull run). Thus, alpha research often exhibits this bias.

Momentum originates from one of the most influential aspects of human psychology: confirmation bias. Simply put, humans “like to be right”: enjoying and finding comfort in writing and reading information that confirms their existing beliefs or hypotheses (scientifically attributable to stimulus-response dopamine effects).

Thus, range-bound markets are frustrating for many folks because they form opinions conceived as directional bets. For them, *their momentum bias generates significant pain in markets like this—which makes them tune-out*. This is also seen in hedge funds, which in aggregate have a comparatively poor record in range-bound markets. In contrast, many computers find enjoyment in markets which generally defy non-quantitative directional prediction.

Coming full circle back to Tadas’ question, perhaps this implicit momentum bias partially explains the durability of his blog—and, in doing so, perhaps contributes to the question he raised. He astutely blends the benefits of momentum, while avoiding its detriments:

- Is a “forecast free blog”, thus ensuring he cannot be wrong in hindsight
- Follows a well-established presentation style, thus ensuring his distinctive voice
- Aggregates and synthesizes from materials authored by others, ensuring both a constant stream of material and diversity of opposing perspectives to satisfy both sides of any debate
- Varies topical coverage on a daily basis to remain relevant and responsive to the trends which dominate investor and trader mindshare

If only we could all replicate alpha strategies as durably as his blog has effectively contributed to the financial blogosphere over the years.

]]>

Begin by taking a look at the time-series plot of intraday trade prices for GOOG from the 18th (from one exchange):

The tape clearly illustrates some unhappy traders, with what appears to be a gap down in the noon hour (followed by trading halt until 3pm). Intraday dynamics can be better understood with a time-series summary of the corresponding trades and quotes:

Although the price appears to gap when viewed on a low-frequency daily chart, drilling into the noon hour illustrates the contrary:

Rather than a single print gap, the price action evolved more slowly over 12:30 to 12:40. The spread quickly expanded after 12:30, presumably driven by the jump in volume subsequent to disclosure. A 5,000 share trade printed shortly after 12:32, as spreads were beginning to stabilize. This precipitated some fear and greed, as market makers blew the spreads out to $4. Spread dynamics after noon appear to be fairly typical response to uncertainty and toxicity, as the summary is qualitatively similar to other similar events such as the morning after Bear in 2008.

By way of comparison, below is plot for TAQ summary on 3:25 – 4pm:

Now that we are familiar with the TAQ action as measured in human chronology, let’s view it through a different lens—following the intuition from Mandelbrot (1963), Clark (1971), and most recently Easley *et al.* (2012).

To do so, begin by considering trades for GOOG from another day which is more normal, say the following day Oct 19. Start with human chronology, and calculate the time series of first differences for 1 and 5 minute bars. Next estimate the Epanechnikov kernel density from first differences. Finally, fit the corresponding Gaussian for the same first differences. Or, in R:

# first differences min1Diff <- diff(to.minutes(trades$price)[,4], na.pad=F) min10Diff <- diff(to.minutes5(trades$price)[,4], na.pad=F) # kernel densities min1Density <- density(min1Diff, kernel="epanechnikov") min10Density <- density(min10Diff, kernel="epanechnikov") # corresponding normals min1Normal <- fitdistr(coredata(min1Diff), "normal")$estimate min10Normal <- fitdistr(coredata(min10Diff), "normal")$estimate

Which are plotted below for Oct 19, with solid red being 1-minute difference density and solid blue being 5-minute difference bars (ignore black for now); Gaussian fits for the same differences of each are dashed:

This reproduces the familiar stylized distribution of high-frequency returns, consistent with Figure 1 from Clark (1971) and Exhibit 2 from Easley *et al.* (2012). Although certainly leptokurtic, the distributions are not too dissimilar from normal.

Now, ignore chronological time and instead consider the trades from the perspective of a *clock based on volume of shares traded over the day*. In other words, generate a new process defined as the sequence of prices coinciding with shares traded on equal-sized partitions of total volume over the day. This process generates a “volume clock” for the trades, in doing so exemplifying a beautiful time-series transformation.

A similar kernel density can be estimated from this trade volume clock, which is plotted above in black (with dashed corresponding normal fit), assuming partition size of 50. This exhibits distribution characteristics consistent with Exhibit 2 from Easley *et al.* (2012).

With intuition on volume clock densities, we can finally take a look at Oct 18 through the lens of a computer via the following plot:

Needless to say, this plot for Oct 18 looks nothing like the previous plot for Oct 19. The normal fits for both 5-min bars (blue) and volume clock (black) are especially interesting, as they have little resemblance to their corresponding kernel estimates. Care is required for trading on days that look like this.

For those interested to learn more, see Clark (1971) for theory, including subordinated stochastic processes (originally due to Bochner (1960)). Ané and Geman (2000) apply a bit more sophistication, including arguing in favor of cumulative transaction count rather than cumulative traded volume. Perhaps worth a follow-up post to see if this still makes sense, given current era of predominantly algorithmic program trading.

The R code to generate the above plots is:

plot(trades20121018$price, main="GOOG Trades (2012-10-18)") plotTAQSummary("GOOG", "2012-10-18", as.POSIXct("2012-10-18 09:30:00"), as.POSIXct("2012-10-18 16:05:00")) plotTAQSummary("GOOG", "2012-10-18 | 12:00 - 13:00", as.POSIXct("2012-10-18 12:00:00"), as.POSIXct("2012-10-18 13:00:00")) plotTAQSummary("GOOG", "2012-10-18 | 15:00 - 16:00", as.POSIXct("2012-10-18 15:25:00"), as.POSIXct("2012-10-18 16:05:00")) plotTimeChangeDensity("GOOG", as.Date("2012-10-19"), trades20121019) plotTimeChangeDensity("GOOG", as.Date("2012-10-18"), trades20121018)

And, a bit of literature on subordinated processes, volume clocks, and price processes:

- Bochner, Harmonic Analysis and the Theory of Probability, University of California Press, Berkeley, 1960.
- Mandelbrot, “The Variation of Certain Speculative Prices”, Journal of Business, Vol. 36 (1963), p. 394-419.
- Clark, “A Subordinated Stochastic Process Model with Finite Variance for Speculative Prices”. Center for Economic Research. University of Minnesota, 1971.
- Ané and Geman, “Order Flow, Transaction Clock, and Normality of Asset Returns”, Journal of Finance, Vol. 55, No. 5 (2000), pp. 2259-2284.
- Murphy and Izzeldin, “Order Flow, Transaction Clock, and Normality of Asset Returns: Ané and Geman (2000) Revisited”, 2006.
- Easley, de Prado, and O’Hara, “The Volume Clock: Insights into the High Frequency Paradigm”. Journal of Portfolio Management, May 2012.

Few folks could be blamed for such flippancy, as it was mostly harmless throughout the great moderation. In fact, traders took apparent pride in their ignorance of macro—except the global macro guys, obviously. Then, along came a credit crisis.

With that swan, Quantivity concluded it was high time to formulate a *systematic* macro perspective: a “top down” complement to calibrate “bottom up” quant models. Quantivity brought great humility to this effort, due to both intrinsic complexity and comparatively weaker background in macro.

This post kicks off a few thoughts derived from this effort; hopefully a welcome addition alongside micro analysis. Two caveats are worth noting. First, confirmation bias is particularly dangerous with macro, and thus emphasis is placed on broadly considering diverse viewpoints. Second, these thoughts are posted with a bit of trepidation, given intense desire to avoid politics and policymaking.

Understanding macro is built from formulating intuition around two themes:

- Combining both
*backward-*and*forward-looking*perspectives - Contextualizing sovereign and institutional politics within those joint perspectives, in real-time

Backward-looking perspective requires careful study of economic history—admittedly, not a topic many folks enjoy reading at bedtime. Much more empirically challenging is the statistical rarity of credit crises; only twice in the past century: globally 1927 – 1934 and in Japan 1989 – 2005.

The following are excellent long form introductions (ignore Koo’s ridiculous title):

- Does Central Bank Independence Frustrate the Optimal Fiscal-Monetary Policy Mix in a Liquidity Trap?, by McCulley and Pozsar (2012)
- Holy Grail of Macroeconomics, by Koo (2011)
- Has Financial Development Made the World Riskier?, by Rajan (2005)

The first two eloquently summarize history and key dynamics of the current macroeconomic climate, including *credit crisis*, *balance sheet recession*, and *liquidity trap*. Rajan presents a shockingly prescient perspective on the impact and dangers caused by financial development, equally applicable today as in 2005.

Arguably the most important collective insight is the *remarkably* important role which *economic orthodoxy* plays among market participants (including regulators). McCulley and Pozsar (p. 3) nicely summarize as:

Intellectual paralysis borne of inertia from dogma that, in the present circumstances, do not apply.

Neatly tucked into this definition are three fundamental scaffolds:

**Dogma**: understanding what is the prevailing*conventional*economic “wisdom”*among market participants*, in sufficient detail to trade on it; in other words, economic theory, market structure, regulation, secular demographic trends, sociopolitical context,*etc*.**Present circumstances**: accessing and understanding*real-time*economic data and analysis, again in sufficient detail; in other words, market data and corresponding econometric analysis**Applicability**: speculating when dogmatic wisdom is applicable to present circumstances, versus when it is not and thus resulting in intellectual inertia

Although in a new guise, these three boil down to a familiar trading concept: contrarian versus trend following. When dogma is appropriate for present circumstances, then follow the trend; if not, then go contrarian. The challenge is having to make this call in *real-time* based on incomplete information, simultaneously understanding all three in sufficient confidence to risk capital.

Towards this end, FRED is a nice source for macro data. Given data, next is building intuition and understanding of dogmatic applicability via analysis and synthesis while seeking to minimize confirmation bias.

The blogosphere is as good source for this as any (certainly better than nearly all sell side research), although much of it suffers from varying degrees of partisanship and politicking. A few exemplar blogs, from the roll, include:

- Econbrowser, by James Hamilton
- Interfluidity, by Steve Waldman
- Macro Man

This topic is known as *direction-of-change* forecasting in the literature. Needless to say, successful prediction of the sign for future returns is quite interesting from a trading perspective. Traditionally, only univariate return series were considered; Anatolyev (2008) is an exception, modeling two or more interrelated markets via dependence ratios. This literature tends to be a bit obtuse, due to commonly unstated stylistic assumptions regarding conditional return dynamics.

The traditional formulation for this topic considers the estimation of the probabilities of returns exceeding an upper or lower threshold , optionally conditioned on an information set from the previous time step:

If , the probabilities reduce to forecasting positive or negative returns; for trading, a natural choice for is roundtrip transaction costs:

Estimating these probabilities can be undertaken via several techniques. One approach is to use a logit model, based upon the logistic function:

Where are explanatory variables from the previous time step. Challenge of this model is proper selection of explanatory variables.

An alternative approach is to consider the following functional decomposition for univariate return series:

Where is the conditional expected value, is the conditional variance, and is a martingale with zero mean, unity variance, and conditional distribution function . From which the direction of change probabilities can be expressed:

With corresponding conditional expectations:

These expectations simplify to the following when , assuming (otherwise, expectation is constant and thus uninteresting):

These expectations can be evaluated explicitly via calculating the empirical distribution function (requiring assumption of a parametric distribution), where is the indicator function:

Alternatively, this decomposition suggests one potential formulation for the logit parameters from the above model, where is estimated by the logit and are historical observations:

Of course, the non-trivial work is generating forecast estimates for next-step average conditional return and conditional variance .

An alternative way to model the logit parameters is to apply ARMA intuition with a binary autoregression (BARMA) due to Startz (2006), including lags for both autoregressive parameters and past indicator values, due to Anatolyev (2008):

A survey of estimation techniques may be considered in a subsequent post, depending on reader interest.

One exploratory analysis technique relevant to sign forecasting is visualizing up/down runs, signed difference (*i.e.* up-down), and corresponding averages for a return series.

returnRuns <- function(r, bound=0, doPlot=TRUE, startAvg=5, avgLen=-1) { # Generate up/down runs and average runs for a return series, optionally # plotting them. # # Args: # r: return series # bound: symmetric upper and lower bound, aka c # doPlot: flag indicating whether plots should be generated for runs # startAvg: Number of average runs which should be excluded for # eliminating unstable average with few leading observations # avgLen: number of periods over which to generate average; of -1 for # entire period # # Returns: none up <- cumsum(ifelse(r > bound, 1, 0)) down <- cumsum(ifelse(r < -bound, 1, 0)) if (doPlot) { plot(up, main='Signed Runs: Up & Down', ylim=range(up,down)) lines(down, col='red') legend("topleft",legend=c("Up","Down"), fill=colors, cex=0.5) plot(up-down, main="Signed Run Difference (up-down)") } if (avgLen == -1) { avgUp <- xts(sapply(c(1:length(up)), function(i) { up[i]/i }), order.by=index(up)) avgDown <- xts(sapply(c(1:length(down)), function(i) { down[i]/i }), order.by=index(up)) } else { avgUp <- xts(sapply(c(avgLen:length(up)), function(i) { start <- i - avgLen + 1 last(cumsum(ifelse(r[start:i] > bound, 1, 0))) / avgLen }), order.by=index(up[avgLen:length(up)])) avgDown <- xts(sapply(c(avgLen:length(down)), function(i) { start <- i - avgLen + 1 last(cumsum(ifelse(r[start:i] < bound, 1, 0))) / avgLen }), order.by=index(up[avgLen:length(up)])) } if (doPlot) { n <- length(avgUp) plot(avgUp[startAvg:n], main=paste("Average Runs: Up & Down (",avgLen," periods)",sep=""), type='l', ylim=range(avgUp,avgDown)) lines(avgDown[startAvg:n], col='red') legend("topleft",legend=c("Up","Down"), fill=colors, cex=0.5) } return (list(up=up, down=down, avgUp=avgUp, avgDown=avgDown)) }

For example, the following plots illustrate CRM run dynamics from 2005 to present. First plot illustrates the running sums for both up and down returns, indicating negative returns are more prevalent:

Second plot illustrates the difference in signed sums, showing time-dynamics for the difference in up and down returns. Not surprising, this difference mirrors the CRM price curve closely:

Third plot illustrates the average probabilities for both up and down, running incrementally over the entire timeframe:

The following are representative papers from the direction-of-change literature, ignoring the early papers focused on evaluating market efficiency (*e.g.* run tests):

- Forecasting Stock Indices: A Comparison of Classification and Level Estimation Models, by Leunga, Daoukb, and Chenc (2000)
- Financial, Asset Returns, Market Timing, and Volatility Dynamics, by Christoffersen and Diebold (2002)
- Direction-of-Change Forecasts Based on Conditional Variance, Skewness and Kurtosis Dynamics: International Evidence, by Christoffersen
*et al*(2004) - Are the Directions of Stock Price Changes Predictable? Statistical Theory and Evidence, by Hong and Chung (2003)
- Evaluating Direction-of-change Forecasting: Neurofuzzy Models vs. Neural Networks, by Bekiros and Georgoutsos (2005)
- Modeling Financial Return Dynamics via Decomposition, by Anatolyev and Gospodinov (2007)
- Direction-of-change Forecasts and Trading Strategy Profitability at Intra-Day Horizons, by Deliya (2007)
- Multi-Market Direction-of-Change Modeling Using Dependence Ratios, by Anatolyev (2008)
- Forecasting the Direction of the U.S. Stock Market with Dynamic Binary Probit Models, by Nyberg (2008)
- Direction-of-Change Financial Time Series Forecasting using Bayesian Learning for MLPs, by Skabar (2008)
- A Kernel-Based Technique for Direction-of-Change Financial Time Series Forecasting, by Skabar (2008)
- Optimal Probabilistic and Directional Predictions of Financial Returns, by Thomakos and Wang (2009)
- Directional Prediction of Returns under Asymmetric Loss: Direct and Indirect Approaches, by Anatolyev and Kryzhanovskaya (2009)
- Markets Change Every Day: Evidence from the Memory of Trade Direction, Skouras and Axioglou (2011)

Finally, Kinlay briefly surveyed this topic in two posts on volatility sign prediction.

]]>This focus is *not* behavioral finance, in search of anomalies driven by cognitive biases divergent from equilibrium (although majority do that too). Rather asking inferential sociological questions, such as: was the market “efficient”, in the Fama sense, during the post-war decades prior to 2000 *because people expected it to be* (blissfully ignoring a few hiccups); in contrast to how it is commonly understood and formalized, with reverse causality: market is assumed to be efficient, thus people understand it as such.

Similarly, have the past 15 years been “inefficient”, in the bubble and anomaly sense, *because* cultural faith among investors in such “efficiency” was lost; or, did they lose faith because the market became inefficient? Big difference.

In other words: *is finance governed by physics, biology, or Peltzman*?

The traditional answer of *market hypothesis*, provided by financial economics via microeconomic principles of equilibrium and efficiency: causality flows from market to investor. This explanation comes in two variants, known by their colloquial analogical fields:

**Physics**: market is governed by*immutable mathematical principles*and can be formalized into coherent*predictive*models, either in favor or contradiction of excess returns; exemplified by classic weak/strong EMH theory**Biology**: market is governed by evolutionary principles ala Darwin, as exemplified by Lo’s 2004 AMH article: “Very existence of active liquid financial markets implies that profit opportunities must be present. As they are exploited, they disappear. But new opportunities are also constantly being created as certain species die out, as others are born, and as institutions and business conditions change.” (p. 24)

Yet, both these explanations suffer from implicitly begging the question: conjure “a market” with desired attributes and then derive conclusions. The physics perspective assumes immutability, conceivability, and mathematical expressiveness for its hypothesized market. While the biology perspective endows the hypothesized market with even more sophisticated Darwinian traits, presumably driven by underlying physical principles so inscrutable as to defy mathematical formalization.

An alternative explanation is to apply the self-fulfilling Peltzman effect to financial markets, and reverse causality: *markets behave as they do because of investor sociology*, rather than arising emergent from implicit cooperation of equilibrium-seeking rational microeconomic agents.

In other words: when investors *believe* the market is rational (irrespective of whether that belief is well-founded), then they embody Dunning-Kruger by *ex ante* faithfully dumping money into their 401K each month; in doing so collectively, the investment management industry undertakes its rent seeking activity resulting in the market possessing *ex post* “efficient” characteristics. Conversely, when investors believe the market is irrational, they either: go to cash, pursue uninformed non-collective trading, or both. Both of which result in anomalous market behavior, uncontrollable by the industry, either due to decreased liquidity or absence of predictable momentum.

If the market is indeed Peltzmanian, then the real question is how to best quantify and model *primary and spillover effects* resulting from investor sociology as they unfold ephemerally.

Recall is *unobserved*, and thus the model cannot be directly estimated via MLE. Thus, need to decide how to approach estimation for this latent variable. One way is to be naive, and simply assume is the deterministic difference in return between stock and index (technically, this generates a profile likelihood as formalized by Severini and Wong [1992], which Murphy and Van Der Vaart [2000] verify is well-behaved consistent with exact likelihood):

This assumption permits focus on estimating , providing insight into the *mixing behavior* of the return being decomposed: if a stock return behaves like its index, then mixing is low with small (in the limit, when a stock behaves identical to its index, as no mixing is required); in contrast, the stock return behaves independent from its index on a regular basis, then mixing is high with a large .

Autocorrelation of and is worth consideration, as that helps determine whether time indexing is required for . For returns with insignificant autocorrelation (common for *signed* equity returns), the time index is dropped and a single is estimated. Yet, *conditional dependence* often exists between and , consistent with previous posts in the Proxy / Cross Hedging series (illustrating r-z copula for CRM / QQQ example below):

Use of this identity for transforms the decomposition model into:

The model is further simplified into a familiar *independent mixture model* by dropping sign and , and estimating via MLE using density and return distribution functions :

MLE estimation requires assumption of parametric distributions for , of which common choices from the literature are normal, student-t, skew-t, or skew hyperbolic student-t (Aas and Haff [2006]). Next question is how to estimate the parameters: and family of parameters (*e.g.* and if is assumed to be normal). As is observed, one way to proceed is via two-step estimation:

- Estimate parameters via MLE from
- Jointly estimate and parameters via MLE on the mixture, holding parameters constant

For both, recall the likelihood , and log likelihood , are defined as:

From which MLE of the mixture is maximization of the likelihood over , where log is chosen for numeric stability:

This optimization can be performed numerically in R via minimization using `DEoptim`

of the negative log likelihood `negLogLikeFun`

(negative is due to minimization in `DEoptim`

versus maximization of ). `DEoptim`

is chosen due to rapid convergence on non-smooth global optimizations.

For example, continuing the example of CRM / QQQ introduced in the previous posts on Proxy / Cross Hedging generates the results:

> symbols <- c("CRM", "QQQ") > endDate <- Sys.Date() > startDate <- endDate - as.difftime(52*5, unit="weeks") > quoteType <- "Close" > p <- do.call(cbind, lapply(symbols, get.hist.quote, start=startDate, end=endDate, quote=quoteType)) > colnames(p) <- symbols > doReturnDecomp(p) normal mix likelihood: -3485.55 phi1 params: 0.0003471366 0.01673634 params 0.2546208 -0.001208877 0.0113988 skew-t mix likelihood: -3566.512 phi1 params: 0.003844969 0.01079941 -0.3252923 2.893228 params 0.2357737 -0.004188977 0.01099266 0.4157174 26.5643 skew-hyp-t likelihood: -3083.700 phi1 params: 0.01051675 0.1485529 -3.945452 10.10836 params 0.8295289 -0.0003636940 0.03071332 -0.5 5

These results correspond to the following density functions for the skew-t mixture:

One interesting observation of these densities is their location parameters are on opposing side of zero: has positive location, while has negative location. One interpretation of this is positive returns from CRM disproportionately originate from the idiosyncratic , while negative returns originate from the index . Economically, this is plausible: positive news is often idiosyncratic, while negative news is often market-wide.

Several additional inferences can be drawn from these results:

- Model selection: likelihood suggests skew-t is the preferred model, indicating long tails and skewness (matching stylized facts)
- Mixing: indicating that over 75% of CRM returns are determined by the corresponding QQQ index; remaining 25% are determined by the unobserved return series
- Tails: CRM df = 2.89 which indicates significantly thicker tails than QQQ df = 26.56 (matching stylized facts for individual stocks versus indices)

Subsequent posts may consider alternative estimation techniques for this model.

R code for generating two-stage MLE estimation of return decomposition via mixing:

library("MASS") library("stats") library("DEoptim") library("sn") library("SkewHyperbolic") normalMixtureIndexDecomp <- function(r, i) { # Two-step MLE estimation of return decomposition model, assuming both # return distributions are normal. # # Args: # r: return series being decomposed # i: index series used for decomposition # # Return value: MLE parameter estimates z <- r - i id <- fitdistr(i, "normal")$estimate negLogLikeFun <- function(p) { a <- p[1]; mu1 <- p[2]; s1 <- p[3]; ll <- (-sum(log(a * dnorm(z,mu1,s1) + (1 - a) * dnorm(i, id[1], id[2])))); return (ll); } mle <- DEoptim(negLogLikeFun, c(0, -0.5, 0), c(1, .5, .5), control=list(trace=FALSE)) cat("normal mix likelihood:", last(mle$member$bestvalit), "phi1 params:",id, "params", last(mle$member$bestmemit),"\n") mle <- last(mle$member$bestmemit) x <- seq(-.25,.25,length.out=500) dnorm1 <- dnorm(x, id[1], id[2]) dnorm2 <- dst(x, mle[2], mle[3]) plot(x, dnorm1, type='l', ylim=c(0, max(dnorm1,dnorm2)), ylab="Density", main="Normal Mixture") lines(x, dnorm2, col='red') abline(v=id[1], lty=2) abline(v=mle[2], col='red', lty=2) legend("topleft",legend=c("phi1", "phi2"), fill=colors, cex=0.5) return (mle) } mixtureSkewTIndexDecomp <- function(r, i) { # Two-step MLE estimation of return decomposition model, assuming both # return distributions are skew-t. # # Args: # r: return series being decomposed # i: index series used for decomposition # # Return value: MLE parameter estimates z <- r - i idp <- st.mle(y=i)$dp negLogLikeFun <- function(p) { a <- p[1]; mu1 <- p[2]; s1 <- p[3]; s2 <- p[4]; df1 <- p[5] ll <- (-sum(log(a * dst(z,location=mu1,scale=s1,shape=s2,df=df1) + (1 - a) * dst(i, dp=idp)))); return (ll); } mle <- DEoptim(negLogLikeFun, c(0, -0.5, 0, 0, 2), c(1, .5, .5, 5, 50), control=list(trace=FALSE)) cat("skew-t mix likelihood:", last(mle$member$bestvalit), "phi1 params:", idp, "params", last(mle$member$bestmemit),"\n") mle <- last(mle$member$bestmemit) x <- seq(-.25,.25,length.out=500) dst1 <- dst(x, dp=idp) dst2 <- dst(x, dp=mle[2:5]) plot(x, dst1, type='l', ylim=c(0, max(dst1,dst2)), ylab="Density", main="Skew T Mixture") lines(x, dst2, col='red') abline(v=idp[1], lty=2) abline(v=mle[2], col='red', lty=2) legend("topleft",legend=c("phi1", "phi2"), fill=colors, cex=0.5) return (mle) } mixtureSkewHypTIndexDecomp <- function(r, i) { # Two-step MLE estimation of return decomposition model, assuming both # return distributions are skew hyperbolic student-t. # # Args: # r: return series being decomposed # i: index series used for decomposition # # Return value: MLE parameter estimates z <- r - i iparam <- skewhypFit(i,plots=FALSE,printOut=FALSE)$param negLogLikeFun <- function(p) { a <- p[1]; ll <- (-sum(log(a * dskewhyp(z,param=p[2:5]) + (1 - a) * dskewhyp(i, param=iparam)))); return (ll); } mle <- DEoptim(negLogLikeFun, c(0, -5, 0, -0.5, 0), c(1, 5, .5, -0.5, 5), control=list(trace=FALSE)) cat("skew-hyp-t likelihood:", last(mle$member$bestvalit), "phi1 params:",iparam,"params", last(mle$member$bestmemit),"\n") mle <- last(mle$member$bestmemit) x <- seq(-.25,.25,length.out=500) dskewhyp1 <- dskewhyp(x, param=iparam) dskewhyp2 <- dskewhyp(x, param=mle[2:5]) plot(x, dskewhyp1, type='l', ylim=c(0, max(dskewhyp1,dskewhyp2)), ylab="Density", main="Skew Hyperbolic Student-T") lines(x, dskewhyp2, col='red') abline(v=iparam[1], lty=2) abline(v=mle[2], col='red', lty=2) legend("topleft",legend=c("phi1", "phi2"), fill=colors, cex=0.5) return (mle) } doReturnDecomp <- function(p) { # Decompose return of two series, using several parametric distributions. # # Args: # p: p[,1] is return being decomposed; p[,2] is index returns # # Return value: none r <- ROC(p[,1], type="discrete", na.pad=FALSE) i <- ROC(p[,2], type="discrete", na.pad=FALSE) normalMixtureIndexDecomp(r, i) mixtureSkewTIndexDecomp(r,i) mixtureSkewHypTIndexDecomp(r,i) }]]>

Moreover, it is misunderstood—even by many who have smelled it up close personally via big trading loses on hedged positions. Aaron Brown’s most recent text, Red-Blooded Risk, explains why.

In doing so, it is *simultaneously brilliant and flawed*. For the former, Brown deserves credit; for the latter, the publisher presumably deserves most of the blame.

First, the brilliance; summarized in one word, with two intended meanings: *pragmatics*. Oh yeah, the book includes 漫画-style comic strips, helpfully provided as idiot self-detectors.

First, well-known meaning is the philosophical tradition linking practice and theory. Arguably unique for risk books, Brown builds deep intuition around the concept of risk and its manifestation from structural to marked positions to historical roots in tulips. Brown has clearly lived and breathed risk management for many years (self-proclaimed from its modern origin in the 1980s), and that wisdom shines through via first-person prose combining insight, intuition, arrogance, pride, greed, humility, regret, condescension, and insecurity. Perhaps the first book ever to make VaR sound geek-sexy—*with nary a technical definition*.

Second, lesser-known meaning is the subfield of linguistics which investigates ways in which context contributes to meaning. Unlike technical risk texts, Brown spends much of the book deep diving into context and letting meaning emanate from therein. As he states, “different aspects are easier to understand from different vantages” (p. 57). As a reader who never personally worked at a big bank, this is deeply informative for attuning mental models (akin to Harris for trading mechanics, Rebonato for derivatives, and Taleb for hedging). Explaining the lineage of quant hedge funds through the corresponding frequentist versus Bayesian disposition of their founders is fascinating, and indeed makes sense in retrospect. Slicing away the credibility mystique and exposing the raw underbelly of banks, down to explaining front, middle, and back office in depth. For readers familiar with disciplined metrics-driven tech companies, the apparent intellectual and technical sloppiness of big banks is simply jaw dropping.

One excerpt simply must be quoted, beginning of Chapter 4, as it is perhaps the most accurate and beautiful summary of self-entitlement believed by geeks with -IQ time immemorial:

The rocket scientists came together on Wall Street in the 1980s and began the process that eventually explained the modern concept of probability and reconstructed the global financial system. We were not individually ambitious. All we wanted was to make more money than any rational person could spend, without ever putting on a tie or being polite to anyone we didn’t like. We didn’t have any use for the money, except for maybe some books and cool computer equipment. We didn’t want to throw (or go to) fancy parties or buy political power—and we didn’t spend it on cars, jewelry, or places to live, and least of all on clothes. We’d probably give the money away, but until then, it would give us the power to say “f- you” to anyone, except that we were mostly pretty soft-spoken and civil in our expressions

Now, the flawed parts.

One reviewer caveat worth advance mention is 9 books out of 10 read by Quantivity have content dense with math, code, or both. On the positive side, reading of Brown’s book indicates positive noteworthiness due to its statistical abnormality (given it has neither); on the negative side, any review is biased through such lens.

First, *lots* of effort was expended by the publisher making this text appeal to a mass audience, clearly rushing to fill the void of perceived post-financial crisis publishing opportunity. From the ridiculous title (and cover) to the silly use of “secret history” meme to hilarious tongue-in-cheek back cover reviewer comments by Gatheral, Taleb, Wilmott, and Thorp. Parts of the book are prone to hyperbole, which read like they were edited in for sales effect. While these nuisances detract credibility, such can be ignored and arguably contributes the positive benefit of reducing its purchase price to mass market (*i.e.* under $25).

Second, the book lacks unifying organization. While Brown provides the following disclaimer for such, the editor equally deserves some blame (p. 57):

If I had all the theory worked out, I could write a textbook organized in logical sequence. Instead, I’m going to intersperse theoretical discussions with accounts of the development of the ideas.

While this makes sense, it is admittedly a bit jarring to see that disclaimer juxtaposed alongside supposition of having “explained the modern concept of probability”. Thus, the reader is left wondering whether perhaps either of the following may be true:

- Brown has a theory of risk, but was refused in editing due to overabundance of equations
- Brown has a new theory of probability, but could not muster a theory of risk

Either are intriguing, although perhaps the former seems more likely as Brown includes the following tongue-in-cheek disclaimer regarding use of a tiny bit of high school-level math included in Chapter 5 (p. 73):

Warning, this chapter contains a little math. It’s nothing intimidating, mostly multiplication and some simple algebra, but I know a lot of people don’t like it. If that describes you, I urge you to read the chapter anyway. It’s one of the most important in the book. You can skip the math and get the ideas anyway.

Having never met Brown and thus unfamiliar with his personality, cannot escape the sense that he is making gentle fun of readers which possess such bias. Either way, it’s amusing.

While modest disorganization is a textual flaw, astute readers may perhaps perceive it as *subtle financial opportunity*: if such theory was sufficiently well-defined to warrant standard textbook treatment, then there would undoubtedly be much less juice possible from doing it really well.

In either case, remedy for this shortcoming is to read the book in as few distinct sittings as possible. Having read it in two sittings, the wisdom was able to percolate together and expand personal mental models nicely. Well worth the read.

]]>

This technique finds surprisingly often use in quant models.

Ongoing analysis and trading based on proxy hedging, exemplified by series beginning with Proxy / Cross Hedging, suggests potential for an equity decomposition model based on the relationship between returns of a stock and its corresponding index :

To explain this model, let’s build it up from intuition.

To begin, consider a trading observation: interday returns of individual stocks have a subtle relationship with their corresponding index. On some days, return for a given stock follows its index; other days, returns of stock and index diverge strongly. This distinction in behavior is commonly attributed to stock-specific “news”, interpreted broadly—whether known publicly or only privately.

This intuition can be formalized into two-state regime:

**Uninformed regime**: stock return follows an index , scaled by a proportional factor**Informed regime**: stock return follows an idiosyncratic path , conditionally independent of its index

Relationship between regimes can be modeled in two ways via . A switching model arises when regimes are binary: . An ensemble model arises when regimes are smooth: . For the latter, can be understood as proportional decomposition weighting of the respective return series, and thus can provide smooth mixing between the regimes. Finally, sign of returns are explicitly decomposed as , acknowledging greater regularity of absolute-valued return series.

Worth noting is the following are *latent* variables: idiosyncratic path from the informed regime, proportional factor , and regime parameter . Obviously, challenge of this model lies in their estimation. One potential trick is to exploit triangular relationships, as described below.

One stylized fact *not* explicitly accommodated by this model is well-known *asymmetry of uninformed regimes*, arising from analysis of market breadth: stocks uniformly go down together (think big down days), but much less often uniformly go up together (majority of rallies). Unclear whether this fact naturally arises via or needs to be explicitly modeled.

Readers familiar with machine learning (ML) may recognize how to reformulate this as an *additive model*:

Where .

This model can be interpreted in numerous ML ways, depending on the desired objective. For example, and can be interpreted as basis functions. Alternatively, boosting can be applied by interpreting them as weak classifiers. Graphical models can be applied by introducing conditional dependence between , , and . Hierarchical models and decision trees naturally arise when and are further functionally decomposed.

Given this model, an interesting question is how to use it *predicatively*—whether directional or not. For example, combining models for two stocks which share a common index to introduce the notion of equity triangle arbitrage on the joint .

Head back to curation and watch new algos emerge on top of that next-gen curation again. Think of Twitter as a new stab at curation. Curated sites will re-seed a new generation of algorithmic search sites. In short, curation is the new search.

Indeed, intent of curation here is to maintain *high signal-to-noise ratio* for a mix of preprint and classics in a *highly-specialized* literature (*i.e.* combo of retail and prop ) for which strong motivation exists elsewhere to obfuscate; and search over the stream provides ability to both rewind time and to integrate *conceptual connectivity* spanning time.

One addition being contemplated is keyword search over all literature cited in feed, providing *deep content search* over the feed. Although, unclear yet what is the best technical avenue to implement this (please comment, if you have suggestions).

So, with this positive start, curation input set is being modestly expanded to coincide with increased personal research activity and availability of several new quant sources—*while maintaining the same focus and high signal-to-noise goal*. Specifically, curation is expanding to include the following SSRN working papers: ARPM Series and JEL Codes G11 (Portfolio Choice), G12 (Asset Pricing), G13 (Contingent Pricing; Futures Pricing), G14 (Information and Market Efficiency), C21 (Cross-Sectional Models), C22 (Time-Series Models), C51 (Model Construction and Estimation), and C53 (Forecasting and Other Model Applications). Selection of JEL codes is data-driven: feed links were ranked by JEL classification and most cited classifications were chosen.

Authors are encouraged to ensure correct use of JEL codes, to ensure your articles are picked up.

Curious what readers think? Are there other high-value sources worth adding to curation input set? What else could make this more useful?

]]>