Skip to content

Many types of trading analysis problems boil down to a combination of regression and ranking, in one guise or another (e.g. classical or minimizing custom loss functions). Yet, the relationship between these two techniques is subtle and their interdependence subject to myriad practical difficulties. One familiar example is the lack of necessary performance equivalence, meaning excellent regression may result in poor ranking and vice versa.

KDD-2010 recently included a paper, Combined Ranking and Regression by Sculley, which describes an approach combining both techniques by simultaneously optimizing dual objective functions. Specifically, from p. 1:

Model that performs well on two distinct families of metrics. The first set of metrics are regression based metrics, such as Mean Squared Error, which reward a model for predicting a numerical value y′ that is near to the true target value y for a given example, and penalize predictions far from y. The second set are rank-based metrics, such as Area under the ROC curve (AUC), which reward a model for producing predicted values with the same pairwise ordering y′1 > y′2 as the true values y1 > y2 for a pair of given examples.

Purported benefits of this combined approach are quite interesting, given financial data:

• Stability: Guards against learning degenerate models that perform well on one set of metrics but poorly on another
• Non-normal distributions: Improve regression performance in the case of rare events, including long-tailed and extreme minority class distributions

The optimization objective is, (3) from p. 3:

$\min_{{\bf w} \in \mathbb{R}^m} [ \alpha L({\bf w}, D) + (1 - \alpha) L({\bf w}, P) + \frac{\lambda}{2} ||{\bf w}||^2_2 ]$

Where $L({\bf w},D)$ is regression loss, $L({\bf w}, P)$ is ranking loss, and $\alpha \in [0, 1]$ is loss weight parameter. Algorithm uses stochastic gradient descent.

Advertisements
2 Comments leave one →
1. September 6, 2010 5:17 pm

Interesting paper. Its great to see you posting again.