Feature Neutral Correlation

Motivation

Feature neutral correlation (FNC) is the correlation of a model with the target, after its predictions have been neutralized to all of Numerai’s features.
A model that is overly reliant on a small set of features will have a low FNC, but might still have a high correlation in the short term. However, it is also more likely to burn significantly in the long term.
A model that uses a diverse set of features and is still correlated with the targets will have a high FNC, and is more likely to have consistent performance over the long term.

Calculation

To calculate a user's FNC for a given round we
  • Normalize the predictions in their submission
  • Neutralize their submission to Numerai's features for that round
  • Calculate the Spearman rank-order correlation of their neutralized submission to the target
1
def calculate_fnc(sub, targets, features):
2
"""
3
Args:
4
sub (pd.Series)
5
targets (pd.Series)
6
features (pd.DataFrame)
7
"""
8
9
# Normalize submission
10
sub = (sub.rank(method="first").values - 0.5) / len(sub)
11
12
# Neutralize submission to features
13
f = features.values
14
sub -= f.dot(np.linalg.pinv(f).dot(sub))
15
sub /= sub.std()
16
17
sub = pd.Series(np.squeeze(sub)) # Convert np.ndarray to pd.Series
18
19
# FNC: Spearman rank-order correlation of neutralized submission to target
20
fnc = np.corrcoef(sub.rank(pct=True, method="first"), targets)[0, 1]
21
22
return fnc
Copied!

Discussion

Read more about feature neutralization and feature exposure here.
Last modified 1mo ago