Numerai Tournament
Ask or search…
K

Meta Model Contribution (MMC & BMC)

What is MMC & BMC?

Meta Model Contribution (MMC) is the covariance of a model with the target, after its predictions have been neutralized to the Meta Model. Similarly, Benchmark Model Contribution (BMC) is the covariance of a model with the target, after its predictions have been neutralized to the stake-weighted Benchmark Models.
These metrics tell us how the unique component of a model contributes to the correlation of the Meta Model (or the Benchmark Models in the case of BMC). By neutralizing the model's predictions by the Meta Model or Benchmark Models, the remaining orthogonal component's covariance with the target is that model's contribution.

Calculation

To calculate a user's MMC for a given round we
  • Normalize the predictions in their submission
  • Normalize the Meta Model
  • Neutralize their submission to the Meta Model
  • Find the covariance of the neutral submission with the target
```python
def contribution(
predictions: pd.DataFrame,
meta_model: pd.Series,
live_targets: pd.Series,
) -> pd.Series:
"""Calculate the contributive correlation of the given predictions
wrt the given meta model.
Then calculate contributive correlation by:
1. tie-kept ranking each prediction and the meta model
2. gaussianizing each prediction and the meta model
3. orthogonalizing each prediction wrt the meta model
4. multiplying the orthogonalized predictions and the targets
Arguments:
predictions: pd.DataFrame - the predictions to evaluate
meta_model: pd.Series - the meta model to evaluate against
live_targets: pd.Series - the live targets to evaluate against
Returns:
pd.Series - the resulting contributive correlation
scores for each column in predictions
"""
# filter and sort preds, mm, and targets wrt each other
meta_model, predictions = filter_sort_index(meta_model, predictions)
live_targets, predictions = filter_sort_index(live_targets, predictions)
live_targets, meta_model = filter_sort_index(live_targets, meta_model)
# rank and normalize meta model and predictions so mean=0 and std=1
p = gaussian(tie_kept_rank(predictions)).values
m = gaussian(tie_kept_rank(meta_model.to_frame()))[meta_model.name].values
# orthogonalize predictions wrt meta model
neutral_preds = orthogonalize(p, m)
# center the target
live_targets -= live_targets.mean()
# multiply target and neutralized predictions
# this is equivalent to covariance b/c mean = 0
mmc = (live_targets @ neutral_preds) / len(live_targets)
return pd.Series(mmc, index=predictions.columns)
```

BMC in Diagnostics

In Diagnostics, BMC is calculated not against the stake-weighted Benchmark Models, but instead against a single model - the benchmark model with the highest stake (see here). There is a difference between BMC on the Leaderboard and BMC in Diagnostics because:
  • The Leaderboard (LB) show live performance - this means the BMC calculated here is in the context of what Numerai and data scientists knew at the time so it's fair to judge models against stake-weighted benchmark models at the time. For early rounds the stake-weighted benchmark model is just example predictions from the v2 dataset.
  • Diagnostics show validation performance - this means you might be using better modeling techniques with better data and better targets and it would be misleading to judge you against example predictions from the v2 dataset. Instead, we should be judging you against the latest greatest model we can make.
If you still aren't sure why BMC is different between the LB and diagnostics, please take a look at our target ensemble notebook which touches on these ideas.

Discussion

Read more about MMC & BMC here.