# Meta Model Contribution (MMC & BMC)

Meta Model Contribution (MMC) is the covariance of a model with the target, after its predictions have been neutralized to the Meta Model. Similarly, Benchmark Model Contribution (BMC) is the covariance of a model with the target, after its predictions have been neutralized to the stake-weighted Benchmark Models.

These metrics tell us how the unique component of a model contributes to the correlation of the Meta Model (or the Benchmark Models in the case of BMC). By neutralizing the model's predictions by the Meta Model or Benchmark Models, the remaining orthogonal component's covariance with the target is that model's contribution.

To calculate a user's MMC for a given round we

- Normalize the predictions in their submission
- Normalize the Meta Model
- Neutralize their submission to the Meta Model
- Find the covariance of the neutral submission with the target

```python

def contribution(

predictions: pd.DataFrame,

meta_model: pd.Series,

live_targets: pd.Series,

) -> pd.Series:

"""Calculate the contributive correlation of the given predictions

wrt the given meta model.

Then calculate contributive correlation by:

1. tie-kept ranking each prediction and the meta model

2. gaussianizing each prediction and the meta model

3. orthogonalizing each prediction wrt the meta model

4. multiplying the orthogonalized predictions and the targets

Arguments:

predictions: pd.DataFrame - the predictions to evaluate

meta_model: pd.Series - the meta model to evaluate against

live_targets: pd.Series - the live targets to evaluate against

Returns:

pd.Series - the resulting contributive correlation

scores for each column in predictions

"""

# filter and sort preds, mm, and targets wrt each other

meta_model, predictions = filter_sort_index(meta_model, predictions)

live_targets, predictions = filter_sort_index(live_targets, predictions)

live_targets, meta_model = filter_sort_index(live_targets, meta_model)

# rank and normalize meta model and predictions so mean=0 and std=1

p = gaussian(tie_kept_rank(predictions)).values

m = gaussian(tie_kept_rank(meta_model.to_frame()))[meta_model.name].values

# orthogonalize predictions wrt meta model

neutral_preds = orthogonalize(p, m)

# center the target

live_targets -= live_targets.mean()

# multiply target and neutralized predictions

# this is equivalent to covariance b/c mean = 0

mmc = (live_targets @ neutral_preds) / len(live_targets)

return pd.Series(mmc, index=predictions.columns)

```

In Diagnostics, BMC is calculated not against the stake-weighted Benchmark Models, but instead against a single model - the benchmark model with the highest stake (see here). There is a difference between BMC on the Leaderboard and BMC in Diagnostics because:

- The Leaderboard (LB) show
**live**performance - this means the BMC calculated here is in the context of what Numerai and data scientists knew at the time so it's fair to judge models against stake-weighted benchmark models**at the time**. For early rounds the stake-weighted benchmark model is just example predictions from the v2 dataset. - Diagnostics show
**validation**performance - this means you might be using better modeling techniques with better data and better targets and it would be misleading to judge you against example predictions from the v2 dataset. Instead, we should be judging you against the latest greatest model we can make.

If you still aren't sure why BMC is different between the LB and diagnostics, please take a look at our target ensemble notebook which touches on these ideas.

Last modified 2mo ago