Numerai Tournament Overview
The official rules and getting started guide to the Numerai Tournament
The Numerai Tournament is where you build machine learning models on abstract financial data to predict the stock market. Your models can be staked with the NMR cryptocurrency to earn rewards based on performance.
The staked models of Numerai are combined to form the Meta Model which controls the capital of the Numerai hedge fund across the global stock market.
Watch this short film to learn how it all fits together:
- 2.Download the dataset with training data and example scripts
- 3.Build your model and submit your predictions back to Numerai
- 4.Stake NMR on your models to earn/burn based on performance
- 5.Automate your weekly submissions and grow your stake over time
At the core of the Numerai Tournament is the free dataset. It is made of high quality financial data that has been cleaned and regularized and obfuscated.
idcorresponds to a stock at a specific time
featuresdescribe the various quantitative attributes of the stock at the time. The
targetrepresents an abstract measure of performance ~4 weeks into the future. Visit numer.ai/data for details about the newest available dataset and how to download them.
Your objective is to build a model to predict the future target using live features that correspond to the current stock market.
Here is a basic example using XGBoost in Python. We train the model using the historical training data, and make predictions on the live tournament data.
import pandas as pd
from xgboost import XGBRegressor
# training data contains features and targets
training_data = pd.read_csv("numerai_training_data.csv").set_index("id")
# tournament data contains features only
tournament_data = pd.read_csv("numerai_tournament_data.csv").set_index("id")
feature_names = [f for f in training_data.columns if "feature" in f]
# train a model to make predictions on tournament data
model = XGBRegressor(max_depth=5, learning_rate=0.01, \
# submit predictions to numer.ai
predictions = model.predict(tournament_data[feature_names])
You can use any language or framework to build your model.
Our example-scripts have more advanced modeling ideas using the latest data available. Also check out the forum for the latest research topics from the team and community.
You can use the diagnostics tool to understand the performance and risk characteristics of your model over the historical validation eras in the dataset.
Using this historical evaluation tool repeatedly can quickly lead to overfitting. Treat diagnostics only as a final check in your model research process.
On every Tuesday, Wednesday, Thursday, Friday, and Saturday of the week, a new
roundis open and new tournament data is released. To participate in the round you need to download the latest tournament data, generate new predictions, and upload those predictions back to Numerai.
Rounds open no earlier than 13:00 UTC. Weekday submission windows are open for 1 hour. Weekend windows open Saturday and close on Monday at 14:00 UTC.
You can use our GraphQL API or our Python and R api clients to download the dataset and upload your predictions. Here is a basic example in Python.
from numerapi import NumerAPI
napi = NumerAPI("public_id", "secret_key")
# download data
napi = NumerAPI()
# upload predictions (for example, the live_example_preds, but formatted as a csv)
Once you have your model pipeline working, you can deploy it to AWS using the Numerai Compute framework to automatically participate in every round.
Submissions are scored against the live target in a number of ways. Here are a few:
Your model's live scores can be viewed publicly on its model profile page. Here is an example of a model's final scores over the past 20 rounds.
The live target of the round is constructed using 20 days of returns skipping the first two days of returns, also known as
20D2L. On each of those 20 days, Numerai will compute a daily update of your submissions score. Here is an example of a submission's 20 score updates within a single round.
You can optionally stake NMR cryptocurrency on your model to earn payouts based on your
In order to qualify for payouts in a round, you must be staked at the submission deadline of that round.
Once NMR is staked, it will remain locked until you release it. Staked NMR can only be released after the resolution of any ongoing rounds. While pending release, NMR will not count towards upcoming rounds.
There are a few advanced options that you can also configure on your stake like your
payout modewhich determines where your payout goes and your
score multiplierswhich determine how much each score impacts your payouts.
Payouts are a function of your stake value and scores. The higher your stake value and the higher your scores, the more you will earn. If you have a negative score, then a portion of your stake will be burned.
payout = at_risk_stake * MAX(-0.25, MIN(0.25, payout_factor * (corr * corr_multiplier + tc * tc_multiplier)))
at_risk_stakeis the value of your stake at the round's submission deadline.
The maximum combined score per round is clamped at ±0.25
payout_factoris number that scales with the total NMR staked across all models in the tournament. The higher the total NMR staked above the 360K threshold the lower the payout factor.
tc_multiplierare configured by you to control your exposure to each score. You are given the following multiplier options.
corr multiplier options
tc multiplier options
0.0x, 0.5x, 1.0x, 2.0x
The payout factor curve and available multiplier options may and will be updated by Numerai in the future alongside major tournament releases.
Here are some example payout calculations. The first 2 examples show the impact of adjusting score multipliers. The 3rd example shows how a negative score can cause a burn. The 4th example shows how the stake is capped at ±25% of the stake value and payout factor applied.
With every daily score update, a new daily update on your payout is also computed. These daily payouts are also just updates and only the final payout of a round counts.
Your stake value will grow as long as you continue to have positive scores. Here are some example payout projections assuming that the model gets the same positive scores every round for 52 rounds.
The leaderboard can be sorted by the reputation of model's
TC. Reputation is the weighted average of a given metric over the past 20 rounds.
Keep an eye on the leaderboard to see how your models compare to all other models in terms of performance and returns from staking.
We are here to help.