The Numerai competition is divided into weekly rounds. Each round lasts for one month.
New round open - every week on Saturday Submission and staking window - first two (2) days, i.e. Saturday, Sunday Submission only window - first seven (7) days Round resolution - after 31 days
Each new round comes with a unique data zip that contains all data you require to compete on Numerai.
Training data - a CSV file containing rows of features and their binary target values. Use this to train your machine learning model.
Tournament data - a CSV file with the same structure containing rows for validation, test and live. Run your trained model against these rows to generate your predictions.
Click <here> to download the current round tournament data
The dataset we provide is unlike typical data sets you will find in other financial data science competitions. A lot of work has been put into assembling this data set for you. Let's break it down.
Each ID represents an asset and each era represents a unit period of time. There are 310 features grouped in 6 categories. The data has been obfuscated in such a way that certain structural relationships are preserved but the true meaning of the features and targets remain secret.
Once you get to play around with the data more, you may also notice how clean it is. This is because we have already done the tedious cleaning and regularization work for you so you can focus on the modelling.
Now that you are familiar with the data, it's time to make some predictions.
Example models included in the datazip are two basic classifier scripts in Python and R to get you started. Try generating predictions with the example models before making your own.
Example predictions also included are examples prediction files that are ready for submission. Hint: if you have having issues submitting your own predictions, compare them against these examples to make sure they are formatted correctly.
Submitting predictions submit your predictions via the website or directly through the API. You can update your submissions as many times as you like while the submission window is open. However, once you have staked on your submission you cannot update it.
All predictions submitted to Numerai will be scored publically.
Your submission will be scored in a few ways.
Correlation - the correlation between your percentile ranked predictions and the targets. Learn more here.
Your live correlation score will be calculated and revealed once the round resolves. This score will be used to determine any payouts or bonuses. Test scores are never revealed to prevent overfitting to the test set.
Leaderboard - doing well consistently will earn you a place on the leaderboard. Users are ranked by reputation: average correlation score over past 20 rounds. Missing submissions are filled with a score of -0.1.
The code to calculate these metrics is open source. Check it out at numerai/submission-criteria.
Prize pools and bonuses
Prize pool - 2000 NMR
Reputation bonus pool - 500 NMR
Staking bonus pool - 100 NMR
Stake amount - an amount of NMR you are willing to lock up for the duration of the round. All payouts and bonuses are a percentage of your stake amount. So the more you stake, the more you can earn!
Confidence - an estimate of your model's expected AUC score. The higher your confidence, the more likely you will be selected, but the rounds you are selected in will have higher benchmarks to beat. Confidence must be greater than or equal to the min benchmark.
Stake selection - once the staking window closes, Numerai will select a group of stakes with the highest confidences such that the entire prize pool is allocated. Stakes that have been selected will be locked up for the duration of the round and processed for payout or burning during resolution. Stakes that have not been selected will be returned immediately.
Benchmark - the correlation benchmark for the round is determined by the results of stake selection process. If the entire prize pool is allocated, then the benchmark is set to the lowest confidence of the group. Otherwise, it is set to a pre-determined minimum of 0.002.
All prize pools, stake amounts and payouts are denominated in NMR. But $5000 of the 2000NMR prize pool will always be paid out in USD. This creates a floor on the USD value of the prize pool while allowing it to grow with the value of NMR.
Reputation bonus works slightly differently. Unlike regular payouts which depend only on a single round performance relative to the benchmark, this bonus depends on your 20 week average performance relative to others. After 20 weeks, the top 1000 NMR staked by reputation earn an additional 50%.
Payout - ±100% based on short term performance
Reputation bonus - +50% based on long term performance
Staking bonus - +5% just for staking