Overview
Everything you need to know to get started in under 60 seconds!
Numerai is a data science competition where you build machine learning models to predict the stock market.
Start with our free dataset made of clean and regularized financial data.
The dataset is obfuscated so that it can be given out for free and modeled without any financial domain knowledge.

Numerai's obfuscated dataset
Each row in the dataset corresponds to a stock at a specific point in time, represented by the
era
. The features
are quantitative attributes (e.g P/E ratio) known about the stock at the time, and the target
is a measure of stock market returns 20 days into the future. Your objective is to build machine learning models to predict the
target
.Here is an example model in Python using LightGBM, but you can use any language or framework that you like.
import lightgbm as lgb
model = lgb.LGBMRegressor(
n_estimators=2000,
learning_rate=0.01,
max_depth=5,
num_leaves=2 ** 5,
colsample_bytree=0.1
)
model.fit(
training_data[[f for f in training_data.columns if "feature" in f]],
training_data["target"]
)
Every business day, new
live features
are released which represent the current state of the stock market. Your job is to generate live predictions
and submit them to Numerai.Here is an example of how you generate and upload live predictions in Python:
# Authenticate
napi = numerapi.NumerAPI("api-public-id", "api-secret-key")
# Get current round
current_round = napi.get_current_round()
# Download latest live features
napi.download_dataset(f"v4.1/live_{current_round}.parquet")
live_data = pd.read_parquet(f"v4.1/live_{current_round}.parquet")
live_features = live_data[[f for f in live_data.columns if "feature" in f]]
# Generate live predictions
live_predictions = model.predict(live_features)
# Format submission
submission = pd.Series(live_predictions, index=live_features.index).to_frame("prediction")
submission.to_csv(f"prediction_{current_round}.csv")
# Upload submission
napi.upload_predictions(f"prediction_{current_round}.csv", model_id="your-model-id")
This is what a submission looks like:

Submissions are scored against two main metrics:
Since the
target
is a measure of 20 day stock market returns, it takes 20 days for each submission to be scored.When you are ready and confident in your model's performance, you may stake it with NMR - Numerai's cryptocurrency.
After the 20 days of scoring for each submission, models with positive scores are rewarded with more NMR, while those with negative scores have a portion of their staked NMR burned.
Behind the scenes, Numerai combines the predictions of all staked models into the stake-weighted Meta Model, which in turn is fed into the Numerai Hedge Fund for trading.
Staking serves two important functions:
- 1."Skin in the game" allows Numerai to trust the quality of staked predictions.
- 2.Payouts and burns continuously improve the weights of the Meta Model.
Last modified 5d ago