Numerai Signals Overview
The official rules and getting started guide to the Numerai Signals Tournament
Numerai Signals lets you upload stock market signals and find out how original they are compared to all other signals on Numerai. Signals can be staked with the NMR cryptocurrency to earn rewards. The best most original signals are used in Numerai's hedge fund.
Numerai Signals is a part of the Numerai master plan to build the world's last hedge fund. Read the Medium Post and watch the short film to learn more about how it all fits together.
- 2.Upload your signal on Numerai's stock universe to receive performance, risk, and profitability diagnostics over the historical portion of your signal.
- 3.Stake NMR on the live portion of your signal to earn or lose NMR based on your performance relative to Numerai's custom targets.
- 4.Automate the weekly upload of your signal by connecting directly to our API and grow the value of your stake over time.
Stock market signals are feeds of numerical data about stocks used by quantitative hedge funds like Numerai to construct portfolios.
An example stock market signal
Examples of stock market signals include:
While the underlying data used to generate these signals can be very different (audited financials vs images of parking lots), the signals themselves all come in the same basic format - a list of stock tickers each with an associated numerical value.
To create your own signal, you will first need to acquire some stock market data.
If you do not already have access to stock market data, there are a number of free or cheap data providers on the internet such as Yahoo Finance, Quandl, and Koyfin.
Check out this forum thread for a list of sources popular data sources, platforms, and tools used by our community.
Finding unique and differentiated datasets is key to creating original signals.
The Numerai Signals stock market universe covers roughly the top 5000 largest stocks in the world.
The universe is updated every week, but in general only a couple low volume stocks will move in or out on a given week.
You can see the historical universe by downloading the historical targets file. This file has two target columns:
target_20d. You are only scored on the latter.
target_20dbecome available after they have resolved, 11 and 33 days respectively from round open.
target_20dtakes longer to resolve, and so the most recent dates will have a value of
target_20dis what your Signal is evaluated against for scoring and payouts.
When you submit a signal to Numerai Signals, you must include at least two columns:
numerai_tickercolumn - values must be valid tickers associated with the ticker type in the header.
signalcolumn - values must be between 0 and 1 (exclusive).
Additionally, for a submission to be valid:
- There must be at least 10 rows with predictions for tickers in the Signals stock market universe for the current
- A ticker cannot appear in the current
livetime period more than once.
Submissions with only two columns are assumed to correspond to the current
You may also to upload your signal over a historical
validationtime period to receive diagnostics metrics on your performance, risk, and potential earnings. The
validationtime period spans from
20130104to the present.
Submissions that include the
validationtime period must include two extra columns:
friday_datecolumn - values must be Fridays as week periods begin on Friday in Numerai Signals.
data_typecolumn - values can only be
validation. Rows with
livemust contain the date of the most recent Friday.
An example submission with ticker
Once your submission has been accepted, it will be queued for diagnostics. This usually takes 10-15 minutes depending on the number of weeks and tickers that span your submission.
An example diagnostics report
These diagnostics serve as a guide for you to estimate whether your signal is good enough to be worth staking on. It is important to note that signals with strong diagnostics over the historical
validationperiod may not score well in any current or future
Using this historical evaluation tool repeatedly will quickly lead to overfitting. Treat diagnostics only as a final check in your signal creation process.
You must submit your latest signal to Numerai every week
You can automate your submission workflow by using Numerai Compute and either our GraphQL API or the official python client.
Numerai has a variety of existing signals. Our existing signals include Barra factors (like size, value, momentum, etc) country and sector risk factors, and custom stock features.
Definition: A signal or target is considered "neutralized" after Numerai transforms it to have zero correlation with any of Numerai's existing signals such as Barra factors, country, sector factors and other custom stock features.
Every signal uploaded to Numerai Signals is neutralized before being scored. The point of the neutralization is to isolate the original or orthogonal component of the signal that is not already present in known signals.
A visualization of neutralization against a single known signal
If you submit a simple linear combination of a few well-known signals, there will be little to no orthogonal component after neutralization.
The targets used to evaluate signals (
target_20d) are also neutralized. The targets are in effect Numerai's custom "specific return" or "residual return".
The data that is used to perform neutralization is not provided, which means the process is a "blackbox". However, you can use the historical diagnostics of your signal to estimate the impact neutralization will have on your signal in the future although it’s important to note that signals with strong scores over the historical period may not score well in any current or future round.
The code that is used to implement neutralization is open source. You can learn more about the neutralization process in this example notebook:
Or check out this forum post to understand broader implications of feature exposure and neutralization.
Signals with very high correlation with subsequent stock returns may score very badly on Numerai Signals and signals with weak correlation with subsequent returns might score well.
In other words, “good” signals with strong predictive value when considered alone may score poorly on Numerai Signals. This highlights the key unique aspect of Signals: Numerai Signals is not about predicting stock returns, it is about finding original signals that Numerai doesn't already have.
Signals are evaluated against a custom blackbox target created by Numerai. This target is based on 22 day neutralized subsequent returns (ignoring the first 2 days), for a total of 20 days worth of returns.
The reason why signals are evaluated on 20 days of returns is because signals that only work on short time horizons are impossible for large hedge funds to implement. For example, even if a signal can accurately predict the 1 hour return of stocks, it is not very useful if it takes a hedge fund 24 hours to fully trade into that position. Signals that are most useful to large hedge funds have predictive power over a long time horizon which is also known as having "low alpha decay".
For more information on the exact market days that make up the 6 days of subsequent neutralized returns, see the following section on dates and deadlines.
Before scoring, signals are first ranked between [0, 1] and then neutralized. Finally the score is computed by taking the Spearman correlation between the neutralized signal and the target (
target_20d). This score is simply referred to as
corrthroughout this doc and the website.
By neutralizing your signal before scoring, Numerai aligns it with the target which improves its performance against the target. Since the target is also neutralized, the neutralization step effectively optimizes your signal for best performance without Numerai having to give out the data used for neutralization.
For example, if your signal is not neutralized to country risks, Numerai Signals will neutralize your signal against country risks before scoring so you can focus on creating an original signal without having to worry about country risk neutralization.
If you only have signals on a subset of the universe (eg only signals on US stocks), you can still submit to Signals and still perform well. For each stock in the universe where you have missing signals, Numerai will automatically fill those in with the median value after the signal is ranked.
corris a measure of how well your signal correlates to a target that is neutralized to all signals known to Numerai, Meta Model Contribution (MMC) is a measure of how well your signal correlates to a target that is neutralized to all signals known to Numerai and all other staked signals on Numerai Signals. This score is simply referred to as
mmcthroughout this doc and the website.
mmcof a signal is computed by first constructing a special signal called the Signals' Meta Model, which is defined as the stake weighted average of all the (ranked and neutralized) signals on Numerai Signals for a given round. The
mmcof a signal is the correlation of the signal to the target after being neutralized to the Signals' Meta Model.
High and consistent MMC on Signals is doubly impressive because it means your signal has an edge over all of Numerai's data and the combination of all other signals on Numerai Signals as well.
MMC is a concept that is taken from the main Numerai Tournament and the scoring system is very similar. See the metamodel contribution section in the Numerai Tournament docs for details on how we compute MMC on Numerai.
Note the computation of Numerai Signals' MMC is completely separate from that of the Numerai Tournament. Specifically, only submissions to Numerai Signals are used to construct the Signals' Meta Model.
Staking means locking up NMR in a smart contract on the Ethereum blockchain. For the duration of the stake, Numerai is given the permission to add payouts to or burn from the NMR locked up.
You can manage your stake on the website. When you increase your stake, NMR is transferred from your wallet to the staking contract. When you decrease your stake, NMR is transferred from the staking contract back into your wallet after a ~4 week delay. You can also change your stake type, which determines which scores (
mmc) you want to stake on.
It is important to note that the opportunity to stake your signal is not an offer by Numerai to participate in an investment contract, a security, a swap based on the return of any financial assets, an interest in Numerai’s hedge fund, or in Numerai itself or any fees we earn. Payouts will be made at our discretion, based on a blackbox target that will not be disclosed to users. Fundamentally, Numerai Signals is a service offered by Numerai that allows users to assess the value of their signals, using NMR staking as a way to validate “real” signals. In return, Numerai uses the staked signals and related data in the Numerai hedge fund. Users with different expectations should not stake signals.
Payouts are a function of your stake value and scores. The higher your stake value and the higher your scores, the more you will earn. If you have a negative score, then a portion of your stake will be burned. Payouts are limited to ±25% of the stake value per round.
payout = stake_value * payout_factor * (corr * corr_multiplier + mmc * mmc_multiplier)
stake_valueis the value of your stake on the first Friday (scoring day) of the round.
stake_cap_thresholdis a number that determines when the
payout_factorbegins to decay. At the time of this writing, the Signals
stake_cap_thresholdis 150K. The
stake_cap_thresholdcan change per round at Numerai's discretion.
payout_factoris a number that scales with the total NMR staked across all models in the tournament. When the total NMR staked across all models exceeds the
mmc_multiplierare configured by you to control your exposure to each score. You are given the following multiplier options.
corr multiplier options
mmc multiplier options
0.0x, 0.5x, 1.0x, 2.0x, 3.0x
The payout factor curve and available multiplier options may and will be updated by Numerai in the future alongside major tournament releases.
Here are some example payout calculations. The first 2 examples show the impact of adjusting score multipliers. The 3rd example shows how a negative score can cause a burn. The 4th example shows how the payout is capped at ±25% of the stake value.
With every daily score, a new daily update on your payout is also computed. These daily payouts are also just updates and only the final payout of a round counts. Final payouts are paid into your stake at the end of the round (Wednesday).
Your stake value will grow as long as you continue to have positive scores. Here are some example payout projections assuming that the model gets the same positive scores every week for 52 weeks.
There are two types of dates in Numerai Signals
data_date- dates corresponding to the underlying stock market data. All
data_datesrefer to the market close of that date and do not include a time. For example, values in the
friday_datecolumn of submissions are of type
effective_date- dates corresponding to actions or events that take place on Numerai Signals and may include a time which is always specified in UTC. There is usually a delay between the
effective_datebecause of time zones and the time it takes for stock market data to be processed. Unless otherwise specified, all dates mentioned in the website and this doc are of type
Submissions, stakes, scores and payouts are grouped into numbered
roundsto make them easier to talk about.
On every Tuesday, Wednesday, Thursday, Friday, and Saturday of the week, a new
roundis open and new tournament data is released.
Saturday rounds open at
18:00 UTCand the submission window is open until Monday
14:30 UTC.Weekday rounds open at
13:00 UTCand the submission window is open for 1 hour.
Each submission will be scored over the ~4 week duration of the round. Submissions will receive its first score starting on the Friday after the Monday deadline and final score on Thursday 4 weeks later for a total of 20 scores.
Effective dates for a round
The universe of the
roundis defined by the
data_dateof the prior Friday. The 20 days of scoring and payouts are based on
22day-2dayneutralized returns. There is a 2 day lag between market close for a day and when the data becomes available for scoring. For example, the
22day-2dayneutralized returns are up to Tuesday market close, but only become available on Thursday.
Data dates for a round
The leaderboard can be sorted by the reputation of model's
mmc. Reputation is the weighted average of a given metric over the past 20 rounds.
Keep an eye on the leaderboard to see how your models compare to all other models in terms of performance and returns from staking.
We are here to help.