Free, zero-setup automation
Model uploads is a simple and free way to automate your daily submissions.
Once you have a trained model that you are ready to upload, simply wrap it with a function that takes live features and outputs live predictions.
# Wrap your model with a function that takes live features and returns live predictions
def predict(live_features: pd.DataFrame) -> pd.DataFrame:
live_predictions = model.predict(live_features[feature_cols])
submission = pd.Series(live_predictions, index=live_features.index)
# Use the cloudpickle library to serialize your function
p = cloudpickle.dumps(predict)
with open("predict.pkl", "wb") as f:
Click on the Upload Model button to open the modal upload modal. Select the pickle file you wish to upload, set the Python version used to create the pickle, then click Upload.
Once your upload is complete, Numerai will immediately run your model to generate a live submission for the current round and against the validation dataset to generate diagnostics.
Once these complete, you should see a submission block in the submissions column, the success status under the latest submission column, a link to view diagnostics under diagnostics column.
If everything is working correctly, you will see the latest submission cycle through these 4 statuses:
- 1.Pending: Numerai is provisioning the cloud resources to run your model
- 2.Running: Numerai is now running your model
- 3.Validating: Numerai has ran your model and is now validating your submission
- 4.Success: Numerai has accepted your submission
If there was a problem, you will see 2 possible statuses:
- 1.Error: Numerai has encountered an unexpected error running your model and the team will look into it.
- 2.Failed: Your model failed to run. Please check the logs and re-upload a working model. Examples of model failures include
- Python or dependency version mismatch
- Invalid submission
- Out of memory
The main benefit of using cloudpickle over the pickle standard library is that it serializes your local context along with your code. This makes it very convenient to package up code developed locally in a notebook environment like Google Colab.
In the example below, our function references
feature_colsdefined in the global scope. Cloudpickle is smart enough to correctly serialize both
feature_colsby value so that it is also available when this function is run by Numerai.
def predict(features: pd.DataFrame) -> pd.DataFrame:
# model and feature_cols are defined in the global scope
live_predictions = model.predict(features[feature_cols])
submission = pd.Series(live_predictions, index=features.index)
Since cloudpickle does not serialize Python itself or Python libraries in your local environment, you will need to make sure that your code is compatible with the exact Python versions listed and libraries supported in the requirements.txt.
When debugging issues, it may be helpful to download the numerai-predict docker container for local testing.
We aim to support all industry standard python machine learning libraries. If your pipeline is using a library that is not currently unsupported, please let us know and we will consider adding it.
For security reasons, your uploaded model will have no access to the internet.
By default, we will provision each model a machine with 1 CPU with 4GB of ram and allow runtime of up to 10 minutes (does not include time spent queueing).
- LGBM model with 20K trees (in example notebooks above) using the small feature set runs in under 1 minute
- LGBM model with 90K trees using the full feature set runs in under 6 minutes
To avoid race conditions, you will need to disable any existing compute configuration in order to upload your model.
Similarly, once you upload your model, you will no longer be able to upload submissions via the API or configure compute on your model.
This feature is designed for new and intermediate users who don’t want to invest time in setting up and managing their own model hosting infrastructure.
The obvious downside of this feature is that you need to upload (and give Numerai access to) your trained model. If you are not comfortable with this, you are 100% free to continue using Compute Heavy, Compute Lite or any other automation solution of your choice.
Numerai reserves the right to disable your model for any reason, including security concerns or if your account is no longer active.
Numerai will try our best to support your usage of this feature but ultimately it is still your responsibility to make sure your submission pipeline is set up properly.