LogoLogo
TournamentsHedge FundDiscordGitHubForumDocs
  • Numerai Tournament
    • Overview
    • Data
    • Models
    • Submissions
      • Model Uploads
      • Compute Heavy
      • NGROK Example
      • CRON Example
    • Scoring
      • Definitions
      • Correlation (CORR)
      • Meta Model Contribution (MMC)
      • Feature Neutral Correlation (FNC)
      • Grandmasters & Seasons
    • Staking
    • Bounties
  • Numerai Signals
    • Overview
    • Data
    • Models
    • Submissions
    • Scoring
      • Definitions
    • Staking
    • Signals + QuantConnect
  • Numerai Crypto
    • Overview
    • Data
    • Submissions
    • Scoring
      • Definitions
    • Staking
  • Community
    • Discord
    • Forum
    • Twitter
    • Youtube
    • Apps & Content
      • Office Hours with Arbitrage
        • Office Hours Recaps: Season 1
          • OHwA S01E01
          • OHwA S01E02
          • OHwA S01E03
          • OHwA S01E04
          • OHwA S01E05
          • OHwA S01E06
          • OHwA S01E07
          • OHwA S01E08
          • OHwA S01E09
          • OHwA S01E10
          • OHwA S01E11
          • OHwA S01E12
        • Office Hours Recaps: Season 2
          • OHwA S02E01
          • OHwA S02E02
          • OHwA S02E03
          • OHwA S02E04
          • OHwA S02E05
          • OHwA S02E06
          • OHwA S02E07
          • OHwA S02E08
          • OHwA S02E09
          • OHwA S02E10
          • OHwA S02E11
        • Office Hours Recaps: Season 3
          • OHwA S03E01
          • OHwA S03E02
          • OHwA S03E03
          • OHwA S03E04
        • Office Hours Season 4
        • FAQ in Office Hours
        • Cited resources
      • Structure of Numerai by Wigglemuse
  • NMR
    • Coinbase
    • Uniswap
    • Etherscan
    • Github
  • Connect
    • Index
Powered by GitBook
On this page
  1. Community
  2. Apps & Content
  3. Office Hours with Arbitrage
  4. Office Hours Recaps: Season 2

OHwA S02E11

From August 20, 2020 / Carlo Lepelaars interview

PreviousOHwA S02E10NextOffice Hours Recaps: Season 3

Last updated 3 years ago

On episode 11, Arbitrage interviewed Carlo Lepelaars who recently wrote a .

Check out the full interview with Carlo .

Questions from Slido

Another request from the community for Sorios to join Office Hours. Sorios, if you're reading, slide into Arbitrage's DMs, he would love to have you as a guest.

Carlo - would you please share some insights on how you choose and decide between abstraction and elaboration while writing?

Carlo admitted that one of his biggest challenges while writing is wanting to include as many small details as possible, but he knows that wouldn't make for coherent posts.

"What I do is write everything down and truncate it from there." - Carlo

What if the team use all previous submissions and their metrics to model the relationship between submission stats and live performance and to suggest improvements?

Michael Oliver concluded by saying that he doesn't think data scientists need to train on the validation set and it's probably better used for hypervalidation of different parameters or techniques.

Arbitrage added that any suggestions for improvement from Numerai would also introduce bias and would lead to models being overfit to the data.

Which is the metric (or combination of several) that we should optimize for?

"There is no one metric to rule them all," Arbitrage said, "you should consider several." Arbitrage suggested that given accounts can have up to ten models, try optimizing different models for different metrics. He said different people are going to have different techniques and different metrics they optimize for, adding "don't just chase high correlation- I think that's risky."

Can the team elaborate on how predictions are combined? Have you gone from simple averaging to stacking or some sort of IPW?

Without divulging too much, Michael Oliver said that they're not doing anything particularly complicated when combining predictions, adding that it's still stake-weighted.

How can I start staking? Any video tutorials?

🎥 Coming soon 🎥

****

When ?

said that it's technically possible, but the team is definitely not going to give tournament participants any metrics from Numerai's backtesting. Doing so would run the risk of inadvertently leaking information about the test data. He added that they are working on adding new submission metrics based on the validation data, and these should be available soon, saying "obviously these will only be valid if you're not training on your validation data."

Arbitrage is working on video tutorials and is close to publishing the first, so stay tuned. Some will be short videos covering the basics, others will be longer tackling different aspects of the notebook.

If you’re passionate about finance, machine learning, or data science and you’re not competing in_, what are you waiting for?

Don’t miss the next Office Hours with Arbitrage : follow_ or join the discussion on _for the next time and date.

Thank you to_ for contributing to answers during this Office Hours, to Carlo for being interviewed and writing an awesome , and to for hosting.

Sorios
Michael Oliver
tips and tricks
the most challenging data science tournament in the world
Numerai on Twitter
Rocket.Chat
Michael Oliver
guide to the tournament
Arbitrage
guide to competing in the Numerai tournament
on YouTube
https://youtu.be/5j6Nf1s:tTUyoutu.be
Carlo in the hot seat on Arbitrage's right