LogoLogo
TournamentsHedge FundDiscordGitHubForumDocs
  • Numerai Tournament
    • Overview
    • Data
    • Models
    • Submissions
      • Model Uploads
      • Compute Heavy
      • NGROK Example
      • CRON Example
    • Scoring
      • Definitions
      • Correlation (CORR)
      • Meta Model Contribution (MMC)
      • Feature Neutral Correlation (FNC)
      • Grandmasters & Seasons
    • Staking
    • Bounties
  • Numerai Signals
    • Overview
    • Data
    • Models
    • Submissions
    • Scoring
      • Definitions
    • Staking
    • Signals + QuantConnect
  • Numerai Crypto
    • Overview
    • Data
    • Submissions
    • Scoring
      • Definitions
    • Staking
  • Community
    • Discord
    • Forum
    • Twitter
    • Youtube
    • Apps & Content
      • Office Hours with Arbitrage
        • Office Hours Recaps: Season 1
          • OHwA S01E01
          • OHwA S01E02
          • OHwA S01E03
          • OHwA S01E04
          • OHwA S01E05
          • OHwA S01E06
          • OHwA S01E07
          • OHwA S01E08
          • OHwA S01E09
          • OHwA S01E10
          • OHwA S01E11
          • OHwA S01E12
        • Office Hours Recaps: Season 2
          • OHwA S02E01
          • OHwA S02E02
          • OHwA S02E03
          • OHwA S02E04
          • OHwA S02E05
          • OHwA S02E06
          • OHwA S02E07
          • OHwA S02E08
          • OHwA S02E09
          • OHwA S02E10
          • OHwA S02E11
        • Office Hours Recaps: Season 3
          • OHwA S03E01
          • OHwA S03E02
          • OHwA S03E03
          • OHwA S03E04
        • Office Hours Season 4
        • FAQ in Office Hours
        • Cited resources
      • Structure of Numerai by Wigglemuse
  • NMR
    • Coinbase
    • Uniswap
    • Etherscan
    • Github
  • Connect
    • Index
Powered by GitBook
On this page
  1. Community
  2. Apps & Content
  3. Office Hours with Arbitrage
  4. Office Hours Recaps: Season 3

OHwA S03E02

From September 17, 2020 / Michael Oliver round two

PreviousOHwA S03E01NextOHwA S03E03

Last updated 3 years ago

For the second episode of Season Three, longtime competitor and now Numerai team member returned to talk about , the latest iteration of the Numerai tournament.

The full interview and discussion with Michael Oliver on Target Nomi will be published on YouTube.

Questions from Slido

When new Corr + MMC leaderboard?

Michael Oliver: Soon!

All else being equal, are models trained on the new target always better than those trained on the current targets? Does blending old and new models make sense?

Michael Oliver said that there's no guarantee that every model will perform better on the Nomi targets, but his testing seems to indicate that's the case.

You could blend results from a model trained on each, but Michael Oliver doesn't think there's a compelling reason to do so.

How will you test on live data that the predictions made on the new target are more useful? Some people will train on the old target, some on the new one?

Michael Oliver explained that because the targets represent the same underlying signal, the introduction of Nomi predictions should blend nicely together with the Kazutsugi predictions and gradually move the meta-model performance in the right direction as users switch over.

Are significant differences between validation mean and feature-neutral mean representative of the quality of the model? If so, is there a minimum ratio to aim for?

Put another way - if there's a small difference between the two, is that a good thing or a bad thing? Or if it's good to be close, what should users aim for?

Michael Oliver said it's a question of how much risk a user is willing to take on. "If your validation mean is much higher than your feature-neutral mean, that means you have a lot of feature exposure," he said, "which means you have a lot of feature risk and your model is pretty linear."

He concluded with, "whether or not that's good is an empirical question. I wouldn't do it."

"Your feature-neutral score is always going to tell you how rich your model is in actual alpha." - Richard

Has Numerai seen a significant jump in average participant validation metrics after the release of the new diagnostic section?

"Yeah." - Richard Craib

Richard explained that because the new diagnostics make the average user more aware of what they're doing so they can improve themselves, the meta-model performance since March has continued to improve.

What kind of improvements in our predictions do you think the new target will lead to? E.g. better at predicting extremes / overall Spearman / etc?

Michael Oliver explained that Nomi targets have a different distribution so the extreme values (0, 1) are only going to be 5% of the targets per era (previously they were each 20%). He said that 0.25's and 0.75's will each be 20%, and that 0's would be 50%. In terms of correlation, the models seem to perform almost identically but with lower volatility for your score on the new targets.

Arbitrage is working through some technical challenges, but hopes to have one soon.

Did the change from three to ten models help [improve the meta model]?

Richard said that they're not entirely sure because multiple changes are happening at the same time, but he does think it's all helping.

Why don’t convolutional neural nets and LTSM neural networks work very well with Numerai data?

Michael Oliver said that every algorithm you could use has an inductive bias, and just because you can fit one to a data set, doesn't mean it's a good algorithm for generalizing. "If you're using convolutional neural networks," he said, "you should be thinking, 'why do I think there's some kind of invariant structure along the dimensions I'm convolving?'"

He explained that in processing an image, it makes sense that a given feature would work on several locations within that image. The Numerai data set, on the other hand, is a set of features in no particular order. "What dimension would you convolve over?" Michael Oliver asked. "And for LTSMs; it is a time series problem to some degree, but you know the correspondences so making an LTSM work seems fraught at best."

Isn’t it disingenuous for Michael Oliver to post a little tweak tosklearn.model_selection.TimeSeriesSplit in the forum and claim he has open sourced his model?

Michael Oliver provided the exact code he used to create the model and included the same parameters that he does a search over. "There's no benefit for anyone else to have more than that."

Richard added that the important thing is explaining how the solution was found.

added that one way of thinking about feature-neutral mean is as the score you get when taking on no risk. "That's clearly better than returns that come from risks," he said.

When feature neutralization tutorial ?

If you’re passionate about finance, machine learning, or data science and you’re not competing in_, what are you waiting for?

Don’t miss the next Office Hours with Arbitrage : follow_ or join the discussion on _for the next time and date.

Thank you to_ for talking about Nomi, to for joining and contributing to answers, and to for hosting.

Richard
on Twitch
the most challenging data science tournament in the world
Numerai on Twitter
Rocket.Chat
Michael Oliver
Richard
Arbitrage
Michael Oliver
Target Nomi
Kazutsugi vs Nomi