“In this close correlation environment, I want 4% — no higher than 4.5%.”
“You all have backtest data. It’s possible that you have some gaps where we could predict on something that doesn’t exist in the backtest but is former live data. It would be an out of sample validation. I don’t know how it would work in practice, but as long as it’s not in the backtest data, it is held out data and we could get some results on it. The problem is that those eras you’re looking at, in terms of average sample validation, are not like the current regime. If you fit to that, you’re going to get very bad results.”
“The tendency to want to overfit to data; it’s really easy to fall into that trap. The way [the tournament] is now, we’re so blind to everything that there is an element of luck. But with time, you do find some experience with the data like I found a range of correlation scores that works. If I had the opportunity to test my model [against the Validation 2 set], I don’t think I would change anything. I don’t think it would benefit me and I’m not sure how it would benefit anyone else.”
“Do we continue to hide that from you guys? Or do we give it out? I think it’s worth investigating.” — Slyfox
“In the first week or two, people were changing their models to get away from these tree-based approaches, but lately it’s been converging on integration-test type models because MMC is encouraging that. I think there’s going to be a short term where it is more correct to go to these safer models, where everybody is doing good and MMC will discourage bad models. Once everyone converges on the easy approach, it will open up and start being more profitable to diverge from that and come up with more creative ideas.”
“If you have two models that are close in performance but one has significantly lower feature exposure, you might want to consider using that one.”
“I kind of overlooked it at first but I think it’s a prerequisite for real MMC — the ability to experiment with different models without having to stake on them yet, but still getting MMC scores and seeing how correlated they are with the meta-model, is going to work in conjunction with MMC to create a better environment for iteration. That’s a big part of why it’s prioritized right now.”