6 Comments
User's avatar
Rainbow Roxy's avatar

Didn't expect this take. Absolutly spot on!

Polina's avatar

You need, of course, both, constant flow of good ideas and strong execution. I'd think of it a pyramid, similar to Maslow's hierarchy where each layer builds on the next (sufficient capital -> space of feasible ideas -> strong execution -> judgement/taste -> ?)

For instance, rockets are earlier in the lifecycle than AI, they are still in capital-intensive stage.

The AI Architect's avatar

The part about evals killing taste is spot on. When everyone optimizes for the same benchmarks, you end up with models that game the metrics instead of actually solving problems. The eval system trains the labs just as much as the models themselves, and that's kinda dangerous when the real world doesn't run on leaderboards.

Giovanni Foglietta's avatar

Very true. It’s the old saying of “follow the incentives”. It’s inevitable that evals end up shaping reality more than assessing it only

Amazon Days's avatar

right on: execution will of course continue to matter, but ideas are back, baby! thx for writing and sharing your ideas!

Giovanni Foglietta's avatar

Indeed!! Ideas are back!