I think that's mostly training costs. There is a serious level of diminishing returns and the primary way companies have been trying to get ahead is throwing more and more hardware at it. AI already burns something around 5x as much power as crypto mining and it's only growing.
You hear sometimes about the AI singularity and how they will continue to become smarter at an exponential pace until they're basically gods in a box, but the reality is we're already well into the top of the S curve and every advance requires more effort than the one before.
You should be more concerned that Chinese labs can train models that are just as good for 10X less because Americans treat the USD’s status as the global reserve currency as the ultimate bitter lesson. Who needs better math and engineering when you can print money to buy more GPUs???
The author literally said gpt is spending 10x more for equivalent. This really means ChatGPT had that intelligence at that cost a year or two ago. Smaller domain models are better at focused tasks in that area but can’t be generalized.
Is that actually true? And is most of it because of the compute requirements of the models or scaling cost due to exponential growth in usage?
I hope it didn't actually cost ten times more to create ChatGPT-5 than it did ChatGPT-4.