Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think that the important conclusion to make of this is that publicly available code is not created or even curated by humans anymore, and it will be fed back into data sets for training.

It's not clear what the consequences are. Maybe not much, but there's not that much actual emergent intelligence in LLMs, so without culling by running the code there's seems to be a risk that the end result is a world full of even more nonsense than today.

This already happened a couple of years ago for research on word frequency in published texts. I think the consensus is that there's no point in collecting anymore since all available material is tainted by machine generated content and doesn't reflect human communication.



I think we'll be fine. AIs definitely generate a lot of garbage, but then they have us monkeys sifting through it, looking for gems, and occasionally they do drop some.

My point is, AI generated code still has a human directing it the majority of the time (I would hope!). It's not all bad.

But yea, if you're 12 and just type "yolo 3d game now" into Claude Code, I'd say I'd be worried about that but then immediately realized no... that'd be awesome.

So yea, I think we'll be fine.


This is a really interesting point, I wonder if this will have a similar effect to model poisoning




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: