Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Frankly, bullshit is the perfect term for it because ChatGPT doesn't know that it's wrong. A bullshit artist isn't someone whose primary goal is to lie. A bullshit artist is someone whose primary goal is to achieve something (a sale, impressing someone, appearing knowledgable, whatever) without regard for the truth. The act of bullshitting isn't the same as the act of lying. You can e.g bullshit your way through a conversation on a technical topic you know nothing about and be correct by happenstance.




Before someone replies and does a fallacious comparison along the lines like: "But humans also do 'bullshitting' as well, humans also 'hallucinate' just like LLMs do".

Except that LLMs have no mechanism for transparent reasoning and also have no idea about what they don't know and will go to great lengths to generate fake citations to convince you that it is correct.


> Except that LLMs have no mechanism for transparent reasoning

Humans have transparent reasoning?

> and also have no idea about what they don't know

So why can they respond saying they don't know things?


> So why can they respond saying they don't know things?

Because sometimes, the tokens for "I don't know" are the most likely, given the prior context + the RLHF. LLMs can absolutely respond that they don't know something or that they were incorrect about sometimes, but I've only seen that happen after first pointing out that they're wrong, which changes the context window to one where such an admission of fault becomes probable.


I've actually had ChatGPT admit it was wrong by simply asking a question ("how is X achieved with what you described for Y"). It responded with "Oh, it's a great question which highlights how I was wrong: this is what really happens...": but still, it couldn't get there without me understanding the underlying truth (it was about key exchange in a particular protocol that I knew little about, but I know about secure messaging in general), and it would easily confuse less experienced engineers with fully confident sounding explanation.

For things I don't understand deeply, I can only look if it sounds plausible and realistic, but I can't have full trust.

The "language" it uses when it's wrong is still just an extension of the token-completion it does (because that's what text contains in many of the online discussions etc).


That interpretation is too generous, the word "bullshit" is generally a value judgement and implies that you are almost always wrong, even though you might be correct from time to time. Current LLMs are way past that threshold, making them much more dangerous for a certain group of people.

I guess it's a fair point that slop has its own unique flavor, like eggs.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: