Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can a good soul explain to this humble layman the arguments behind each side of the "it's just predicting the next character" versus "it's more than that and shows some reasoning for new things" debate?


> "it's just predicting the next character"

That is literally what the model does, these models are trained to predict what the next word is in text, and when you query them they generate the next word to your text over and over to create a response text.

> "it's more than that and shows some reasoning for new things"

In order to predict the next word the model encodes some structures around words and contexts, meaning that "the next word predictor" is a bit reductive.

So, both sides are correct in some way, it is just a next word predictor, but there is a lot of complexity in predicting the next word so that is still very impressive.


Thank you! The SotA of science is still science and not magic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: