> I must have missed the part when it started doing anything algorithmically. Ye...

> I must have missed the part when it started doing anything algorithmically.

Yeah.

"Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers"

@dang there's something weird about this URL in HN. It has 35 points but no discussion (I guess because the original submission is too old and never got any traction or something)