Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yeah, stochastic is there because we give up control of order of operations for speed

so the order in which floating-point additions happen is not fixed because of how threads are scheduled, how reductions are structured (tree reduction vs warp shuffle vs block reduction)

Floating-point addition is not associative (because of rounding), so: - (a + b) + c can differ slightly from a + (b + c). - Different execution orders → slightly different results → tiny changes in logits → occasionally different argmax token.



Actually, that's a misconception. It's because of varying batch sizes that requests get scheduled on: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...


Oh actually yeah that's true. You have correctly out-nitpicked my nitpick lol.

But at that point i feel like we are getting close to "everything that isn't a perfect Turing-machine is somewhat-stochastic" ;)

Edit: someone corrected me above, it does seem to matter more then I thought


> someone corrected me above, it does seem to matter more then I thought

if you llm agent takes different decisions from the same prompt, then you have to deal with it

1) your benchmarks become stochastic so you need multiple samples to get confidence for your AB testing

2) if your system assumes at least once completion you have to implement single record and replay so you dont get multiple rollout of with different actions




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: