Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Largest R1, as in the 671B? How do you accomplish that feat?


Just do it? Llama.cpp doesn't load the entire thing into ram. It mmaps the file and the kernel takes care of the rest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: