Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The best way to drive inference cost down right now is to use TPUs. Either that or invest tons of additional money and manpower into silicon design like Google did, but they already have a 10 year lead there.




> The best way to drive inference cost down right now is to use TPUs

TPUs are cool, but the best leverage remains to reduce your (active) parameters count.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: