Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Also it's not just about running an obviously worse quant.

Running different GPU kernels / inference engines also matters. It's easy to write an implementation that is faster and thus cheaper but numerically much noisier / less accurate.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: