Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Intel optimization manual has a fun example where they use vpconflict for vectorizing sparse dot products: https://github.com/intel/optimization-manual/blob/main/chap1...

I benchmarked it on Intel, and it was indeed quite fast/a good improvement over the scalar version. Will be interesting to try that on AMD.



Nice! Thanks for linking it :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: