Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, no, we have the HumanEval results for the June release.


Which is both (1) a subjective selection to measure the effectiveness of various chatbots and (2) now subject to gaming from companies using opaque/closed/inaccessible/unverifiable systems, like OpenAI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: