> Developers can run inference on Llama 3.1 405B on their own infra at roughly 5... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		popcorncowboy on July 23, 2024 \| parent \| context \| favorite \| on: Open source AI is the path forward > Developers can run inference on Llama 3.1 405B on their own infra at roughly 50% the cost of using closed models like GPT-4o Does anyone have details on exactly what this means or where/how this metric gets derived?

rohansood15 on July 23, 2024 | [–]

I am guessing these are prices on services like AWS Bedrock (their post is down right now).

PlattypusRex on July 23, 2024 | [–]

a big chunk of that is probably the fact that you don't need to pay someone who is trying to make a profit by running inference off-premises.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact