Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Scraped reddit text archives (~23B items according to their corporate info page) are ~4 TB of compressed json, which includes metadata and not just the actual comment text.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: