Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is no longer true. They changed their policy to ignore robots.txt in 2017. I seem to recall that they still respected robots.txt later, though I can’t find any more information on it and may be misremembering. Currently, they do not.


Does it mean archive.org works for any sites?

My main use for archive.is is for sites that somehow cannot be archived (a message will show up mentioning this site cannot be archive or something along these lines).

archive.is is generally pretty good in forcibly attempting to get an archive, if the HTML doesn't work, the screenshot will work fine. Although archive.is doesn't seem to handle gifs/videos.


> Does it mean archive.org works for any sites?

They respected exclusion requests after they stopped to respect robots.txt. I don't know their policy for new exclusion requests.


Oh. Did not know that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: