>What scraper or headless browser are you using? it works so well. >Before 2019 ...

scandox · 2025-11-06T17:40:26 1762450826

I get that it convincingly simulates a human but so do I (because I am a human) and I don't get through the paywall...

r721 · 2025-11-06T17:45:51 1762451151

There are some tricks which work for different websites - for example, for NYT it's enough to manually clear nytimes.com cookies, FT used to work after click from twitter/x and so on. So I guess there is some set of heuristics.

kazinator · 2025-11-06T19:19:42 1762456782

It seems that archive.is often has the full article for sites that are completely paywalled to every non-paying visitor: no cookie-driven freebies, nothing.

Publicly revealing everything they are doing would be a strategically bad idea, obviously.

It's not inconceivable that they actually pay for access to some of the sites; it wouldn't be surprising.