It would be a good thing, if it would cause anything to change. It obviously won...

testdelacc1 · 2025-11-24T06:53:11 1763967191

If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you. If you host on AWS and “the internet goes down”, then customers treat it akin to an act of God, like a natural disaster that affects everyone.

It’s not great being down for hours, but that will happen regardless. Most companies prefer the option that helps them avoid the ire of their customers.

Where it’s a bigger problem is when a critical industry like retail banking in a country all choose AWS. When AWS goes down all citizens lose access to their money. They can’t pay for groceries or transport. They’re stranded and starving, life grinds to a halt. But even then, this is not the bank’s problem because they’re not doing worse than their competitors. It’s something for the banking regulator and government to worry about. I’m not saying the bank shouldn’t worry about it, I’m saying in practice they don’t worry about it unless the regulator makes them worry.

I completely empathise with people frustrated with this status quo. It’s not great that we’ve normalised a few large outages a year. But for most companies, this is the rational thing to do. And barring a few critical industries like banking, it’s also rational for governments to not intervene.

graemep · 2025-11-24T09:50:20 1763977820

> If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you.

Not just customers. Your management take the same view. Using hyperscalers is great CYA. The same for any replacement of internally provided services with external ones from big names.

testdelacc1 · 2025-11-24T10:31:00 1763980260

Exactly. No one got fired for using AWS. Advocating for self-hosting or a smaller provider means you get blamed when the inevitable downtime comes around.

BlackFly · 2025-11-24T12:09:23 1763986163

I think this really depends on your industry.

If you cannot give a patient life saving dialysis because you don't have a backup generator then you are likely facing some liability. If you cannot give a patient life saving dialysis because your scheduling software is down because of a major outage at a third party and you have no local redundancy then you are in a similar situation. Obviously this depends on your jurisdiction and probably we are in different ones, but I feel confident that you want to live in a district where a hospital is reasonably responsible for such foreseeable disasters.

testdelacc1 · 2025-11-24T16:33:13 1764001993

Yeah I mentioned banking because of what I was familiar with but medical industry is going to be similar.

But they do differ - it’s never ok for a hospital to be unable to dispense care. But it is somewhat ok for one bank to be down. We just assume that people have at least two bank accounts. The problem the banking regulator faces is that when AWS goes down, all banks go down simultaneously. Not terrible for any individual bank, but catastrophic for the country.

And now you see what a juicy target an AWS DC is for an adversary. They go down on their own now, but surely Russia or others are looking at this and thinking “damn, one missile at the right data Center and life in this country grinds to a halt”.

DeathArrow · 2025-11-24T06:58:06 1763967486

>If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you.

What if you host on AWS and only you go down? How does hosting on AWS shield you from criticism?

testdelacc1 · 2025-11-24T07:08:06 1763968086

This discussion is assuming that the outage is entirely out of your control because the underlying datacenter you relied on went down.

Outages because of bad code do happen and the criticism is fully on the company. They can be mitigated by better testing and quick rollbacks, which is good. But outages at the datacenter level - nothing you can do about that. You just wait until the datacenter is fixed.

This discussion started because companies are actually fine with this state of affairs. They are risking major outages but so are all their competitors so it’s fine actually. The juice isn’t worth the squeeze to them, unless an external entity like the banking regulator makes them care.

ectospheno · 2025-11-24T16:14:14 1764000854

I’m pretty cloudflare centric. I didn’t start that way. I had services spread out for redundancy. It was a huge pain. Then bots got even more aggressive than usual. I asked why I kept doing this to myself and finally decided my time was worth recapturing.

Did everything become inaccessible the last outage? Yep. Weighed against the time it saves me throughout the year I call it a wash. No plans to move.

tracker1 · 2025-11-25T00:23:32 1764030212

I'm of a similar mindset... yeah, it's inconvenient when "everything" goes down... but realistically so many things go down now and then, it just happens.

Could just as easily be my home's internet connection, or a service I need from/at work, etc. It's always going to be something, it's just more noticeable when it affects so many other things.

bobbob27 · 2025-11-25T15:54:35 1764086075

To be honest, it's MUCH easier to have one source to blame when things go down. If a small-medium vendor's website goes down on a normal day, so poor IT guy is going to be fielding calls all day.

If that same vendor goes down because Cloudflare went down, oh well. Most already know and won't bother to ask when your site will be back up

captainkrtek · 2025-11-24T04:53:18 1763959998

> It would be a good thing, if it would cause anything to change. It obviously won't.

I agree wholeheartedly. The only change is internal to these organizations (eg: CloudFlare, AWS) Improvements will be made to the relevant systems, and some teams internally will also audit for similar behavior, add tests, and fix some bugs.

However, nothing external will change. The cycle of pretending like you are going to implement multi-region fades after a week. And each company goes on continuing to leverage all these services to the Nth degree, waiting for the next outage.

Not advocating that organizations should/could do much, it's all pros/cons. But the collective blast radius is still impressive.

chii · 2025-11-24T05:26:32 1763961992

the root cause is customers refusing to punish these downtime.

Checkout how hard customers punish blackouts from the grid - both via wallet, but also via voting/gov't. It's why they are now more reliable.

So unless the backbone infrastructure gets the same flak, nothing is going to change. After all, any change is expensive, and the cost of that change needs to be worth it.

MikeNotThePope · 2025-11-24T05:48:14 1763963294

Is a little downtime such a bad thing? Trying to avoid some bumps and bruises in your business has diminishing returns.

Xelbair · 2025-11-24T07:20:21 1763968821

Even more so when most of the internet is also down.

What are customers going to do? Go to competitor that's also down?

It is extremely annoying, will ruin your day, but as movie quote goes - if everyone is special, no one is.

throwaway0352 · 2025-11-24T12:50:19 1763988619

I think you’re viewing the issue from an office worker’s perspective. For us, downtime might just mean heading to the coffee machine and taking a break.

But if a restaurant loses access to its POS system (which has happened), or you’re unable to purchase a train ticket, the consequences are very real. Outages like these have tangible impacts on everyday life. That’s why there’s definitely room for competitors who can offer reliable backup strategies to keep services running.

mallets · 2025-11-24T13:46:37 1763991997

Those are examples where they shouldn't be using public cloud in the first place. Should build those services to be local-first.

Using a different, smaller cloud provider doesn't improve reliability (likely makes it worse) if the architecture itself wrong.

esseph · 2025-11-25T05:19:15 1764047955

It makes credit card transactions risky (offline)

mallets · 2025-11-25T13:06:04 1764075964

Talking more about some unrelated function taking down the whole system, not advocating for "offline" credit card transactions (is this even a thing these days?). Ex: If the transaction needs to be logged somewhere, it can be built to sync whenever possible rather than blocking all transactions if the central service is down.

Payment processor being down is payment processor being down.

Xelbair · 2025-11-28T13:03:13 1764334993

Cash exists, physical tickets exists.

those things shouldn't be fully tied to the internet/intranet anyways.

wongarsu · 2025-11-24T14:12:43 1763993563

Do any of those competitors actually have meaningfully better uptime?

From a societal level, having everything shut down at once is an issue. But if you only have one POS system targeting only one backend URL (and that backend has to be online for the POS to work) then cloudflare seems like one of the best choices

If the uptime provided by cloudflare isn't enough then the solution isn't a cloudflare competitor, it's the ability to operate offline (which many POS have, including for card purchases) or at least multiple backends with different DNS, CDN, server location etc.

immibis · 2025-11-24T08:50:47 1763974247

They could go to your competitor that's up. If you choose to be up, your competitor's customers could go to you.

dewey · 2025-11-24T09:02:12 1763974932

If it’s that easy to get the exact same service / product as another vendor the maybe your competitive advantage isn’t so high. If Amazon would be down I’d just wait a few hours as I don’t want to sign up on another site.

MikeNotThePope · 2025-11-24T10:23:46 1763979826

I agree. These days it seems like everything is a micro-optimization to squeeze out a little extra revenue. Eventually most companies lose sight of the need to offer a compelling product that people would be willing to wait for.

immibis · 2025-11-25T18:59:35 1764097175

Why can't we just take pride in doing a good job?

krige · 2025-11-24T06:14:11 1763964851

What's "a little downtime" to you might be work ruined and day wasted for someone else.

bloppe · 2025-11-24T09:07:50 1763975270

I remember a Google cloud outage years ago that happened to coincide with one of our customers' massively expensive TV ads. All the people who normally would've gone straight to their website instead got 502. Probably a 1M+ loss for them all things considered.

We got an extremely angry email about it.

fragmede · 2025-11-24T07:24:51 1763969091

It's 2025. That downtime could be be difference between my cat pics not loading fast enough, or someone's teleoperated robot surgeon glitching out.

cactusplant7374 · 2025-11-24T14:42:47 1763995367

I have a lot of bad days every year. More than I can count. It's just part of living.

aaron_m04 · 2025-11-24T06:08:36 1763964516

Depends on the business.

tjwebbnorfolk · 2025-11-24T18:40:54 1764009654

> the root cause is customers refusing to punish these downtime.

ok how do I punish cloudflare -- build my own globally-distributed content-delivery network just for myself so that I can be "decentralized"?

Or should I go to one of their even-larger competitors like AWS or GCP?

What exactly do you propose?

niutech · 2025-11-25T11:31:29 1764070289

Why not just boycott CDNs like Cloudflare and instead host your website on a decentralized network like Bluesky (https://danielmangum.com/posts/this-website-is-hosted-on-blu...) or IPFS (https://pinme.eth.limo/) for free?

chii · 2025-11-25T03:08:11 1764040091

you are not a customer of cloudflare.

You need to be punishing the services you "paid" to use, but had downtime. So did you terminate any of those services for downtime, or had any sort of punishment done to them as a result?

tjwebbnorfolk · 2025-11-25T18:09:27 1764094167

Ok but the price I am paying includes some % of downtime in the SLA, and I am ok with that.

If I wanted 100.00000% uptime, I would have to pay much more, but I don't want to

whatevaa · 2025-11-24T06:54:44 1763967284

Grid reliability depends on where you live. In some places, UPS or even a generator is a must have. So it's a bad example, I would say.

LoganDark · 2025-11-24T12:21:06 1763986866

> Checkout how hard customers punish blackouts from the grid - both via wallet, but also via voting/gov't.

What? Since when has anyone ever been free to just up and stop paying for power from the grid? Are you going to pay $10,000 - $100,000 to have another power company install lines? Do you even have another power company in the area? State? Country? Do you even have permission for that to happen near your building? Any building?

The same is true for internet service, although personally I'd gladly pay $10,000 - $100,000 to have literally anything else at my location, but there are no proper other wired providers and I'll die before I ever install any sort of cellular router. Also this is a rented apartment so I'm fucked even if there were competition, although I plan to buy a house in a year or two.

heartbreak · 2025-11-24T13:25:06 1763990706

The hyperscalers definitely vote with their wallets.

mopsi · 2025-11-24T06:15:28 1763964928

Downtimes happen one way or another. The upside of using Cloudflare is that bringing things back online is their problem and not mine like when I self-host. :]

Their infrastructure went down for a pretty good reason (let the one who has never caused that kind of error cast the first stone) and was brought back within a reasonable time.

tracker1 · 2025-11-25T00:24:51 1764030291

And even in multi-region, you experience a DNS failure and it all goes up in flames anyway. There's always going to be something.

stingraycharles · 2025-11-24T07:42:25 1763970145

It’s just a function of costs vs benefits. For most people, building redundancy at this layer costs far too much than the benefits.

If Cloudflare or AWS go down, the outage is usually so big that smaller players have an excuse and people accept that.

It’s as simple as that.

“Why isn’t your site working?” “Half the internet is down, here read this news article: …” “Oh, okay, let me know when it’s back!”

GuB-42 · 2025-11-24T14:58:54 1763996334

Same idea with the Crowdstrike bug, it seems like it didn't have much of on effect on their customers, certainly not with my company at least, and the stock quickly recovered, in fact doing very well. For me, it looks like nothing changed, no lessons learned.

beanjuiceII · 2025-11-24T16:14:51 1764000891

what do you mean no lesson learned? seems like you haven't been paying attention..there's always a lesson learned

peaseagee · 2025-11-24T16:59:52 1764003592

I believe they mean that Crowdstrike learned that they could screw up on this level and keep their customers....

thewebguyd · 2025-11-24T17:15:16 1764004516

That's true of a lot of "Enterprise" software. Microsoft enjoys success from abusing their enterprise customers what seems like daily at this point.

For bigger firms, the reality is that it would probably cost more to switch EDR vendors than the outage itself cost them, and up to that point, CrowdStrike was the industry standard and enjoyed a really good track records and reputation.

Depending on the business, there are long term contracts and early termination fees, there's the need to run your new solution along side the old during migration, there's probably years of telemetry and incident data that you need to keep on the old platform, so even if you switch, you're still paying for CrowdStrike for the retention period. It was one (major) issue over 10+ years.

Just like with CloudFlare, the switching costs are higher than outage cost, unless there was a major outage of that scale multiple times per year.

beanjuiceII · 2025-11-24T22:54:36 1764024876

that IS the lesson! there are a million questions i can ask myself about those incidents. What dictates they can't ever screw up? sure it was a big screw up, but understanding the tolerances for screw ups is important to understanding how fast and loose you can play it. AWS has at least a big outage a year, whats the breaking point? risk and reward etc.

I've worked places where every little thing is yak shaved, and places where no one is even sure if the servers are up during working hours. Both jobs paid well.. both jobs had enough happy customers

ehhthing · 2025-11-24T05:33:37 1763962417

With the rise in unfriendly bots on the internet as well as DDoS botnets reaching 15 Tbps, I don’t think many people have much of a choice.

swiftcoder · 2025-11-24T07:41:57 1763970117

The cynic in me wonders how much blame the world's leading vendor of DDoS prevention might share in the creation of that particularly problem

immibis · 2025-11-24T08:51:36 1763974296

They provide free services to DDoS-for-hire services and do not terminate the services when reported.

zamadatix · 2025-11-24T12:55:54 1763988954

Not that I doubt examples exist (I've yet to be at a large place with 0 failures on responding to such issues over the years), but it'd be nice if you'd share the specific examples you have in mind if you're going to bother commenting about it. It helps people understand how much is a systemic problem to be interested in vs having a comment which more easily falls into many other buckets instead. I'd try to build trust off the user profile as well, but it proclaims you're shadowbanned for two different reasons - despite me seeing your comment.

One related topic I've seen brought up is Workers abuse https://www.fortra.com/blog/cloudflare-pages-workers-domains..., but that goes against this claim they do nothing when reported.

inemesitaffia · 2025-11-25T17:47:32 1764092852

Search for any booster service, or hacking forum period and check who's hosting it.

Same thing with any service that hijacks and redirects your session.

Almost always CloudFlare.

tete · 2025-11-26T09:34:15 1764149655

> As if anybody could viably stop using them.

To be fair AWS (and GCP and Azure) at least is easy to replace with something else. And pretty much all alternatives are cheaper, less messy, etc. There are very few situations where you cannot viably do so.

We live in a world where you can get things like dedicated servers, etc. within similar time spans as creating a "compute engine" node on a big cloud provider.

The fact that cloud services added serious limitations to what applications were able to do (things like state management, passing configuration in more unified ways, etc.) means that running your own infrastructure is easier than ever, since your devs won't end up whining at you until you do something super custom just for some project to be a bit easier. But if you really want to you can.

GitHub also has become easy to get away from and indeed many individuals and companies did so.

CDNs are the bigger thing but A) there are a lot of other CDNs and B) having an image, or lets say an ansible config allows you to quickly deploy something that might be close enough for your use case. Just take any hosting company or even a dozen around the world.

Of course if you allowed yourself to end up in a complete vendor lock in things might be different, but if you think that it's a good idea to be completely dependent on the whims of some other company maybe you deserve that state. As in don't run a business without having any kind of fallback for decisions you make. Yes, profit from that big benefit something might give you, but don't lock the door behind you.

Sure you might be lucky and sure maybe you are fine going for luck while it lasts. Just don't be surprised when it all shatters.

philipallstar · 2025-11-25T12:57:17 1764075437

> As if anybody could viably stop using them.

It is as easy to not use them as it ever was. There has been no actual centralisation. Everything is done using open protocols. I don't know what more you could want.

Compare it to Windows where there is deep volume discounting and salespeople shmoozing CTOs and getting in with schools, healthcare providers etc etc. That's actual lock-in.

kordlessagain · 2025-11-25T16:13:57 1764087237

When the tubes go down the tubes it will be the fault of those who are complacent.

philipallstar · 2025-11-26T10:58:36 1764154716

This is just silliness.

markus_zhang · 2025-11-24T11:31:04 1763983864

It’s too few and far between. It’s gonna make some changes if it’s a monthly event. If businesses start to lose connection for 8 hours every month, maybe the bigger ones are going to run for self hosting or at least some capacity of self hosting.

mkornaukhov · 2025-11-24T18:23:29 1764008609

Yeah, agree. But even in case of 8 hour downtime (it's almost 99% SLA) it isn't beneficial for really small firms.

fragmede · 2025-11-24T09:02:26 1763974946

> It obviously won't.

Here's where we separate the men from the boys, the women from the girls, the Enbys from the enbetts, and the SREs from the DevOps. If you went down when Cloudflare went do, do you go multicloud so that can't happen again, or do you shrug your shoulders and say "well, everyone else is down"? Have some pride in your work, do better, be better, and strive for greatness. Have backup plans for your backup plans, and get out of the pit of mediocrity.

Or not, shit's expensive and kubernetes is too complicated and "no one" needs that.

rkomorn · 2025-11-24T09:04:17 1763975057

You make the appropriate cost/benefit decision for your business and ignore apathy on one side and dogma on the other.

tcfhgj · 2025-11-24T08:30:02 1763973002

> As if anybody could viably stop using them.

You can, and even save money.

sjamaan · 2025-11-24T06:00:27 1763964027

Same with the big Crowdstrike fail of 2024. Especially when everyone kept repeating the laughable statement that these guys have their shit in order, so it couldn't possibly be a simple fuckup on their end. Guess what, they don't, and it was. And nobody has realized the importance of diversity for resilience, so all the major stuff is still running on Windows and using Crowdstrike.

c0l0 · 2025-11-24T08:15:16 1763972116

I wrote https://johannes.truschnigg.info/writing/2024-07-impending_g... in response to the CrowdStrike fallout, and was tempted to repost it for the recent CloudFlare whoopsie. It's just too bad that publishing rants won't change the darned status quo! :')

graemep · 2025-11-24T09:47:57 1763977677

People will not do anything until something really disastrous happens. Even afterwards memories can fade. Cloudstrike has not lost many customers.

Covid is a good parallel. A pandemic was always possible, there is always a reasonable chance of one over the course of decades. However people did not take it seriously until it actually happened.

A lot of Asian countries are a lot better prepared for a tsunami then they were before 2004.

The UK was supposed to have emergency plans for a pandemic, but it was for a flu variant, and I suspect even those plans were under-resourced and not fit for purpose. We are supposed to have plans for a solar storm but when another Carrington even occurs I very much doubt we will deal with it smoothly.