Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why wouldn't you want to punish people who are at fault, especially if there is a pattern of problems or gross negligence that causes harm?


Blame-oriented cultures result in people hiding problems, which makes them worse in the long run.

That doesn’t mean people aren’t accountable for their actions in “blameless” cultures, but problems are first assumed to be the result of systemic issues, deserving to be fixed, rather than individual issues, deserving to be punished.


Exactly. I had a team lead who started working for me a while back. We had scripts that could be run on the web farm to perform different tasks.

Two of them relevant to this story were:

  webservers.regenerate.all.cache.files
  webservers.release-prep.stop.all.services
The first one would refresh all the cached information after a marketing database update. The second would stop all the webservers.

Guy's first day; I'm showing him the ropes; we push the marketing data update and set about regenerating all the cache files by manually picking the correct file from the folder of all possible files. I'm sure we can all guess what happened to make this a story remotely worth telling...

Complete site outage. Completely unnecessary. Completely human error.

Should we blame the guy who clicked on the file that was directly adjacent to the one he intended? Should we blame me as the guy overseeing the training? Or should we change the system so that files that we use multiple times everyday and are safe/innocuous are't right next to an E-stop/EPO button? Or maybe we should change the system so that pushing marketing data refreshes the caches files automatically?

Blameless culture favors the latter actions over the former and tends to make your operation stronger and more resilient over time. The experts (and the novices) who made the mistake can speak freely about what happened and how we might prevent it, without fearing reprisal.

If someone repeatedly kills the site by mistake time after time, despite reasonable safeguards being in place, they should face disciplinary action. But when they make an honest mistake because we left an idling chainsaw laying around on the workbench, it makes no sense to blame them for grabbing it by mistake.


You are conflating criminal acts with engineering problems.

As stated earlier, blameless postmortems are for RCA of a particular incident. If you shoot every engineer who causes a incident you will succeed in having no incidents because no one will bring them up or making any changes for fear of getting shot.


You might want to, but whether you should do it is a separate question of its own.

If you promise people that there will be no blame, no punishment, no nothing, then they may speak more forthrightly. Do you want that honesty or do you prefer to retain the option of punishment?


because in the end the powers at work will make sure that the blame goes to some person turning a wrench instead of finding the real reason and the real source of the blame, most likely a process or inadequacy of redundancy/checks-and-balances would have been the real fix. Pointing at Max over in Wrenching because he left a wrench in the motor because he's been working double shifts for months on end does not fix future issues, and Hank the Replacement will likely make a similar mistake at some point in the future




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: