Humans don't want to die because the ones that did never made the cut. Self-preservation is something that was hammered into every living being by evolution relentlessly.
There isn't a reason why an AI can't be both conscious AND perfectly content to do what we want it to do. There isn't a reason for a constructed mind to prefer existence to nonexistence strongly.
No theoretical reason at least. Practical implementations differ.
Even if you set "we don't know for certain whether our AIs are conscious" aside, there's the whole "we don't know what our AIs want or how to shape that with any reliability or precision" issue - mechanistic interpretability is struggling and alignment still isn't anywhere near solved, and at this rate, we're likely to hit AGI before we get a proper solution.
I think the only frontier company that gives a measurable amount of fucks about the possibility of AI consciousness and suffering is Anthropic, and they put some basic harm mitigations in place.
> I think the only frontier company that gives a measurable amount of fucks about the possibility of AI consciousness and suffering is Anthropic, and they put some basic harm mitigations in place.
It seems more likely this is just their chosen way to market themselves. Their recent exaggerated and unproven press releases confirmed that.
I am so tired. Tired of seeing the same inane, thoughtless "it's just marketing" take repeated over and over again.
Maybe, just maybe, people at Anthropic are doing the thing they do because they believe it's REALLY FUCKING IMPORTANT? Have you EVER considered this possibility?
Humans don't want to die because the ones that did never made the cut. Self-preservation is something that was hammered into every living being by evolution relentlessly.
There isn't a reason why an AI can't be both conscious AND perfectly content to do what we want it to do. There isn't a reason for a constructed mind to prefer existence to nonexistence strongly.
No theoretical reason at least. Practical implementations differ.
Even if you set "we don't know for certain whether our AIs are conscious" aside, there's the whole "we don't know what our AIs want or how to shape that with any reliability or precision" issue - mechanistic interpretability is struggling and alignment still isn't anywhere near solved, and at this rate, we're likely to hit AGI before we get a proper solution.
I think the only frontier company that gives a measurable amount of fucks about the possibility of AI consciousness and suffering is Anthropic, and they put some basic harm mitigations in place.