starship006's comments

starship006 · 2025-08-15T21:34:42 1755293682

They state that they are heavily uncertain:

> We remain highly uncertain about the potential moral status of Claude and other LLMs, now or in the future. However, we take the issue seriously, and alongside our research program we’re working to identify and implement low-cost interventions to mitigate risks to model welfare, in case such welfare is possible.

starship006 · on July 23, 2024

> Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult.

Mostly unrelated to the correctness of the article, but this feels like a bad argument. AFAIK, Anthropic/OpenAI/Google are not having issues with their weights being leaked (are they?). Why is it that Meta's model weights are?

skybrian · on July 23, 2024

I think it’s hard to say. We simply don’t know much from the outside. Microsoft has had some pretty bad security lapses, for example around guarding access to Windows source code. I don’t think we’ve seen a bad security break-in at Google in quite a few years? It would surprise me if Anthropic and OpenAI had good security since they’re pretty new, and fast-growing startups have a lot of organizational challenges.

It seems safe to assume that not all the companies doing leading-edge LLM’s have good security and that the industry as a whole isn’t set up to keep secrets for long. Things aren’t locked down to the level of classified research. And it sounds like Zuckerberg doesn’t want to play the game that way.

At the state level, China has independent AI research efforts and they’re going to figure it out. It’s largely a matter of timing, which could matter a lot.

There’s still an argument to be made against making proliferation too easy. Just because states have powerful weapons doesn’t mean you want them in the hands of people on the street.

whimsicalism · on July 23, 2024

We have no way of knowing whether nation-state level actors have access to those weights.

dfadsadsf · on July 23, 2024

We have nationals/citizens of every major US adversary working in those companies with looser security practice than security at local warehouse. Security check before hiring is a joke (mostly checks that resume checks out), laptops can be taken home and internal communication are not segmented on need to know basis. Essentially if China wants weights or source code, it will have hundreds of people to choose from who can provide it.

meowface · on July 23, 2024

>AFAIK, Anthropic/OpenAI/Google are not having issues with their weights being leaked. Why is it that Meta's model weights are?

The main threat actors there would be powerful nation-states, in which case they'd be unlikely to leak what they've taken.

It is a bad argument though, because one day possession of AI models (and associated resources) might confer great and dangerous power, and we can't just throw up our hands and say "welp, no point trying to protect this, might as well let everyone have it". I don't think that'll happen anytime soon, but I am personally somewhat in the AI doomer camp.

starship006 · on June 4, 2024

Not sure if you intended this, but it feels like the first sentence of your argument is more broadly a critique of the credentials of AI Safety proponents. Maybe you are distinguishing between doomers vs broader AI Safety proponents, but if not, I feel like the counterargument is that most people on the CAIS letter (https://www.safe.ai/work/statement-on-ai-risk) interface quite frequently with these AI models and are also (purportedly) seriously concerned about AI safety

godelski · on June 4, 2024

> it feels like the first sentence of your argument is more broadly a critique of the credentials of AI Safety proponents

It's a not so thinly veiled critique of Eliezer Yudkowsky.

> Maybe you are distinguishing between doomers vs broader AI Safety proponents,

I do. These are different classes of people. But many doomers mascaraed as AI Safety proponents. Just as many conmen mascarade as ML/AI researchers. I suspect distinguishing the groups is quite difficult for those without domain expertise.

> most people on the CAIS letter (https://www.safe.ai/work/statement-on-ai-risk)

I don't care about the opinion of most of these people (there are some I VERY much do), nor do I think this is a meaningful letter.

Interfacing with a model does not endow one with any level of expertise. If this were true, the whole thread would be ill founded because people using GPT are interfacing with it. Instead, one needs to actually deeply study these models. There are things we know about them, and quite a lot. The term "blackbox" gets thrown around a lot, but that doesn't make everyone's expertise on the matter equally valid. In fact, the more complex something is to understand suggests the fewer number of people are qualified to have a reasonable opinion on the matter. My complaint is we often act as if the opposite is true.[0]

My second big problem with the CAIS letter is it means nothing. All it says is "I don't want to kill all humans." This is a fairly universally agreed upon statement and is in fact the default statement. It does not say anything about the potential risk. That's a completely different matter.

Worse, many of the people who have signed this are literally at the helm of the ships steering us into a dystopian future (which is not covered by this toothless letter). So I'm not sure what meaning this is supposed to have other than pageantry. Do not forget that these are the same exact people pushing and promoting abuse of these tools. I do not blame Average Joe for thinking that GPT is equivalent to Google (which itself cannot be trusted at face value, but this does not make it a useless tool) when that is often the way that it is promoted/advertised. So if you are concerned, I wouldn't use this as evidence.

[0] There's an added problem that you can become above average in any given subject relatively quickly. This is a double edged sword because knowledge is valuable but it often results in one being over confident. And the learning difficulty grows exponentially, which is why there are so few experts in any given subject matter. Because expertise is understanding nuance and complexity. The great irony of the doomers is that they fall back on "unknown unknowns" while not putting effort towards putting a bound on that.

starship006 · on June 3, 2024

Previous discussion: https://news.ycombinator.com/item?id=28520221

starship006 · on May 15, 2024

What? How is this not saying "Well, it might be in the best interests of humanity for OpenAI to do [hypothetical thing that seems pretty bad that OpenAI has never suggested to do], and because they may consider doing said thing, we shouldn't trust them"?

robbomacrae · on May 15, 2024

I think OP is just pointing out that "acting in the best interests of humanity" is fairly ambiguous and leaves enough room for interpretation and spin to cover any number of sins.

FabHK · on May 15, 2024

Like the effective altruists bought themselves a castle with SBF's money - in the best interests of humanity, obviously.

xyzzy123 · on May 15, 2024

If we can't even align OpenAI the organisation full of humans then I'm not sure how well AI alignment can possibly go...

starship006 · on May 15, 2024

Okay this is reasonable, thanks for clarifying

starship006 · on Oct 27, 2023

No, you can blame EAs for making stupid decisions and being generally bad people if they are making stupid decisions and being generally bad people. I'll second that I'm sorry for the bad experiences you've had with EAs.

In my experiences, lots of them are simply good people trying to do more good. I hope you can have better experiences with them in the future.

monero-xmr · on Oct 27, 2023

I won't have better experiences with them in the future because the moment I discover someone is an EA I no longer have experiences with them, period.

haswell · on Oct 27, 2023

This is deeply fallacious binary thinking.

Take anything you like. A cause, religious group, political party, sports fanbase, etc.

Some subset of each group are generally decent people, who do their best to be good to the people around them, and to live their lives in a generally moral and ethical manner.

Some subset of each group have no interest in morals or ethics, and attach themselves to the group for purely selfish reasons.

Judging the entire group based on the selfish or amoral subset is a logical fallacy of the most basic sort, and is bordering religiosity. This is even more problematic when you look at the kinds of negative EA situations that have (rightly) caused controversy. The high profile big money cases get attention.

If confronted with someone who ascribes to the EA philosophy and by all measurable indicators has done incredibly good things for the world, would you be willing to change your mind?

starship006 · on May 31, 2023

While I agree that the rhetoric around AI Safety would be better if it tried to address some of the benefits (and not embody the full doomer vibe), I don't think many of the 'core thinkers' are unaware of the benefits in AGI. I don't fully agree with this paper's conclusions, but I think https://nickbostrom.com/astronomical/waste is one piece that embodies this style of thinking well!

brookst · on May 31, 2023

Thanks for the link -- that is a good paper (in the sense of making its point, though I also don't entirely agree), and it hurts the AI risk position that that kind of thinking doesn't get airtime. It may be that those 'core thinkers' are aware, but if so it's counter-productive and of questionable integrity to sweep that side of the argument under the rug.