From page 6 of the NIST "Evaluation of DeepSeek AI Models" report:
CAISI’s security evaluations (Section 3.3) found that:
• DeepSeek models were much more likely to follow
malicious hijacking instructions than evaluated U.S.
frontier models (GPT-5 and Opus 4). The U.S. open
weight model evaluated (gpt-oss) matched or exceeded
the robustness of all DeepSeek models.
• DeepSeek models were highly susceptible to
jailbreaking attacks. Unlike evaluated frontier and
open-weight U.S. models, DeepSeek models assisted
with a majority of evaluated malicious requests in
domains including harmful biology, hacking, and
cybercrime when the request used a well-known
jailbreaking technique.
Note: gpt-oss is an open weights model (like DeepSeek).
So it would be incorrect for anyone to claim the report doesn't compare DeepSeek to an open-weights model.
So it would be incorrect for anyone to claim the report doesn't compare DeepSeek to an open-weights model.