It’s really even easier than that. I already do all my work on AWS and use Bedrock that hosts every popular model and its own except for OpenAIs closed source models.
I have a reusable library that lets me choose between any of the models I choose to support or any new model in the same family that uses the same request format.
Every project I’ve done, it’s a simple matter of changing a config setting and choosing a different model.
If the model provider goes out of business, it’s not like the model is going to disappear from AWS the next day.
Given a choice between being “locked in” to a major cloud provider and trusting your business to a randomish little company, you are never going to get a compliance department to go for the latter. “no one ever got fired for choosing AWS”.
This is the API - it’s basically the same for all supported languages
Real companies aren’t concerned about cost as much as working with other real companies, compliance, etc and are comparing cost or opportunities between doing a thing and not doing a thing.
One of my specialties is call centers. Every call deflected by using AI vs talking to a human agent can save from $5 - $15.
Even saving money by allowing your cheaper human agents to handle a problem where they are using AI in the background, can save money. $15 saved can buy a lot of inference.
And the lock in boogeyman is something only geeks care about. Migrations from one provider to another costs so much money at even a medium scale they are hardly ever worth it between the costs, distractions from doing value added work, and risks of regressions and downtime.
99% of people who use it do so because of A. existing agreements wrt compliance and billing (including credits, spend agreements etc.) B. IAM/org permissioning structures that they already have set up.
> Isn't the API worse
No, for general inference the norm is to use provider-agnostic libraries that paper over individual differences. And if you're doing non-standard stuff? Throw the APIs at Opus or something.
> Aren't the p95 latencies worse?
> The costs higher?
The costs for Anthropic models are the same, and the p95 latencies are not higher, they're more stable if anything. The open weights models do look a bit more expensive but as said many businesses don't pay sticker price for AWS spend or they find it worth it anyway.
on less vendor to vet, one less contract to negotiate, one less 3rd party system to administer. you're already locked into AWS anyway. integrates with other AWS services. access control is already figured out.
I forgot to mention that. But funny enough AWS and GCP made a joint announcement that they are introducing a service to let users easily connect the infrastructure of the two providers between their private networks without going over the public internet.
This isn’t some type of VPN solution, think more like DirectConnect but between AWS and GCP instead of AWS and your colo.
It’s posited that AWS agreed to this so sales could tell customers that they don’t have to move their workloads from AWs to take advantage of Google’s AI infrastructure without experiencing extreme latency.
I have a reusable library that lets me choose between any of the models I choose to support or any new model in the same family that uses the same request format.
Every project I’ve done, it’s a simple matter of changing a config setting and choosing a different model.
If the model provider goes out of business, it’s not like the model is going to disappear from AWS the next day.