It’s really hit and miss for me. Well defined small tasks seem ok. But every time I try some “agentic coding”, it burns through millions of tokens without producing anything working.
> Beta technology disclaimer
> Rovo Dev in the CLI is a beta product under active development. We can only support a certain number of users without affecting the top-notch quality and user experience we are known for providing. Once we reach this limit, we will create a waiting list and continue to onboard users as we increase capacity. This product is available for free while in beta.
You---can, really--slow, speed up or change, how, things sound, by, -- using queues like this, to control how the voice,,, - tells the story {{3sec}} - once you find a voice you like, you can go in and {{1sec}}
Perhaps ironically (given the somewhat corrupted attribution to Alexandra Kollontai of the flippant remark that "the satisfaction of one's sexual desires should be as simple as getting a glass of water"), Stalin reeled in some of the sexual excesses that characterized the early Soviet regime, because, as it turns out, sex is indeed dangerous and deserving of honor and respect, and sexual degeneracy is a sure way to propel a society toward self-destruction and chaos.
Are the inputs and the responses in a non English language? LLM APIs can get costly for non English, sometimes as much as 10x due to more tokens being consumed. Not sure what's the solution here.
Also, maybe you can use sometime kind of caching combined with some mbeddongd search to serve the previous response, if the input is similar above a certain threshold.