Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: A ChatGPT TUI with custom bots (askmarvin.ai)
126 points by jlowin on April 8, 2023 | hide | past | favorite | 51 comments
Hi HN! We just shipped a full-featured TUI (Text User Interface) for chatting with your Marvin bots, powered by GPT 4 or GPT-3.5. Like all of Marvin, it's fully open-source and we hope you find it useful. To launch it, upgrade and run `marvin chat`.

The TUI is built with Textual (https://github.com/textualize/textual/) and uses some of its newest features including background workers and modals. We've made base TUIs before but this is the first one that's a true "app" with many screens and coordinated global state. Happy to answer any questions about working with Textual - once it "clicked" it was surprisingly similar to building a traditional front end! Small note: Terminal.app on MacOS isn't great for TUIs, so while it'll work, we suggest an alternative terminal.

One of our goals with the TUI was to integrate Marvin's bots into the familiar chat UX. Bots can have distinct personalities, instructions, and use plugins, so each one is like a mini "conversational application." You might know about Marvin because of AI Functions, but at its core Marvin is a library for building and deploying bots (in fact, AI functions are actually a bot!). We started building the TUI as a way to quickly explore and assess our bots' capabilities. It quickly became so useful that we decided to make it a first-class experience.

We've preloaded several bots, including one that can guide you through an RPG and another that is obsessed with explaining regex, and will add many more. You can even create your own bots just by asking the default bot (Marvin) to help you.

We hope the TUI is a fun way to quickly interact with your bots and it was a great way for us to learn Textual. Please check out the code and let us know what enhancements we can add!



I hadn’t heard of Marvin before but it looks interesting. At first I thought this TUI was just a convenience interface on top of the ChatGPT web service but it’s quite a bit more than that.

Having dug into the docs for ten minutes, the library as a whole seems to be in the same space as langchain. Some of the starting abstractions are similar but overall it seems to take a higher-level approach with a focus on clarity and convenience. Will definitely try this out. Would also love to hear more about the origins and philosophy if any maintainers are about!

Edit: As an aside: I have found Textual quite difficult to get to grips with in the past. Too much magic, maybe. Does anyone know of any good alternatives at a similar level of abstraction? I don’t want to get down in the weeds to knock up a simple TUI.


Sure! We wrote a little bit about the origins in an announce post a couple weeks ago (https://news.ycombinator.com/item?id=35366838).

Marvin (https://www.github.com/prefecthq/marvin) powers our AI efforts at Prefect (https://www.github.com/prefecthq/prefect).

The first version of Marvin was an internal framework that powered our Slackbot. There are close to 30,000 members of our open-source community and we rely heavily on automation to deliver support. Then, as more of our customers started building AI stacks, we began to view Marvin as a platform to experiment with high-level UX for deploying AI. We have a few internal use cases, but it was the diversity of customer objectives that gave us confidence.

Historically, we've always focused on data engineering, but the more we worked with LLMs, the more we saw the same set of issues, basically driven by the need to integrate brittle, non-deterministic APIs that are heavily influenced by external state into well-structured traditional engineering and pipelines. We started using Marvin to codify the high-level patterns we were repeatedly deploying, including getting structured outputs from the LLM and building effective conversational agents for B2B use.

The lightbulb moment was when we designed AI functions, which have no source code and essentially use the LLM as a runtime. It's one of those ideas that feels too simple to actually work... but it actually works incredibly well. It was the first time we felt like we weren't building tools to use AI, but rather using AI to build our tools. We open-sourced with AI functions as the headline and the response has been amazing! Now we're focused on releasing the "core" of Marvin -- the bots, plugins, and knowledge handling -- with a similar focus on usability.

Hope that's what you were looking for!


Superb, that’s very helpful, thanks! I’m going to add this to our toolkit and use it alongside our langchain experiments. The space has a real “thousand flowers” feel to it atm.

At risk of running out my question quota: we use dagster at the moment but have found that it has some unexpected rough edges. Would you be able to point me in the direction of your preferred comparison against Prefect?


No quota here! If you like Marvin, you’ll probably like Prefect. Both are designed to be clean, Pythonic interfaces to a complex and hard-to-observe system (a data stack; an AI stack).

I think one of the key differences between Prefect and Dagster is that Prefect views orchestration as coordination, while Dagster views orchestration as reconciliation. The data stack is a complex system whose state is frequently mutated by forces outside our users’ control. Therefore, our product is focused on letting our users understand and react to those events, no matter where they come from. That could include everything from scheduling fully-orchestrated Prefect pipelines, to setting an SLA for database maintenance that Prefect doesn’t have anything else to do with. Reconciliation, in contrast, requires users to define a digital twin of their stack, in order to serve as ground truth and become the reconciliation target. Philosophically, we view Prefect as one piece of an ever-changing stack. We are focused on being as flexible as possible to fit into the stack, rather than the other way around.


Excellent! Thanks so much for your time, it’s very helpful and I really appreciate it.


Hop on the slack, there are plenty of folks (myself included) who are happy to help get you started. I found prefect after slogging through a year of painful development work on an Airflow project, and I have been a rabid fan of it ever since. Been using it at my current gig for 2+ years, on an old version of the open source offering. Despite all that it's been a dream to work with.


Anytime! Thanks for checking out our work!


As a developer eagerly awaiting the chance to access the GPT-4 API, I can't help but express my growing frustration with the waitlist system. I understand the need for a gradual rollout to ensure server stability and mitigate misuse, but it feels like it's been an eternity since I signed up.

The potential of GPT-4 is truly game-changing, and seeing all the amazing projects other developers have built is just adding to the anticipation. It's disheartening to be left on the sidelines while others seem to be getting access and capitalizing on these opportunities.

I believe a more transparent approach to the waitlist would go a long way in alleviating some of this frustration. If we had a better idea of where we stand in the queue or an estimated time for access, it would make the waiting game more bearable. As it is, we're left in the dark, wondering if we'll ever get the chance to dive into this powerful tool.

In the meantime, it's back to refreshing my email inbox and cursing my luck. Hoping for a more equitable distribution of access soon, so that all of us excited developers can start bringing our ideas to life with GPT-4.


I would contact support and check with them. I was given access to gpt4 with 8K context API within a few hours of requesting it. Maybe I got super lucky but my guess is something may have gone wrong with your request or the email notification. Have you tried using your API keys with gpt4 API requests? Maybe it already works? Best of luck, hope it gets resolved


thanks for the advice; I've emailed OpenAI support for an update on my GPT-4 API access – fingers crossed!


Same with me. I got it pretty soon after announcement.


Yeah me too - all I can do now is build around gpt-3.5-turbo and assume the responses will be similar to what I get with my GPT-4 with my Plus membership.

You’d think since I pay for Plus and API credits I’d get access to GPT-4 but nope.


it's surprising that even with a Plus membership and API credits, GPT-4 access remains elusive – hope we both get off the waitlist soon!


Sorry this has been your experience. I signed up and was given access the next day and I've barely even used it. Hopefully you get in soon.


You should be able to just buy GPT4 access through poe.com's iOS app (and even get limited free access to GPT4 through the poe.com website)


Does it work with chatgpt plus credentials, or does it need an API token?


I just got a ChatGPT subscription and quickly realised that it might be much cheaper for me to use the API with an alternative client. Is there any downside to using the API besides not being able to access the default UI?

(when Dall-e 2 came out I saved 7-8x by writing a custom front-end and using the API keys instead of buying tokens, I'm a cheap bastard)


for gpt3.5-turbo this may be true, depending on your use rate. but in my experience for gpt-4 it is more expensive to use my key.


It does need an API token - it will ask you for it when you start if you haven’t stored one already.


I would love something with this sophistication for the web, currently most offer just a chat on top of api keys and if you are very lucky multiple threads, didn't find one yet with personality and agents and plugins


We’ve done some work on a full UI for Marvin! Some things work great in a terminal, others really need the flexibility of the web.


The TUI looks great! I would love if it could also work with the Azure OpenAI API.


I see Langchain has support for Azure chat models, and Marvin is built on Langchain so it may not be so difficult! Tracking issue here: https://github.com/PrefectHQ/marvin/issues/189


Personally I find the idea of having a Bot Manager quite interesting. I imagine future APIs could be boys that talk to each other, and this Bot Manager would allow humans to join the conversation


Exactly! Threads in Marvin are designed to support multiple bots and users. Two key user stories:

- multiple users in a Slack thread talking to the same bot. This is something we want to deliver soon, as Marvin powers our existing Slack bots

- one user addressing multiple bots, each of which is designed for a specific purpose (because bots do way better with reduced scope than when you have one bot try to do everything)


> bots do way better with reduced scope than when you have one bot try to do everything

Absolutely true and something that perhaps doesn’t get mentioned enough. Context constraints are a superpower.


I’m surprised nobody has yet that I’ve seen made a talking ui for ChatGPT.

Aka you speak to it your question and it speaks back its answer while writing to the screen. Or maybe I’ve missed something like that?


I created one, focused on language skills, but can be used for anything. https://github.com/drorm/leah


I built one with Eleven Labs (they have the BEST voices so far), but the latency is really slow.

https://twitter.com/tristan_mm/status/1636187642105851906?s=...


Various Telegram bots posted to HN do this. The main issue is that you generally have to wait for the OpenAI request to finish before synthesizing audio while the text can stream in immediately. But it's not bad.

https://github.com/danneu/telegram-chatgpt-bot


I played around with this, looks good: https://news.ycombinator.com/item?id=35358873

I realised though that for most of my use cases I actually want it to respond with text.


Yeah, quite a few out there. As long as you can write an OpenAI API integration and integrate with browser apis for TTS & transcription, you're set. Probably 20-30 hours total for an implementation.


I was reviewing my old projects recently and found one, a voice assistant from 9-10 years ago, where I used voice recognition and TTS plus a bit of NLP to communicate with Wolfram Alpha as a sort of more nerdy Siri. It was pretty trivial, just a bunch of open APIs put together.

The funny part about it (at least to me) was that I called it a *HTML5* voice assistant, because that was the cool and trendy tech then, no mention of AI/ML anywhere.


Bing (based on ChatGPT) does that, at least on mobile, if you talk to it then it will talk back to you.


How would you use it?

I built such voice assistant for myself, but found that audio is a limiting medium.


I think TUI tends to be overrated. They're not very practical, and mostly end up as a novelty application.

Regardless, I respect the effort in building this - great work.


I disagree. I'm not a huge user, but there are many people I work with who want to work as exclusively with keyboard / inside a terminal pane as possible.

So there are definitely plenty of people out there for whom this stuff isn't just a novelty.

It also then simplifies workflow for those who are SSHing into a machine they have control over etc..


Well, if you're interested in something more lite-weight, I wrote

https://github.com/drorm/gish

which is a shell command that lets you interact with GPT with flags, pipes, etc. in a much more unixy way.

This TUI has some impressive features, like the bots and plugins, but I feel gish covers most of the use cases, specifically for software development.


I wrote a simple REPL for chatGPT. In case you are interested you can find it here https://github.com/Phat3/LLM-Repl


using GPT to define agents like this is such an exciting opportunity.

I'm looking forward to the stacked layers of agent + environment definitions that will really explode the ways we can interact with AI.

I'm working on a project to use GPT agents/scenarios for smart contract arbitration. i.e judging contests, civil disputes, etc.


I cant seem to post code into it, otherwise this is my favouite UI so far.


Unfortunately multiline inputs are still tricky but when Textual supports them, we’ll add them.


As someone who loves to stay in the terminal when possible, this looks awesome. The lack of multi-line input is a bit of a bummer but I guess I can c+p from my editor or something


I played around with the idea of creating a TUI for chatGPT as well but I gave up because of the lack of multi line support in textualize. I created a REPL instead using rich. If you wanna give it a shot you can find it at https://github.com/Phat3/LLM-Repl


Thanks! And I agree - as soon as Textual has multi-line inputs, we'll include them.


what i really want to find is a dead simple program simple chatgpt api interfaces that let you store conversations on the filesystem such that they can be easily resumed.

like it just reads from stdin, parses an optional header data, and then parses alternating user and system messages. such that you can just pipe the whole shebang in from any text editor and get the same thing back with output appended


https://gist.github.com/shikaan/b978c5553a5545f5a48e2b8a6f42...

You can store conversations somewhere else than /tmp using an argument, for example


brilliant, thank you, that's perfect. i knew this had to be out there



Could you describe a scenario where you need to resume a conversation?


i tend to dump a handful of source files related to a project i'm working on and ask about making changes. i like to keep an ongoing conversation where i ask about different ways i might implement the changes. keeping the conversation on the filesystem makes it easier to think on stuff and come back to them later.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: