LLMs are not as great for text adventures as they might seem. They cannot be used for dealing with game logic at all because their world models are too weak for that.
They could maybe be used as glorified parsers (i.e. read human intent and convert to a command) but I have yet to see a good implementation of that.
They could also be used to embellish details which are insignificant and the author has not spent time on, but (a) what's the point? and (b) what the author chooses to leave unimplemented or generic is actually an important signal to the player.
Yeah, LLMs would need to be wildly constrained to a world model. I remember in some LLM text game that was supposed to be medieval dungeon crawl, I just declared that I pulled a shotgun from my bag and blasted the goblins, and it just rolled with it instead of saying 'no.'
> what the author chooses to leave unimplemented or generic is actually an important signal to the player.
For computer games, yes, because you can get stuck with no apparent way out. As an aid for TTRPGs where a human GM is present and can steer the PCs out of blind alleys, it could make a very nice aid to world building. “Give me fifty random characters with backstory for a Pathfinder 2e game” is exactly the sort of thing we should use AI for. It doesn’t really matter. It’s flavor. So if the AI messes up… who cares?
LLMs are absolutely fantastic for text adventures. You are straw manning by assuming the idea is to just feed some world description into the context and then chat with the model.
How would you ensure that an LLM "gets" a puzzle mechanic and that reasonable attempts at it will be rewarded with progress, while not letting players sweet talk it into into disregarding the puzzle?
How about maintaining the state outside of the model's context for example in some SQLite database? The purpose of the language model would be as a language interface to a statically defined set of (SQL) commands/statements. And so on - there would be more problem's to be solved, of course, and sweet talking may always remain a possibility just as cheating is in any other game as well.
The current crop of LLMs are not able to consistently/logically update the state in the SQLite database based on player actions. They will update it when they are not supposed to, update it to the wrong value, and fail to update it entirely when they are supposed to.
I tried this. It sounds good on paper but the LLM will just "forget" to use it's tools. Either it will decline to query the database and just make stuff up, or it will decline to update the database when it should. The further along the game play gets the more out of sync the game word gets from the external state. I'm sure there is a clever solution but I never found it.
you're making the mistake to assume that leaving the structure of the communication and game play to the LLM is the only option. the LLM just is a tool serving a specific purpose in the game play. the LLM cannot forget to query if the query/state-management task is simply an imperative step in a loop. it's not left to the LLM to remember it.
Do you have some recommendations? I've only had poor experiences with these things that essentially can't hold the line between letting players 'do what they want and the world responds' versus players 'making the world do what they want and the world obeys"
They could maybe be used as glorified parsers (i.e. read human intent and convert to a command) but I have yet to see a good implementation of that.
They could also be used to embellish details which are insignificant and the author has not spent time on, but (a) what's the point? and (b) what the author chooses to leave unimplemented or generic is actually an important signal to the player.