This is a great point! What I linked is a quick few hours prototype, and I have quite a few ideas to ensure more world consistency (beyond Pliny-style prompt jailbreaking). I didn't have the time yet to prove they would work well, though.
I ended up giving up. It's incredibly hard to keep it on track but also let the user be creative. At any time I could just say things like "I jump into the lake" or "I open the chest" even though neither one was mentioned, and it would happily continue on. I found myself pretty far down the generate a JSON scene full of JSON objects to interact with and quit - because at that point, you're just writing a game engine.