Discussion about this post

User's avatar
NotPeerReviewed's avatar

As someone who works with language models, and has spent some time mucking about in their innards, I'd like to split the "world model" thing into subquestions:

1) Do stacks of encoder and/or decoder layers have a world model in them?

2) Do (a) the outputs vanilla GPTs generate naively, or (b) the outputs some GPTs are tuned to generate using RLHF or similar techniques, draw upon that world model in consistent and reasonable way, or are they just kind of BSing their way through it in ways seem humanlike to humans?

The answer to (1) is definitely yes; there's lots of things language models can do that they wouldn't be able to do if they didn't have a world model in them.

I think the answer to (2a) is mostly no; vanilla GPT-3's output lurches around conceptually in ways that are often bizarre and incoherent.

The answer is (2b) is unclear, but I lean toward yes. A system that has access to a world model and is rewarded for behaving as if it is drawing upon that world model is probably drawing upon that world model at least to some degree.

ChatGPT still has some serious issues using its world model, though - it's extremely prone to making things up that don't exist; it seems to really want to give affirmative-seeming answers to questions ("does Python library X provide functionality for Y?") even if it should be giving a negative answer. So it may be drawing upon the world model when it can, but BSing in certain cases because it wants to say "yes" even though its world model tells it "no".

Expand full comment
John Shack's avatar

as a counterpoint, you may find this article interesting to read: https://deoxyribose.github.io/No-Shortcuts-to-Knowledge/

Expand full comment
6 more comments...

No posts