I'm sorry for not being able to read everything here before posting. I plan to read this thread whenever I can. For now, I just wanted to post what I'm working on since (1) it's relevant to the thread, and (2) I think I've figured out some important implementation details.
Background: I've been working on this for nearly a year now, though my initial goal wasn't to build a cognitive architecture. It was to take advantage of novel AI research as quickly as possible. It turns out that the infrastructure for taking advantage of AI research is also really good for building very complex agents, and I'm now trying to figure out if it's good enough to support a full *cognitive framework*. By cognitive framework, I mean the scaffolding necessary to make cognitive architecture development plug-and-play.
I'm basing this on the HuggingFace Agents framework. An HF Agent is more of an assistant than an agent (i.e., it responds to requests, it doesn't do anything proactively), but I've hacked it up to support proactive behavior. Here's an overview of what my hacked-up version does:
- The core is based around a code generation model, not necessarily an LLM.
- An agent can accept input from multiple "streams". Each stream message triggers a code generation run.
- Each invocation of the code generator is given these inputs: (1) a "header" prompt to give global guidance & information, (2) examples to further help guide outputs, (3) a history of recent messages, (4) a formatted version of the message coming in from the stream, and (5) a list of functions that the agent is allowed to run. Each stream is associated with its own way of formatting messages so the agent can distinguish between streams.
- The code generator produces two outputs: (1) an explanation of how it plans to respond & why, and (2) any code that it should run.
- HF Agents can catch errors and go through an iterative process to try to get the code generator to fix any errors in the code.
Given different guidance and tools, I think it's possible to implement a wide variety of agents with this. For example:
- If told to act like an assistant, and if given standard assistant tools (calendar functionality, email functionality, check news & weather, etc.), it acts like a dumb assistant.
- If told to act out a personality, and if given a single send() tool to send a response to the user, it acts like a dumb chatbot.
- If told to act out a personality, and if given a send() tool, a tool to edit information in its own prompt header, a stream to set future reminders for itself, and a stream to search & tab through its own conversation history, it acts like MemGPT.
- Given the above plus a tool to search for and invoke relevant OpenAI GPTs, it's like MemGPT augmented with an enormous amount of functionality.
- And, of course, you can give an agent tools for interacting with a (sub)cluster of other agents.
I'm fairly confident at this point that I can support any "modern" chat-based agent, so now I'm wondering what it would take to support three other things:
- Emotion regulation. I spoke with a cognitive scientist that specializes in this, and he's convinced that emotion regulation all boils down to: positive feedback loops for satisfying needs, negative feedback loops for avoiding harms, and a "common currency" for balancing different motives. For this, I plan to add first-class support for running experiments, which is broad enough to include parameter search and reinforcement learning. I think that should be sufficient to model any feedback loops, and I think therefore it should be sufficient to model emotion regulation. Given this, I expect plugging emotion regulation into an agent should be conceptually easy: make sure there's a definition of what sorts of things the agent wants & avoids, kick off an RL-based experiment whenever something relevant comes up, and have the RL algorithm generate "thoughts" to feed the agent through a stream.
- Embodied control. Chatbots are "easy" since the final expression (text) can be generated by a single model. With actual bodies, or even just with video, the final expression is split into multiple modalities (e.g., voice, body movements, facial movements), and they all need to be in sync with one another. If we had good multimodal models, that might be fine, but we don't, so I need a way to generate outputs from multiple models and somehow make them consistent with one another. I think good experimentation support would solve this problem too. For each expression, it can generate many outputs from many models, and it can eventually converge on set of mutually-compatible outputs. Or it can stop after X seconds and pick the best it has so far. The integration should again be conceptually easy: instead of giving the agent a send() tool, give it a more general express() tool, which kicks off an experiment to figure out how best to handle the expression. I think there are more aspects to embodied control than just deliberate expressions, but I think deliberate expressions are the hardest to handle.
- Heuristic derivations. For this, what I ultimately want is for an agent to be able to ask itself "Is this something I would do?" I want to model the chatbot personality as a set of premises, and the agent should be able to determine whether any response it gives is consistent with all information derivable from those premises. If this is possible, then the chatbot can *automatically* improve itself over time. It should try to extract new premises from every response it generates as well so it's consistent with its own past responses. I have ideas on how to do this, but they're all vague and will ultimately require experimentation with LLMs. I know there's a lot of research working on similar goals (X-of-thought, anything people analogize with Q*), so solving this might end up just requiring the ability to quickly check a bunch of research results, which incidentally is what my infrastructure was originally designed for.