Context:
- I'm working on infrastructure that's friendly to distributed development of complex AI applications.
- At the least, I want to solve everything I mentioned at the end of
>>27144, meaning it should give easy ways of supporting emotion regulation (through feedback loops), embodied control (through native experimentation support), and heuristic derivations (through hybrid structured-unstructured generations).
- To support distributed development, I want it to make it easy for people to plug in their own compute (desktops, cloud compute, robots, whatever else), and I want it to support enough access control to avoid catastrophic effects from, e.g., raids.
- It boils down to orchestration software modeled on Kubernetes, but more support for distributed development (i.e., many clusters with many owners as opposed to monolithic admin-managed clusters) and asynchronous communication channels (pub-sub as opposed to DNS-based cluster networking). I've made a few design changes to support all this.
- My approach to access control is here
>>29197 >>29248.
- The crux of my approach to hybrid structured-unstructured generations is here
>>28127.
- Until now, the feedback loop & experimentation support pieces were missing.
Update:
- I just finished implementing what I think is a viable basis for feedback loops & experimentation. The design for this was hell to figure out, mostly because of the complex way it interacts with access control, but I think I have something that can work. I have a test backend working and the necessary client library changes completed.
- On top of what kubernetes provides, I'm adding three new concepts: "remote" controllers, posting messages to controllers, and "fibers". Remotes and fibers are both specified through the "metadata" field of any config, posting messages is done through a POST rest api.
- Any config can be assigned to a remote controller, assuming you have the necessary permission to use another cluster's controllers. If a config is assigned a remote controller, that controller received all operations executed against the controller (create, update, delete) while your own cluster is able to observe the results (read). I originally added this since the people that know how to set up optimizers are usually not the people that set up & run models. Remote controllers make it possible for one person to optimize another person's models without needing "too much" access to the models.
- In kubernetes, all operations are config file changes. The new POST api gives a way to send a message to a controller independent of any config file changes. You can post messages against a config file, and that message will get picked up by whichever controller is responsible for handling that config file. The controller can, but isn't isn't expected to, make any config changes as a result of posted messages.
- Fibers enable controllers to post messages to each other across clusters, again without granting "too much" access. Normally in kubernetes, configs are identified by group/version/kind/name tuples. With fibers, configs are identified by group/version/kind/name/fiber. You can think of a fibers as adding an extra "dimension" of configuration whose purpose is to tie together multiple controllers. The controllers for any config with the same group/version/kind/name (and different fibers) can post messages to each other. For experimentation, one fiber can be responsible for generating trials (candidate configurations), another can be responsible for evaluating them (value assignment), and a third can be responsible for deploying them.
- I'll be testing this out next to find a good design pattern for running modules that continually self-optimize as they run.
I apologize if this is confusing. Once I get a prototype up, I think that will make things a lot clearer.