OK guys, this is a pretty hefty post. I've been doing some more holowaifu theorycrafting and research, and also decided to try to flesh out the waifu program's architecture a bit more. There are three basic parts, the control system, the waifu's internal behavioral logic, and the rendering. I have some more details about how these would work, but there still some aspects I'm not sure about. But I'm sure the other board members can help with those.
I know this wall of text looks formidable, but please read all of this because the payoff for doing so is potentially colossal. I decided this post was too big and covers too many things to be all in the same post, so it comes in 2 parts.
The Setup
>Control system
For this I had a couple of ideas. The first is extending the tracking marker concept. Up until now this holowaifu project has relied on the principle of summoning the waifu when a suitable tracking marker such as a QR code is detected. But there's no reason to limit this only to the waifu. You could introduce multiple markers that represent different AR objects. So for example, if you have a marker for a virtual keyboard on a glove you're wearing, you can just look at your hand to summon a virtual keyboard in the AR space (or other suitable control scheme; I'm a fan of a Mass Effect-style dialogue wheel) and issue commands through it.
The second is using hand gestures. The reason I picked these ideas as opposed to something else like physical buttons or head movement detection is because they operate on the same fundamental principle as the waifu herself does, the principle of AI image recognition. This simplifies the project so we don't have to develop a separate method to control the visor; we can do so through the functionality already developed for the holowaifu herself, although I might like to incorporate a voice command system at some point simply because it doesn't step on the toes of the other control systems and it creates more of a sense of the waifu being your OS. But this system parallels the smartphone models that lack physical keyboards (most of them), instead using touch screen keyboards. The vast majority of smartphones now use touch screen controls; only BlackBerry and a few other outliers that have very small market share still have a physical keyboard, and I think the AR visor market would work the same way.
>Behavioral logic
I think this portion of the program should be written as a state machine, much like the control system likely will be. State machines represent basically any sort of nontrivial programming I might be able to do because they just intuitively make sense to me. But we could have the waifu switch between states according to hand gestures and other forms of interaction that govern her behavior, including interactions with other AR objects you summon through their separate markers.
>Rendering
This is the part that I have the hardest time understanding, which is kind of a problem because under this system the AR visor is controlled through AR means rather than physical buttons. You obviously need to render the waifu (or an AR-based control system) before you can interact with them.
But there's another element to this that might make it possible for the waifu to exist persistently. The current concept has her being forced to stay near a marker and vanishing whenever you look away from the marker. But we could make her stick around and move realistically through the environment without a robot to denote her position if the AR visor is capable of scanning the environment and creating a digitized version of it. At that point, the holowaifu would exist within this mixed VR/AR space and wouldn't vanish when you lose sight of a marker; she only needs the marker to instantiate herself. Obviously, the more realistic the digital clone of your local environment looks, the more processing power will be required to render it, so if you want to go this route it's probably best to produce the cloned environment with cel-shaded graphics because these are more forgiving in terms of system requirements, but more realistic graphics are pretty accessible these days.
Another possible method is to incorporate the aforementioned smartphone projector into the visor and then use a system similar to Star Wars: Republic Commando to issue movement commands to the waifu; this is only possible if the waifu has persistent existence. In RepCom, the player character Delta-38 can issue movement commands to the members of Delta Squad by using the D-Pad to project a holographic waypoint much like the markers we're discussing here. The waypoints are also context-sensitive, so if a Delta Squad member is ordered to a position that has special properties (i.e. a sniping position, a bacta tank healing station), the squad member will take the appropriate action for the context. The holowaifu should behave the same way; if you use the visor's built-in projector to project an action marker onto a chair, she'll go to the chair and sit in it, while if you tell her to go to a certain spot and dance, she'll do that. If the visor is capable of recognizing tracking markers for hand gestures and the waifu's position, it should also be capable of recognizing interactables in the environment, particularly if action markers are projected as an assist for the image recognition.
To be continued