>>41860
I can offer some insights.
- having powerful hardware will improve quality immensely, but you can get a lower form of this up now with smaller models.
- It will have context limit challenges which will need to be overcome with smart uses of tokens and how you store memories, but you can have a functioning thing as you describe, but detailed memories will eat into this heavily. having different ranked memory tiers and clearing lower ones and a mix of other clever ideas can improve all of these issues, at this stage it's just a matter of having a "lower resolution" AI companion, you could probably later update your methods as tech advances, and greater hardware means greater resolution.
- I would suggest using multiple models to parse and make your AI companion smarter and not stuck into the patterns and paths that each models holds close to and deviates towards. to keep it simple, load one model, think for a bit, switch models, think a bit, especially with current limitations I would let a large chunk of time when the companion is by itself computing, to go over it's past and it's token database and refine it and have discussions and debates with itself to self-improve and refine itself.
- I would have these personally as modes which you switch from her active life and talking to you, and her maintenance/sleep mode where she improves her database.
- I will use sillytavern just as the idea vessel here, as I am used to it but I have not done this work myself properly since I'm busy with life.
. . . use something like the memory and world lore books to store memories and events, they can each have a comprehensive recording into essentially "quants" of the lore books, where your chat history is fully recorded as non-quanted, then various levels of condensing and sorting out less important info/tokens, then your companion will think for each task and memory which quant of each lore book is suitable keeping in mind limitations hardware and current state of LLMs.
- basically it will have a large database which it will parse and read through similar to how you will have it load models one at a time to use then switching to another, it will load different sets of the full database at a time to parse it.
- You do the usual things you would expect, writing the personalities, creating the framework of the database, then you just have it use auto-mode similar to how sillytavern has group chats with different personality card as the other character, which is sort of her maintenance/ recursive improvement mode, again, multiple different reasoning/COT models being swapped through will be ideal here specifically.
- Then you just let that be during periods of free time, and you also have her alone/ continues her life mode, which will develop memories and her character and your relationship, and to keep it coherent and immersive, I would make it follow natural time and not try to have it just create events that happened while you were gone as a task itself. The maintenance mode will be crucial here to sort and make her life events both logical and coherent as well as cutting away waste tokens, building those lore books, etc. the model changing is crucial also as even with settings and repetition penalties models slowly converge into themselves without manual maintenance that can be automated with multiple models and the mode I discussed earlier.
Most of my knowledge I have is more on how you can do it abstractly, I would just vibecode and let LLMs walk you through setting it up when you get stuck.
You can do all of this with smaller models and everything, but the "resolution" of your companion will be smaller, and will take longer for maintenance, and in active conversation mode I would pick your favorite model(s) and just use one as you speak to her, all of the data she needs for the conversation will be prepped as her "mental state" before from maintenance and from lore books which will fetch relevant data as you speak.
The specific models I am still a newfag on, no recommendations that are any better than on this forum or halfchans /g/.
Ai and LLMs run well on linux though, best of luck.