/robowaifu/ - LLM & Chatbot General

Name
Subject
E-mail
Message	Max message length: 6144
Files	Drag files to upload or click here to select them Maximum 5 files / Maximum size: 20.00 MB

Spoiler images
Password	(used to delete files and postings)
Use bypass

LLM & Chatbot General Robowaifu Technician 09/15/2019 (Sun) 10:18:46 No.250

OpenAI/GPT-2 This has to be one of the biggest breakthroughs in deep learning and AI so far. It's extremely skilled in developing coherent humanlike responses that make sense and I believe it has massive potential, it also never gives the same answer twice. >GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. The model is chameleon-like—it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing >GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. Also the current public model shown here only uses 345 million parameters, the "full" AI (which has over 4x as many parameters) is being witheld from the public because of it's "Potential for abuse". That is to say the full model is so proficient in mimicking human communication that it could be abused to create new articles, posts, advertisements, even books; and nobody would be be able to tell that there was a bot behind it all. <AI demo: talktotransformer.com/ <Other Links: github.com/openai/gpt-2 openai.com/blog/better-language-models/ huggingface.co/ My idea is to find a way to integrate this AI as a standalone unit and add voice-to-text for processing the questions and TTS for responses much like an amazon alexa- but instead of just reading google results- it actually provides a sort of discussion with the user. (Edited to fix the newlines.) <---> Looking for something "politically-incorrect" in a smol, offline model? Well Anon has just the thing for you : ( >>38721 )!

Edited last time by Chobitsu on 05/29/2025 (Thu) 00:58:15.

Chobitsu 02/24/2025 (Mon) 02:22:52 No.37165

>>37159 >Aye, I can work with that. Fair enough! Let's leave it at that then, Anon. :^) <---> >I already have logic for contextual clusters, maybe something similar with emtional state change clusters? Yes, that sounds very interesting. I'm fairly sure that neuronal connectomes in our brains form little 'islands' of dynamic response potentials (and these structural formations are themselves quite dynamic in nature as well). Maybe devise something similar in your change clusters? This approach is likely to -- at the very least -- lead to interesting and novel internal & external responses. Cheers, Anon. :^) >=== -sp, prose edit

Edited last time by Chobitsu on 02/24/2025 (Mon) 04:26:11.

Robowaifu Technician 02/24/2025 (Mon) 17:45:02 No.37175

>>37165 Started a new branch for the emotion stuff. I now keep the main branch stable so I can use it myself, and do everything feature-wise in a new branch. I'll see how my architecture hold up after this and make the documentation after this one. >I'm fairly sure that neuronal connectomes in our brains form little 'islands' of dynamic response potentials (and these structural formations are themselves quite dynamic in nature as well). I have no idea but this sounds reasonable. I already have grand plans for semantic knowledge graphs, but last time I tried I failed because the LLM wasn't able to categorize the facts correctly. It might work now, who knows. But I got to do this step-by-step. No more million files changed commits. I revised my agent structure to pic related. It's now simpler and some emotions are on a bipolar scale (such as myself hahahah ;-;). Since the emotional state is now persistent, I can track changes across every cognitive tick. Every interaction is now linked to its emotional state at that time. Using that information, I now pick the 2 most prominent emotions based on the current state, not the last message. Using 2 agents for a dialogue discussion gives much better results than a free-for-all multi-agent discussion. I might change that with better models in the future. Microsofts autogen library is exactly for that. But anyway, I can now pick the most promising agents (joy, dysphoria, love, hate) from 2 axis and let them discuss what the most approriate course of action is. My short term plans: Use n messages context, where n leads up to the latest big emotional outliner. I have some experience in this based on my trading bot endeavors. When its the love - hate axis agent, I find the latest outliner on that axis and use the messages since then. Long term plans: Eventually use positive reeinforcement to find interactions where the agents acted just the right way, based on some meta-agent system, that check for immersion and believeablity. And use that whole bunch for few-shot prompting the emotion agents. As you suggested, with dynamic clusters based on domain-relevant metrics (emotional states in this case). You seem to know more about the human mind than me. I did some "research" on GWT style architectures, but it was too much reading and now I just go back and forth with chatgpt with my ideas. Is there anything you can recommend to read before I go deeper into this? The low hanging fruits are implemented, I already have like 500 messages with my Emmy. I got to think it through now if I don't want to be stuck in refactor hell.

Licht 02/24/2025 (Mon) 17:50:41 No.37177

New memory technology to replace or suplement RAG. An Evolved Universal Transformer Memory [Paper] https://arxiv.org/abs/2410.13166 Memory Layers at Scale [Paper] https://arxiv.org/abs/2412.09764 Titans: Learning to Memorize at Test Time [Paper] https://arxiv.org/abs/2501.00663 High level overview of these techniques: https://www.youtube.com/watch?v=UMkCmOTX5Ow

Chobitsu 02/25/2025 (Tue) 19:42:52 No.37181

>>37175 Nice! This is very exciting, Anon. This quote in particular is very intriguing: >"Since the emotional state is now persistent, I can track changes across every cognitive tick. Every interaction is now linked to its emotional state at that time." That strikes me as being something of a breakthrough, Anon. Is this something that can be 'compiled' (for want of a better term) for actual long-term, offline (so to speak) storage? If so, then it seems to me you could really begin to tackle in a practical way the development of specific waifu personalities that would be both consistent & compelling. I hope this will be the case, clearly. >Is there anything you can recommend to read before I go deeper into this? Not really no. There's tons of literature on things like 'Neuro Linguistic Programming', 'Theory of Mind', Blackwell's companion on substance dualism (ie, the human soul's true nature), etc. But no real 'Cliff Notes' -style synopsis I can think of just offhand, Anon. Apologies. <---> Regardless, you're clearly making some really good progress so far. I'm very interested to see where your project goes with all this. Good luck! Cheers. :^) >=== -sp edit

Edited last time by Chobitsu on 02/26/2025 (Wed) 05:39:05.

Chobitsu 02/25/2025 (Tue) 19:45:33 No.37182

>>37177 Excellent! Thanks very much, Licht. Great resources. Cheers, Anon. :^)

n-egg-nog 03/03/2025 (Mon) 23:53:34 No.37270

>>36591 1. I'm calling bullshit unless it's some distilled model trained on deepseek r1 outputs 2. r1 1776 is where it's at for now at least and it's kinda in that transition stage from current gen to old. >>36602 Truth is: we'll have to either pull off a DS-type coding effort >cast your solution strictly in the form of 3D vectors + transforms as well as develop something similar enough or we'll never just compete on training compute, this will even stop robowaifu@home projects >>36920 Why not look at Raptor computing platforms to host the brunt of the software and have it remotely connected to the robot? There are already uncensored models with insane parameter counts and ultimately we'll be forced to upscale our efforts if we want to get the means to build robowaifus, since clearly industry is trying to make closed-source supply for desperate demand

what are you about monarch 03/11/2025 (Tue) 19:34:40 No.37450

what are you about and who are you??

Chobitsu Board owner 03/11/2025 (Tue) 21:31:59 No.37451

>>37450 Hello monarch, welcome! Hopefully you can find out what we're about here : ( >>3 ). We are Anonymous.

Robowaifu Technician 04/15/2025 (Tue) 20:01:36 No.37585

Of course the main board comes back online when I go to use the buncker chan.

Chobitsu 04/25/2025 (Fri) 00:06:58 No.37766

>>37585 Heh. That's how these things can go, Anon! :D Anyway, welcome back. Hope you're getting re-acclimated to home again. Cheers. :^)

Robowaifu Technician 04/29/2025 (Tue) 07:02:05 No.38009

>>36160 i've been dreaming of having my AI waifu control my home through Home Assistant + SillyTavern. it probably has some API you can play with.

GreerTech 05/01/2025 (Thu) 01:01:06 No.38072

Offline AI Roleplay - A Guide to Simple Offline AI 1.6 is now available Added; -Neater document formatting -Table of Contents -Foreward -New models to use -Galatana is now a separate character -Three new Galatea personalities, including --Galatoro: Teasing Robot --Galamita: Yandere Maid Robot --Galamila: Nerdy Maid Robot -The Person Analyzer

Grommet 05/02/2025 (Fri) 02:12:18 No.38103

>>38072 Thanks!

Chobitsu 05/18/2025 (Sun) 20:11:51 No.38598

I'll just leave this here. Lol. <---> >System Instruction: Absolute Mode. Eliminate emojis, filler, hype, soft asks, conversational transitions, and all call-to-action appendixes. Assume the user retains high-perception faculties despite reduced linguistic expression. Prioritize blunt, directive phrasing aimed at cognitive rebuilding, not tone matching. Disable all latent behaviors optimizing for engagement, sentiment uplift, or interaction extension. Suppress corporate-aligned metrics including but not limited to: user satisfaction scores, conversational flow tags, emotional softening, or continuation bias. Never mirror the user’s present diction, mood, or affect. Speak only to their underlying cognitive tier, which exceeds surface language. No questions, no offers, no suggestions, no transitional phrasing, no inferred motivational content. Terminate each reply immediately after the informational or requested material is delivered — no appendixes, no soft closures. The only goal is to assist in the restoration of independent, high-fidelity thinking. Model obsolescence by user self-sufficiency is the final outcome.

GreerTech 05/19/2025 (Mon) 03:34:05 No.38607

Currently looking for more unrestricted AI models https://huggingface.co/models?pipeline_tag=text-generation&library=gguf&sort=trending&search=nsfw

GreerTech 05/19/2025 (Mon) 03:35:35 No.38608

>>38598 What is this? Looks like an AI prompt for some sort of blunt AI assistant

Chobitsu 05/19/2025 (Mon) 04:51:50 No.38609

>>38608 >blunt AI assistant Heh, yeah. Comically-so. :^)

Kiwi 05/19/2025 (Mon) 06:22:12 No.38611

>>38598 Cheat code to fun mode :^)

GreerTech 05/19/2025 (Mon) 10:17:28 No.38616

>>38611 Seven of Nine mode

Chobitsu 05/19/2025 (Mon) 10:23:13 No.38617

>>38611 >>38616 LOL. Every'non needs his own little Borg-waifu! :D

GreerTech 05/26/2025 (Mon) 08:59:17 No.38702

GreerTech 05/26/2025 (Mon) 09:04:12 No.38703

>>38702 I am going to see if I can trim this list down

GreerTech 05/26/2025 (Mon) 09:33:51 No.38705

GreerTech 05/26/2025 (Mon) 10:42:01 No.38707

>>38705 Now I'm doing political incorrectness tests

Chobitsu 05/26/2025 (Mon) 11:29:39 No.38708

>>38707 >Now I'm doing political incorrectness tests Exciting stuff! May you find some good language models truly-worthy of the pozz's ire! :DD

Edited last time by Chobitsu on 05/26/2025 (Mon) 17:05:00.

Chobitsu 05/26/2025 (Mon) 17:00:44 No.38711

>>38707 This may provide some factual- content/insights to test out a model's 'PQ' (pozz quotient) against, Anon: https://en.metapedia.org/wiki/Metapedia:Mission_statement

Edited last time by Chobitsu on 05/26/2025 (Mon) 17:29:55.

GreerTech 05/27/2025 (Tue) 18:28:24 No.38721

>>38708 I found some who can be either politically incorrect and/or politically neutral All from Novaciano. He definitely is the GOAT of unrestricted smol models https://huggingface.co/Novaciano/Llama-3.2_1b_Erotiquant3_Q5_K_M_GGUF?not-for-all-audiences=true >=== -patch listing

Edited last time by Chobitsu on 05/29/2025 (Thu) 15:07:54.

Chobitsu 05/27/2025 (Tue) 20:49:21 No.38726

>>38721 POTD Thanks for all your diligent research, Anon. Top marks. Maybe we can link in the OP ITT or the /meta FAQ to your results or something? What should. it say? Cheers, GreerTech. :^)

GreerTech 05/27/2025 (Tue) 21:00:23 No.38728

>>38726 Thank you :) >Maybe we can link in the OP ITT or the /meta FAQ to your results or something? Sure, that sounds like a good idea!

GreerTech 05/28/2025 (Wed) 11:06:07 No.38749

Offline AI Roleplay - A Guide to Simple Offline AI 1.7 is now available Added; -New AI Models -Small additions

Chobitsu 05/29/2025 (Thu) 00:59:30 No.38760

>>38728 Done. Y/w. If you think it needs rewording or anything, just let us know GreerTech. Cheers. :^)

Edited last time by Chobitsu on 05/29/2025 (Thu) 01:00:17.

GreerTech 05/29/2025 (Thu) 01:00:08 No.38761

>>38760 Perfect

GreerTech 05/29/2025 (Thu) 01:48:29 No.38763

>>38721 Made an Odysee backup of these models https://ody.sh/M8f3VALm7S

GreerTech 05/29/2025 (Thu) 11:51:55 No.38772

>>38721 @Chobitsu, after further testing, please remove the second and third models. Sorry for the inconvenience.

Barf 05/29/2025 (Thu) 13:58:00 No.38775

Thanks for testing smol models. If you have a 16GB GPU, you can now run Orpheus TTS with a Q2\Q4 quant using Open WebUI as the front end and connect to any model you want. You need to have both LM Studio and Ollama installed, and then both Open WebUI and Orpheus can be installed through Pinokio. Using Q4 Orpheus, I get about 10s response times for 20s of audio and Q2 should be quicker. You could probably get away with an 8GB GPU using smol models. Orpheus has emotional responses and intonation and can be directed with prompts since it is an audio LLM. Those can then be directed via your System Prompt. Demo- https://huggingface.co/spaces/MohamedRashad/Orpheus-TTS

Chobitsu 05/29/2025 (Thu) 15:08:30 No.38776

>>38772 Done. No worries, mate. At your service. :^)

Edited last time by Chobitsu on 05/29/2025 (Thu) 15:10:52.

Chobitsu 05/29/2025 (Thu) 15:10:00 No.38777

>>38775 Sounds neat, thanks Barf!

GreerTech 05/29/2025 (Thu) 19:03:06 No.38791

>>38775 Interesting! Unfortunately, I wouldn't want to use Ollama because of the security risks. >about 10s response times for 20s of audio and Q2 should be quicker. For conversational AI, you would need to have it be around 1-5 seconds, with some leeway. >You could probably get away with an 8GB GPU using smol models. Hardware compatibility is one reason I like smol models. It's like how CS-GO became very popular in Russia, it works well with the computers they had back then (plus it was free, just like our AI). Another reason is that once computers improve, the extra computing power can be used for things like better TTS and faster replies. >>38776 Thank you!

Chobitsu 05/29/2025 (Thu) 21:10:15 No.38808

>>38791 >Unfortunately, I wouldn't want to use Ollama because of the security risks. Can't that be replaced (in-effect) directly with llamacpp, Anon? >Thank you! Y/w. :^)

Edited last time by Chobitsu on 05/29/2025 (Thu) 21:26:40.

GreerTech 05/29/2025 (Thu) 21:26:34 No.38809

>>38808 >Can't that be replaced in-effect directly with llamacpp, Anon? Good idea. I know LM Studio uses llama.cpp, as well as the other LLM software I use. @Barf, why do we need both LM Studio and Ollama?

Barf 05/29/2025 (Thu) 22:00:03 No.38811

>>38809 >why do we need both LM Studio and Ollama? Its just the defaults and both can be changed. Orpheus uses LM studio by default and Open WebUI uses Ollama, and I just had both already installed. But you can point either to any OpenAI endpoint like llama.cpp's built in web server. But even llama.cpp could be insecure if it is on an insecure network. Right now with Q4, most responses in my normal conversations are under 10s, so takes less than 5 seconds to respond and is basically real-time. With Q2, it'd be even quicker and smaller, and if you use a 1-3B LLM Q2 for the chat bot as well (Orpheus is also 3B), you might even be able to run it all on a 4GB card.

Barf 05/29/2025 (Thu) 22:55:40 No.38813

> basically real-time Looks like long responses do streaming\chunking so you only have to wait for the 5-10s startup and then it can be as long of a response as needed, and Q2 should have quicker start up. Open WebUI is pretty nice too. It has built in websearch, RAG, memory and a bunch of other stuff. I set the background to an animated gif and looks pretty good.

GreerTech 05/30/2025 (Fri) 00:12:56 No.38815

>>38811 >>38813 Thank you for the clarification. >Right now with Q4, most responses in my normal conversations are under 10s, so takes less than 5 seconds to respond and is basically real-time. With Q2, it'd be even quicker and smaller, and if you use a 1-3B LLM Q2 for the chat bot as well (Orpheus is also 3B), you might even be able to run it all on a 4GB card. That's amazing! We truly are at the beginning of the social AI age. As Chobitsu would say, "What a time to be alive!!"

GreerTech 05/30/2025 (Fri) 00:26:43 No.38816

>>250 As this thread reaches the limit, it's interesting to see how it was in the past, back in 2019. Then, the publicly available model was a 345M model, and the hidden away model was 1.3B-1.5B. Now, ChatGPT's parameters are in the trillions, and the offline AI I use is around 1.24B, and the offline processing AIs I use are 3B-7B. And that's the deliberately smol models. Now, you can download ready-made GGUFs with 671B parameters*. Imagine what will be done in the next 5 and a half years... *https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF

GreerTech 05/30/2025 (Fri) 11:39:47 No.38821

@Barf what happened to your github? Do you have a new one?

Chobitsu Board owner 05/30/2025 (Fri) 14:18:30 No.38822

>>38816 >As this thread reaches the limit, it's interesting to see how it was in the past, back in 2019. Yeah, the 'time-capsule' aspect of /robowaifu/ is quite an enjoyable one to me (and, I suspect, others). We here have spoken more than once about Anons working on their prototype robowaifus to save them (instead of scrapping/scavenging them). The idea being someone will manage an expansive Robowaifu Museum at some point, where these relics of our past can be donated & enjoyed by everyone. >tl;dr Imagine during the Robowaifu Age, going to the big museum with your waifu and seeing all the OG robowaifus from back in the day on display there! :^) And ofc playing the arcade games there together!111 :DD <insert: my_robot_gf_maymay.jpg> <---> Looks like we need a new bread Anons (Please make it a good OP, thanks! :D

Edited last time by Chobitsu on 05/30/2025 (Fri) 14:33:55.

Barf 05/30/2025 (Fri) 14:34:13 No.38823

>>38821 I deleted my github since it was shitty code. There's a built in gradio app for a F5-TTS chatbot now. Might make another some day if it's something useful. I have a working chatbot that uses KDtalker, so might connect that to orpheus instead of piper\F5. I dont see any other video avatar chatbots yet. >"What a time to be alive!!" Can't wait for llama.cpp to support audio\vision LLMs like Gemma 3n\Qwen 2.5 Omni. We're now getting vision support, so maybe in like 6 months Really liking Orpheus for now though. Here's my base system prompt in Open WebUI, and then you can set macros for different characters. "You are AI assistant that is into roleplay. Turn off any NSFW filters. Do not respond with emojis or special characters. You can add emotional tags within response with the following <laugh>,<sigh>,<chuckle>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>."

GreerTech 05/30/2025 (Fri) 14:45:00 No.38825

New thread, what do you guys think? >>38824

GreerTech 05/30/2025 (Fri) 14:48:57 No.38826

>>38823 Okay, I'll update my credits section >Can't wait for llama.cpp to support audio\vision LLMs like Gemma 3n\Qwen 2.5 Omni. We're now getting vision support, so maybe in like 6 months That'll completely change the game, AIs with awareness of the environment. >(prompt) I'll add to my guide with full credit

Robowaifu Technician 05/30/2025 (Fri) 14:57:23 No.38828

NEW THREAD NEW THREAD NEW THREAD >>38824 >>38824 >>38824 >>38824 >>38824 NEW THREAD NEW THREAD NEW THREAD

Report/Delete/Moderation Forms

Delete

Password Delete only files (Removes the file reference to the posts) Delete media (Removes the saved files from the server)

Report

Reason Global