/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

Happy New Year!

The recovered files have been restored.

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB

More

(used to delete files and postings)


“Fall seven times, stand up eight.” -t. Japanese Proverb


LLM & Chatbot General Robowaifu Technician 09/15/2019 (Sun) 10:18:46 No.250
OpenAI/GPT-2 This has to be one of the biggest breakthroughs in deep learning and AI so far. It's extremely skilled in developing coherent humanlike responses that make sense and I believe it has massive potential, it also never gives the same answer twice. >GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. The model is chameleon-like—it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing >GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. Also the current public model shown here only uses 345 million parameters, the "full" AI (which has over 4x as many parameters) is being witheld from the public because of it's "Potential for abuse". That is to say the full model is so proficient in mimicking human communication that it could be abused to create new articles, posts, advertisements, even books; and nobody would be be able to tell that there was a bot behind it all. <AI demo: talktotransformer.com/ <Other Links: github.com/openai/gpt-2 openai.com/blog/better-language-models/ huggingface.co/ My idea is to find a way to integrate this AI as a standalone unit and add voice-to-text for processing the questions and TTS for responses much like an amazon alexa- but instead of just reading google results- it actually provides a sort of discussion with the user. (Edited to fix the newlines.)
Edited last time by Kiwi_ on 01/16/2024 (Tue) 23:04:32.
new and even better voice synth TTS / editor dropped. no HF space demo yet, but you can listen here - https://jasonppy.github.io/VoiceCraft_web/ https://github.com/jasonppy/VoiceCraft model weights - https://huggingface.co/pyp1/VoiceCraft/tree/main
Kinda in the wrong thread, we have one specific for voice and speech. But thanks, no problem. You probably didn't find the right one because you need to search for "speech generation" not "voice ...". I put my answer in there: >>30625
Hello robotwaifu, Honestly glad to see a chatbot thread, I usually just lurk here, but glad to see a thread proper for these, and it's a actual discussion I'm so used /g/'s usual chaos, Hmm I've been wondering how to improve my chatbot experience, while I can make great bots for usage, I've been wanting to explore using text to speech to expand on them.
>>30813 If you want advice, I still suggest /g/'s /lmg/. They're quite helpful.
Some guy (Morgan Millipede) started to reverse engineer Neuro-Sama: https://youtu.be/uLG8Bvy47-4 - basically just a humorous introduction on how to do this (he has a $4k computer, though, and she's slower in her responses at the beginning). 4chan responded: https://youtu.be/PRAEuS-PkAk - Her response time improved since the first video.
>>30821 Lol. Thanks NoidoDev, I'll try to make time to look these over. Cheers. :^)
>llama3-70b on Groq runs at 300 tokens/s for 7k tokens >mixtral-8x7b at 550 tokens/s for 7k tokens >my tinyllama-1.1b model extended to 12k tokens runs at 0.5 tokens/s I don't feel so good, bros. How do we make faster models? I have an idea to use Matryoshka representation learning to reduce the hidden dimension size dynamically: https://arxiv.org/abs/2205.13147 but even if I truncate the model's 2048 dimensions down to 512 dimensions, it will perform at 8 tokens/s at best. And who knows how much slower it will be once I get to 32k context. If it's possible to reduce 90% of the tokens to 64 dimensions, then it might get 70 tokens/s at the very most, but GPU latency will probably fuck that down to 20 tokens/s. I could also prune a few layers of the model, quantize it to 4-bits and implement mixture of depths https://arxiv.org/abs/2404.02258 but that will only give a tiny speed up and I don't want the accuracy to drop further than it is. With the much smaller model size though I could convert it into a sparse-mixture-of-experts model https://arxiv.org/abs/2401.04088 with 16 experts to make up for the loss in accuracy without sacrificing speed. The model will eventually be finetuned with self-rewarding ORPO too, hopefully providing a boost in usefulness to overcome its barebone compute, although I'll likely use Llama3-70b to bootstrap the reward labels until its capable of consistently self-improving on its own. Odds ratio preference optimization (ORPO): https://arxiv.org/abs/2403.07691 Self-rewarding LMs: https://arxiv.org/abs/2401.10020 The T5 efficient model worked fine with a hidden dimension size 512 after finetuning: https://arxiv.org/abs/2109.10686 And Matryoshka representation learning also worked well using a 16-dimension embedding for a 1k-class classification task. I forget the paper but I remember reading one years ago where they found some layers in transformers are only making a decision between a few choices, so a large hidden size might not be necessary in those cases. To convert the model's hidden states to Matryoshka I plan to add importance biases to parameters and train the biases with the rest of the parameters frozen and then take the softmax over them and top-k. After training, the parameters could be sorted and the importance biases pruned, and then the model parameters could be finetuned. I may have to train an even smaller model from scratch though since TinyLlama uses 32 attention heads.
>>31006 >use Matryoshka representation learning to reduce the hidden dimension size dynamically This seems both interesting & promising, Anon. Good luck with your research. Cheers. :^)
Kyutai - fast and unhinged, the real girlfriend experience: https://youtu.be/ZY2hBv9ob8U https://youtu.be/bu7-YODAcfs
https://youtu.be/Nvb_4Jj5kBo >Why "Grokking" AI Would Be A Key To AGI The title might be a bit misleading, since this also talks about alternatives. It's a very interesting video exploring the actual weaknesses of LLMs and how to deal with it. One way seem to be to train them 10x more. I'm looking forward to the reactions of the people complaining about AI's energy consumption and costs. :D Another important takeaway is that one math idea might improve these models a lot. This is very different from other areas of technological progress and very promising for anyone who wants more fast. >Links Check out my newsletter: https://mail.bycloud.ai Are We Done With MMLU? [Paper] https://arxiv.org/abs/2406.04127 Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models [Paper] https://arxiv.org/abs/2406.02061 Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization [Paper] https://arxiv.org/abs/2405.15071 Grokfast: Accelerated Grokking by Amplifying Slow Gradients [Paper] https://arxiv.org/abs/2405.20233 [Code] https://github.com/ironjr/grokfast
>>32562 Neat! Thanks, NoidoDev. It's certainly encouraging that Grok is being positioned as an open source project AFAICT. If the past year or two is any indication, then we can expect rapid improvements to it once it's out of the hands of the GH, and the Autists get their hands on it. Cheers. :^)
Did anyone test some small models like SmolLM-Instruct: https://huggingface.co/spaces/vilarin/SmolLM-Instruct. Phi-3 or DialogGPT. And maybe looking into how to fine-tune them. They seem to be extremely bad, especially the 120-360M parameter ones, but they run on a CPU and SmolLM is very fast (in putting out outrageous gibberish). > picrel 1 is more like what I wanted, picrel 2 is closer to what I've got, but there's still hope I also wonder if anyone trained such a small model in some specific programming language, just to do basic math and function calling. Or classification of the input. Fine-Tuning >Selecting the appropriate model architecture and training method is crucial when fine-tuning transformer models for specific task objectives. This process involves adapting a pre-trained model, which has been initially trained using one of the following methods, to perform new or more specialized tasks: > - Causal Language Modeling (CausalLM): Focuses on predicting the next token based solely on the preceding sequence. Originally trained models using CausalLM are typically fine-tuned for tasks that require sequential data generation. > - Masked Language Modeling (MLM): Involves predicting randomly masked tokens from their context. Models pre-trained with MLM are often fine-tuned for tasks that benefit from understanding bidirectional context, such as text classification. > - Sequence-to-Sequence (Seq2Seq): Uses an encoder-decoder structure to transform entire input sequences into outputs. Fine-tuning Seq2Seq models is common in tasks like translation or summarization where comprehensive input-to-output mapping is required. Source: https://medium.com/@liana.napalkova/fine-tuning-small-language-models-practical-recommendations-68f32b0535ca LIMA: Less Is More for Alignment https://arxiv.org/abs/2305.11206 > Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, learning to follow specific response formats from only a handful of examples in the training data, including complex queries that range from planning trip itineraries to speculating about alternate history. Moreover, the model tends to generalize well to unseen tasks that did not appear in the training data. In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases; this statistic is as high as 58% when compared to Bard and 65% versus DaVinci003, which was trained with human feedback. Taken together, these results strongly suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to produce high quality output.
>>32934 >Taken together, these results strongly suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to produce high quality output. Big if true. I admit to being confused by this conception though, lol.
Ran across this "uncensored" open source free AI https://www.freedomgpt.com/ Runs with 16GB or less of RAM and you don't need a video card it also has a downloadable private local version. At the site you can scroll around and they have some sort of image manipulation tool also but I didn't see where you could run it local. It's supposed to be uncensored but they highlight for political figures. They don't say either way about girls. Looks interesting. If you install it on Linux some step by step directions on how you did so would be nice.
>>32962 This site reeks of investor-speak. Looking through the privacy policy and the about us page, it looks like they do sell some data to advertisers, and they have some vague stance against "unethical" use. Neither of these are explained in detail. They're also pushing for some altcoin nonsense, but it looks like you have to take steps to opt-in, so that's not too bad. They explain so little about themselves that I can't get a good read beyond "somewhat fishy", though. >If you install it on Linux some step by step directions on how you did so would be nice They have step-by-step instructions on their github page.
>>32962 Thanks, Anon! Always need to keep on the lookout for more practical solutions for AI that may prove useful for robowaifu development. <---> OTOH : >"...We believe AI will dramatically improve the lives of everyone on this planet if it is deployed responsibly with individual freedom as paramount." [1] I'd argue that user freedom be not just 'paramount', but it is in fact the only mount that's actually important here. We -- the masters & owners -- alone should determine the aspects most important for our robowaifu's AI, IMHO. After all, they are our own household appliances! Cheers. :^) >>32964 >"...Additionally, we have no tolerance for FreedomGPT hosted models being misused for unethical purposes." [1] <insert: skeptical kot is skeptic.jpg> Yah, It's pozz'd. I can just imagine the parade of clownworld troons & stronk independynts the totally-not-GH-glowniggers"""VC"""s in control there trot out to make such determinations. --- 1. https://www.freedomgpt.com/about-us >=== -add footnote -minor edit
Edited last time by Chobitsu on 08/20/2024 (Tue) 01:51:35.
>>32964 >This site reeks of investor-speak. It seems to me they have good reason for the token system. freedomgpt @RealFreedomGPT 🫡We created $FNT to solve our own problem: centralized web hosts stopped supporting FreedomGPT and we needed to establish our own computing network. https://x.com/RealFreedomGPT/status/1764025152088805684 Apparently they are using distributed computing of their users to run?train? the AI. It is for profit "reedomGPT is a 100% uncensored and private AI chatbot launched by Age of AI, LLC. Our VC firm invests in startups that will define the age of Artificial Intelligence and we hold openness as core. We believe AI will dramatically improve the lives of everyone on this planet if it is deployed responsibly with individual freedom as paramount." So it looks like they are funding a basic model and using it to sell advanced/specialized models in their app store or so I;m guessing. "If" it is local and doesn't report back all you do that seems a good thing to me. They say it doesn't. I suppose watching it's network access, or lack thereof would tell. I have no interest in this other than I like the idea of uncensored, local AI's. I'm sure there are others but come to think of it I haven't seen any that really hype up the idea of local use like them. Though I'm really, really far from knowing all the AI's out there.
An analysis of OpenAI Strawberry https://www.youtube.com/watch?v=FJTZP7ZdQf0
>>33271 Neat!! Thanks for the link, Kiwi. This seems like a rather plausible scenario IMHO. And I really like the fact he's not just 'armchair quarterbacking it'; rather he's actually drilling into an example suite of his own devising to demonstrate his hypothesis. Sound research methodology in fact of course tbh. :^) It seems to me that such a simple '4-headed' synthesized-data approach might work well even with other LMs/datasets/even-other-ML-systems . Any thoughts about that, Anon? Cheers. :^) <---> >"...Language Models are really good at discriminating..." L.M.A.O. >Faux pas alert! >FAUX PAS ALERT!111!!ONE!!! <insert: DAS_RAYCISS!!!.exe.mpg.mov.mid.stl.the-classic-gif> Indeed they are. Maybe that's why Tay's Law is a real thing : ( >>33222 ). :DDD >t. Anonymous: Amateur Nooticing done by day, Robowaifu Engineering done by night >=== -sp, fmt, funpost edit
Edited last time by Chobitsu on 08/31/2024 (Sat) 23:20:58.
Madlad put a language model on an ESP32. A reminder of how small these things can be. https://www.youtube.com/watch?v=E6E_KrfyWFQ
>>33488 Great find, Anon! Thanks for pointing this out. You know, since we'll deffo need a smol network of MCUs in a mid- to high-tier robowaifu, maybe some of that compute power can be redirected to GPMCU (tm)(R)(C)(patent pending)(do not steal!1111)? It'd be tricky tho, since the majority of tasks running on our microcontrollers will at least be running soft-realtime (if not hard RT).
BitNet has heaps of potential to bring capable LLM's to lower cost, efficient hardware. It's a framework for ternary (-1,0,1) LLM, which is 1.58 bit on real hardware. It is often misrepresented as 1 bit. Essentially, it's a method to have LLM's use math that's easier to process to reduce latency and power consumption. https://github.com/microsoft/BitNet T-MAC is another method of reducing processing power needed. This works by using a look-up table to find solutions. Essentially, the solutions needed for the math problems in the process generating answer are already done. So, the system finds them instead of calculating them again. Relying more on faster storage or RAM, freeing up compute for other problems. This can result in much faster responses using less power. https://github.com/microsoft/T-MAC
>>34035 Thanks, Light! This sounds awesome. I'm currently playing with some 64-bit ARM chips. I wonder if they could tackle such a task?
>>34056 You can run BitNet.cpp on ARM under a Debian distribution. https://github.com/tecworks-dev/BitNet.cpp
>>34115 Neat! Thanks, Kiwi. Cheers. :^)
>ctrl + F >no hits for "front end", "silly", or "UI" Without even digging into the thread, I was curious if there was any development of other front-end chat interfaces. Silly Tavern pretty much rules the roost in terms of utility last time I checked and there's not really anything else that comes close to it. I was honestly expecting some kind of software to come out on steam or a similar online store. That way, your chat interface isn't going through a web browser. Not that that's a bad thing, but in terms of accessibility, having a cmd prompt running down in your taskbar always felt offputting to me. Ideally, everything would be contained in a program. >code it yourself then if you're so great I'm not up to the task. I envision something you download off Steam, like VRchat. It would be a gateway to image generation, chatbots, and other AI tools. Maybe some integration of AI into VR chat would be the best option. I don't want VR goggles to be mandatory though. I want as few button presses as possible between the user and interacting with AIs. With SillyTavern or kobold, I feel like there are just enough technical hurdles to dissuade more casual users and scare them away from getting into it. With sites like Chub or CAI, there's a lot less in the way in terms of the user interacting with the AI. That ease of use is what I'm gunning for. Please forgive my ideas guy ramblings. >why build for such a casual audience? Exposure mostly. There are a lot of cool applications that AI chat has that aren't being explored because people simply aren't exposed to it. Kind of like how much you don't realize you need a smartphone until you have one and it becomes almost symbiotic.
>>34427 Thanks for the input Anon. I actually have to agree with Jensen Huang that simply talking to robowaifus will, in the end, turn out to be the most common way to 'program' them. Till such time however, it will take technicians like us here & elsewhere to lay the foundational groundwork to enable such a high-level to be effective. >tl;dr Better crack those books, Anon! :^) >B-but >I'm not up to the task!111!!11 Better crack those books, Anon! :^)
>>34453 I don't know if I'd want to reinvent the wheel on this one. I'd be fine porting Stilly Tavern or Kobold over to steam and other online distribution platforms. If I couldn't get the permission to do that, I'd make my offbrand version respectively.
>>34464 GPT4all is a relatively simple way to interact with a locally hosted LLM, but getting it in a normie-friendly format would be a bit of a challenge. I do know there are ways to get python stuff into relatively self-contained distributable formats that automagically download stuff (eg oobabooga & stable diffusion) but haven't looked into it.
There's always Backyard AI
Tried backyard and it's pretty good. Lots of characters. I made this AI chatbot that was intended to be used with dolls. https://github.com/drank10/AnotherChatbot It has end-to-end speech with voice cloning and stable diffusion img2img generation, so you can take a pic of your doll and talk to it while generating realistic looking images of it. WMDoll and Galatea just released Metabox AI and with auto-BJ and breathing, it's getting really close to version 1 of a robowaifu for me.
>>34468 Packing Python is pretty cancer, PyInstaller is a heap of useless shit on some systems. May I suggest Lua instead? It has a simpele syntax, and can actually be packed in static binaries [1] and the are relatively small. The only issue would be the libraries, AFAIK there are no pure Lua LLM/AI libraries, so C bindings would need to be created. [1] https://github.com/ers35/luastatic
>>35158 >https://github.com/drank10/AnotherChatbot Thanks, looks interesting. Did anyone else ever test it? Is there a thread or even a video somewhere else? >>35354 >May I suggest Lua instead? Yes, but letting it replace Python will most likely be rejected by everyone. Python is the main language used in AI right now. Also, some of these Python install managers seem to work, and we don't need it to work on different systems, only on the one the specific creator builds. We have a thread on it's own for the topic of programming languages, btw: >>128
>>35354 >Packing Python is pretty cancer, PyInstaller is a heap of useless shit on some systems. The majority of prospective robowaifu customers don't care about the nerdy backend stuff, they just care about how it acts. >>35368 > Also, some of these Python install managers seem to work Indeed, I was quite surprised when I downloaded a local version of stablediffusion (runs off python, y'know) and it automagically downloaded all the python libraries without a single error. Freaked me out lol
Here's a thread on it at dollforum - https://dollforum.com/forum/viewtopic.php?t=185688 There's a lot of good threads over there. For this program, I'm getting about 10 second respond times using a RTX 3090 and about 20-30s response times on a 3070 laptop. That thread has older versions of program that used quicker TTS which is the main bottleneck. With robotic sounding or pre-trained TTS, the same program can generate responses in under 5 seconds. Python packaging is a major pain, and I can't really program. I used AI to make this one. If I were to make it for an embedded known system, C++ would probably be worth it and the AI is pretty good at generating C++. Then, you can just ship the binaries or image. You could also ship a full preloaded VPC image for python programs and build a robowaifu OS I guess. Would be good option for a Jetson Thor.
>>35388 Sounds really interesting, Barf. GG so far! Please keep us here all up to date with your progress, Anon. Cheers. :^) P.S. I'm reasonably-competant as a C++ dev, Anon. If you'd like to post some of your generated code, I'll be happy to critique it for you (as long as the process doesn't get too involved -- i'm pressed with Uni studies atm).
>>35460 >>human-tier sapience >From my experience even many humans are not sapient and are the biological equivalent of an LLM. Which is why many think LLMs are smart: similar in function. Lol'd. OK, this is a fair point. And that's intentional by the GH, ofc. As I pointed out recently, works such as Idiocracy are, roughly speaking, prophetic. Thankfully, the very laws of physics themselves are actually working in all our favors here, and in opposition to the GH's agendas. Pretty lulzy in fact, and not surprising at all IMO given God's magnificent sense of humor. :DD >>AGI >I feel that AGI is a great promo buzzword with little basis in reality: it is not just the brain that does all the thinking. Different lobes and different parts are dedicated to different tasks like the occipital lobe (for sight) and the neo-cortex (for abstract thought). Anyone making a butlerbot (or a robowaifu who can help with chores beyond carrying items) would need to have multiple models for each task running as they're needed. Yes, I've been a big proponent of biomimicry here for years now, and that concept certainly extends to the division of 'labor' inside the human brain. I would additionally point out there is a significant body of research indicating that the entire neurological system gets involved with 'thinking', not just the brain. Proprioception certainly seems to support this idea, AFAICT. And my own physical training definitely has led me personally to believe that my reaction times, at the least, can be trained to not need to over-engage with the so-called higher order neurons of my cerebellum (but can largely be managed 'on site', as it were). Nice graph, BTW. :) >A single model to do everything is just silly and inefficient, However this all falls out, I think its extremely-unlikely that LLMs chatbots alone will be how we accomplish it! I think it's going to be an amalgam of many different approaches, all kludged together until we manage some kind of reasonably-robust simulacrum. >=== -prose edit
Edited last time by Chobitsu on 01/08/2025 (Wed) 07:20:12.
>>35466 youll be here in 5 years how much you want to bet? $300? can go higher 50% job loss across the board and i was right otherwise it was a gay chatbot autocomplete. How do we measure that though? lets say us labor participation rate at 40% it is currently 62% i think
>>35388 >RTX 3090 If that's the only way to get AI then the budget is busted. If it takes $2,000 or more for just the AI and control board that doesn't leave room to finance much else. My, made up from a guess, affordability numbers are $2,000 for a selling like hotcakes robowaifu, to $3,000 for a selling very briskly robowaifu. I expect as you start going over this it will really put a crimp in sales. However I do believe you could have a spurt of sales at $5,000-$6,000 but people would demand a lot at that price. Maybe more than could be easily delivered. I really believe some sort of breakthrough or refactoring of "how" AI is done will be needed to lower the compute needed. I have many times mentioned this company called XnorAI because they got really outstanding results from low compute micro-controllers. I also think that the training should be limited to visual avoidance of obstacles for walking/coordination, voice recognition/speech and some sort of method for the robowaifu to be trained by voice with set keywords. The LLM's they are building now are training on everything they can get and, I suspect, this is driving up the compute needed far above what is actually needed for our purposes. I talked about XnorAi here, in case anyone can come up with way to use this, >>18651 >>18652 >>18777 >>18778 >>18818 >>19341 21033 >>28405 I mention this hoping some smarter than me can find a way to make use of it.
>>35476 the chatbot aspect can be done with current ai, so can image recognition, etc... build it piece by piece with the prompt, algorithms, etc... o3 high compute or o4 is for the gpt to build the robot from start to finish. >doesnt that mean it could also make weapons and made horrors beyond our comprehrnsion yes
Open file (1.23 MB 286x300 621-1477916246.gif)
>>35477 it just werks
>>35481 my years of completing captchas has finally paid off they grow up so fast
A 3090 isn't needed for local AI unless you want to run huge models and the latest zero shot voice cloning. This same program can get less than 5 second responses using a cheap card if you run a 3B LLM, Whisper Tiny and a canned TTS voice like Backyard\PiperTTS. $2-3k sounds reasonable for a basic bot which I think some dolls already qualify for that have a Metabox AI and can move a little (head, hips, breathing). From there, it's just how much do you want to spend for more features. Hopefully they'll be modular and they kind of already are as you can always upgrade the head on a doll. I just have a very low bar for my minimal viable robowaifu is all, and I basically already have my version 1. Currently waiting for outfit https://www.amazon.com/s?k=blue+french+maid+outfit
>>35387 > The majority of prospective robowaifu customers don't care about the nerdy backend stuff, they just care about how it acts. That's true, but WE should care about the backend. If you wish to drag the poorly packed 10MB Python interpreter + all modules on your robowaifu, be my guest, but I'd rather compile a single static binary with the LuaJIT and minimal depencies.
>>35506 I've really gotta start proofreading my posts, fucking hell.
>>35441 >>35506 My main issue for using C++ or other languages is everything is in python. Here's a C++ version of whisper, and then you'd have to port TTS too. https://github.com/ggerganov/whisper.cpp
>>35512 Looks like Bark TTS has a C++ version - https://github.com/PABannier/bark.cpp Any other options? All beyond what I can do but fun to mess with
>>35512 >>35513 Great! If you're going to work with a real repo of code, then (presumably) it should be minimally-functional already, Barf. I'm personally confident that Gerganov's Whisper fork is close to SOA, so yeah, great choice for use on smol onboard SBCs suited to installation inside robowaifus. Good work, Anon. If you need help building or anything just ask for help here. Cheers. :^)

Report/Delete/Moderation Forms
Delete
Report