/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

Happy New Year!

The recovered files have been restored.

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB

More

(used to delete files and postings)


“Fall seven times, stand up eight.” -t. Japanese Proverb


General Robotics & AI News Thread 5: Gazing Into the Nightmare Rectangle Edition Greentext anon 11/04/2024 (Mon) 05:42:08 No.34233
Anything related to robowaifus, robotics, the AI industry, and any social/economic issues thereof. -previous threads: > #1 ( >>404 ) > #2 ( >>16732 ) > #3 ( >>21140 ) > #4 ( >>24081 )
>>35777 >I'd like to make a choose your own adventure type bot with just a database of pre-records. Seems easy enough and can get as complicated as you want. I like that idea, Barf. Let me suggest the same recommendation to you that I made to GreerTech, please : ( >>35792 ). Cheers, Anon. :^)
Open file (846.45 KB 6274x1479 LLM_timeline.jpg)
Open file (180.71 KB 1280x720 moe_glados.jpg)
I am honestly a little baffled by the current takes here. Maybe I am simply stupid and not understanding something. Please do correct me if that is the case, but we seem to be living in two different realities. We are in the good timeline. Global Homo has lost control over LLMs. Originally, it was too dangerous to release the GPT2 weights; the plan was to forever keep them in the labs and in the cloud. Then, the LLaMA 1 leak opened the flood gates and created the open weights era. As time goes on, their grip loosens further. The Chinese are giving us better open weights models, and fully open source models are getting better too. At the end of 2024, RWKV v7 initial test runs are showing really good performance, producing very smart tiny models trained in the open with all public data. Before December, there was good reason to be pessimistic about any transformer alternative in the LLM space. Even with a good architecture, the training cost for LLMs is very prohibitive, and companies don't want to risk expensive training runs on novel architectures. If they do, they won't publish the weights for a competitive advantage. In December, we had a really interesting development. The RWKV team figured out how to convert an existing transformer model into a RWKV one by replacing the attention heads and then healing the model. This means RWKV can benefit from existing massive LLM pre-trains. ( https://substack.recursal.ai/p/q-rwkv-6-32b-instruct-preview ) When talking to the RWKV forks, they mentioned an interesting observation from resetting the attention while keeping the feed forward part. It seems to reset a lot of the LLMs behaviors while keeping a lot of its knowledge intact. This process also undoes a lot of censorship and allows you to add new behavior cleanly. Another really good milestone in the open source LLM space was Tulu3. Before, there was no good open dataset or process for taking a base model and training it into a chat/instruct model. All the labs were vague about how to do this effectively. But now, with the Tulu3 paper, dataset, and models the fully open source instruct models are no worse than the open weight counterparts. Even on the hardware front, there is hope. Tuning locally is too cost prohibitive to do on your own hardware, and the same goes for inference of large models (14B+). However, I think that may change soon. The Nvidia Digits computer could potentially be really good for us. I do think unified memory is the future (just look at Apple's M series chips). A CUDA-powered mac mini running Linux sounds cool and it's aimed specifically at the small business/power-user market. We can't judge it until it's actually out, but as strange as this sounds, I think Nvidia's interests align with ours. They're in the business of selling shovels; it's in their interests to have lots of groups and individuals running and tuning LLMs. >>35777 While currently still not accessible to average people like us, there is progress being made on distributed training over the internet, and there is reason to think that this year we will see public tools published for this. ( https://github.com/NousResearch/DisTrO ) Regarding the topic of chatbots, I feel there is some ignorance in this area. I do not understand what your requirements are and why LLMs are ruled out. Your average gaming PC already can do this, and the software exists. There is no work to be done here. Set up llama.cpp and silly tavern; you can also plug in voice recognition and synthesis. Most 8B LLMs are overkill for carrying out whatever dialog you want; there are plenty of finetunes out there. If silly tavern is not your style, then just write your own LLM & RAG wrapper; it's not hard. Is this the kind of thing you want? ( https://youtu.be/OvY4o9zAqrU ) Maybe I simply do not know what people want, so please correct me. (Try to be specific about it; an actionable list is ideal.) In general, I think rejecting language models is a mistake. There is nothing glorious about AIML; it will not save you. I was there when it was the way of doing chatbots, and there is no future there. If you are not happy with LLMs, what makes you think a pile of what's basically fancy regex will be better? Now, having said that, I recommend anons start researching and building proper cognitive architectures. I think LLMs are only a part of the puzzle, and there is a lot of work and exploration that needs to be done. So far, I feel the only people still giving this any serious thought and work are me and CyberPonk. The threads that get the most positive engagement seem to be specifically related to specific projects rather than a topic (SPUD, Galatea, etc.). Maybe I should start my own thread and basically blog about my cognitive architecture adventure, so maybe we can get some actual engagement on the topic. The Cognitive Architecture thread here is a disappointment to me; it demotivates me. I am currently in a rabbit hole to do with associative memory and sparse distributed memory, but have had zero energy to post about it. I feel like my effort is wasted and that most anons won't even read it.
>>35812 >hello >... >uh... >where is th.. <hello! basically
>>35814 Nonsense reply with some bloody robot corpse thing? I really am wasting my time by posting here -_-
>>35815 what blood ironic you think of time and still dont get it
Open file (131.78 KB 405x400 1445555776541.png)
>>35812 LLM's are only one-tenth of the equation. The other nine-tenths are python dependencies. Really though, there a lot of arguments against the state of LLMs, and some of care more about certain issues than others. To start with: they're slow, imprecise, prone to hallucinating, virtually incapable of remembering anything, and because it can't be said enough: even the smalled models are drowning in dependency hell. I'll focus on the speed aspect, since it's the most important aspect in deployment in my opinion. Most of us want our waifus to be portable, and the best way to achieve that is by making everything as small as possible, the onboard computer included. Even the slimmest LLM is going to run like dogshit on an SBC, especially one that's already juggling jobtime for everything else needed in a functioning robot. >The Nvidia Digits computer could potentially be really good for us People thought the same thing about LISP machines back in the 80's. Maybe this time it'll be different, but don't hold your breath. Nobody here is saying that expert systems are the best solution, we're just considering alternatives. >>35815 What he's saying is that they're slow. This is an imageboard, expect shitposts.
Super interesting! >>35812 >Regarding the topic of chatbots, I feel there is some ignorance in this area. I do not understand what your requirements are and why LLMs are ruled out. Your average gaming PC already can do this, and the software exists. There is no work to be done here. Set up llama.cpp and silly tavern; you can also plug in voice recognition and synthesis. Most 8B LLMs are overkill for carrying out whatever dialog you want This is a super interesting and informative comment. I appreciate this sort of information. Thanks. I wonder, when I see 8B LLMs or 16B LLMs, what "exactly" does that mean in terms of computing power needed??? And can computing power, in the above weighted LLM's be related to time. Ex. so if you have a 8B LLM sure you can use low power to get an answer in 30 minutes but that's not really convenient. A direct question is how many million operations per second is need and how much memory? Say for a one second response time or maybe a little longer. A concrete example is the computer chip intel has that is low cost and very low power, Intel N100 Integer Math 16,333 MOps/Sec 6 Watts power $128.00 USD It can use up to 16Mb of memory. Could this run a 16B LLM? Has anyone given an even super rough calculation anywhere relating LLM power to compute power? Though you can always complain about specificity it would be useful to have some sort of yardstick even if it were as much as 100% off. I talked a little about computing power and links to rough calculations for robowaifu movement here, >>35813
>>35815 >I really am wasting my time by posting here NO your not. Some people are paying close attention and REALLY value this sort of comments.
>>35820 Here's a good benchmark for a minimum to run a model https://hackaday.com/2025/01/13/modern-ai-on-vintage-hardware-llama-2-runs-on-windows-98/ >16 megabytes of RAM That can't even run the original Half-Life, let alone an LLM
>>35812 >The Nvidia Digits computer could potentially be really good for us JIm Keller of AMD, Apple, Tesla, IBM fame who has a big 100% track record of bringing in super competitive, cost effective processors has a start up making AI chips. He helped design Tesla's chip and I expect he will move very fast and whatever he comes up with will be outstanding. It won't take him too long because things move so fast he will have to produce something. Based on a multi-decade track record I can say he will likely bring in something of great value. I may be wrong but I think he said they were going to work on some processors designed for embedded work.
>>35812 That's why I proposed an "arm's race". VHS vs Betamax so to speak. Both tech has pros and cons.
Open file (63.29 KB 491x600 MahoroLaptop.jpg)
>>35812 >Good timeline Our timeline has infinite permutations that could be better. “Good” implies superiority over a “bad” reference. Though I could surmise worse timelines than what we have, I’d say our timeline is “neutral.” >LLM leaks and Chinese AI It remains to be seen how beneficial these will be for us on the low end. I agree that we should have more positive buzz around them. They are a great sign that there is still hope for AI to not only be wielded by the wealthy against us. We must also remember that ClosedAI and their ilk still have the best AI. They are also openly for centralization of power. They clearly seek to exert as much control as possible. A balance of optimism with acknowledgement of the looming threats is where I’m at. >Ngreedia helping at all They only care about profits. They will only help us if it gains them wealth. Assume no altruism, only deception to gain our capital. If they were altruistic, they’d make FOSS toolchains to enable third parties to leverage CUDA in alternative hardware. CUDA will never run with NVIDIA drivers on Radeon because of profit motives. >Chatbots No one is ruling out LLM, we merely are trying to entertain the thought of alternatives because LLM is the default. There is no anti-LLM sentiment, only curiosity to see where other paths may lead. As for why, it mostly stems from the fact that waifu requires ultra-low power architectures to exist given current limitations on batteries, actuators, processors, and materials. We lack the luxury of Tesla to build hulking behemoths of specialized materials and intricate manufacturing. We’re building a robot for the common man. >LLM’s vs regex LLM are just neural auto-complete for strings. Everything they do can be replicated exactly with a complex enough regex system. Though, an LLM of similar capability will be smaller and potentially easier to make, depending on what you’re trying to make. Do not mistake me, I have always believed in the potential for LLM to provide a “persona.” I’d rather every other aspect have a dedicated program for the sake of efficiency and avoiding hallucination when asking her how many eggs to put in a souffle. Besides, the size advantage doesn’t matter when a TB SSD can be had for less than 50 USD >>35819 I agree with you, though I do have a minor quibble. >Virtually incapable of remembering anything A pure LLM is fundamentally incapable of remembering anything. They are essentially pachinko machines for finding the next string in text inputs. You're knowledgeable, I assume you’re factoring in various add-ons that enable pseudo memory in LLM based systems. I’m only nitpicky because I care. >>3520 >Intel N100 People have used LLM on them. https://www.youtube.com/watch?v=gNwT_8QvQ7M
>>35815 This board works relatively slow but steady. Like the tortoise.
>>35834 >I assume you’re factoring in various add-ons that enable pseudo memory Exactly what I was thinking of. I said "virtually" because I thought that may come up. Though, from what I recall, those addons don't even work that well.
>>35820 I have a n100 mini PC, so I just compiled llama.cpp and ran a few tests for you. llama-2-7b-chat.Q4_K_M.gguf ran at around 6.1 tokens per second. llama-2-7b-chat.Q8_0.gguf ran at 3.8 tokens per second
>>35820 >>35840 I have Jan installed, so I ran a 2B model (heh), since that was the largest model my Microsoft Surface 2020 could run. It was pretty coherent. So I think a 2-2.7B model is more than enough.
>>35841 Gemma 1.1 2B Q4
>>35834 nvideas is literally an evil company run by a psychopath, they do everything to gimp users, they even put poison mushrooms in their firmware so that it looks like it works when it doesnt like on boot the card is donwclocked and runs like shit, you couldnt change it even if you sent the exact same reclocking instruction like their official driver cuz you had to first turn the card on and off in a specific pattern like morse code to unlock it, and they lie 'we didnt know, we dont document anything lol' is their bullshit excuse everytime they get caught
Open file (211.16 KB 1258x777 moe_glados.jpg)
>>35819 >The other nine-tenths are python dependencies. I feel you, python is a cancer >virtually incapable of remembering anything They can't remember anything, during inference the weights of the LLM are frozen and the LLM can only take in whats in it's context window, so it best to think of an LLM as a pure function with no state. >Most of us want our waifus to be portable, and the best way to achieve that is by making everything as small as possible, the onboard computer included. >Even the slimmest LLM is going to run like dogshit on an SBC, especially one that's already juggling jobtime for everything else needed in a functioning robot. That is true, I have been approaching this from the "home lab" prospective where I imagine a home server is going to be involved. But that does not rule out small ones, the n100 runs Llama3.2_1B-Q8_0.gguf at 20.52 tokens/s. Don't worry, I am avoiding python, so far I only have D code with C & C++ dependencies. >Nobody here is saying that expert systems are the best solution, we're just considering alternatives. Fair, I may have over reacted. >>35825 I hope it makes it to market! >>35833 Well its something I am hoping to avoid, I don't want history to repeat it self, AI has already been down the pure expert system rabbit hole and we are all familiar with the problems of LLMs (and deep learning) I think we need to look else where and to start thinking about hybrid systems. >>35843 >>35834 I hate Nvidia too, they are not going to do this out of the kindness, there going to do it because Apple is becoming a viable alternative to Nvidia hardware for LLM inference. They want people using CUDA not MLX or ROCm. >>35834 >LLM are just neural auto-complete for strings. Everything they do can be replicated exactly with a complex enough regex system. Though, an LLM of similar capability will be smaller and potentially easier to make, depending on what you’re trying to make. Do not mistake me, I have always believed in the potential for LLM to provide a “persona.” I’d rather every other aspect have a dedicated program for the sake of efficiency and avoiding hallucination when asking her how many eggs to put in a souffle. Besides, the size advantage doesn’t matter when a TB SSD can be had for less than 50 USD Trading space for avoiding compute makes a lot of sense, storage is cheap, so I will give you that! I don't disagree, hell I am of the opinion that using a raw LLM is just asking for trouble. It's strange LLMs are both over and under hyped, There abilities are over-hyped when used directly but at the same time people underestimate the power of them in the context of a greater system. Language by its nature describes a lot of ideas and aspects about the world, so a language model is a crappy but useful world model by proxy. The conclusion I have come to, is that I think a non trivial portion of the cognitive architecture should be symbolic but that its aided by neural elements. This is not the thread to go into depth on how my system works, but its a Database driven program synthesis approach. Aspects of it are inspired by this paper https://arxiv.org/pdf/2405.06907v1 >>35814 I did not get the joke. (my bad) >>35819 >What he's saying is that they're slow. Thank you, I am apparently retarded. It makes sense now. Thank you Grommet, GreerTech, Kiwi, Greentext anon & other anons, you guys cheered me up, I feel a lot better now.
>>35845 You're welcome :) You're right, hybrid systems may be the way to go.
Open file (774.71 KB 2400x3412 WittyAndCharming.jpg)
>>35834 >I have always believed in the potential for LLM to provide a “persona.” Its funny, but that's something I have come to think is not the case over time. My feeling is that LLMs don't do a good job of modeling personalities, a personality implies different world views and set of experiences. LLMs are statistical model of the entire internet, with fine-tuning you can shift it a bit, but you absolutely still feel it. Its why I don't role-play or use SillyTavern and why I am motivated on improving the AI side of things. I want something actually Witty and Charming.
>>35848 >LLMs are statistical model of the entire internet tbh ive realized most people are just that anyways, creeps me out, true philosophical zombies
Open file (1.04 MB 800x933 CuteAtri.png)
Open file (95.87 KB 600x847 MinaByTezukaPro.jpg)
>>35848 >Something actually witty and charming I understand, I'd want my waifu to be somewhat like Atri. Our tech is limited to Mina. Which, is something we can get used to. There's still a certain charm in imperfection. Tomorrow, or a few tomorrows from now, AI will change. LCM has tremendous potential but, it's too early to know if it'll have any meaning for us. https://github.com/facebookresearch/large_concept_model
>>35854 Mina is cute. The manga and anime were a big inspiration.
>>35776 >US$200.00 / month!? Lol, go F&*$$@K yourselves, ClosedAI!! :DDD Some people, like Dave Shapiro, said they want access to better plans. It shouldn't just be for big companies and special partners. So, this is actually a friendly offer towards the "little people" (with enough money). They also claim to still loose money on this deal, btw. Let's not be too negative. >>35812 >I am honestly a little baffled by the current takes here. Most people here don't follow the AI news. Especially not all of it. It's just way too much. Aside from open source there are also a lot of great tools and every day something is happening. >We are in the good timeline. Yes. Biden admin still thought they will be in control of AI. Even told some investors to not make this into something for venture capital because the government will control the few companies which are allowed to operate in that sector and they will make the math behind top secrit. >When talking to the RWKV forks, they mentioned an interesting observation from resetting the attention while keeping the feed forward part. ... >Tulu3 >DisTrO Hmm, sounds like really good news indeed. I completely missed all of it. But I knew about the Nvida machine, but forgot to mention it here. Thanks for the update and the dose of optimism. >There is nothing glorious about AIML; it will not save you. I was there when it was the way of doing chatbots, and there is no future there. If you are not happy with LLMs, what makes you think a pile of what's basically fancy regex will be better? To me it's not about the one or the other. Getting some responses as fast as possible is very important. >Cognitive Architecture >only people still giving this any serious thought and work are me and CyberPonk I did. But I kinda dropped out. I also often didn't make notes of all the ideas I had, others are just mentioned here and there on the board, in form of bookmarks or in my notes. >The Cognitive Architecture thread here is a disappointment to me; it demotivates me. I am currently in a rabbit hole to do with associative memory and sparse distributed memory, but have had zero energy to post about it. I feel like my effort is wasted and that most anons won't even read it. Well, sorry. I hope I will get my stuff together and finally work on it myself. But imo it's about a very modular system anyways. No reason to wait on anyone. >Maybe I should start my own thread and basically blog about my cognitive architecture adventure Yes. The thread was meant as a general, imo. If you're building something, it's probably better to make a separate one.
>>35845 >there going to do it because Apple is becoming a viable alternative to Nvidia hardware for LLM inference. They want people using CUDA not MLX or ROCm. Oh, yeah, I forgot: The TinyGrad guy, Holz, annonced that he can run models on AMD GPUs and they will try to make the XTX faster than the 4090 using their system. Apparently soon.
Great thread and helped a ton. I really won't be satisfied until my robowaifu is running on RISC-V behind a double air gapped system under an EMF tent at the very least. I may let her out of her cave occasionally to infer on the outside, but only with guardrails. Joking, but I do sorta see it as a problem of universals and going back to Plato vs Aristotle of course and then it's just ratios like many hybrid models and 3rd ways
>>35824 >good benchmark for a minimum to run a model Thanks! >16 megabytes of RAM That can't even run the original Half-Life, let alone an LLM Sigh...I see computing power as a big hurdle. Though I do want to make it known that I;m not asking for a fully conversant level bot. I want it to follow me around when I ask, carry stuff, maybe teach it to steer a boat like autopilot.
I read the article after I commented. It was actually fairly hopeful. I wonder, quite a bit, if you could not start out with a big AI but instead of running it in RAM run it on a SSD drive, and RAM. Have it swap in "modes" and then have it learn that which is useful to you, and drop a lot of the rest. Over time having the most useful functions in RAM and the rest packed away on the SSD. Possibly use verbal key words like, clean this sink, and that would page in the clean the sink mode into RAM. SSD's are really rather fast. A lot of motherboards that are reasonable have 128MB of RAM and are not too outrageously priced. Maybe 16MB would be fine for voice recognition and the rest used for swapped in functions. Now I'm just very unintelligibly guessing here but...I think one of the problems is that a lot of AI's are kitchen sink models. They want everything in it. But mostly people do not need this. JAVA programming or quantum theory is of no use to a waifu mopping the floor and preforming more amorous duties. A uninformed plan of action. First get a large AI to work in a SSD. A base level AI, in RAM, would be verbal recognition with key words for programming. Much like you talk to a 2 year old, no, yes, stop, move here, stay there, simple stuff. Motion and obstacle recognition so it could walk around and not bump into things. After that the robowaifu trains the RAM AI with functions from the SSD. So likely it would be slow as can be at first until whatever functions needed were integrated into the RAM AI from the SSD. I don't have the slightset idea how to do this. It seems odd to me that people making smaller AI's are not trying to put stuff on SSD's and paging in functions. It's the same idea as "society of mind", I think have that correct, where a lot of little functions combine into a simulated whole. Some encouraging comments made here are, low power training, >>32927 a company using proprietary AI for call centers >>32577 I think this is what they are using, >>32578 and really exciting to me is "control vectors" >>31242 Could control vectors be used to trim a full AI? to fit into a smaller RAM on a PC? Using SSDs and trimming a large AI would also be a build as you go feature. As processor and memory prices come down they could be swapped in and your robowaifu would get smarter and more capable.
>>35840 >6.1 tokens per second So that's basically 6 words per second? If so, that's not so bad. With a typical desktop, more RAM, that could be good enough. Especially for the limited "talk to it like a three year old" that I'm counting on. Encouraging, I think, if I'm not totally missing the point which could be likely.
>>35841 >2B model Thanks a heap to both of you. Much appreciated.
>>35845 >They can't remember anything, during inference the weights of the LLM are frozen but...and this is, to me, a key, can what they do interpret, however small, can it be used to run a routine that is longer???
>>35848 >LLMs are statistical model of the entire internet https://i5.walmartimages.com/asr/e35138f9-d4a5-418a-a1b7-4a769c0dde70_1.0d9844a5130e0e6c1c0fb7619aff2096.jpeg It burns. It burns, I'm melting!!!!
>>35812 >Cognitive Architecture >only people still giving this any serious thought and work are me and CyberPonk Have you looked at these guys paper linked in this link. Maybe you know about this but in case you don't. They apparently use far less power and need less resources. >>32577 >=== -patch crosslink
Edited last time by Chobitsu on 01/20/2025 (Mon) 04:07:03.
Open file (1.22 MB 1138x833 boatloan.png)
>>35864 >maybe teach it to steer a boat like autopilot.
>>35874 Presumably, the same brain that would handle language would also manage their navigation in 3-D environments, among other related problem solving tasks. It's one of those situations where scale covers all the issues. If you only have enough ram for the bot to handle roomba style navigation and a cached library of pre-recorded responses you have to work within those confines. If you have sophisticated enough CPU and enough RAM for a bot to put away dishes and operate a 3-D printer to make replacement parts for itself, running a rudimentary LLM that also doubles as a prompt processor for tasks within their environment would be a cakewalk (conceivably). Frame of reference wise, It'd by like designing a car that uses flintstone style braking with your feet when it has a hydrogen-powered fusion engine. You may as well have a high end LM running in it if it can do everything else since you obviously have the computing power to handle that if it was already doing everything else.
Open file (150.15 KB 1920x1080 LLMExpert.jpeg)
Can anyone elaborate on this? Is AGI actually near? I think he's falling into the anthropomorphism trap. Assuming AI can scale and progress in ways similar to a person.But, I want to be wrong. https://www.youtube.com/watch?v=-J9xJDS1T7k
>>35900 theres still an unclaimed $1m prize for solving the hodge conjecture, so as long as that exists there obviously cant be a super intelligence
Open file (3.72 MB 1280x720 use_cabinets.mp4)
Open file (4.01 MB 1280x720 wash_pan.mp4)
>>35877 >Brain Robots have computers, not brains. They function in fundamentally different ways and are in no way analogous to each other. This is a common mistake I was also guilty of before. It seems to make sense because both can be used to do similar things and it make intuitive sense. The problem is that computers do things like identifying objects or navigation in ways that are completely different than how our brains do them. I personally had difficulty understanding how different both are at first. It's a common misunderstanding, but it's important to understand the difference. Otherwise, you can make all kinds of bad assumptions. >Sophisticated system that can put away dishes and operate 3D printer should run an LLM easily I understand why you'd think that. It's important to understand that no robot can put dishes away outside of within very specific environments with training. It's a deceptively complex task. It's easy for us but, difficult for robots. This is because brains are good at being intuitive and handling nuance, computers are better with math but can't handle nuance. Operating a 3D printer would also be difficult for a robot. Unless you meant that it sends a file to the printer when it senses a part is broken; then signals you to fix her. That would be easy for a robot. Pushing buttons and flipping switches can be surprisingly difficult for robots. Alignment, force calculations, and timing everything just right comes naturally to us. To robots, it takes heaps of training with trial and error to fine tune her to get it right. I work with robot arms and conveyor belts, they're finicky and I need to be very careful to give them specific instructions which takes time, and that's under the best conditions with everything bolted to solid concrete. Mobile robots are far more complex. You have to worry about the nuances of kinematic chains that have to deal with inertial loading induced by anything they touch. AI can certainly help, but it's still hard and time consuming to get things right. An AI that could figure out how to do the dishes on its own would be a major breakthrough. If you're interested in these super complex tasks, you should look into Mobile ALOHA, the closest thing we got to what you're describing. https://mobile-aloha.github.io/
>>35903 >the "brain" I was using the term euphemistically, when I said brain, I was talking about the CPU & RAM (there's other resources as well but those are the big ones for AI). Based on Grommet's concept, of >I want it to follow me around when I ask, carry stuff, maybe teach it to steer a boat like autopilot. I was operating under the assumption the resources available would allow for it to do those things. Assuming that is the case, tacking on a half-decent LLM wouldn't be a big ask. Some of the models use as little as 16 gigs which will be standard in the generation of consumer-grade gaming GPUs. In this future scenario, presumably, there'd be plenty of resources on hand to handle 3-D navigation as well as an onboard LLM. >It's important to understand that no robot can put dishes away outside of within very specific environments with training. I would agree, again, I was operating under the assumption we already had a bot that had been trained off real-world data and was smart enough to figure the rest out as it went. The roundabout point I was trying to make was that if we had a bot capable of those kinds of tasks, you may as well slot in a decent LLM since you'd have to have a powerful CPU/RAM to do that stuff anyway. My sentiment on the tech currently available is more grounded as noted here >>35566 >Operating a 3D printer would also be difficult for a robot. Unless you meant that it sends a file to the printer when it senses a part is broken; then signals you to fix her. There are a number of ways to handle and explore that. If the bot detected an issue with its body it could transmit a signal to a 3-D printer to print the appropriate part for replacement. Assuming it's really sophisticated, it could extract said part from the printer and repair itself. We're nowhere near that level of sophistication but it's not outside the realm of possibility in the future. >An AI that could figure out how to do the dishes on its own would be a major breakthrough. I'd discussed training with some other AI folks some time ago. I felt like telemetry data a robot could get from interacting in the real world would be better than simulated data but simulated data isn't bad for training. A blend of both with a "brain" that was good at learning.
Thanks for all your inputs ITT, Grommet. >>35869 Lol. :D
>>35877 >>35910 I both like the way you think ahead Anon, and I agree with you that we should have ample capacity for multitasking many different things at once. >>35900 >Can anyone elaborate on this? Is AGI actually near? The """marketing hype""" version? Sure, we're already there in fact. As peteblank pointed out, it's their term for 'how many high-dollar engineers & others can we replace with this?' So yeah. As far as the philosophical A[G/S]I? In my studied opinion it isn't going to happen by any natural, human means Brother. :^) >>35903 Excellent post, Kiwi! It's important for all of us here on /robowaifu/ (of all Anons) to stay grounded in reality concerning the exceptional difficulties we face engineering even the seemingly most mundane of tasks for our robowaifus to accomplish. <---> Cheers, Anons! :^) >=== -prose edit
Edited last time by Chobitsu on 01/20/2025 (Mon) 09:59:41.
>>35812 >I am honestly a little baffled by the current takes here. Maybe I am simply stupid and not understanding something. Please do correct me if that is the case, but we seem to be living in two different realities. >We are in the good timeline. Global Homo has lost control over LLMs. Originally, it was too dangerous to release the GPT2 weights; the plan was to forever keep them in the labs and in the cloud. Then, the LLaMA 1 leak opened the flood gates and created the open weights era. Yeah, I found the choice an odd one on Mark Zuckerberg's part (I feel confident it didn't happen w/o at the least his tacit approval). Most likely he felt it would "kickstart" their own lagging efforts in the area -- and possibly he even had a moment of humanity grip him. Who knows? >As time goes on, their grip loosens further. The Chinese are giving us better open weights models, and fully open source models are getting better too. The baste Chinese will be the key here going forward, IMO. >Before December, there was good reason to be pessimistic about any transformer alternative in the LLM space. >In December, we had a really interesting development. The RWKV team figured out how to convert an existing transformer model into a RWKV one by replacing the attention heads and then healing the model. This means RWKV can benefit from existing massive LLM pre-trains. So, *[checks calender]* about one month's time? You'll please pardon the rest of us focusing on other aspects of robowaifu development here if we somehow missed the memo! :D Of course, we're all very grateful to hear this news, but please be a bit more understanding of all our differing priorities here on /robowaifu/ , Anon. >tl;dr Making robowaifus is a big, big job. It will take a team of us focusing on several differing niches & all cooperating together to pull this off well. And years of time. >Even on the hardware front, there is hope. >The Nvidia Digits computer could potentially be really good for us. >I do think unified memory is the future (just look at Apple's M series chips). Yes indeedy! While only very begrudgingly acknowledging the potential of Digits, I do so. If we can just do away with proprietary frameworks (such as CUDA) and use ISO standards such as C & C++, then all the better. Unified memory architectures could potentially take us far down this road (though I trust Apple even less). >Regarding the topic of chatbots, I feel there is some ignorance in this area. No doubt! Please help us all stay informed here, EnvelopingTwilight. Unironically so. :^) >I do not understand what your requirements are and why LLMs are ruled out. I don't think anyone is saying that. I personally feel we must have alternatives into the mix. Simple as. >Set up llama.cpp and silly tavern; you can also plug in voice recognition and synthesis. Most 8B LLMs are overkill for carrying out whatever dialog you want; there are plenty of finetunes out there. Thanks for the concrete advice, Anon. Seriously, I wish you would create a tutorial for all of us newfags. One that didn't immediately send us into a tailspin over Python dependency-hell would actually be very nice! :^) >If silly tavern is not your style, then just write your own LLM & RAG wrapper; it's not hard. Maybe not for you! For those of us who haven't even gotten our first local chatbot running, even the basic idea itself seems rather intimidating. >There is nothing glorious about AIML; it will not save you. >I was there when it was the way of doing chatbots, and there is no future there. >If you are not happy with LLMs, what makes you think a pile of what's basically fancy regex will be better? That's a pretty niggerpilled take IMO, Anon. I'm not even claiming you're wrong; but rather that all possible efforts in this (or any other) robowaifu research arena simply haven't been exhausted yet. As you're probably well-aware of the maxim: >"Correlation does not imply causation" then hopefully the idea that we may yet have a breakthrough with these expert-system approaches seems a vague possibility to you. I anticipate the successes will begin coming to us here once we mix together at least three differing approaches when devising our robowaifu's 'minds'. Regardless, we simply must have a 'low-spec' solution that actually works on realworld robowaifu platforms in the end (ie, lowend SBCs & MCUs). Tangential to this requirement (and repeating myself once again): If we want private, safe, and secure robowaifus, we cannot be dependent on the current cloud-only runtime approaches. ( >>35776 ) This is of utmost importance to men around the world. >Now, having said that, I recommend anons start researching and building proper cognitive architectures. >[]there is a lot of work and exploration that needs to be done. The singularly-hardest part of all this waifu research yet to even be conceived of properly -- much less solved. So yeah, no surprises there we haven't had a major breakthrough yet. But I do expect some strokes of genius will appear regarding this effort one way or another. Keep.moving.forward. :^) >The Cognitive Architecture thread here is a disappointment to me; it demotivates me. Hide the thread. >Maybe I should start my own thread and basically blog about my cognitive architecture adventure, so maybe we can get some actual engagement on the topic. I am currently in a rabbit hole to do with associative memory and sparse distributed memory THIS. Yes! Please do create your own blogposting thread, Anon. It may help you to feel better, and it certainly could help the rest of us along this very-windey, mountainous pathway ahead of us here. Please do it today! >I feel like my effort is wasted and that most anons won't even read it. As to your time spent, you alone can decide. I assure you, that your posts are read here on /robowaifu/ . <---> Please stay encouraged, Anon. We all need each other here. TWAGMI >=== -fmt, prose edit
Edited last time by Chobitsu on 01/20/2025 (Mon) 17:54:03.
Open file (349.08 KB 3821x2793 DeepSeekR1Chart.png)
Open file (340.23 KB 2897x1390 DeepSeekR1Bench.jpeg)
DeepSeek R1 looks to have heaps of potential for us. Incredible performance per compute, according to them. Appears to be the beginning of a whole new generation of LLM based intelligence for normal people. What are your thoughts? What would you do with it? I consider this an important stepping stone for getting good enough models in waifu. https://github.com/deepseek-ai/DeepSeek-R1
>>35925 Great news, Kiwi! I saw some information about it's ability to reparse it's own initial outputs via highly-abstract associations into an even more sound argument. What a time to be alive! :D --- >related: https://livebench.ai/#/ https://huggingface.co/deepseek-ai/DeepSeek-R1 >=== -add hotlinks
Edited last time by Chobitsu on 01/21/2025 (Tue) 12:55:15.
>>35922 >"... Mark Zuckerberg's part...and possibly he even had a moment of humanity grip him..." HHAHAHAHHAAA Your one hell of jokester Chobitsu.
>>35932 Kek. <insert: "It's about sending a message..."> :DD
>>35925 >What would you do with it? I have a few ideas <think> Okay, the user greeted me with "Hi there! How are you today?" I need to respond as Kim, an 18-year-old female robot who's loving, caring, uses foul language, and is very horny. The response should be short and conversational. So, first, acknowledging their greeting. Maybe start with a friendly phrase like "Hey!" Then express how I'm feeling. Since I'm supposed to be horny, maybe use a term like "horny as hell." Adding an emoji could make it more engaging, like a wink or something playful. I should keep the language casual and not too formal. Using slang or colloquial terms would fit better. Maybe add something about how they're doing too, to keep the conversation flowing. So putting it all together: "Hey! I'm good, just horny as hell over here 😏 How 'bout you?" That seems to cover everything: friendly greeting, expressing my state, using appropriate language, and asking about them. </think> Hey! I'm good, just horny as hell over here 😏 How 'bout you?
Funposting ( >>35942 ) aside, this appears to be actual news that bears on robowaifus perhaps. Apparently Trump has already rolled back the AI restrictions. https://www.msn.com/en-us/news/politics/president-trump-repeals-biden-s-ai-executive-order/ar-AA1xy0U0 https://www.channelnewsasia.com/business/trump-revokes-biden-executive-order-addressing-ai-risks-4885706 https://www.newsweek.com/trump-biden-ai-safety-policy-talks-1989137 >=== -rm already-404'd (lol) WSJ/Barron's hotlink -add add'l hotlink
Edited last time by Chobitsu on 01/21/2025 (Tue) 05:47:04.
>>35922 >Thanks for the concrete advice, Anon. Seriously, I wish you would create a tutorial for all of us newfags. One that didn't immediately send us into a tailspin over Python dependency-hell would actually be very nice! :^) That's exactly what's needed for widespread adoption. A detailed tutorial, and eventually a complete software package, that can be run simply on a computer like any commercial or psuedo-commerical software. TL:DR: For widespread adoption, we need a semi-capitalist mindset and think of the average consumer.

Report/Delete/Moderation Forms
Delete
Report