/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

The canary has FINALLY been updated. -robi

Server software upgrades done, should hopefully keep the feds away. -robi

LynxChan 2.8 update this weekend. I will update all the extensions in the relevant repos as well.

The mail server for Alogs was down for the past few months. If you want to reach out, you can now use admin at this domain.

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB


(used to delete files and postings)

Knowing more than 100% of what we knew the moment before! Go beyond! Plus! Ultra!

General Robotics/A.I. News & Commentary #2 Robowaifu Technician 06/17/2022 (Fri) 19:03:55 No.16732
Anything in general related to the Robotics or A.I. industries, and any social or economic issues surrounding it (especially of robowaifus). === -note: I'll plan to update this OP text at some point to improve things a bit. -previous threads: > #1 (>>404)
>>16752 Don't waste your time on watching this. It was hard: https://www.youtube.com/watch?v=NAWKhmr2VYE
>>16753 ew, absolutely horrifying post-wall face!
If nobody is going to post about it, I will. Two very impressive papers came out. The first one is Deepmind's breakthrough work on RL exploration via novel, simple and powerful self-supervised learning objective, which finally conquered Montezuma's Revenge (!) and most DM-HARD-8 tasks. The second one is an academic tour-de-force devising novel scheme of training a CLIP-like contrastive semantic model as a sufficient surrogate reward for training an agent which passably executes some tasks in minecraft environment. This is a way forward for training from human-generated youtube tutorials. Both of these works are significant and can be applied to our cause, albeit they require moderately large compute (large by the standards of an amateur, moderate by the standards of a good US org). At the very least, agents trained via these objectives could be used as dataset generators for our would-be agent. If we are to use these innovations for our projects, we need to start a semi-closed community to test approaches to distributed computation and to guide the effort of recruiting volunteers into the computation graph. 1. BYOL-explore https://www.semanticscholar.org/paper/BYOL-Explore%3A-Exploration-by-Bootstrapped-Guo-Thakoor/54d1fcc284166e7bbd5d66675b80da19714f22b4 >We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually-complex environments. BYOL-Explore learns a world representation, the world dynamics, and an exploration policy alltogether by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually-rich 3-D environments. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore’s intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive agents. 2. MineDojo https://www.semanticscholar.org/paper/MineDojo%3A-Building-Open-Ended-Embodied-Agents-with-Fan-Wang/eb3f08476215ee730d44606b96d1e24d14f05c1d >Autonomous agents have made great strides in specialist domains like Atari games and Go. However, they typically learn tabula rasa in isolated environments with limited and manually conceived objectives, thus failing to generalize across a wide spectrum of tasks and capabilities. Inspired by how humans continually learn and adapt in the open world, we advocate a trinity of ingredients for building generalist agents: 1) an environment that supports a multitude of tasks and goals, 2) a large-scale database of multimodal knowledge, and 3) a flexible and scalable agent architecture. We introduce MINEDOJO, a new framework built on the popular Minecraft game that features a simulation suite with thousands of diverse open-ended tasks and an internet-scale knowledge base with Minecraft videos, tutorials, wiki pages, and forum discussions. Using MINEDOJO’s data, we propose a novel agent learning algorithm that leverages large pre-trained video-language models as a learned reward function. Our agent is able to solve a variety of open-ended tasks specified in free-form language without any manually designed dense shaping reward. We open-source the simulation suite and knowledge bases (https://minedojo.org) to promote research towards the goal of generally capable embodied agents.
Open file (298.44 KB 1377x515 Screenshot_4.png)
Many of you may have noticed a shilling week (((OpenAi's))) GPT-3 on 4chan's /pol/. Today 22.06.2022 the neural network started giving pilpul i.e. passive-aggressive mental gymnastics, facts avoiding, etc. > in two words - another nn was neutered Expect an article from openai about "how evil racists tried to ruin gpt-3"
By this point it should be obvious that large generative multimodal models are here to stay. The experiment shows us that 20 billions of parameters is enough for implementing quite fine, abstract artistic ability. 3 billions is enough for less abstract prompting. You could likely run this model on an RTX3090, if you optimized it for inference. Of course they won't give you the weights, that's why a group of people needs either to pool funds and train their own model, or to train it in a distributed manner, which is harder.
>>16775 >>16779 This is very good to see. I'm glad we're seeing all of this progress, and might be able to implement some of it in our future robowaifus. So the can create interesting dishes and even imagine their own stories or become hobby artists in their free time. >>16775 > If we are to use these innovations for our projects, we need to start a semi-closed community to test approaches to distributed computation and to guide the effort of recruiting volunteers into the computation graph. I generally think it's a good idea for sub projects of the bigger robowaifu project to look for people outside of this small group here. Our project seems to only appeal to a minority for now. One could look for an angle, how a part of it could be used for something else, and pitch it to people interested in that. Then come back with the result.
>>16737 No you're fine. It was my fault Meta Ronin.
Open file (370.25 KB 1089x871 Screenshot_4.png)
> yandex released YaLM-100B a RU/ENG Language Model > trained on russian/english languages on ru supercomputers > The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian. It's opensourced! https://github.com/yandex/YaLM-100B
>>16779 This guy here talks about AGI and how it's not a thing: https://www.youtube.com/watch?v=kWsHS7tXjSU >Blake Richards is an Assistant Professor in the Montreal Neurological Institute and the School of Computer Science at McGill University and a Core Faculty Member at MiLA. He thinks that AGI is not a coherent concept, which is why he ended up on a recent AGI political compass meme. When people asked on Twitter who was the edgiest people at MiLA, his name got actually more likes than Ethan, so hopefully, this podcast will help re-establish the truth. I discovered the term HLAI recently, also with the distinction to AGI in a sense that AGI would be one system doing everything humans could do, while HLAI would be more like a human-like AI. I think it's a interesting distinction. I also like the podcast "The Inside View" where this guy is invited. It seem to try to give an understandable overview over the different ideas and anticipations in regards to AI in near future. https://www.youtube.com/c/TheInsideView
Maybe a bit OT. Just in case someone cares about "the Metaverse". Maybe for virtual waifus or so. Neil Stephenson wants to creates his own version: https://decrypt.co/102646/snow-crash-author-neal-stephenson-is-building-a-free-metaverse-called-lamina1 https://youtu.be/Rf0N1a5g-ko >Nearly 30 years before Facebook became Meta, there was “the metaverse.” The author Neal Stephenson coined the term in his cyberpunk novel Snow Crash in 1992 to describe an online, VR-ish world where the inhabitants of humankind could interact and escape the dystopian unpleasantness of meatspace. https://en.m.wikipedia.org/wiki/Snow_Crash Most here (including myself) might not really like his political tendencies, but he's at least not in favour of big corporations.
Open file (82.98 KB 1200x799 put shoe on head.jpg)
Mycroft AI released Mimic 3, a TTS engine that can run on-device (even a Raspberry Pi 4) with some decent results. FOSS. https://mycroft.ai/blog/introducing-mimic-3/ https://mycroft.ai/mimic-3/ (has demos, the English US vctk_low voices seem much better than the default preview)
>>16833 Thanks, I saw that. Might actually be pretty usefu (I don't mean that hat).
>>16837 I suppose particularly for people who value privacy/data-security, DIY hacking, slow/no internet or low-cost. For someone whose only concern is speed and quality then a cloud/commercial solution might look ideal, but that wouldn't fly for me.
Maybe we should also build a Tachikoma (spider robot from Ghost in the Shell), since they're kinda cute. Oh... https://youtu.be/yGekn_74EHM
Kibo-chan is back: https://youtu.be/HpUuvt8yoDE
>>16866 With a new body, including legs: https://youtu.be/XGvb9Nb1K6k
>>16866 >>16867 Dear little Kibo-chan is an inspiration to us all Anon! :^)
>>16850 This idea has some merit. It was proposed as one of the mobility platform alternatives for the board's MaidCom project, so yea.
>>16871 I don't really think that kind of body would be working well for indoor. Anyways this here >>16835 looks more interesting. If you add wheels to the legs and dress, and maybe make the parts of the dress removable, in case she wants to sitt or lie down.
Open file (157.67 KB 1200x912 this_might_be_big.jpg)
There's a new personal voice assistant for Linux now: Carola. It's for Fedora, though. Which might mean it's going to be optimized for their Gnome desktop (or maybe not since it's not from Redhat). However, it might have or get some capabilities which might become handy for building a robowaifu with skills to be an assistant. It uses Google to create it's voice, which is of course not an option for us. But this can surely be replaced by alternative software, if not already then at some point. I have no time to test it right now, just wanted it to drop in here. Article: https://fedoramagazine.org/your-personal-voice-assistant-on-fedora-linux/ Github: https://github.com/Cyborgscode/Personal-Voice-Assistent
>PLATO stands for Physics Learning through Auto-encoding and Tracking Objects, and it was trained through a series of coded videos designed to represent the same basic knowledge that babies have in their first few months of life. ... >However, PLATO isn't quite up to the level of a three-month-old baby yet. There was less AI surprise when it was shown scenarios that didn't involve any objects, or when the testing and training models were similar. >What's more, the videos PLATO was trained on included extra data to help it recognize the objects and their movement in three dimensions. >It seems that some built-in knowledge is still required to get the full picture – and that 'nature vs nurture' question is something developmental scientists are still wondering about in infants. The research could give us a better understanding of the human mind, as well as helping us build a better AI representation of it. >"Our modelling work provides a proof-of-concept demonstration that at least some central concepts in intuitive physics can be acquired through visual learning," write the researchers. https://www.msn.com/en-au/news/techandscience/scientists-have-created-an-ai-that-can-think-like-a-human-baby/ar-AAZsgdN
BLOOM - BigScience Large Open-science Open-access Multilingual Language Model https://huggingface.co/bigscience/bloom > 176 billion parameters > 70 layers, 112 attention heads > Hidden layers are 14336-dimensional >>16732 >BLOOM - BigScience Large Open-science Open-access Multilingual Language Model https://huggingface.co/bigscience/bloom > 176 billion parameters > 70 layers, 112 attention heads > Hidden layers are 14336-dimensional
Open file (48.64 KB 688x715 Screenshot_1.png)
>>16886 I'm surprised they even allowed that into the public domain.
>>16886 >"...As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans." Oh, except the teensy-tiny little fact that the programs written by humans, actually work well, most of the time heh :^). We are a long way from so-called 'AI' that can write coherent and effective software. Frankly, I've become so skeptical of any organization or study promoting B*g D*ta, that at this point I feel comfortable assuming, a priori, that it's simply the works of a den of thieves and liars. We anons here & elsewhere will eventually manage to create pleasing, private & safe robowaifus together. But it's plain that we aren't going to get there either by conforming with, nor capitulating to the Globohomo Big-Tech/Gov's machinations--and ultimately the evil they have planned for us all. Thanks though, Anon. At the least it's somewhat encouraging in a small way to see some kind of a pushback against the pozz apparently happening with this one. >>16887 So, pic-related when? :^) >=== -add the single word 'pleasing'
Edited last time by Chobitsu on 07/12/2022 (Tue) 21:36:17.
>>16732 https://www.youtube.com/watch?v=7_06t5FUn0Y This time, not an Ai thing, but artificial muscles. > materials scientists and colleagues at the nonprofit scientific research institute SRI International have developed a new material and manufacturing process for creating artificial muscles that are stronger and more flexible than their biological counterparts https://phys.org/news/2022-07-scientists-durable-material-flexible-artificial.html
>>16908 Thanks. Yeah, and it's by a non-profit. Here the article: https://phys.org/news/2022-07-scientists-durable-material-flexible-artificial.html And related https://phys.org/news/2022-03-unimorph-nanocomposite-dielectric-elastomer-large-scale.html Paper seems to be behind a paywall, and the Scihub app didn't work for me (like most times). Will probably be moved or we need a crosslink to >>12810
>>16908 >>16913 Legitimate. Indirectly robotics-related at the very least. >>16913 Agreed, thanks for pointing that out Anon. :^)
>>16732 It's kind of sad, imagine the zogbots they'll do. https://blog.google/technology/research/our-new-quantum-virtual-machine-will-accelerate-research-and-help-people-learn-quantum-computing/ Like todays (((openAI))) gpt-3, remember a shitton of threads on /pol/ with gpt-3 greentexts? Now we see the fruits, the company itself spammed these threads then, in result - hardcoded politically correct crap :/
>>16954 >Now we see the fruits, the company itself spammed these threads then, in result - hardcoded politically correct crap You/we are free to take GPT-J or BLOOM (176B params, mind you, performance directly comparable to GPT-3) and finetune it on whatever dataset we like.
>>16955 yeah, i know, but, if we are talking about future robots, the best solutions will use pozzed as fuck neuralnets :/ on the software side, they will obviously encrypt it all so that for a simple user it will be a kind of iOS - a closed and annoying system, a perfect basis for AD's shilling right in ur room!
>>16956 >basis for AD's shilling right in ur room! What's 'AD' ?
>>16956 This will likely happen, and we should make any and all efforts not to lose the war on general purpose computing (and robotics) to have a possibility of having it our own way.
>>16957 an ads, it will shill you *insert here random corporation* with tons of diversity shit that you can't skip, youtube already trying to implement ads embedded straight into steam. (same as twitch) or, it will control everything you say, if you do manage to say something #LEBAD, this thing will change your content in real time (see voicemod's ai voices, realtime processed)
Please remember we have a robowaifu privacy, safety, & security thread general, anons (>>10000). These are all great issues to discuss, but it will help everyone here if we can keep them all together in one place, I think. >=== -reflect new crosspost's subject edit
Edited last time by Chobitsu on 07/21/2022 (Thu) 21:40:38.
>>16732 > https://www.tomshardware.com/news/mit-protonic-resistors-analog > Bringing analog "tunes" to the world of digital chips - with increased performance. > A team of researchers with the Massachusetts Institute of Technology (MIT) have been working on a new hardware resistor design for the next era of electronics scaling - particularly in AI processing tasks such as machine learning and neural networks. We'll see the ultimate botnet in our lifetime!
>>17147 >protonic resistor you mean an alkaline as in just a normal alkaline battery isnt bronsted–lowry the norm in highschool level chemistry, do they not teach you why the measurement for acidity is called pH making tiny batteries isnt impressive neither is using batteries as resistors its funny how they say the variable voltage is some amazing benefit, lmao this is literally an unwanted property of chemical batteries thats why batteries are marked with a tilde in front of the voltage and why anything powered by batteries needs a bunch of capacitors just to keep the damn voltage stable, but using something with variable voltage( ie. variable resistance ) as a resistor, come on now classic thesis project though, profs desperate for tenure while everyone else just wants to graduate and plays along, no idea what theyre talking about with processors it sounds like fantasy, processors are almost entirely made out of transistors, you know the thing that flips from 1 to 0 and viceversa, resistors are irrelevant in a processor
Open file (4.90 MB 4096x768 androids.png)
Open file (1.84 MB 2048x2048 sd_universe_687.jpg)
Open file (705.47 KB 499x658 fl13.png)
Open file (539.60 KB 512x640 l4.png)
Open file (461.61 KB 512x640 exs2.png)
I wonder how long people will cope and deny the truth of the simple fact that A(G)I is a spectrum lower bounds of which we are already experiencing at current gen systems, and even the currently available relatively humble DL model scale is enough to compete with human beings in quite broad skill domains where we simply didn't live through enough evolutionary time to truly excel at it ... such as a relatively new skill of painting pictures given a meaningful textual description. These pictures are made by yours truly from a few witty prompts with a software anyone can run on a 10 year old CPU-only PC with 12 gigs of RAM, in a few minutes per 512x512 sample. The software is mostly a wrapper around a deep neural network with ~1 billion parameters total, a convolutional attention-enabled UNet trained to reverse the process of addition of random gaussian noise to an image, given a textual description of the image content as a small vector embedding, at the scale of 100 terabytes of general internet data. As the obvious by this time experiments of myself and thousands of beta testers show, the NN learned to imitate every conceivable popular imaging style and hundreds of themes and variations thereof, often rivaling human artists - not the best of us, for now, but the average ones - surely (and they rage about it on twitter already). Nextgen models will follow, as will the new tools to integrate these abilities deeply into current and new creative workflows - what you see right now is just a v1 tech demo of something that will become widely known under various names, including "dreamstudio". Multiple implications follow: once again https://www.gwern.net/Scaling-hypothesis holds; the fall of creative scarcity is imminent; creativity will not be the same, but a lot of people will get newfound freedom to express themselves (will they, we have enough imagination to apply this power to some lasting positive effect?) Some people will lose their profits and professional pride. You can continue this long list on your own. It is a taste of things to come this decade. I'm stating here that instead of following the obvious impulse of moving the goalposts ever further into esoteric vitalist A(G)I denial (It doesn't do art! It doesn't do logic! It doesn't learn! this is photoshop! this is creepy! this is fake! It will never do XYZ!), instead of enveloping ourselves in comfy elaborate copes we should go forth and take the technology for what it is and co-evolve with it, molding it to our taste. What now has been done for creativity, tomorrow will be done for limited and then for more general and even embodied agency; our goal of robot companions will be naturally interwoven with increasing naturalness and sophistication of DL technology ... or we could again glance over the obvious tech breakthrough, sneer, deny, seethe, cope, dilate and bite the dust while the usual silicon valley suspects tame and productize the hell out of this tech only to sell it to us through their gatekeeping machinery. See you on the other side of whatever is coming. ------------------------------------------------------------------------------------------ If you are interested in experimenting with this technology, the code, guide and leaked NN weights are available via these links: https://rentry.org/retardsguide https://github.com/CompVis/stable-diffusion https://sweet-hall-e72.notion.site/A-Traveler-s-Guide-to-the-Latent-Space-85efba7e5e6a40e5bd3cae980f30235f https://github.com/Maks-s/sd-akashic We could really use a separate thread for design experiments with this class of tools.
>>16775 More on self-supervised learning; Self-taught 'AI' shows Similarities to how the Human Brain works; https://www.bibliotecapleyades.net/ciencia3/ciencia_artificialhumans158.htm Semi-related article about fMRI; https://www.extremetech.com/extreme/339085-mind-reading-technology-can-turn-brain-activity-into-images
>>17393 Thx for a reply, tbh I thought the board is dead.
>>16775 I think Montezuma's Revenge was originally beaten by Uber's Go-Explore algorithm. It looks like DM's algorithm is more general though. Both papers look pretty cool. I'll take a look.
Open file (497.28 KB 448x640 gynoid_853.png)
>>17397 Aren't you the DL-kun I had pleasure to converse with on the topic of retrieval-augmented models? Would be cool to have a more permanent contact to talk to you about DL now and then! See the second link from >>17003 in that case.
I previously made posts in this thread about general-task neural networks or algorithms, so here's another one: https://peract.github.io/ > Instead of using object-detectors, instance-segmentors, or pose-estimators to represent a scene and then learning a policy, PerAct directly learns perceptual representations of actions conditioned on language goals. This action-centric approach with a unified observation and action space makes PerAct applicable to a broad range of tasks involving articulated objects, deformable objects, granular media, and even some non-prehensile interactions with tools. The code / weights are promised to be freely available.
>>17403 Interesting, I like the general language-conditioning very much, though their use of full voxelspace-context looks heavy-handed to me. I also like this newer synthetic dataset: https://github.com/eric-ai-lab/VLMbench
>>17399 I think that's someone else. I'm the math anon. >retrieval-augmented models If you haven't seen them yet, I highly recommend checking out external attention models. https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens >>17403 >>17406 There's also this one from Google: https://ai.googleblog.com/2022/02/can-robots-follow-instructions-for-new.html They try to get a robo to generalize to new tasks by: - Training it on a hundred tasks associated with task descriptions, - Then passing the descriptions through a language model before giving it to the robo.
I see it isn't posted here, so here's some more stable diffusion stuff. - The code & model were posted here >>17259 - Textual Inversion for creating reference tokens usable with stable diffusion: https://github.com/rinongal/textual_inversion - A community-built repo of reference tokens: https://huggingface.co/sd-concepts-library - Some people are also doing prompt weighting with stable diffusion, which was previously used with vqgan: https://github.com/tnwei/vqgan-clip-app/blob/main/docs/tips-n-tricks.md - ... This supports negative weight prompts, which let you tell that model that you want X and not Y. Plus a bonus blog post on AI progress: https://astralcodexten.substack.com/p/i-won-my-three-year-ai-progress-bet The main takeaway is that, 3 months ago, the leading text-to-image model was approximately 3 years ahead of what even optimistic experts believed, and that was after accounting for DALL-E 2.
It's starts with humans. > an "Atom Touch" the first artificial prosthetic arm capable of near-full human range of motion, a basic sense of touch, and mind control https://atomlimbs.com/touch/preview Nothing prevents it from being used in robotics.
>>17438 I like how you think.
New framework for simulation that works with Unity, Blender, and Godot: https://github.com/huggingface/simulate New Q&A tool that's very easy to use: https://twitter.com/osanseviero/status/1572332963378958338 Stable Diffusion prompt generator for creating good prompts: https://huggingface.co/spaces/Gustavosta/MagicPrompt-Stable-Diffusion

Report/Delete/Moderation Forms