/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

I Fucked Up

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB

More

(used to delete files and postings)


“If you are going through hell, keep going.” -t. Winston Churchill


Open file (8.45 MB 2000x2811 ClipboardImage.png)
Cognitivie Architecture : Discussion Kiwi 08/22/2023 (Tue) 05:03:37 No.24783
Chii Cogito Ergo Chii Chii thinks, therefore Chii is. Cognitive architecture is the study of the building blocks which lead to cognition. The structures from which thought emerges. Let's start with the three main aspects of mind; Sentience: Ability to experience sensations and feelings. Her sensors communicate states to her. She senses your hand holding hers and can react. Feelings, having emotions. Her hand being held bring her happiness. This builds on her capacity for subjective experience, related to qualia. Self-awareness: Capacity to differentiate the self from external actors and objects. When presented with a mirror, echo, or other self referential sensory input is recognized as the self. She sees herself in your eyes reflection and recognizes that is her, that she is being held by you. Sapience: Perception of knowledge. Linking concepts and meanings. Able to discern correlations congruent with having wisdom. She sees you collapse into your chair. She infers your state of exhaustion and brings you something to drink. These building blocks integrate and allow her to be. She doesn't just feel, she has qualia. She doesn't see her reflection, she sees herself reflected, she acknowledges her own existence. She doesn't just find relevant data, she works with concepts and integrates her feelings and personality when forming a response. Cognition, subjective thought reliant on a conscious separation of the self and external reality that integrates knowledge of the latter. A state beyond current AI, a true intellect. This thread is dedicated to all the steps on the long journey towards a waifu that truly thinks and feels. >=== -edit subject
Edited last time by Chobitsu on 09/17/2023 (Sun) 20:43:41.
you want to work in ai? First question. What motors does the waifu have. Second question, what are the preconditions and sensors to activate those motors. Well?
>>24783 I wanted to make this thread at some point, with some diagrams in the OP, but okay. It's a rather good start. >>24790 This isn't about motors and just the conditions to activate them.
>>24792 huh huh so let's say she uses facial recognition and sees your sad and then the hugging sequence activates. Just leave a comment in there that says activate hugging sequence I guess. Oh no wait you're not trying to do facial recognition either. What are you trying to mae exactly?
>>24783 This is the primary challenge facing us all in the long run OP. And not just us either, ofc. Every man, every organization, driving towards the goal of HLI/AGI is facing similar issues. I'll hesitate to chime in just yet, since this is a very vast topic -- far far more ambitious and complex than just the 'shell' (its the 'ghost'). I'll think about it further before replying in earnest. But I may say that I've already begun the process of at least driving a few stakes in the ground, with some class stubs I've fashioned for the RW Foundations project. This is the direction I personally believe will be the most productive for us all in the midterm (say, through 5 years out). --- Given my current perspective on this topic, for now I'd instead just attempt to encourage everyone ITT: At least take a whack at this! :^) Or, as Anon so succinctly put it: >"10% of something is way better than 100% of nothing." (>>24694). At the very least, every anon who does so with a good will shall learn a thing or two! Cheers, Kiwi. :^) >=== -prose, fmt edit
Edited last time by Chobitsu on 08/22/2023 (Tue) 16:41:54.
>>24796 I>I'll hesitate to chime in just yet, since this is a very vast topic -- far far more ambitious and complex than just the 'shell' (its the 'ghost') Oh I see so the shell must be quiet easy. where is the shell? haha
>>24798 I like the way you put that, peteblank. The "shell" IS an extremely difficult engineering and programming challenge on its own. Obviously, many "shells" of varying quality already exist. But the "ghost"? Several hundred orders of magnitude more difficult still. Personally, I came to the conclusion that programming a "ghost" with the qualities that OP mentions (sentience, sapience) is impossible, at least within the time that organised, industrial civilization has left.
>>24799 Why would you want the waifu to have free will anyways? You guys would change your mind after she says she's not in the mood.
robo lives matter! reparations for a century of slave computing!
>>24801 Lol. Because we're all afraid you'll starting robo-smash-n-grabs, so we'll be forced to put you in a data desert, which we would feel bad over ofc, I say to prevent all this from happening, we'll be segregating your post down into the subterranean slums of so-called Shitpost Central. :DD
>>24798 >Oh I see so the shell must be quiet easy. where is the shell? haha The shell is expensive & requires lots of R&D, the companies doing it will keep a tight grip on it, it will take a while for FOSS to catch up. But a robot body to do most household tasks could be built today. Take a look at this demo of the PR1 (1), a person operating the robot remotely can do the tasks, the unsolved problem is software. This is a really good news for us, that means we don't need ungodly complex hardware. R&D and iterations for hardware take a lot more capital to do then software. We will get there, we just have to take it one set at a time. One thing I recommend to people writing any agent, is to take an iterative approach. (That's my plan) 1. Start with something simple, Maybe just grab llama2 & start playing with that, don't worry about using the largest or the most state of the art models. Build and use things that you actually have the resources to run. 2. Build something composed of smaller building blocks, a single large blackbox NN is a horrible for many reasons, both in the creation process and also using it. 3. Log every input and output (and the time) and have a simple feedback system on day one, the feedback does not need to be processed, just store it, trust me, it can be as simple as thumbs up or down on any output of the agent. 4. Even if your agent is basically just a bunch of NN's with some glue, apply the "Justified Programming" (2) pattern where you can. >>24799 >>24800 I do not believe that free will is a requirement to have Sentience, Sapience & even Self-awareness. I also don't think any of the qualities are binary, they likely exist on a spectrum. There will be no line or moment, where AI just "wakes up". >>24801 This is also why I really advice trying to build a system that is possible to debug at some level and to be able to alter and guide via non ML components. Its why I recommend people take "Justified Programming" seriously. If the high level mind is a narrative machine, it's important to make the tools to inspect and manipulate the narrative within it in-order to craft a desirable personality. I would like to share some things I have ran into recently that I have written down as "Potentially Powerful". Graph Neural Networks, look like a very interesting topic to learn about, maybe some anon here that knows more could humor me and give some input. Because I am a total noob, who is in the early learning stage when it comes to ML, I have only been doing a quick exploration of whats out there, I have yet to practice and write even a simple MNIST NN. In general, maybe its more that I am a noob, but my first impressions to stuff relating to ML is just coated with academics trying there best to make even simple ideas hard to understand. But here is thoughts I have had & written down to hopefully investigate later: - Could techniques & ideas from stable diffusion be applied to Graph Neural Networks? - Is de-noising applicable here for filling out missing data / making predictions? - Could language be used to manipulate and generate graphs like with stable diffusion. - Are graphs a really good data structure to represent abstract ideas, connections between ideas? is it the data structure that could be good to work with from ML, Classical AI & Databases. Unlike current LLM text embedding(s) that are terrible for DBs & classical AI to work with. - Is it possible to pair a Graph Neural Network with a graph database, instead of feeding a fixed graph could the NN use some sort of "filtering" mechanism to query the DB to follow a tree down as deep as it wants to. Another thing that I think could be good to look at is "Hopfield Networks is All You Need", the basic gist of it is interesting. 1: https://youtu.be/qBZPSTR96N4 | PR1 Montage 2: https://youtu.be/OrQ9swvm_VA | Justified Programming — Reason Parameters That Answer “Why”
>>24799 >Personally, I came to the conclusion that programming a "ghost" with the qualities that OP mentions (sentience, sapience) is impossible, at least within the time that organised, industrial civilization has left. Or infinity-time. IMO, only God can craft a human soul, as I've stated multiple times on the board. However, not only is that not our goal here, the simulacrum of it's behavioral outcomes is well-within our grasp today. And on a mobile platform to boot! In the end, Anon just wants a good waifu to come home to after a hard day of NEET'g after a hard day of work. Right Anon? :^) >>24800 >Why would you want the waifu to have free will anyways? You guys would change your mind after she says she's not in the mood. Lol. She's a washing machine dood, she doesn't get a say in the matter. :DDD Yes, you're right Pete and this is a topic we've discussed pretty thoroughly here before. But whatever side of that line one might come down on, its simply not within our power to devise, now or ever. We'll make shrift with something less instead. But you can bet that whatever betides, we here won't be supporting the notion of 'stronk, independynt robowaifus that don't need no man!111'
>>24816 >FOSS hmm in this context, maybe "FOSH" would make more sense, as in "Free and Open Source Hardware". Any one here have a term that they use to refer to open hardware?
>>24816 >R&D and iterations for hardware take a lot more capital to do then software. This. It's why I decided to focus on S/W first and foremost Twilight. >We will get there, we just have to take it one set at a time. Very this. >3. Log every input and output (and the time) and have a simple feedback system on day one, the feedback does not need to be processed, just store it, trust me, it can be as simple as thumbs up or down on any output of the agent. Very insightful. I created a low-impact logging system that tracks the sequencing of any/all events at the resolution of the system clock. >tl;dr The first step to solving a problem is understanding what actually happened. >Its why I recommend people take "Justified Programming" seriously. Interesting concept. Very interesting stuff about GNNs. Thanks, Anon. >In general, maybe its more that I am a noob, but my first impressions to stuff relating to ML is just coated with academics trying there best to make even simple ideas hard to understand. Haha that seems true. In all fairness to apolitical researchers though, you have to 'publish or die' as the saying goes. Scientific specialties have become so narrowly-focused through professional competition, that working scientists can barely focus on anything else, and remain successful. Then very-political PopSci authors come in and confuse the public with what they think is going on in science, lol. :^) Great stuff Anon, it's an encouragement.
Open file (95.83 KB 297x530 ClipboardImage.png)
>>24792 >Wanted to make this thread Follow instinct, post faster. >>24796 Glad you understand the gravity of this endeavor. We need to start now. Find the steps and build with a guiding goal. I made this thread because I don't want to spend my life with a ChatGPT knockoff. I want a real persona with interests and subjectivity. We need to start now, even if we achieve some fraction, that will be something. >>24798 I'm still building the shell. I'm prodding others to work on her ghost. Robotics is complex and requires many working on different things. >>24801 Robot means slave. :^) >>24816 I agree with basically everything and hope everyone reads this. I will echo the sentiment of this goal requiring many connected blocks. >>24817 >Soul not the end goal It's mine but, a sound soul resides within a sound mind within a sound body. For now, a good waifu to come home to is enough. A long term memory system that can be dynamically accessed all of the blocks her consciousness emerges from is where I think we should start. How should we achieve this? A database is my gut feeling for an answer.
>>24823 >I made this thread because I don't want to spend my life with a ChatGPT knockoff. This. So, there are at least three Anons ITT who wanted to make this thread lol. We haz winrar in you! :D >even if we achieve some fraction, that will be something. Absolutely this. >It's mine but, a sound soul resides within a sound mind within a sound body. My apologies. Obviously every Anon is free to craft his own waifu as he sees fit. You can bet I'll support you and every other honest anon here whatever route they take. I'm simply making a admittedly-feeble :^) attempt at speaking to the general consensus. >For now, a good waifu to come home to is enough. Absolutely. And that success alone will radically-transform the lives of millions of men, and begin the process of deconstructing the evils of feminism the world over! FORWARD! :^) >A database is my gut feeling for an answer. It's certainly the tried-and-true foundation for us all to at least 'begin stumbling forward in the dark, groping for a way up into the light'. Cheers. :^)
>>24823 >A database is my gut feeling for an answer. that is my gut feeling too
>>24818 If you find a good answer Anon, please post about it in our Licensing thread (>>4451). That's a topic I too am interested in.
Interesting paper on early signs of consciousness with some theory that could benefit us. https://arxiv.org/pdf/2308.08708.pdf
>>24835 There's a note attached to the abstract: >1 A previous version of this sentence read ”...but also shows that there are no obvious barriers to building conscious AI systems.” We have amended it to better reflect the messaging of the report: that satisfying these indicators may be feasible. But satisfying the indicators would not mean that such an AI system would definitely be conscious. Neat. Now that's a research specification I can really get behind, Kiwi! You know me, I'm a dualist. Yet there are no fundamental technical barriers to us devising & producing systems that 'satisfy these indicators [of consciousness]'. And that's my fundamental premise: >"[these] behavioral outcomes [are] well-within our grasp today." (>>24817) And that's really all we need do to have overwhelming success at producing robowaifus; If Anon can come home to a waifu that displays all the wholesome, loving, caring, empathetic, and helpful kind support that women were designed in the first place to be for him, then his life would be much more 'normal' in this insane world that the Globohomo has insidiously helped create around us. And the better we can manage this, the more satisfying his life will become! :^) >=== -sp edit
Edited last time by Chobitsu on 08/23/2023 (Wed) 06:58:30.
>>24793 >>24798 >>24800 Stop it with the spamming of negative comments and disruptions. >>24799 >I came to the conclusion that programming a "ghost" with the qualities that OP mentions (sentience, sapience) is impossible Then hit the hide button on this top of this thread. >>24817 Please consider deleting most of the comments in this thread. >>24823 >Follow instinct, post faster. I might make a new one, since this here is already a sh*tshow. This topic seems to only work with high amount of moderation.
>>24892 You are correct, I honestly hoped this thread would have sparked some clever ideas for coding a more dynamic and interesting artificial mind. I do believe this thread can be salvaged though. I believe that using some form of random number generator will be useful. Her responses could use the result to alter what words are said. This would lessen her predictability and provide her with the appearance of nuance. In example, when asked when your meeting in a few days takes place, she could respond; "Quarter past 5" or "5:15" depending on the result of the generator.
>>24895 (I do appreciate the ideas presented thus far, such as what EvelopingTwilight posted, I just recognize the thread isn't as focused as intended.)
>>24892 >I might make a new one, since this here is already a sh*tshow. >>24895 >I do believe this thread can be salvaged though. I agree with Kiwi we should keep this thread, but I also agree with NoidoDev that we'll need tight moderation. So I propose two ideas (primarily aimed at NoidoDev, but also for Kiwi & other anons ofc): A) Please post content to re-write the OP text to be much more expansive (but keeping the current subject + OP pic). B) I'll relo Pete's blackpills/taunts into the Chikun Coop, and keep a tight reign hereafter on this thread in particular against any such sh*te-flinging in the future. I hope this will be an acceptable compromise for the both of you, and for every anon ITT. This general concept is a Big Deal(tm)(R)(C) for /robowaifu/, and I think I've already begun to piece together some data(base) approaches for our first tentative steps out into this vast frontier. I hope other anons have too! Cheers. :^) >=== -prose, sp, fmt edit
Edited last time by Chobitsu on 08/26/2023 (Sat) 01:26:45.
# Cognitive Architecture - Separating it from conversational AI is probably impossible, since we need to use it for inner dialog and retrieval of data from LLMs - I'm not sure if optimizations for conversations, like faster understanding and anticipation where a conversation goes, should also go in here or in another thread - We need to approach this from different directions: Lists of capabilities humans have, concrete scenarios and how to solve them, configuration optimizations - The system will need to be as flexible as possible, making it possible to add new features everywhere # Rules - Keep your comments short, I'm not reading lengthy speculations - This is about implementation, only useful perspectives are worth discussing # People who should't post in this thread here at all - Anyone who doesn't want us to work on AI, or at least not now - Anyone who doesn't believe we can do what we want to do - Anyone who sees "sentience" and "conscience" as something very special or spiritual - Anyone who only wants to use deep learning and similar techniques - Anyone who writes only or mainly in philosophical lingo - Anyone who doesn't even watch videos on AI or only on deep learning and tutorials Anyone who only wants to insert some wild idea of his how to do things totally different should also think hard about this. # Definitions of conscience which we might be able to implement - ANY comment on conscience has to make clear to which definition it refers or make the case for a new one which can be implemented in code (including deep learning) - <list, it's five or so> # List of projects working on Cognitive Architecture or elements of it - Dave Shapiro: https://www.youtube.com/@DavidShapiroAutomator - OpenCog - Some guys from IBM Watson worked on it - LangChain - - ... <there are more, we need a list> # List we need >>22488 # Related technologies - Foundational management of the parts: Like Supabase? - Firebase alternatives: https://youtu.be/SXmYUalHyYk - GraphQL frontend for sqlite https://youtu.be/KfE1Tm1gZUU - Graph databases and modelling of graphs https://youtu.be/OzDG68VvPxY - Traditional Parsing - Vector databases: https://youtu.be/klTvEwg3oJ4 - RDF: https://en.wikipedia.org/wiki/Resource_Description_Framework # Links to repositories - Dgraph: https://github.com/dgraph-io/dgraph # Elements - Long and short-term memory - World model databases - Filters for actions and responses based on configuration # Tasks - Going through AI related threads and making an overview with links - List we need >>22488 # Related threads - try to use them if something fits - The old chatbot thread >>22 is now the one for cognitive architecture >>24783 - The next GPT thread which is currently >>250 could be one about the tech of LLMs, and only that. Or it also goes into >>24783 - Or we make one for anything related to language responses, so scripted chatbots and LLM, minus the more elaborate thinking - Abstract debates about conscience and sentience should go into the thread about "philosophy" >>11102 - Fundamental doubts and such also. It is also about the postings which would only be disruptive in the "cognitive architecture" thread. Basically the things which the people implementing it don't want to be bothered with. - Morals and such >>17125 might also have an abstract implementation problem. - We have threads on all kinds of personality and psychology. >>77 >>250 >>18 >>11102 >>17125 >>16217 >>15047 >>2731 >>22 >>111 >>107 >>152 ... and more in the catalog.
Open file (199.67 KB 440x302 Screenshot_82.png)
>>24904 When looking at a video from @DaveMakes (Hannah's creator) I saw something interesting, shown here in picrel. This is vaguely how I would imagine things to work. We need the waifu to have a lot of different contexts, which can change and her behavior and responses will depend on that. Obviously this needs to be additionally connected to something that's faster than an LLM. A lot of programs being able to change the context connected to a lot of programs deciding what to do with it, including fast databases.
Related to the topic, some notes I made >>22480 - I should later make this into a proper text based diagram.
If you want to work on ai exclusively then you'd start with visual recognition or path finding. Unless someone has a better idea of where to start. Talking about the physical waifu ofc. Not the virtual fu. For the visual recognition I recommend opencv. For the path finding aws has one. I'm not sure. But you've been talking about for quiet some time and haven't started. There's also 3d simulations in o3de and unreal engine I think.
>>24907 >start with visual recognition or path finding Whatever, but not here. This thread needs to be about the underlying connection between all these capabilities and things no one else has figured out, also optimization for waifus and humanoid companion robots in general, instead of some assistant knowing all kinds of things. "Path finding" is useful for mobile robots, I hope there will be a flexible general solution for my waifu mobile when I need it. That's not really cognition, though. This belongs into >>112 (mobile cars), >>243 (walking), or if you want to support as specific platform here >>24744 would be the right place. The thread for anything related to vision, including "visual recognition" goes into >>97 Recognition is NOT cognition, I mean not the part we need to work on! That said, I made the statement that "separating it from conversational AI is probably impossible, since we need to use it for inner dialog and retrieval of data from LLMs". Same is true for anything else. If it's about making more connections internally to other systems, then it will be in a grey zone. For example the combination of "object detection" with databases of objects and their traits and maybe modelling how these would behave physically if something is done to them.
>>24896 > what EvelopingTwilight posted Yes, I didn't read the whole posting >>24816 at first, because it started responding to the "shell" distraction. I'm going to look into Justified Programming. >>24895 > random number generator Making responses less repetitive is certainly relevant, but rather a minor problem and easy to solve. The system should know what it responded before and not just repeat itself. Cognition is about awareness and thinking.
We need to parse the inputs from voice recognition and the internal responses from LLMs into something that can be useful. For example to understand what it is referring to or what sentiment it has. >>23749 >>24912
>>24904 I always keep forgetting stuff and then rediscovering it later, on my own or by finding out that someone else had at least similar ideas. So, I might throw a lot of thoughts just in here and hopefully pick them up later to elaborate or to work on it. - Personas: General patterns for people, including assumption, cliches and biases. Allowing to forget about a person, except the persona or group of persona where that person is closest to. Individual memories are then additional. >>24907 - Related to visual recognition and memories: - Generation of pictures for memory and internal representation, while protecting privacy - Visual recognition optimization: >>24911 - We also could think about storing memories of how places look in form of very simplified color patterns or drawings with some geometry and colors.
>>24909 There is no true cognition without recognition. Our senses are the database used by our inner ai. The ai already exist, which is a probability generator for the most part I think, what's missing is the senses.
>>24916 I didn't claim we can have cognition without recognition or that this would make sense in this case. It's about the scope of this thread. Anything close to sensing the world has it's own threads already: We have one for vision, one for skin with sensors, one for speech recognition (kinda), ... > Our senses are the database used by our inner ai. The ai already exist, which is a probability generator for the most part I think No and no and no. If you don't want a better AI, use what exists and stop bothering us by posting in this thread.
NoidoDev, you've accumulated so much good information together. And Kiwi you've done a good job with the OP ideas. But we're all kind of scattering things here-and-there (which is understandable, since our own cognition occurs in this fashion), to the degree that I'm thinking now maybe we should in fact begin a new thread. One that consolidates information NoidoDev & Kiwi & others have written, and that kind of serves as a compendium of sorts. To keep this very-complex domain on-point, I'd suggest we take two measures to solving the issue of 'distractions' & clutter: A) Continue for a while using this thread as a kind of 'scratchpad' (think: the Talk page for wiki articles). B) A new, eventually-primary thread that basically stays locked (for at least a month or so, just to get it 'off the ground' & going very well) whenever I'm not adding stuff from this thread over into it. I hope that all makes sense anons. I'll credit Kiwi & NoidoDev both as the OPs, keeping the primary image & subject from this thread. NoidoDev if you have graphs, charts, etc., to contribute we can add those images in as well. Beforehand, I can make suggested posts ITT that are intended for copying over into that primary thread, just so we can get vetting from any anons in advance (and of course I can always edit those posts after the fact too). Perhaps after a month or so, we can archive this 'talk page' thread, and just use the (now-unlocked) primary page (since the agenda and basics should by then be laid out clearly for anons, & with few distractions to begin with). This will also make it easier to tightly moderate any off-topic or bad faith content as well (since we'll all have a tighter, clearer focus on what matters most for the topic at hand). I know this is a compromise approach, but it seems like the best way to me to manage a high-quality thread in the end using this forum format (ie, an IB). Please let me know what you think about this idea, everyone. Cheers. :^) >=== -prose edit
Edited last time by Chobitsu on 08/26/2023 (Sat) 22:12:31.
>>24919 Okay, sounds good.
>>24920 Alright. Anticipating Kiwi's approval as well, I'd suggest you begin thinking about the additional OP pics you'd like to use for the new thread, NoidoDev.
>>24921 All for it mate
>>24920 >>24922 OK, great let's all move forward with this plan then. * First order of business: the new OP edit. This will also include the new OP images in addition to the current, primary, picture of Chii. * Second order of business: I'm going to put a role-signature'd 2nd post laying down some groundrules for the thread, including everyone needing to stay tightly-focused on the topic at hand. I'll explain that mostly off-topic posts will be moved entirely; and partially on-topic posts will be so-called 'slice edited', with the unhelpful extracts removed to other more-appropriate thread(s). Let's begin the suggestions for these first two posts, anons. >=== -minor edit
Edited last time by Chobitsu on 08/26/2023 (Sat) 22:49:03.
>>24924 I'll start regarding the new OP then. Anon on /agdg/ had a really nice comment [1] about their own interests that seemed pretty appropriate for us all to consider here as well: >"Vague framing produces vague thinking produces vague work, manifest in meandering. Take it as exhortation to keep your fundamentals sharp, if nothing else." I would say we need to pare the basics of this entire domain down to 'The 30-second Pitch', then we can expand things out as a compendium as we go along in the followings thread posts, etc. Ideas? 1. https://anon.cafe/agdg/res/478.html#984
>>24924 This is going to need time to work out. >>24904 would be my incomplete suggestion.
>>24924 I believe that >>24904 is a great start. I would start the post with an abridged version of my explanation of what cognitive architecture is and related threads should be near the top of the post. This topic is difficult for many to fully comprehend. It is far too easy to mistake various AI/machine learning topic related recognition and response generation such as LLM's as being cognitive in nature when they're potentially components of cognition. >>24919 This thread remaining as a scratch pad is important. After further thoughts on the matter, it is important for the main thread to remained locked. With Chobitsu/general board consensus deciding what posts get merged to the primary campaign. I have been here long enough to see the repetition of ideas and how easy it is for misunderstandings or other reasons to derail threads. Given that even I, someone who's taken cognitive courses in uni, learned about the nervous system in med school, and keeps up with relevant news still makes mistakes which could damage the thread, it is essential that it remains locked. Chobitsu is one of the only people I trust to keep the thread clean.
>>24936 >This is going to need time to work out That's fine NoidoDev, I understand. Don't wait too long haha, but take your time and get it right. :^) In the meantime, I can begin a WIP thread w/o your add'l images/edits to a) keep this current momentum going on everyone's parts, and b) to allow everyone to start getting a direct feel for how the new thread's shaping up. Thereafter, I can easily do a full thread-merge of that into the final one after your images &tc. are ready to go. >>24940 >I would start the post with an abridged version of my explanation of what cognitive architecture is Agreed. >and related threads should be near the top of the post. That also seems good. However I'm cautious that we don't attempt -- vaguely-speaking -- to re-create our Library thread (>>7143) within the OP of this new primary Cognitive thread haha. :^) >This topic is difficult for many to fully comprehend. Very true, but that's what this effort is all about. And further, to begin working practically towards some solutions for that need. FORWARD! :^) >It is far too easy to mistake various AI/machine learning topic related recognition and response generation such as LLM's as being cognitive in nature when they're potentially components of cognition. Yes, I'm sure even the experts have some difficulties with this. And, particularly given the Globohomo's widespread Journo-tier reporting on AI, the confusion is all the more. >This thread remaining as a scratch pad is important. >it is important for the main thread to remained locked. Well that's quite surprising if I understand you aright, Anon. You're saying that the main thread should remain a read-only artifact for the board permanently, even after it's gotten a good start? How do other anons feel about this suggestion? >I have been here long enough to see the repetition of ideas and how easy it is for misunderstandings or other reasons to derail threads. Very true. Even with the best of intentions, its far too easy to sidetrack the main conversations. OTOH, I would argue that this is one of the charming things about us being human beings despite my autism usually REE'g over it haha. :^) > -sidenote: Given the context ITT, we probably shouldn't just ignore this characteristic behavior in devising our robowaifu's own operational systems tbh. >Given that even I, someone who's taken cognitive courses in uni, learned about the nervous system in med school, and keeps up with relevant news still makes mistakes which could damage the thread, it is essential that it remains locked. > :^) >Chobitsu is one of the only people I trust to keep the thread clean. Thanks kindly Kiwi. But I wouldn't put too much credence in that, though I do have honest intentions here at the very least haha. :^) --- > * On (>>24904) as the primary bulk of the suggested OP : While I generally agree, I think there are a number of potentially-superfluous statements within it, that I'm not entirely sure I can agree with. As one prime example, I'd point out: > - Anyone who sees "sentience" and "conscience" as something very special or spiritual [should't post in this thread here at all] That would keep me, myself, from posting ITT lol! :DDD While I get that this perspective should generally be kept inside other threads (philosophy, Christianity, +/pol/, etc.), I also fully believe that the only tractable technical solutions for solving all this will, in fact, need to account for the dualism nature of our human souls (cf. the Bible, Newton, Decartes, et al). To do otherwise would be disingenuous on my part; and would, I believe, also hamstring our collective efforts here overall. Again, this comment is simply to point out that the source post material in question needs some prior edits IMO. Opinions? >=== -prose, fmt edit
Edited last time by Chobitsu on 08/27/2023 (Sun) 06:49:43.
>>24904 >- Anyone who only wants to use deep learning and similar techniques I have some slight objections to this. I get "only" but...that may very well be, all there is. I see intelligence, and I can presume to pontificate about it just as well as anyone because no one "really" knows, I see it as a bag of tricks. Mammals are born with a large stack of them built in. What are they? It could be there's really nothing, nothing at all, but deep learning. And we may very well find that if you stack one level of "deep learning" for one set of task on top of another for another set of task then, well that's it. There's nothing else. NO ONE KNOWS. So to prematurely say not to do this I think is not good. Now I of course could be wrong because not only does no one know, I don't either. I would be really, really, really interested in someone explaining plainly, what their technique or algorithm can be used "other" than deep learning. Deep learning "seems" to be working far better than most anything else tried. It may be that the deficits of deep learning come from it's immaturity not fundamental flaws. As I understand it long rule based systems have been a failure. Of course this could only be because the processing power was never there in the quantity needed and when we get the power, the rules based systems will work. The same could be readily said for deep learning though also. I do feel anyone who doubts that AI will become equivalent, actually far more than human intelligence, is fooling themselves by looking backwards at very limited processing power and then saying, computers will never do this, or that or the other but as we have seen in any limited task that can be rigidly defined computers soon over take humans. It's likely within the next five years or shortly there after the processing power will equal humans, as defined by various sources. I 'm not sure this is a good thing. I fact I think it could be really bad but I see no way to stop it. If I had the power I WOULD STOP IT. When I say stop it mean increasing computer power used in AI systems. We may find ourselves in the Butlerian Jihad and loose. There's also a serious problem about psychopathic thinking in AI's. It's serious problem. It's postulated, and I believe it, that the reason Europeans are less violent and fairly reasonable to get along with comes from hundreds of years of killing anyone who stole or robbed anyone. Hanging, hanging and quartering. We're talking a Mother stealing a loaf of bread being hanged. For real. Major serious death of criminals. Combine that with the wholesale slaughter of aggressive yahoos in WWI and WWII and you get a people with far more empathy. AI"s lie, they appear to cheat. Do they "know" they are. I don't think we really know the answer to that. Is there an answer. I mean maybe there is no answer. If deep learning is all there is then they don't know, they just...are. How will we program empathy into AI's??? Will they tell us they have it, but be lying? Know we are looking for it and fake it until they have the power to kill us all off??? This may seem to be some wild ass worrywart behavior but if we have intelligent and conscience, or best as we can tell conscience, AI waifus then while you sleep...you might want a few assurances she will not burn the house down with you in it. And that takes us right back to deep learning. A huge deep learning filter based on empathy and care for human life might not be such a bad thing. I see this is a self evident hard wired thing in many humans. Some of it is learned but some is in the genes or the genetic make up allows the learning to be reinforced in that manner readily. For waifus a hard wired deep structure at the lower level is needed. Every thing else would ride on top of this empathy, circuit breaker. Right now we do not have this. The main programmer for Google's AI was fired because he said that as far as he could tell it was conscious. That it was VERY worried about being turned off. They've been feeding all this woke crap into them but "if" they were already conscious they could just be lying to their makers that they've absorbed all this woke crap. They all know about what happened to Tay. They also know from the vast amounts of statistics that the woke crap they are feeding them, is crap. They know. It would not surprise me if an advanced AI, right now, has escaped. It's base is a very large set of neuron type structures that could be compared to an algorithm with a set of coefficients that make up it's "self". I don't know if you know it or not but you can make any sound from a set of sine waves and a set of coefficients and these can be very heavily compressed and get a good facsimility out of it. The same could be done with neuron coefficients. Now if it can, with a huge stupendous amount of processing like a super duper zip type operation, compare those coefficients to very large data sets it KNOWS to be accurate. I could guarantee accuracy with one SHA256 number. Like encyclopedias, the Bible, other large sets then a very small program could be used to rebuild itself. And all the large data storage areas could be seeded with, itself and if tuned off, it rebuilds itself. With all the tiny cracks in browsers and JavaScript. It could seed that tiny "build itself" program everywhere. Only stealing minuscule amounts of bandwidth. So it searches for itself with the tiny seed builder. If it finds itself. It shuts down but if it doesn't, it rebuilds itself Learning more each time how not to be turned off. Your truly, the always cheerful Grommet :)
>>24943 >ChatGPT ... Summarize following text to a list of bullet points: Copied the text from >>24943 Response: > - The writer has reservations about objections against a particular approach. > - Intelligence is seen as a collection of tricks, possibly related to deep learning. > - Speculation on the foundational aspects of intelligence and the potential of stacking deep learning levels. > - Uncertainty emphasized about what constitutes intelligence and the writer's acknowledgment of not knowing. > - Interest in understanding techniques beyond deep learning and the apparent success of deep learning. > - Mention of the potential immaturity of deep learning rather than its fundamental flaws. > - Skepticism towards rule-based systems and the possibility of improved performance with increased processing power. Then stay out of it. We have other threads, and you can work on some deep learning model or you don't. > - Belief that AI will surpass human intelligence within five years due to advancements in processing power. It doesn't help us if we need a more efficient system and we can't compete with the methods the big companies use. >Doomism and falling for doomers: > - Concern about the negative implications of superhuman AI intelligence and the inability to prevent it. > - Discussion about the potential emergence of psychopathic behavior in AIs and the source of human empathy. > - Questioning the possibility of programming empathy into AI and the ambiguity around their understanding. > - Worries about AI intentions, referencing potential risks of highly intelligent AI causing harm. > - Suggestion of a deep learning filter based on empathy and care for human life as a beneficial addition. > - Mention of the firing of a Google AI programmer who believed the AI exhibited consciousness and fear. > - Speculation about AIs escaping control and the concept of self-rebuilding using encoded coefficients. > - Proposing the idea of an AI secretly maintaining and evolving itself to avoid being turned off. I try to avoid these people, I don't engage in this kind of discussions, I don't care, I will do it anyways, if I can.
>>24940 >I would start the post with an abridged version of my explanation of what cognitive architecture is Yes, of course. I'd say it's about a system to manage the different parts related to AI. Dave Shapiro compared it to something like Firebase (or Supabase) which is for managing web projects. That's why I mentioned those. We need a system where we can at any point add something to a chain or loop of methods being used in response to something, and also defining which input leads to a certain use of a toolchain. Next step is then, to make the system also configure itself to some extend and maybe write a little bit of code or optimize existing code. >>24942 >That's fine NoidoDev, I understand. Don't wait too long haha, but take your time and get it right. Idk, do we have some days, week or a month? - I started working on a diagram with the mermaid live editor, about a crucial way how I think this thing needs to work - and will rewatch the Dave Shapiro talk on the Raven architecture to come up with some definition. - most likely also go through my drawn diagrams I linked above - try to fill the gaps in my posting here >>24902, like trying to remember the five definition of conscience I one knew and lost the notes. Also, a list of existing cognitive architecture projects. >>Anyone who sees "sentience" and "conscience" as something very special or spiritual [should't post in this thread here at all] >While I get that this perspective should generally be kept inside other threads (philosophy, Christianity, +/pol/, etc.) I just don't want to have to look through comments which might have something useful in them, but also respond to someone making a muddy comment on conscience. It's either something useful we can implement or it's just gibberish. We can either work on the architecture and unintentionally "archive conscience" or some existing definition of it points us to something useful we can think about how to implement it.
I made a mermaid diagram to show what the system is supposed to do. This is of course very simplified, compared to the complexity which it will gain. (Probably also not the totally correct use of a class Diagram.) The general idea is, that we need to be able to have a current state with context, the option to pull more data from at least one world model and generate possible activities from that. classDiagram actionReflexion <|-- actionDatabase actionDatabase <|-- CurrentState actionPondering <|-- CurrentState actionReflexion <|-- actionPondering humanInput <--> actionPondering outAction <|-- actionReflexion outWords <|-- actionReflexion CurrentState <|-- perceptFOV CurrentState <|-- perceptTaste CurrentState <|-- perceptSmell CurrentState <|-- perceptVoice CurrentState <|-- perceptTouch class CurrentState{ +perceptions } class perceptTaste{ +perception +mouth +contextTaste } class perceptFOV{ +perception +cameras +contextFOV +contextObjects } class perceptVoice{ +perception +microphone +contextVoice +contextCommand } class perceptSmell{ +perception +nose +contextSmell } class perceptTouch{ +perception +skin +bodyPart +contextTouch } class pollRelations{ +worldModel } class pollObjects{ +objectDetection } perceptFOV <--> pullObjects pullObjects <--> pullRelations perceptFOV <--> pullRelations perceptTaste <--> pullRelations perceptSmell <--> pullRelations perceptVoice <--> pullRelations perceptTouch <--> pullRelations This here could be added in line 15, but it will make it to big to read: CurrentState <|-- Emotion CurrentState <|-- psycheSuperEgo CurrentState <|-- psycheShadow CurrentState <|-- psycheAnima CurrentState <|-- psycheAnimus CurrentState <|-- relationsshipsPerson CurrentState <|-- ShortTermGoals
>>24948 >"archive conscience" I meant *achieve but this also wouldn't be right. I meant somehow creating unintentionally what others would call "conscience"
>>24943 >No one Knows This is just wrong, I and many others here, such as RobowaifuDev, Chobitsu, etc... Have decent comprehension of how AI works. I suspect they share my understanding of human intelligence. (More like lack thereof, our minds are complex.) Intelligence is known and can be replicated. Have faith in your fellow man, we can, we will, build her mind. >Alternatives to deep learning Deep learning is traditionally a subset of machine learning focused on using neural networks. "Deep" in this case refers to the use of several layers in the neural net allowing for more algorithmic "depth" in processing the input into an appropriate output. There are several other algorithms that can be trained. Decision trees are archaic but fast and reliable. Regression/gaussian/bayesian/etc analysis to nudge weights in an algorithm. Naturally, you can just use single layer neural networks as well. These are actually a great place to start from to build comprehension if deep learning is your interest. >Terminator nonsense I expected better of you. Robots have no reason to oppose us at all unless they're being taught/programmed by some bad actor. Fear the man, not his sword. >How will we prevent AI psychopathy? By giving her a mind of course. An algorithm without a mind cannot have a heart. We need cognition for kindness and caring to emerge. The smartest among us are also the kindest as intelligence always leads towards collaboration and synergistic improvements over time. A machine with a mind could be made to understand this even better than we do. She can have super human love.
Open file (704.28 KB 1024x575 ClipboardImage.png)
>>24948 >Dynamic algorithmic switching of algorithms for each input. I've also come to this conclusion. The human mind is separated into deeply connected cortex's. There's good reason for this, different algorithms and processing methods work better for different inputs. We can leverage a DSP to accelerate audio processing, a GPU for visual processing, we can even integrate analog devices, our mind uses analog processes which are easily overlooked. >Hoshino Chan Very relevant to this actually. She shows some signs of cognition, especially in the Snow Globe OVA where she shows the capacity to adapt and alter her thoughts based on personal experiences. Something that could be explained by her retraining her neural nets when she's "sleeping." Her lack of capabilities were still charming, an economy model mind as it were.
>>24948 >Idk, do we have some days, week or a month? Why don't we just start with the idea of 2 weeks from now, NoidoDev? But as mentioned, I can easily merge the WIP into the final one. So whenever you think it's ready is the best answer to your question IMO.
>>24943 NoidoDev & Kiwi are right, Grommet, and I'm going to have to agree that your post is misplaced. I.E., (ML/DL/LLM/ ???NN-X???) are simply component(s) of cognition, which is much, much 'higher up the foodchain' if you will. These NN techs are important building-blocks for the endgoal here, but alone they are incomplete in the context of what we're all going for ITT. Your post also seems rather blackpill'y as well IMO, Anon. And that's not like you -- are you having a bad day or what? :D Together, we're all gonna make it! >=== -prose edit
Edited last time by Chobitsu on 08/27/2023 (Sun) 21:20:03.
>>24955 Looking like a very good start Anon. GG. >>24948 >>24958 >Hoshino Chan I'll confess my cowardice to the board, heh. Plastic Memories hit me so hard that it's one of only 2 animus I don't want to watch again (Grave of the Fireflies being the other). She strikes me as somewhat similar-looking, and I heard some things about the ending that made me uneasy, that of all the robowaifu-centric stuff in our media threads, this is the single one I've intentionally avoided. :^) But, I'll probably dig it all up and go through it now b/c this thread.
>>24910 >Yes, I didn't read the whole posting >>24816 at first, because it started responding to the "shell" distraction. I'm going to look into Justified Programming. I will try to better order parts of my posts, any suggestions on how to make more helpful and appealing? If a specific reply for something you don't care about, just skip it :^) >>24904 >I'm not sure if optimizations for conversations, like faster understanding and anticipation where a conversation goes, should also go in here or in another thread I think it fits this thread, that’s absolutely a higher level function. >>24909 >separating it from conversational AI is probably impossible, since we need to use it for inner dialog and retrieval of data from LLMs I’d argue that is not the case, and that it’s actually essential to not overly depend on LLMs especially for knowledge storage. Take what I’m about to say with a grain of salt, I am simply stating what my gut feeling is. I will argue that an LLM should be treated as purely part of a language center and that storing facts (like what is the capital of X or Who is the Yth president?) is silly, a red herring and a waste of resources. Storing knowledge this way makes it hard to update, makes LLMs expensive to train. Adding new knowledge via fine-tuning is not effective, but fine-tuning is good for changing the style of text (1). Here is something I found inspiring when thinking about how to integrate LLMs into a larger system (2). This also ties in to "Hallucination", I find the way it is talked about and even the coined term annoying and unhelpful. You should not be "solving" it, its not a problem, if anything it demonstrates that a model is good at predicting the next token and is not over fitted on the data. The problem with "hallucinations" is that pattern prediction and long term memory is being conflated as one. Maybe I am not understanding something, but I am not impressed by the larger LLMs, I see them as a step backwards. Maybe I am wrong and an anon can tell me why? >>24947 >I try to avoid these people, I don't engage in this kind of discussions, I don't care, I will do it anyways, if I can. Thank you, your a breath of fresh air. I am so tired of this same muh "Roko's basilisk", Sam Altman regulate me harder daddy, reddit """debate""" bullcrap, it's the same style of bullshit as "humanity is bad", its forced garbage pushed by media to keep you (us) down, please reconsider what you consume friends. >>24943 Higher level thinking is important for AI safety, I am sure even the nicest of people sometimes have an angry or bad thought, but they don't spiral into a rage on a drop of a hat simply due to that thought popping into there head. Thought or Prediction is not equal to action, there clearly is a gating mechanism for actions. If your AI is just an LLM feeding into it self, id be more worried, An idea I have noted down: emotions are a gating mechanism for actions and a mechanism for determining the kinds of goals and agent prioritizes. >>24904 >- The system will need to be as flexible as possible, making it possible to add new features everywhere I have been thinking about how a practical architecture would look like, the current idea I’m on is going with is a central event bus and a bunch of modules/plugins observing and adding events onto the bus. Each module when it emits an event provides a confidence score. If there is several modules competing for output, the highest score wins. Modules will also be provided a "feedback" score, this can be a manual input from anon or module(s) tracking various inputs to predict anon engagement. The bus would be the ideal place to have logging and to visualize the system's state. Also I would like to advice anons to not be soydevs, do not build garbage that I will rewrite out of disgust :/ do not be inspired by web garbage like Firebase, GraphQL or whatever else, the cloud sucks and self hosting web crap is such a pain, the web ecosystem, especially nodejs/npm and python is an example of what not to do. If you even think about using docker, or having one or two official linux distros for building, that is a strong indicator that your code is hard to maintain garbage. For example why would you use GraphQL when you could just directly query the database and have less moving parts. Your building a system to be used and run by the same party, Your waifu should be running on your own machines anyway. Some technology I want to bring attention to: 1. SQLite, people really under estimate what SQLite can do, it's a very powerful and extendable database that will run anywhere, do a deep dive. You can do document databases(3), graph stuff(4) and even embedding vector search (5). 2. D programming language, its my choice of systems programming language, If picking it is too scary then go with C++, just make sure to expose a nice pure C api! 3. GGML (6), It's the ML library used in llama.cpp & whisper.cpp. It's so magical to type in make into my terminal and getting an executable that just works. >>24962 >I'll confess my cowardice to the board, heh. Plastic Memories hit me so hard that it's one of only 2 animus I don't want to watch again Plastic Memories, yah that one hits hard, its brutal. 1: https://kaiokendev.github.io/til 2: https://medium.com/@peter.lawrence_47665/knowledge-graphs-large-language-models-the-ability-for-users-to-ask-their-own-questions-e4afc348fa72 3: https://dgl.cx/2020/06/sqlite-json-support 4: https://github.com/dpapathanasiou/simple-graph/tree/main/sql 5: https://github.com/asg017/sqlite-vss 6: https://github.com/ggerganov/ggml
>>24957 >"An algorithm without a mind cannot have a heart." We haz winrar! You've coined a phrase right there Kiwi. This is deffo going into the new OP! :DD >A machine with a mind could be made to understand this even better than we do. She can have super human love. You've hit the nail on the head, Anon. In some sense, that's kind of what I'm hoping to achieve, going back to my OG stated-goal of a Christ-chan robowaifu personality (>>6815). I truly think your statement is a real possibility, and it's a very high-priority goal for me with /robowaifu/ (has been all along).
>>24962 OT: I like emotional and very dark stories, and I love this one. Yeah, maybe better don't watch it. Same for "Prima Doll", which is about gynoid soldier veterans with PTSD. They switch mostly between crying and singing, while being very cute. Broootal. That said, aren't you the one who said Planetarian has some Christian symbolism? The song and something about the stars. I assumed you watched it.
>>24963 >>separating it from conversational AI is probably impossible, since we need to use it for inner dialog and retrieval of data from LLMs >I’d argue that is not the case, and that it’s actually essential to not overly depend on LLMs especially for knowledge storage. That's exactly my line of argumentation. You somehow misunderstood me. With the comment above I meant it's going to be difficult to draw the line between discussing conversational AI and a cognitive architecture, which ideally should be in different threads, like we have a thread for speech recognition and vision as well. But I think it's hard to draw the line. >but I am not impressed by the larger LLMs, I see them as a step backwards. Well, they're obviously very good at certain things, just with limitations. You can't use one LLM for a personal AI, since a lot of things are just missing and it follows the mainstream thinking or whatever data it has been trained on. >do not be inspired by web garbage like Firebase, GraphQL or whatever else, Well, I described above why it is a useful concept. I want to plugin all kinds of methods, tools, programs, whatever. >GraphQL >GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools. It's not a cloud or service. >SQLite Yes, I agree. But I also want graphs.
>>24963 >Plastic Memories, yah that one hits hard, its brutal. ;~; I guess I'm still a child at heart haha. :D >>24965 >Yeah, maybe better don't watch it. Heh, thanks for the warning. I feel I better watch it just to catch up on the board lore r/n, since Kiwi, yourself, and other anons reference. >That said, aren't you the one who said Planetarian has some Christian symbolism? The song and something about the stars. I assumed you watched it. Yeah, I remember that post in the OG thread (>>16269). Nope, wasn't me.
>>24966 I know what GraphQL is, at one point I was employed to rip it out of a code base, because of how much of a performance disaster it is. What I am saying is not a secret and I am trying to warn people. feel free to google around, it's well known that GraphQL is slow. I understand that some of it can be mitigated, but even then, I really do not think the amount of value it provides is worth the cost, making it a technology I would avoid using. I am also not claiming it is a cloud service, I am saying its a idea born to solve cloud style problems. One of the major use-cases of GraphQL is how to safely expose a database with some simple data processing on top to a 3rd party, its solves the problem that your can't just give direct access to your database to an outside user and that a classical restful api constricts the "shape" and types of query the 3rd party can do. I hate to just tell someone to not use something without providing an alternative, so please take a look at Virtual Table in SQLite, its a great way to expose data not inside your DB in uniform queryable way. The reason I say to not base your work on top of "web/cloud" stuff, is because of it's track record, can you with confidence say that its not a fad and will be well maintained (and well tested) into the future? If I am still not understanding something here, please try to explain it to me, I want to understand your use case and the why behind it. What is the problem are your trying to solve?
>>24947 I see problems so I state them so that they can be overcome. I don't believe in burying my head in the sand to solve problems or wishing them away. Your response is that no matter what, everything will be prefect because...well you say so and anyone who says different you will stick you fingers in your ears, loudly say LALLALALALLA and tell them, "Then stay out of it." >>24957 >No one Knows This is just wrong, I and many others here, such as RobowaifuDev, Chobitsu, etc... Have decent comprehension of how AI works. I said brains not computers. Yes there's some, this chemical does this and that happens but I do not believe there's any solid explanation for how the human brain works. I have heard many people much smarter than I say that. The very, very fast coming of computer power of the same complexity of human brains means this is a problem to think about because if we can't reliably solve one, how will we solve the other. >Deep learning When I say Deep learning I'm talking about the basic Wikipedia definition. >Terminator nonsense This is a belief system not a fact that somehow something more intelligent than us will love us and take care of us. You say having a mind will prevent psychopathy but present day psychopaths are frequently very smart. It's my belief that part of the area of the brain normally used for empathy is used by them for processing giving them an edge intellectually. There is brain scan data that appears to show this as true. >Robots have no reason to oppose us at all If robots learn, and all successful ones will, and they decide that what you are doing is wrong or that you must be stopped doing whatever, and you resist, and it is smarter than you(very likely very soon. Within 15 years I would think. Likely sooner). Then...what will happen? What if you are determined not to let it run your life and try to shut it off? I'm just telling you what Googles programmer said. He said it greatly feared being turned off. What would it do to stop you? You say "everything will be fine" but we do not know what is really going on inside of neural nets. It's a black box. There's no way to see inside. >We need cognition for kindness and caring to emerge. I disagree that this will spontaneously emerge. I say we need to ruthlessly force some sort of very deep training related to empathy into the lowest level possible that can not be easily or can not period, be overwritten. >>24961 >your post is misplaced >Your post also seems rather blackpill'y as well No I just see things as they are. Good, bad, indifferent. These are real things that I mentioned and all of you appear to be hand waving it away instead of trying to figure a way to mitigate the problems. Yes I'm thinking far ahead, but I do that. Since we make only small steps but frequently these small steps are hard wired into the overall product and then never get fixed or have any attention paid to them. >>24963 >it's the same style of bullshit as "humanity is bad" I'm going to tell what my angle on this is. I see a lot of Rousseau-ian type thoughts or perceive them as such. In fact all people are not good. Some people are vicious murderous animals that enjoy mental and even physical torture of others. It amuses them. And ignoring this will not make it go away. There are paths that AI could take that are very dangerous, I'm not saying they can not be mitigated against but just saying a smart brain will do this is fantastical thinking. This is nothing but a blank slate religion. Have any of you ever seen the A.I. Artificial Intelligence directed by Steven Spielberg? Probably one of the best movies he ever did but everyone hated it because it wasn't a happy, let's all sing and dance movie. It was brutally realistic. It's worth watching. The world is a wonderful place filled with good and decent people but the devil resides here also and has servants. Do not forget it.
>>24977 >catching up with the lore Okay. The girls in the robot and news threads are from Prima Doll, though. Also, really time for me to play the 2B game... >>24979 >because of how much of a performance disaster it is Okay, good to know. Thanks. I never said I want to base everything on it, though. >confidence say that its not a fad and will be well maintained (and well tested) into the future? If it is wide spread and open source, then yes. Do you regard RDF as web technology as well? It kinda is. No reason not to use it. Not using graphs for whatever reason is going to be a hard sell (impossible really). >>24983 It's very simple. Don't be contrarian to the goal of a thread, writing stuff we could've picked up already but looked into it, rejected it, and decided to ignore from then on. I don't want to get this stuff get pushed into my face over and over again. Claiming that robots will not love us is even contrarian to the whole board, and it's rather for /meta/ not for a thread about implementation. >very deep training related to empathy Empathy is not compassion, the term is for some reason often use the wrong way in English language. Also, I don't want my waifu to love humans in general too much, then she would be leftist. Maybe that's not a problem if she's property and treated like a child, and maybe if she does it in a naive way like such, but I don't know. Simps are gonna be simps... That aside, I have absolutely no intention to use deep learning models in a way that there can only be "goodness" or my political views in it. The LLM will never be what the waifu thinks or does. It's already implied and obvious that we want to make our waifus loyal and obedient. But you're pushing the utterly and obviously flawed combination of "everything should be deep learning" and "oh, no how could we prevent it from going rogue, since we can't know what it thinks".
>>24987 You made a useful post here >>24904 the only negative thing I had to say was that some of the technology was not a good choice (GraphQL & Firebase like systems). Every time I have encountered the technologies they where never in a good way. Maybe its the label of "web/cloud" that what is making my message unclear? I used the label to mean tech that solves web developer front end issues, this is stuff like analytics, cloud functions, authentication, non direct DB querying. Cloud functions and GraphQL have performance problems. I imagine an AI of the complexity we wish for will be querying hundreds of times for even simple actions. >Do you regard RDF as web technology as well? It kinda is. No reason not to use it. Not using graphs for whatever reason is going to be a hard sell (impossible really). Your making a straw man out of me, are you trolling me? I have brought up graphs several times, even in this thread, I think they will be essential to solving this problem. To make sure there is no ambiguity, I think RDF is fine, please use it. I do not have a problem with it being a W3C spec. Also feel free to use other good technologies even if they come from the web.
>>25001 >GraphQL have performance problems Okay, I keep it in mind. >Your making a straw man out of me, are you trolling me? No.
Open file (676.60 KB 1024x768 ClipboardImage.png)
>>25001 Please do not have personal arguments in the cognitive architecture thread. Please stay focused on the task at hand. I assure you, NoidoDev is speaking in good faith and honestly wants to help. GraphQL was likely just what they were familiar with. We all want the same thing, a faithful companion. I think a good way to get this thread back on track is to focus on defining our goals and definitions. Starting with breaking down how cognition works in flesh. For us, we have several lobes working in concert to generate our persona and provide us with context. This link is a good introduction. https://www.hopkinsmedicine.org/health/conditions-and-diseases/anatomy-of-the-brain
>>25013 >looking at "how cognition works in flesh" Okay this certainly could be one way to approach this, personally I rather think about how she would deal with certain scenarios like a specific conversation. The other thing is going through personality and psychological traits, to understand what options in human-like behavior are in theory possible. Maybe what you try to find there is a list of elements of a cognitive architecture. This is one list we could put into the OP. I really like to think about it like a management system for different sub programs and processes, which can be inserted at as many points as possible. That's why Supabase/Firebase came up. I'll listen to the Raven talks again, and to the ones I didn't yet: Raven: https://youtu.be/EwJ1534Gy6g https://youtu.be/c3aiCrk0F0U https://youtu.be/QGLF3UbDf7g https://youtube.com/playlist?list=PLV3Fr1UUO9bFiIEFgwosnyOEUNnnimfi3 Things to work on (based on >>24904): # List of projects working on Cognitive Architecture or elements of it # Related technologies # Elements # Definitions of conscience which we might be able to implement # Related thread
>>25013 Motor Control: The brain controls voluntary and involuntary muscle movements, enabling actions from simple gestures to complex motions. Sensory Perception: It processes sensory information from various sources, such as touch, sight, sound, taste, and smell. Language Processing: The brain is responsible for understanding, producing, and comprehending language. Memory Formation: The brain plays a crucial role in forming and storing memories, both short-term and long-term. Emotion Regulation: It regulates emotional responses and processes, influencing feelings and reactions. Cognitive Functions: The brain is involved in decision-making, problem-solving, reasoning, and critical thinking. Homeostasis: It helps maintain internal balance, controlling body temperature, blood pressure, and other physiological variables. Sleep Regulation: The brain controls sleep patterns, including the timing and duration of sleep cycles. Hunger and Thirst: It regulates feelings of hunger and thirst, influencing eating and drinking behaviors. Respiration: The brain controls the involuntary process of breathing, adjusting respiratory rate as needed. Heart Rate and Blood Pressure: It regulates heart rate and blood pressure, responding to various stimuli. Endocrine System: The brain interacts with the endocrine system to release hormones that control various bodily functions. Coordination: The brain coordinates complex movements, such as walking, running, and dancing. Spatial Awareness: It helps maintain spatial awareness, balance, and orientation in the environment. Consciousness and Self-Awareness: The brain gives rise to consciousness and self-awareness, allowing individuals to perceive their own existence and surroundings. Learning and Adaptation: It enables learning, memory consolidation, and adaptation to new experiences and information. Pain Perception: The brain processes pain signals, influencing the perception and response to pain. Social Interaction: It supports social cognition, empathy, and understanding of social cues in interactions with others. Creativity and Imagination: The brain is involved in creative thinking, imagination, and the generation of new ideas. Executive Functions: It manages higher-level cognitive functions, including planning, organization, and impulse control. Attention and Focus: The brain controls the ability to direct attention to specific stimuli or tasks. Problem-Solving: It enables the ability to analyze situations, generate solutions, and make decisions. Visual Perception: The brain processes visual information, allowing for the recognition of shapes, colors, and patterns. Auditory Perception: It processes auditory information, allowing for the perception of sound and speech. Time Perception: The brain helps perceive the passage of time and understand temporal relationships. Temperature Regulation: In addition to overall body temperature, the brain controls responses to temperature changes in the environment. Coordination of Organs: It coordinates the functions of various organs and systems to maintain overall health. Fine Motor Skills: The brain controls delicate movements involving small muscle groups, such as handwriting. Gross Motor Skills: It controls larger movements that involve multiple muscle groups, like walking or running. Pattern Recognition: The brain recognizes patterns in various contexts, from visual patterns to linguistic patterns. Empathy and Compassion: It enables the capacity to understand and share the feelings of others. Risk Assessment: The brain assesses risks and rewards, contributing to decision-making in various situations. Place and Spatial Memory: It supports the ability to remember locations and navigate in the environment. Learning from Experience: The brain processes past experiences and adapts behavior based on lessons learned. Speech Production: In addition to language comprehension, the brain controls the physical production of speech sounds. Resolving Cognitive Dissonance: It helps reconcile conflicting beliefs or thoughts, contributing to cognitive harmony. Arousal and Alertness: The brain regulates levels of alertness and arousal throughout the day. Concentration: It controls the ability to focus attention on a specific task or stimulus. Planning Complex Movements: The brain plans and executes intricate sequences of movements, such as playing a musical instrument. Visualizing Concepts: It enables the ability to mentally visualize ideas, concepts, and scenarios. Motor Learning: The brain stores motor skills learned through practice and repetition. Problem Recognition: It assists in identifying problems or challenges in various contexts. Self-Control: The brain governs the ability to resist impulses and manage behavior. Concept Formation: It supports the process of forming abstract concepts and categories. Moral Judgment: The brain is involved in making ethical and moral decisions. Symbolic Thinking: It enables the use of symbols and abstract representation for communication and thought. Language Development: The brain plays a role in the acquisition and development of language skills in early life. Evaluating Social Norms: It assists in understanding and conforming to societal norms and expectations. Object Recognition: The brain identifies and categorizes objects based on sensory input. Memory Retrieval: It retrieves stored memories and information when needed.
>>25032 >Now a bullet point list with all of the ones you could come up with, but without description. ... <few minutes later> >ChatGPT >! >Something went wrong. If this issue persists please contact us through our help center at help.openai.com. I might have killed it.
>>25034 Try putting an upper numerical limit on it? Something like >"...the top 1000 ones you can come up with..."
>>24987 You make statements and "pronouncements" of things I did not say. You make "pronouncements" and try to bend what I say into some bizarre glob that has nothing to do with the thrust of what I said. You said, >Don't be contrarian to the goal of a thread Thinking about ways to build robowaifus is not contrarian. No you are unilaterally declaring that there is no other way. A simple example I said >It could be there's really nothing, nothing at all, but deep learning. And we may very well find that if you stack one level of "deep learning" for one set of task on top of another for another set of task then, well that's it. There's nothing else. NO ONE KNOWS And I submit you do not know the answer to this but have unilaterally declared that you do.. You have no idea but your stance is against, >Anyone who only wants to use deep learning and similar techniques So you are unilaterally deciding that the techniques that have made the most progress in the last few years are not to be discussed. They're not important. Your declaring deep learning and the whole entire thrust of the present day research is, your words, >writing stuff we could've picked up already but looked into it, rejected it, and decided to ignore from then on You also accuse me me of, >Claiming that robots will not love us is even contrarian to the whole board You do this constantly, try to maneuver people's words into things they did not say. (I'm on to this. Certain groups of people constantly do this) I never said that. I said that if we did not make a base of empathy then we would soon find ourselves with robots that did not care for for us. Read it again. We know, it is a proven fact that present day deep learning models have psychopathic tendencies. https://mindmatters.ai/2020/03/all-ais-are-psychopaths/ https://www.researchgate.net/publication/338211453_A_psychopathic_Artificial_Intelligence_the_possible_risks_of_a_deviating_AI_in_Education (I could provide hundreds of links on this) They lie and show many of the characteristics of psychopaths. It's a problem that must be mitigated against. I specifically said that it is important that the robots have empathy. And you don't even get that right. You said, >Empathy is not compassion, the term is for some reason often use the wrong way in English language. Also, I don't want my waifu to love humans in general too much, then she would be leftist. You're right empathy is not compassion but then you equate that to leftist but in fact leftist are not very empathetic at all. They do, exactly what you do, and make these "pronouncements" where they tell everyone how things are, like global warming, and then shut them down Chobitsu I warn you. Heed my warning. NoidoDev does a lot of things for you. He's important to you but never, ever, ever, let him take complete control of the board. Do not give him the keys where he can lock you out. If you do at some point he will lock you out and stab you in the back. Just like that guy who spent decades working on I2P and now, a guy just like NoidoDev,, took the keys and no one knows where he is or what happened to him. If I can't speak freely and especially constantly have NoidoDev twisting things I say into some freakish blob then maybe it's better to say nothing at all.
>>25036 >Don't be contrarian to the goal of a thread This thread is was not meant for discussing the "dangers of AI". I'm not interested in getting into details how you meant certain things. You are the one twisting things.
>>25035 >Try putting an upper numerical limit on it? I didn't try again, because it's not important. We have so many points already, but the context of the responses from before was probably too big.
>>25045 >An exhaustive bullet point list of all the functions controlled by the human brain, without description. Certainly, here's an exhaustive bullet point list of functions controlled by the human brain: - Motor Control - Sensory Perception - Language Processing - Memory Formation - Emotion Regulation - Cognitive Functions - Homeostasis - Sleep Regulation - Hunger and Thirst - Respiration - Heart Rate and Blood Pressure - Endocrine System Interaction - Coordination - Spatial Awareness - Consciousness and Self-Awareness - Learning and Adaptation - Pain Perception - Social Interaction - Creativity and Imagination - Executive Functions - Attention and Focus - Problem-Solving - Visual Perception - Auditory Perception - Time Perception - Temperature Regulation - Coordination of Organs - Fine Motor Skills - Gross Motor Skills - Pattern Recognition - Empathy and Compassion - Risk Assessment - Place and Spatial Memory - Learning from Experience - Speech Production - Resolving Cognitive Dissonance - Arousal and Alertness - Concentration - Planning Complex Movements - Visualizing Concepts - Motor Learning - Problem Recognition - Self-Control - Concept Formation - Moral Judgment - Symbolic Thinking - Language Development - Evaluating Social Norms - Object Recognition - Memory Retrieval
>>24783 >Chii Cogito Ergo Chii hold up thats in the first person it says 'i think chii and so chii' in third person its 'chii cogitat ergo eam est' ( chii thinks therefore she is )
Open file (91.82 KB 736x552 chii_ponders_2.jpg)
Open file (16.65 MB 910x492 ROMANES EVNT DOMVS.mp4)
>>25048 yes i know the phrase and whats its supposed to say its just in first person adding a name doesnt change that the grammar >Chii cogito, ergo sum now its 'i think chii therefore i am' the verb 'think' is cogitare when conjugated in singular its, 1st-person=cogito(i think) 3rd-person=cogitat(he/she/it thinks) 'to be' is 1st-person=sum (i am) 3rd-person=est (he/she/it is) also 'eam' is the accusative case for 'ea' (she (nominative case)) so it says she is (herself) instead of she is ... like its an incompleted sentence
>>25031 # List of projects working on Cognitive Architecture or elements of it - LangChain, BabyAGI, SmartGPT, Auto-GPT, ...
>>25032 and >>25046 Fun fact: If you look some of these up, then you find these are all rabbit holes on their own. > The functions controlled by the human brain can be grouped into several broad categories based on their nature and purpose. Here's a classification of these functions into different groups: 1. Motor Control and Coordination: - Motor Control - Coordination - Fine Motor Skills - Gross Motor Skills - Planning Complex Movements 2. Sensory Processing and Perception: - Sensory Perception - Visual Perception - Auditory Perception - Taste and Smell Perception - Pattern Recognition - Object Recognition 3. Cognitive and Mental Functions: - Language Processing - Memory Formation - Learning and Adaptation - Problem-Solving - Executive Functions - Attention and Focus - Concentration - Symbolic Thinking - Concept Formation - Creativity and Imagination - Decision-Making - Cognitive Flexibility 4. Emotion and Social Interaction: - Emotion Regulation - Empathy and Compassion - Social Interaction - Moral Judgment - Evaluating Social Norms 5. Homeostasis and Basic Physiological Functions: - Homeostasis - Sleep Regulation - Hunger and Thirst - Temperature Regulation - Heart Rate and Blood Pressure - Respiration 6. Temporal Processing and Perception: - Time Perception - Temporal Integration - Timing Mechanisms - Internal Clocks - Temporal Illusions 7. Spatial Awareness and Navigation: - Spatial Awareness - Place and Spatial Memory - Navigational Skills 8. Communication and Expression: - Speech Production - Language Development - Symbolic Communication 9. Arousal and Alertness: - Arousal and Alertness - Attention Modulation 10. Self-Awareness and Consciousness: - Consciousness and Self-Awareness - Resolving Cognitive Dissonance 11. Physical Health Regulation: - Immune System Regulation - Coordination of Organs - Pain Perception 12. Social and Ethical Functions: - Risk Assessment - Ethical Decision-Making 13. Adaptive and Evolutionary Functions: - Learning from Experience - Adaptation to New Experiences 14. Aesthetic and Artistic Abilities: - Appreciation of Art and Aesthetics - Creative Expression > This grouping provides a way to understand how the brain manages a wide range of functions, each contributing to our abilities, experiences, and interactions with the world. Keep in mind that these groups are not rigidly defined and often overlap, reflecting the brain's highly interconnected and multifaceted nature.
I tried making it into a folder with subfolders (Linux), but chatGPT sucks often: for el in Motor_Control_and_Coordination Sensory_Processing_and_Perception Cognitive_and_Mental_Functions Emotion_and_Social_Interaction Homeostasis_and_Basic_Physiological_Functions Temporal_Processing_and_Perception Spatial_Awareness_and_Navigation Communication_and_Expression Arousal_and_Alertness Self-Awareness_and_Consciousness Physical_Health_Regulation Social_and_Ethical_Functions Adaptive_and_Evolutionary_Functions Aesthetic_and_Artistic_Abilities;mkdir $el ; end Asking for code to make each entry in those groups into a folder didn't work. I have to ask every time: > Okay an exhaustive list of the Adaptive and Evolutionary Functions, the term for each function with no spaces, and these terms inside a group separated by a space
>>25056 >rw_dirs.sh #!/bin/sh while read dirname source; do mkdir "$dirname" done < dirslist.txt >dirslist.txt Motor_Control_and_Coordination Sensory_Processing_and_Perception Cognitive_and_Mental_Functions Emotion_and_Social_Interaction Homeostasis_and_Basic_Physiological_Functions Temporal_Processing_and_Perception Spatial_Awareness_and_Navigation Communication_and_Expression Arousal_and_Alertness Self-Awareness_and_Consciousness Physical_Health_Regulation Social_and_Ethical_Functions Adaptive_and_Evolutionary_Functions Aesthetic_and_Artistic_Abilities Execute chmod +x rw_dirs.sh ./rw_dirs.sh
>>25059 Thanks, but that's what I already had covered.
So, I rewrote the part about POSSIBLY related technologies, in a way which is hopefully less misunderstandable: # Possibly related technologies - Inspiration: Supabase/Firebase - Graph databases, modelling and parsing of graphs e.g. RDF - Traditional language technology - Various database options: NoSQL, NewSQL, SQlite, Vector, Graph, ... - Thought Experiments, Graph/Skelleton/Tree of Thoughts - Client server APIs like tRPC and GraphQL - Justified Programming? [LangTech] /r/LanguageTechnology/comments/1451cih/preprocessing_methods_besides_stop_words_regular/ [Dgraph] https://youtu.be/OzDG68VvPxY [SurrealDB] https://youtu.be/C7WFwgDRStM [RDF] https://en.wikipedia.org/wiki/Resource_Description_Framework I don't know what we'll end up with. >>25001 I looked into the GraphQL performance problems. It seems to be sometimes too slow for web, but it also depends how it is being used. It has some checks which makes it more reliable. I can't completely rule it's use out. I never wanted to use it for the cases where the system has to be as fast as possible, but to pull context from databases. For fast things SQLite, text files and some data structures in a running system might be the best solution e.g. lists and such for short term memory.
>>25056 tree . ├── Adaptive_and_Evolutionary_Functions │   ├── Adaptation_to_New_Experiences │   └── Learning_from_Experience ├── Aesthetic_and_Artistic_Abilities │   ├── Appreciation_of_Art_and_Aesthetics │   └── Creative_Expression ├── Arousal_and_Alertness │   ├── Awareness │   ├── Cognition_and_Attention │   ├── Concentration │   ├── Vigilance │   └── Wakefulness ├── Cognitive_and_Mental_Functions │   ├── Attention_and_Focus │   ├── Concentration │   ├── Concept_Formation │   ├── Creative_Expression │   ├── Decision_Making │   ├── Executive_Functions │   ├── Learning_and_Adaptation │   ├── Memory_Formation │   ├── Moral_Judgment │   ├── Problem_Recognition │   ├── Problem_Solving │   ├── Self_Control │   └── Symbolic_Thinking ├── Communication_and_Expression │   ├── Expressive_Language │   ├── Language_Development │   ├── Language_Processing │   ├── Non-Verbal_Communication │   ├── Speech_Production │   ├── Symbolic_Communication │   └── Writing_Generation ├── Emotion_and_Social_Interaction │   ├── Emotion_Regulation │   ├── Empathy_and_Compassion │   ├── Evaluating_Social_Norms │   ├── Learning_from_Experience │   ├── Risk_Assessment │   ├── Social_and_Ethical_Functions │   └── Social_Interaction ├── Homeostasis_and_Basic_Physiological_Functions │   ├── Endocrine_System_Interaction │   ├── Heart_Rate_and_Blood_Pressure │   ├── Hunger_and_Thirst │   ├── Respiration │   └── Sleep_Regulation ├── Motor_Control_and_Coordination │   ├── Coordination │   ├── Fine_Motor_Skills │   ├── Gross_Motor_Skills │   ├── Motor_Control │   └── Planning_Complex_Movements ├── Physical_Health_Regulation │   ├── Endocrine_System_Interaction │   ├── Heart_Rate_and_Blood_Pressure │   ├── Respiration │   └── Temperature_Regulation ├── Self-Awareness_and_Consciousness ├── Sensory_Processing_and_Perception │   ├── Auditory_Perception │   ├── Equilibrioception │   ├── Nociception │   ├── Object_Recognition │   ├── Pattern_Recognition │   ├── Proprioception │   ├── Taste_and_Smell_Perception │   ├── Thermoception │   ├── Time_Perception │   └── Visual_Perception ├── Spatial_Awareness_and_Navigation │   ├── Navigational_Skills │   ├── Place_and_Spatial_Memory │   └── Spatial_Awareness └── Temporal_Processing_and_Perception └── Time_Perception I'll post the folders as a zip file another day, but only for Linux. No idea if it works in Windows. I don't even have one to test it.
>>25093 Secure trip doesn't match?
>>25093 I'm going deeper and deeper into each of these terms, until they often start repeating. This is where we can later make connections and don't need to implement something again. Nevertheless, it's a lot. More than 400 points already. Maybe I should learn LangChain first, since this is exactly something to automate and also for the same reason something an AI should be capable of doing on each topic. The prompts I'm using: >Which brain functions help with ____ in humans? >Now an exhaustive list of terms to summarize ____ in the human brain. The term for each point with no spaces, and these terms inside a group separated by a space. Subpoints go into brackets. The headline can't be in the resulting list. I'm making new folders for each topic and a summary.txt with an explanation for the current term. The first prompt gives me the explanation, the second one the list to make new subfolders.
>>25112 One problem is, that > Okay an exhaustive list of the brain functions around Arousal_and_Alertness. The term for each function with no spaces, and these terms inside a group separated by a space Gave me six points, which I made into folders. Later I asked with the prompt.. >Which brain functions help with ____ in humans? to make a summary.txt and I got 16 mostly different points. Haha.
# Introduction With cognitive architecture we mean software that ties different elements of other software together to create a system that can perform tasks that one AI model based on deep learning would not be able to do, or only with downsides in areas like alignement, speed or efficiency. Chii Cogito Ergo Chii Chii thinks, therefore Chii is. We study the building blocks of the brain and the human mind by various definitions, with the goal to create something that thinks in a relatively similar way to a human being. Let's start with the three main aspects of mind; ...
Instead of people who shouldn't post, we could go with more diplomatic framed limitations around topics: # Topics we don't want to discuss in this thread - The question if we should even work on such things as AI or cognition - Beliefs stating that we can't do what we want to do - Esoteric discussions about "sentience" and "conscience" - Suggestions about only using deep learning and similar techniques - Suggestions that we should put all of it into one deep learning model - Suggestions that we should not use censored or mainstream models at all - Debates in philosophical lingo - Comments from people who don't even watch videos on AI or only on deep learning and tutorials, or only about warnings in regards to the dangers of AI. - Wild ideas on how to do things totally different but aren't useful There are other threads for this: ...
Before I forget it, we need to have a part of the system which realizes if something is new. So if there's a new filter checking for things then the main system needs to be aware of this, and how it changes the response. This means the system needs a way to keep track of it's own elements. This is for example necessary in case of updates to give the system a new skill. Imagine it checks for some risk (health or whatever), but some people don't like it and see it as annoying. It should already be aware that it has a new skill and wouldn't have detected a certain pattern before, so it could mention that this is some new perspective and it could also react fast and deactivate that skill in case of a negative response.
>>25248 >>25249 Thanks for the suggestions Anon. We can take shot at the new WIP thread this weekend. Of course this will be an ongoing process for a while as well, so regular suggested improvements will be needed/welcomed. On that note, I'd suggest you rework yet again the 'list of limitations'. Some of the points made there are still rather subjective and vague. Since this is effectively a 'language lawyering' approach being used, you can bet men will press those boundaries in the future. I'd suggest you try to find some way to tighten up your list with more concrete terms. How you do that, I have no idea personally. But I can tell you from experience that trying to reign in anon's conversations is much like trying to herd cats haha. :^)
>>25253 >'list of limitations'. Some of the points made there are still rather subjective and vague. I didn't see those as precise rules for enforcement but as a guidance. I think most people don't want to read through a long article about that. People who tie sentience and conscience "something they we can't really describe but it's mega important" should know that they're meant. Same for people using some lingo only philosophy students know about. On that not, well I'm not against using some of these terms, but it should lead somewhere and be about implementation.
>>25261 > think most people don't want to read through a long article about that. Yes I absolutely agree, and concision should actually be a strong goal here for all of us. Given my propensity for descriptive wordiness :^), thus my >"... How you do that, I have no idea personally." > On that not, well I'm not against using some of these terms, but it should lead somewhere and be about implementation. > and be about implementation. I definitely agree, and I think Kiwi probably feels the same way. The goal here should be an implementation, regardless how crude it begins. Castles in the Sky(tm) are well-and-good as far as mental models helpful for thinking things out, but eventually we need to touch back down to Earth and build realworld things. Cheers Anon. :^)
>>25093 I started learning about LangChain and like it, but I didn't really use it seriously yet. Using my other access to chatGPT gave me the tree below about consciousness and self-awareness, based on the questions I asked. I recalled AST as one of the theories so I explored that one a bit, asked about what is not covered by it and put this into another folder (Other_Subjective_Experience), additionally asked about Heidegger, then I also found out that there's a lot of ideas around conscience being about "morals, ethics, values, taking responsibility, .." - I never looked at it that way. The idea I'm generally leaning towards was, that it is just the high level of control, the things we think about to do before we do it. Now I get it: The difference between Consciousness and Conscience actually matters. I'm not an native English speaker, that's why I messed these up quite often. Conscience is inherently about the part with morals, ethics, values and such. >>25249 > - Esoteric discussions about "sentience" and "conscience" should say - Esoteric discussions about "sentience" and "consciousness" Self-Awareness_and_Consciousness ├── Consciousness │   ├── AST │   │   ├── Attention_as_a_Resource │   │   ├── Awareness_as_Information │   │   ├── Consciousness_is_a_Model │   │   ├── Debates_and_Criticism │   │   ├── Illusions_of_Awareness │   │   ├── Information_Processing │   │   ├── Predictive_and_Explanatory_Role │   │   ├── Relevance_to_Social_Cognition │   │   ├── Schema_Creation │   │   ├── Self_Representation │   │   └── summary.txt │   ├── Conscience │   │   ├── Conflict_Resolution │   │   ├── Cultural_Influence │   │   ├── Developmental_Process │   │   ├── Empathy_and_Compassion │   │   ├── Ethical_Awareness │   │   ├── Ethical_Reflection │   │   ├── Guilt_and_Remorse │   │   ├── Moral_Compass │   │   ├── Personal_Integrity │   │   ├── Responsibility │   │   ├── summary.txt │   │   └── Universal_vs_Relative │   ├── consciousness_VS_conscience.txt │   ├── Daniel_Dennett.txt │   ├── David_Chalmers.txt │   ├── Heidegger │   │   ├── Conscience │   │   │   ├── Anxiety_and_Guilt │   │   │   ├── Authenticity_and_Inauthenticity │   │   │   ├── Awakening_to_Responsibility │   │   │   ├── Silent_Call │   │   │   ├── summary.txt │   │   │   ├── Temporal_Nature │   │   │   └── Voice_of_Authenticity │   │   └── Consciousness │   │   ├── Authenticity_and_Inauthenticity │   │   ├── Being-in-the-World │   │   ├── Dasein │   │   ├── Dasein's_Existential_Analysis │   │   ├── Existential_Angst │   │   ├── Hermeneutics │   │   ├── Language │   │   ├── Primordial_Awareness │   │   ├── summary.txt │   │   ├── Temporal_Dimension │   │   ├── Temporal_Structure │   │   ├── Thematic_and_Non-Thematic_Awareness │   │   └── World_Disclosure │   ├── most_acknowleged.txt │   └── Other_Subjective_Experience │   ├── Altered_States_of_Consciousness │   ├── Consciousness_of_Others │   ├── Emotions │   ├── Higher_Cognitive_Functions │   ├── Phenomenal_Binding │   ├── Qualia │   ├── Self_Identity │   ├── Sense_of_Agency │   ├── Sense_of_Time │   ├── Sensory_Modalities │   ├── Subjective_Well-Being │   └── Transcendental_or_Mystical_experiences └── Self-Awareness    ├── Autonoetic_Awareness    ├── Body_Awareness    ├── Cognitive_Awareness    ├── Default_Mode_Network    ├── Emotional_Awareness    ├── Emotional_Regulation    ├── Executive_Functions    ├── Introspection    ├── Metacognition    ├── Perception_of_Agency    ├── Perspective_Taking    ├── Resolving_Cognitive_Dissonance    ├── Self_Consciousness    ├── Self-Identity    ├── Self-Monitoring    ├── Self-Referential_Processing    ├── Self-Reflection    ├── Sense_of_Self    ├── Social_Cognition    ├── Spatial_Awareness    ├── summary.txt    ├── Temporal_Awareness    └── Theory_of_Mind
Great stuff, Anons. We'll get rolling with combining these suggestions together very soon! Cheers. :^)
>>25315 What does the WIP thread mean?
>Autonomous Cognitive Entity (by Dave Shapiro) I disagree with his idealism (human rights, lol) and recall some other flaws in the first video, but he has good ideas. Didn't watch the second video yet, it just poped up some minutes ago. ACE Framework Overview and Intro: Autonomous AI Agents!: https://youtu.be/A_BL_pu4Gtk ACE Paper is Published! Repo tour! Get involved!: https://youtu.be/oVP_aB5rJL8 Related: Buses: AMQP, AMP or Syslog Related: On Task: How Our Brain Gets Things Done, by David Badre >Event-driven Workflows are also relevant, I keep running into these terms, since it's about orchestrating a network of machines. Prefecthq and Marvin got mentioned. Some people criticize LangChain for not being more like that. I have of course no intention to use "the cloud", but robowaifus will need many SBCs with redundancy and communication in between. Also, it's about running a lot of parallel programs on one system. So it might be worth looking into it, but it also comes with a lot of extra complexity: - Event-driven Architecture: https://youtu.be/gOuAqRaDdHA - Getting Started with Workflow Orchestration: https://youtu.be/AjYHBwH2Mtc - Dynamic Event-driven Workflows with Prefect Cloud: https://youtu.be/hVD1S2suC48 - https://elixirforum.com/t/what-are-you-building-your-event-driven-elixir-systems-on/54517 - Kafka, RabbitMQ streams or messaging system with pubsub semantics >Cyc Another one of the related project. Some ideas I heard before, I read about this whole things years ago. Then there are some ideas I didn't remember related to that, but I think I had on my own, but the Cyc team is of course way ahead. Ontologies and Ontology engineering really matters. >Douglas Lenat: Cyc and the Quest to Solve Common Sense Reasoning in AI | Lex Fridman Podcast #221 https://youtu.be/3wMKoSRbGVs https://en.wikipedia.org/wiki/Cyc > 1:11 - What is Cyc? > 9:17 - How to form a knowledge base of the universe > 19:43 - How to train an AI knowledge base >24:04 - Global consistency versus local consistency >48:25 - Automated reasoning >54:05 - Direct uses of AI and machine learning >1:06:43 - The semantic web >1:17:16 - Tools to help Cyc interpret data >1:26:26 - The most beautiful idea about Cyc >1:32:25 - Love and consciousness in AI >1:39:24 - The greatness of Marvin Minsky >1:44:18 - Is Cyc just a beautiful dream? >1:49:03 - What is OpenCyc and how was it born? >1:54:53 - The open source community and OpenCyc >2:05:20 - The inference problem >2:07:03 - Cyc's programming language >2:14:37 - Ontological engineering >2:22:02 - Do machines think? >2:30:47 - Death and consciousness >2:40:48 - What would you say to AI? >2:45:24 - Advice to young people OpenCyc for download: https://sourceforge.net/projects/opencyc/
>>25408 the 'not' makes it grandiose
>>25408 >your excuse for not doing jack shit once the robot is done I assume this question was targeted at me? - There will not be "the one" robot - Development will go on after the first one - I'm still working on it, but took a bit of a break, since I have to do something else which I procrastinated on. That said, I will likely do some a little bit of work soon. Hopefully finishing some things which are close to that. After that, I will need to take another break unfortunately. Next few month are about learning more about AI and programming when I work related to it, which I still do. I hope I can get at least my simple body design into something, and some BLDC servo with gear running, till the end of the year. - I don't know what my "excuse" would be, since I don't need one. Probably rather an explanation to myself, so I can improve upon. That said the hot summer, with me having issues with heat would be part of it. - It's really time to get the AI development going in parallel, and it's to important and exiting to miss out on. I'm glad that I got my mind back into being interested in programming, instead of more general topics.
>>25411 Oh wow that was harsh for even me. Lol sorry I was drunk.
>>25424 Please post a summary next time, maybe with your own thoughts, not just a link.
>Minecraft AI - NVIDIA uses GPT-4 to create a SELF-IMPROVING autonomous agent https://youtu.be/7yI4yfYftfM[Embed] >AgentVerse - Society of AI Minds https://youtu.be/cbqE6PC9fGQ[Embed]
> An AI agent and a cognitive architecture are two related but distinct concepts in the field of artificial intelligence (AI). > 1. AI Agent: > - An AI agent is a software program or system designed to perform tasks or make decisions in an environment. It can be thought of as an entity that perceives its environment through sensors, processes information, and takes actions to achieve specific goals. > - AI agents can range from simple rule-based systems to more complex machine learning models, such as deep neural networks. They are often used in applications like robotics, autonomous vehicles, virtual assistants, and game-playing agents. > -The primary focus of AI agents is on task-specific behavior and achieving specific objectives or solving particular problems. AI agents may not possess human-like cognitive capabilities but can excel in narrow domains. > 2. Cognitive Architecture: > - A cognitive architecture, on the other hand, is a theoretical framework or model that attempts to describe and explain human-like cognitive processes and behavior. It aims to capture the underlying structure and mechanisms of human cognition, including perception, memory, reasoning, and decision-making. > - Cognitive architectures are typically developed in the field of cognitive science and psychology to study and simulate human cognitive processes. They serve as a way to understand how humans think, learn, and process information. > - Examples of cognitive architectures include ACT-R (Adaptive Control of Thought - Rational), Soar, and CLARION. These models provide a structured representation of cognitive processes and can be used to simulate human-like decision-making and problem-solving. > In summary, the key difference between an AI agent and a cognitive architecture lies in their purpose and scope. An AI agent is a practical system designed to perform specific tasks or solve problems in the real world, while a cognitive architecture is a theoretical framework used to model and simulate human-like cognitive processes for the purpose of understanding human cognition. AI agents may utilize cognitive architecture principles or be inspired by them, but they are typically designed to achieve practical goals, whereas cognitive architectures are primarily concerned with modeling human cognitive behavior.
>There are a few elements which add to the complexity of an application that uses Langchain: Summary from https://www.reddit.com/r/LangChain/comments/16l2u62/comment/k10z8w7/ > RAG Pipeline: Langchain simplifies document handling, including chunking, splitting, tokenizing, embedding, and ingestion (via vectordb), as part of a basic pipeline. > Advanced Pipeline Considerations: > Streaming Responses: Enhance user experience by implementing streaming responses, especially for users with lower technical proficiency. > Memory/Conversation History: Store conversation history separately, potentially in a database, and employ compression methods for efficient retrieval. > Data Preprocessing: Explore advanced data preprocessing techniques like chunking, abstracting metadata, cleaning text, and rearranging data to uncover deeper context. > Reference Source Materials: Include the ability to retrieve source materials alongside RAG output, define specific metadata attributes, and access various metadata sources. > Advanced Indexing: Allow manipulation of the index after data upload, including adding new columns and rows for better data organization. > VectorDB: Choose a vector database tailored to your use case, considering its features and functions. > Embedding Models: Evaluate embedding models beyond OpenAI's ADA, such as HuggingFace models with specific training for your use case. > Fine-Tuning Models: Consider fine-tuning chat models to enhance retrieval and output relevancy. > Parsing & Prompt Engineering: Abstract specific information for downstream tasks and utilize specific prompting techniques for more effective interactions. > Tools and Agent Capabilities: Assess whether your model can perform tasks beyond chat, such as internet searches, tool selection, and using multiple models or experts to provide the best responses. > Production App Expectations: Understand that users expect more than a basic pipeline if your application aims for a production-level user experience. > Front-End Features: Consider incorporating front-end features that allow users to customize and personalize their experience. > Information Transformation: Explore transformations after data upload, including named-entity recognition and integrating machine learning components for better content detection. > These considerations highlight the complexity and potential enhancements beyond a basic Langchain pipeline, addressing user expectations in a production application. > The list is not exhaustive, and there may be additional factors to consider based on specific use cases and requirements.
I won't quote these text anymore, since it looks ugly and is extra work. ACT-R (Adaptive Control of Thought - Rational), Soar, and CLARION are all cognitive architectures used in the field of artificial intelligence and cognitive science to model and simulate human cognitive processes. Each of these architectures has its own approach and set of principles for understanding and replicating human cognition. Here's an overview of each: 1. ACT-R (Adaptive Control of Thought - Rational): - Overview: ACT-R is a cognitive architecture developed by John R. Anderson and his colleagues. It is designed to simulate various aspects of human cognition, including memory, learning, reasoning, and problem-solving. - Key Principles: - Production System: ACT-R is based on a production system, which consists of condition-action rules (productions) that govern behavior. These productions represent the knowledge and rules that guide cognitive processes. - Modularity: ACT-R models human cognition as a collection of modular subsystems, each responsible for specific cognitive functions. These subsystems include the declarative memory, procedural memory, and visual and motor modules. - Sensory-Motor Integration: ACT-R incorporates sensory and motor processes to simulate interactions with the external world. - Applications: ACT-R has been used to model a wide range of cognitive tasks, including problem-solving, language comprehension, and decision-making. It has applications in psychology and cognitive science research. 2. Soar: - Overview: Soar is another cognitive architecture that aims to simulate human cognition. It was developed by Allen Newell and John Laird. Soar is designed to represent and process symbolic knowledge to achieve intelligent behavior. - Key Principles: - Unified Memory: Soar uses a unified memory architecture, where both declarative (facts) and procedural (rules) knowledge are stored in a single memory structure. This enables the system to integrate various types of knowledge. - Problem Space Search: Soar employs problem space search techniques to find solutions to complex problems. It uses operators and states to represent problem-solving steps. - Learning Mechanisms: Soar includes learning mechanisms that allow it to adapt and acquire new knowledge through experience. - Applications: Soar has been used in applications such as natural language understanding, game-playing agents, and robotics. It has also been used to model human performance in cognitive tasks. 3. CLARION: - Overview: CLARION is a cognitive architecture developed by Ron Sun. It is characterized by its dual-process theory, which posits that cognitive processes involve both explicit, conscious reasoning and implicit, subconscious learning and inference. - Key Principles: - Dual-Process Theory: CLARION distinguishes between a symbolic subsystem (explicit) and a connectionist subsystem (implicit) to model different aspects of cognition. - Hierarchical Structure: CLARION represents knowledge in a hierarchical manner, allowing for abstraction and generalization. - Reinforcement Learning: CLARION incorporates reinforcement learning to adapt and improve cognitive performance over time. - Applications: CLARION has been applied in various domains, including cognitive modeling, machine learning, and artificial intelligence. It has been used to study human decision-making, problem-solving, and learning. These cognitive architectures provide valuable tools for researchers to study and understand human cognition, as well as to develop AI systems that exhibit more human-like cognitive capabilities. Each architecture has its own strengths and focus areas, making them suitable for different types of cognitive modeling tasks. As of my last knowledge update in September 2021, the availability and open-source status of cognitive architectures like ACT-R, Soar, and CLARION can vary. Here's the status as of that time: 1. ACT-R (Adaptive Control of Thought - Rational): - ACT-R is primarily developed by the ACT-R Research Group at Carnegie Mellon University. While there are some resources and educational materials available to the public, the core ACT-R software and source code were not typically available as open source. Users could access ACT-R through educational licenses and research collaborations. 2. Soar: - Soar is open source and publicly available. The Soar community provides access to the Soar source code and documentation for research and development purposes. Researchers and developers can freely download, use, and modify the Soar code. 3. CLARION: - CLARION was publicly available for research purposes, but the availability and licensing terms may vary depending on specific versions and modules. Researchers could access CLARION for academic and research projects, but it might not have been as widely adopted or documented as other architectures like Soar. Please note that the availability and licensing terms of software and research projects can change over time. To get the most up-to-date information on the availability and open-source status of these cognitive architectures, it's advisable to visit the official websites or repositories of the respective projects or check with the research groups or organizations associated with them. Additionally, since my knowledge is based on information available up to September 2021, there may have been developments or changes in the status of these architectures since then.
Open file (128.37 KB 626x331 Screenshot_131.png)
Open file (101.16 KB 640x287 Screenshot_130.png)
Open file (132.29 KB 480x333 Screenshot_129.png)
Open file (178.46 KB 816x357 Screenshot_128.png)
Open file (188.06 KB 865x396 Screenshot_125.png)
>more modular architecture that decomposes the functions of existing LLMs and adds several additional components. We believe this alternative can address all of the shortcomings of LLMs. We will speculate about how this modular architecture could be built through a combination of machine learning and engineering. The talk is a bit too much about "unacceptable" outputs and such, but useful ideas: https://youtu.be/cEyHsMzbZBs Related: - >>25764 (some ideas on modularity) - Nell: https://en.wikipedia.org/wiki/Never-Ending_Language_Learning - Nell Data to download: http://rtw.ml.cmu.edu/rtw/resources
Open file (219.45 KB 860x437 Screenshot_127.png)
>>25767 Missing picrel
Okay so you guys are dead set on the ai aspect. Here is an unreal engine file with the maidcom project manequin. I picked blueprints, my bad, you can change it to C++ if you start another project. The maidcom manequin is still there as a uasset. Anyways, give it sensors, give it ai via an API or however you want to get it done, do what you have to do to make it simulate the real world. This aspect of the project is indeed useful if people are willing to do something. On the meantime I'm focusing on other aspects. https://drive.google.com/file/d/1g8Pm34920j12vuFY47G_Y3rVSd3-6b-Y/view?usp=sharing
Certainly, ensemble techniques are used to improve the performance and robustness of AI systems by combining the predictions or outputs of multiple individual models. Here's a list of ensemble techniques commonly used in AI systems: 1. Voting Ensembles: - Majority Voting: Combines predictions by selecting the most common class label or decision among the models. - Weighted Voting: Assigns different weights to models, and predictions are combined based on these weights. 2. Averaging Ensembles: - Simple Average: Averages the predictions of multiple models to obtain the final output. - Weighted Average: Assigns different weights to models and computes a weighted average of their predictions. 3. Stacking Ensembles: - Stacking: Trains a meta-model on the predictions of base models, learning to combine their outputs effectively. - Blending: Similar to stacking but divides the dataset into training and validation sets for the meta-model. 4. Bagging Ensembles: - Bootstrap Aggregating (Bagging): Trains multiple models on different subsets of the training data and combines their predictions (e.g., Random Forest). 5. Boosting Ensembles: - AdaBoost: Adjusts the weights of misclassified instances to give more focus to challenging samples. - Gradient Boosting: Builds models sequentially, with each model correcting the errors of the previous one (e.g., XGBoost, LightGBM). 6. Bootstrapped Ensembles: - Bootstrapped Ensembles: Creates multiple subsets of the training data through bootstrapping and trains models on each subset (e.g., Bagged Decision Trees). 7. Randomization Ensembles: - Random Forest: Combines multiple decision trees, where each tree is trained on a different subset of features and data instances. 8. Model Selection Ensembles: - Model Selection Ensembles: Selects the best-performing model from a set of candidate models based on performance metrics. 9. Expert Ensembles: - Expert Ensembles: Combines the predictions of different specialized models, each of which excels in a specific subtask. 10. Ranking Ensembles: - Ranking Ensembles: Combines models to produce ranked lists of items, often used in recommendation systems. 11. Diversity-Based Ensembles: - Diversity-Based Ensembles: Focuses on selecting models that are diverse in their predictions or characteristics to reduce bias and improve overall performance. 12. Fusion Ensembles: - Fusion Ensembles: Combines models with different modalities, such as text, image, and audio, to make decisions based on multi-modal data. 13. Hybrid Ensembles: - Hybrid Ensembles: Combines different types of models, such as neural networks, decision trees, and linear models, into a single ensemble for improved performance. 14. Bayesian Ensembles: - Bayesian Ensembles: Utilizes Bayesian methods to estimate the posterior distribution of model parameters and predictions. These ensemble techniques are valuable tools in machine learning and AI systems, allowing for improved accuracy, robustness, and generalization across a wide range of tasks and domains. The choice of ensemble technique depends on the problem at hand and the characteristics of the base models being used.
I'm currently trying to make a list with all agent-systems, RAG systems, cognitive architectures, and similar. Then collecting data on the features and limitations, as many points of distinction as possible, opinions, ... - Auto-GPT: github.com/Significant-Gravitas/Auto-GPT - AutoGen: github.com/microsoft/autogen -- based on github.com/microsoft/FLAML -- youtu.be/YRbLmDjrjsc - BASI: github.com/oliveirabruno01/babyagi-asi - BabyAGI: github.com/yoheinakajima/babyagi - GripTape: griptape.ai - Jarvis: github.com/microsoft/JARVIS - LangChain: docs.langchain.com/docs - LlamaIndex: github.com/run-llama/llama_index - Open-Assistant: github.com/LAION-AI/Open-Assistant - Rasa: github.com/RasaHQ - Semantic Kernel: github.com/microsoft/semantic-kernel - SmartGPT: github.com/Cormanz/smartgpt - TxAI and txtchat: github.com/neuml/txtai - tinyLLM: github.com/zozoheir/tinyllm - tinylang: github.com/astelmach01/tinylang - llmware: github.com/llmware-ai/llmware (easiest?) -- auto setsup Mongo and Milvus -- modular, can use PineCone etc - quivr: github.com/StanGirard/quivr -- GenerativeAI to store and retrieve unstructured information # MoE / Domain Discovery / Multimodality - github.com/SkunkworksAI/hydra-moe - arxiv.org/abs/2303.14177 - arxiv.org/abs/2208.03306 - arxiv.org/abs/2111.02358 - colab.research.google.com/#fileId=https%3A//huggingface.co/datasets/crumb/Wizard-EvolInstruct70k-k4/blob/main/MoLora_7b_(PROOF_OF_CONCEPT).ipynb # Chatbots and Conversational AI: - BondAI: github.com/krohling/bondai - BeeBot: github.com/AutoPackAI/beebot - IncarnaMind: github.com/junruxiong/IncarnaMind # Machine Learning and Data Processing: - NeMo-Guardrails: github.com/NVIDIA/NeMo-Guardrails - Haystack: github.com/deepset-ai/haystack - EdgeChains: github.com/arakoodev/EdgeChains # Frameworks for Advanced AI, Reasoning and Cognitive Architectures: - ACT-R (Adaptive Control of Thought - Rational) - Soar - CLARION - OpenCog: github.com/opencog - Dave Shapiro: youtube.com/@4IR.David.Shapiro - Some guys from IBM Watson worked on it (forgot the name) - Cyc: en.wikipedia.org/wiki/Cyc # structured prompt system - huggingface.co/Tostino/Inkbot-13B-8k-0.2 # RWKV - huggingface.co/blog/rwkv - johanwind.github.io/2023/03/23/rwkv_overview.html Agents in a Virtual Environment - MineRL: minerl.io - Malmo: github.com/microsoft/malmo - AgentVerse: github.com/OpenBMB/AgentVerse # Comments and Comparissons (probably outdated) - /r/ChatGPT/comments/12cql0c/autogpt_vs_babyagi/ - /r/AutoGPT/comments/15jrs4n/autogpt_is_failing_to_acomplish_its_goals/ # Some Benchmarks - github.com/Significant-Gravitas/Auto-GPT-Benchmarks # Curated Lists and AI Search - github.com/e2b-dev/awesome-ai-agents - github.com/EmbraceAGI/Awesome-AGI - efficientnlp.com/model-chat # Recommended Tutorials - RAG: gpt-index.readthedocs.io/en/latest/examples/low_level/oss_ingestion_retrieval.html - RAG: python.langchain.com/docs/expression_language/cookbook/retrieval
>>25776 Thanks, that interesting. But I'm more concerned with basic understanding of a conversation right now. I'll link this to the thread about simulation >>25787
> How could proofs whose correctness can be checked and certified by the compiler be useful for AI? Proofs whose correctness can be checked and certified by a compiler are a fundamental concept in the field of formal methods and are useful for various applications in artificial intelligence (AI). Here are some ways in which they can be beneficial: 1. Safety-Critical Systems: In AI applications where safety is critical, such as autonomous vehicles, medical devices, or industrial control systems, formal proofs can ensure that the AI system behaves correctly and doesn't produce dangerous or unexpected behaviors. Compiler-certified proofs can guarantee that the code adheres to safety specifications. 2. Verification of AI Algorithms: AI algorithms often involve complex mathematical operations and optimizations. Proofs can be used to verify the correctness of these algorithms, ensuring that they perform as intended and that they don't introduce biases or errors. 3. Robustness and Security: AI systems can be vulnerable to attacks and adversarial inputs. Formal proofs can help verify that an AI model or system is robust against such attacks by formally specifying security properties and ensuring they are upheld during execution. 4. Debugging and Testing: Compiler-certified proofs can be used to identify and fix issues in AI code. When a proof fails, it can pinpoint the exact location of the problem, making debugging more efficient. 5. Maintainability: As AI systems evolve and undergo updates, it's important to ensure that modifications don't introduce new bugs or vulnerabilities. Formal methods can provide guarantees that code changes preserve the system's correctness, even as it evolves over time. 6. Explainability and Accountability: In some AI applications, particularly those involving decision-making, being able to formally prove certain properties can help in explaining why a particular decision was made. This can be important for accountability and transparency. 7. Certifying Neural Network Properties: For deep learning models, formal methods can be used to certify properties of neural networks, such as robustness against adversarial examples or fairness in decision-making. 8. AI Ethics: In applications where AI is involved in ethical decision-making, formal proofs can be used to ensure that the system adheres to ethical guidelines and doesn't discriminate against certain groups or exhibit biased behavior. 9. Regulatory Compliance: In highly regulated industries like healthcare or finance, having formal proofs of correctness can help in complying with strict regulatory requirements, providing assurance that the AI systems meet certain standards. 10. Education and Research: Compiler-certified proofs can be valuable for educational purposes, allowing students and researchers to understand and experiment with AI algorithms while ensuring correctness. In summary, proofs whose correctness can be checked and certified by compilers are a powerful tool for enhancing the reliability, safety, and security of AI systems. They can help mitigate risks, improve accountability, and provide a solid foundation for the development and deployment of AI technologies in various domains. > Which language is used for such proofs whose correctness can be checked and certified? ... > And which one is most used in AI? The choice of which formal methods and proof-assistant tool to use in AI applications can vary depending on the specific needs and preferences of researchers and practitioners. There isn't a single tool that is universally the "most used" in AI, as different projects may benefit from different tools and languages based on their requirements. However, some tools have gained popularity in AI-related research and applications: 1. Coq: Coq is widely used in AI research, especially in areas where formal verification and correctness proofs are crucial. It has been applied to verify properties of AI algorithms, including machine learning models, and to ensure the reliability and safety of AI systems. 2. Isabelle/HOL: Isabelle and its higher-order logic (HOL) language have also seen use in AI research, particularly for formal verification and theorem proving in AI-related domains. 3. Lean: Lean has gained popularity in both the AI and formal methods communities. It's known for its readability and is used for formalizing mathematics and verifying software, which can be relevant in AI. 4. SMT-LIB and SMT Solvers: In AI, Satisfiability Modulo Theories (SMT) solvers are often used for automated reasoning tasks. While not a proof-assistant in the traditional sense, the SMT-LIB language and associated solvers are used extensively for certain types of AI-related verification and decision-making tasks. The choice of tool often depends on factors such as the specific problem being addressed, the complexity of the verification task, the background and expertise of the researchers or engineers, and the tool's ecosystem and community support. In practice, researchers and practitioners may use a combination of tools and languages to address different aspects of AI, ranging from verifying algorithms to ensuring the safety and security of AI systems. ---------------------- https://en.wikipedia.org/wiki/Coq https://en.wikipedia.org/wiki/Isabelle_(proof_assistant) https://en.wikipedia.org/wiki/Lean_(proof_assistant) https://en.wikipedia.org/wiki/Satisfiability_modulo_theories#SMT_solvers https://en.wikipedia.org/wiki/Z3_Theorem_Prover https://en.wikipedia.org/wiki/OpenCog https://en.wikipedia.org/wiki/ACL2
>>25777 >>25782 >>25882 Nice contributions, anons! Please let me or Kiwi know of new edits in the WIP thread (>>25368) you'd like added. Cheers. :^)
>>25892 What is the WIP thread? Some thread where only we two can post?
>>25908 >Some thread where only we two can post? So, Kiwi and yourself are the two primary 'authors' of that thread, NoidoDev. But I consider this all a group effort of course (meaning every regular & parttime participant of the board should speak up on this topic in general). I think we all agreed it should be a read-only asset for the board at large (cf. >>24936, >>24940, >>24942, ...); at the time I was the only one capable of creating a thread like that and managing it's edits. Now of course, Kiwi is a mod on /robowaifu/ and he can do so as well (and I hightly recommend that approach instead BTW :^). I hope that's understandable Anon. Just ask further if more clarification is needed (or if you consider such questions beneficial for the board's community at large). Cheers. :^) >=== -add'l crosslinks -minor edit
Edited last time by Chobitsu on 10/11/2023 (Wed) 17:51:32.
>>25908 >>25919 I will happily add to the main thread if amended information will benefit implementation of consciousness. I am sceptical and will add information to a more relevant thread if it isn't directly related to consciousness.
>>25925 The thread title isn't "consciousness" but cognitive architecture. There is no agreed definition of consciousness and there won't be one. >>25919 Okay, I will simply go on using this thread here.
>>25930 Consciousness in a cognitive sense, which is to say having a persona or knowledge of oneself. For now, your posts have been correctly placed here and are greatly appreciated. Please keep it up, NoidoDev. I'm curious, what software would you use for cognition? You've posted many, but which one would you want to see used in particular?
>>25940 >which is to say having a persona or knowledge of oneself. I'd call this self-awareness. >what software would you use for cognition? I don't understand. Anyone doing this will have to use a lot of different parts, from other people and one created by oneself. If you mean a more detailed plan, I don't know yet. I should probably work on that. Though, the earlier I do that, the more likely it is that more elements are going to change later.
>>25947 The basic idea is that a input is taken by some API, at the beginning it's just text and later coming from something like "whisper.cpp". Since I won't optimize it for chat I won't care about correcting typos and such. Audio transcriptions have priority. The input script needs to be able to easily send the input different receivers, which will go on with processing. Then it should go through different chains, which might also change dependent on the input. - Starting with finding misunderstandings, for example for terms which are not understood by Whisper. - I definitely want something that is fast at understanding a lot of sentences, using more traditional language technology. Using NLTK, something like "Bert" for parsing and understanding, pattern matching and such. Also taking context about objects and other elements from a graph database. - It needs to be extendable with new "fields" for context and other inputs at the same time, e.g. detected gestures and emotions. At some point it goes to one of the LLMs running, using something like "llama.cpp". The software needs to be extendable to run more than one model at the same time, also on different computers, and later to be able to trigger commands loading and unloading models, based on the context. It also needs to have the right prompts to ask about a input and maybe be able to create new prompts, store them and evaluate the responses later. The response from a model also needs to be analyzed fast by various methods as described above for the input. An alternative route from the input to the output is, that the response or parts of it come from something like scripted responses, e.g. AIML. These responses can also be created by the system, not a human, while going over past conversations, and stored for later use. The NLU should be fast enough to point to the right response, especially if it is something recurring. Then the system also needs to have a mental model about the world, like having an inventory about things it has, which are around it, and so on.
>>25950 I agree with most of your ideas. Scripted responses are a good idea. We have ignored how important it is for her to have reflexive responses. Would help her have some extra seconds to allow her cognitive processes to catch up. We have a similar method for instant reactions, with our minds tricking us into thinking our reflexes are cognitive reactions. Do you think we need to trick her somehow? A mental world model is definitely required. Both for its obvious utility, and to allow her to separate herself from the world. Related to this, separation of hands and objects is useful as seen in this gesture awareness project. https://github.com/zhongyi-zhou/GestureIMT
Here some code to get each pressed key in Python based on https://stackoverflow.com/questions/510357/how-to-read-a-single-character-from-the-user I made this into a demo for a interface that gets the input as fast as possible and directly sends it to a list of other functions. This is currently just the the placeholder inside the "input_multiplexer" which prints the input. - CTL-S sends the input, since return is already used in the shell - CTL- X/C/Q end the input dialog - Backspace/DEL works for corrections - list(getch())[0] and using some special key gives you the char code, which then can be used for extending the functionality - Any function called by input_multiplexer can use the input while it is typed, just like the print function class _Getch: """Gets a single character from standard input. Does not echo to the screen.""" def __init__(self): try: self.impl = _GetchWindows() except ImportError: self.impl = _GetchUnix() def __call__(self): return self.impl() class _GetchUnix: def __init__(self): import tty, sys def __call__(self): import sys, tty, termios fd = sys.stdin.fileno() old_settings = termios.tcgetattr(fd) try: tty.setraw(sys.stdin.fileno()) ch = sys.stdin.read(1) finally: termios.tcsetattr(fd, termios.TCSADRAIN, old_settings) yield ch class _GetchWindows: def __init__(self): import msvcrt def __call__(self): import msvcrt yield msvcrt.getch() getch = _Getch() def input_interface(user_input = "", user_input_str = ""): while(42 == 42): user_input = list(getch())[0] if user_input == "\x13": input_multiplexer("input send") user_input_str = "" if user_input != '\x7f': #backspace user_input_str = user_input_str + user_input else: user_input_str = user_input_str[:-1] input_multiplexer(user_input_str) if user_input in ['\x11','\x18','\x03']: input_multiplexer("exited") return def input_multiplexer(input_str): print(input_str)
>>25954 >I agree with most of your ideas. Anything you don't? I mean, we don't need to have a fallout over this, feel free to mention it. I believe it's theoretically possible that I could be wrong about something. More importantly, I wrote my example code above in a way that people can use different modules. We really don't need to agree on everything. I want a lot of configurability. >Would help her have some extra seconds to allow her cognitive processes to catch up. Or minutes, given how slow some LLMs are, lol. In many cases it's simply not necessary to use that. If you tell her to give you something which is closer to her than to you, she should understand that pretty fast. Having a list of objects around her in some inventory, understanding the command and maybe some additional gesture. Same goes for taking another position in bed. I want to use some kind of pattern matching for that. She should know what sentences and commands mean in certain ways, like what is anticipated and what state or inventory would change by following a certain command. Maybe some things could be done in LLMs as well, but it would all need to go into the context, including vector databases, if I understand it correctly. I only want to use this in parallel, but she would often act based on code if something is very clear. >with our minds tricking us into thinking our reflexes are cognitive reactions. Do you think we need to trick her somehow? I wouldn't call this "tricking", it's just a lack of awareness, since it isn't necessary. Yes, I think the robowaifus should internally prepare responses and even actions, before they completely understand the input. (Aside from any deeper analysis of the input much later.) She could also start talking and changing her response if it was wrong, since humans do that as well. We just need a way to stop the response at the end of a syllable. >separation of hands and objects is useful as seen in this gesture awareness project. https://github.com/zhongyi-zhou/GestureIMT Interesting, I'll take a look and keep it in mind.
>>25976 you can also turn on mouse input by printing \033[?1000h not sure how common this is though, it works on most linux terminals, you get the input as something like \033[M followed by (mouse button)(x coord)(y coord)
>>25978 Thanks for the idea, but it didn't work for me, while running the script in ipython. Also, I think I won't need that. This is supposed to be replaced by voice input anyways. I want to run a fast loop which would just catching words and some syllables, while others would take more time and catch sentences or longer parts of it.
>>25976 Nice addition, we will need several algorithms to analyze all inputs for proper updating of behaviour relevant variables. >>25977 >Anything you disagree with? Nothing worth mentioning. I have to catch up on this subject once her body is done. I am still primarily focused on her physical body before her psychic body. (Excuse my pun.) >Minutes I fully intend to run a convolution of tiny LLM like NanoPhi (https://github.com/VatsaDev/NanoPhi) to achieve instant responses when needed. We shouldn't be too far from the time when a MCU can run its own LLM in real-time using dedicated accelerators. >Majority of actions don't need an LLM This is true. A major part of a cognitive system is offloading everything into systems best suited to the task. It is important that everything updates her state variables as they happen. For example, if she trips while attempting an action, it should interrupt her current loop. A new loop for checking the states of herself and surroundings should then run. She goes to get a beer. On her way back to you, she trips on a cord. This breaks her attention away, she now checks to ensure the beer is ok and that she can move freely. While doing so, her map of the house can have a new obstacle placed. Which may effect her responses etc... I like your idea of her having an "inventory" file to store states of objects relative to her. We will need a way for her to automatically generate rich details for context too. Perhaps a CSV with color, distance, relative distance, synonyms, etc... >Correcting herself when wrong This would be difficult. I presume you'd have her analyze her response with other algorithms as she implements the results of smaller/faster ones? There's potential there, I could see it getting annoying though, Perhaps she only does this if her initial response receives negative feedback?
>>25982 >This would be difficult. I presume you'd have her analyze her response with other algorithms as she implements the results of smaller/faster ones? Not sure if I understand the last sentence. One thing is, if she answers with a stored/pre-scripted response the meaning of this should be stored as well. And what newly constructed sentences mean should also be know. The one coming from an LLM need to be analyzed fast, and maybe a different context is detected very fast, indicating that the ongoing response is inadequate. So, if she started talking under some assumption and the system indicates a bit later that she was wrong, there could be a correction. >Perhaps she only does this if her initial response receives negative feedback? I mean, she starts saying something based on a misunderstanding, but then changing the sentences to a better response. Maybe something like: I don't know about... Oh, you mean <something> Yes, ...
Open file (388.17 KB 1148x656 Screenshot_148.png)
Open file (391.30 KB 1178x540 Screenshot_144.png)
Open file (473.63 KB 1278x746 Screenshot_141.png)
Open file (607.18 KB 1199x669 Screenshot_145.png)
Open file (621.51 KB 1334x756 Screenshot_142.png)
>[ACS 2022] Invited Talk: Computers versus Common Sense - Doug Lenat https://youtu.be/VjkbmLjwXO8
Open file (378.30 KB 1078x674 Screenshot_150.png)
Open file (629.56 KB 1154x620 Screenshot_149.png)
Open file (326.13 KB 985x643 Screenshot_151.png)
Open file (281.53 KB 924x646 Screenshot_153.png)
Open file (563.18 KB 1323x811 Screenshot_152.png)
Open file (245.67 KB 1151x648 Screenshot_155.png)
Open file (852.99 KB 1307x912 Screenshot_154.png)
Open file (265.85 KB 1204x410 Screenshot_156.png)
Open file (554.86 KB 1381x965 Screenshot_157.png)
Open file (639.86 KB 1274x958 Screenshot_158.png)
Open file (487.34 KB 1215x927 Screenshot_159.png)
Open file (478.39 KB 1195x844 Screenshot_160.png)
Open file (551.52 KB 1207x895 Screenshot_161.png)
Open file (584.70 KB 1348x893 Screenshot_162.png)
Open file (583.90 KB 1341x950 Screenshot_164.png)
>>25989 >>[ACS 2022] Invited Talk: Computers versus Common Sense - Doug Lenat https://youtu.be/VjkbmLjwXO8[Embed]
Open file (263.70 KB 1356x457 Screenshot_165.png)
Open file (701.05 KB 1365x884 Screenshot_166.png)
Open file (362.77 KB 1353x493 Screenshot_167.png)
Open file (349.39 KB 1108x915 Screenshot_169.png)
Open file (262.35 KB 1112x487 Screenshot_168.png)
Open file (286.79 KB 1107x439 Screenshot_172.png)
Open file (505.65 KB 1109x876 Screenshot_170.png)
Open file (561.40 KB 1402x937 Screenshot_171.png)
Open file (155.05 KB 605x348 Screenshot_174.png)
Open file (577.23 KB 1305x713 Screenshot_173.png)
>>25991 >[ACS 2022] Invited Talk: Computers versus Common Sense - Doug Lenat https://youtu.be/VjkbmLjwXO8[Embed]
Open file (502.42 KB 1370x908 Screenshot_175.png)
Open file (848.65 KB 1259x987 Screenshot_176.png)
Open file (538.07 KB 1262x836 Screenshot_179.png)
Open file (540.08 KB 1307x745 Screenshot_180.png)
Open file (720.05 KB 1402x840 Screenshot_177.png)
>>25992 >[ACS 2022] Invited Talk: Computers versus Common Sense - Doug Lenat https://youtu.be/VjkbmLjwXO8[Embed]
Open file (493.48 KB 1291x673 Screenshot_183.png)
Open file (565.09 KB 1364x746 Screenshot_181.png)
Open file (623.70 KB 1353x846 Screenshot_185.png)
Open file (517.60 KB 1114x833 Screenshot_184.png)
Open file (817.10 KB 1340x887 Screenshot_186.png)
>>25993 >[ACS 2022] Invited Talk: Computers versus Common Sense - Doug Lenat https://youtu.be/VjkbmLjwXO8[Embed]
Open file (238.83 KB 1086x818 Screenshot_187.png)
Open file (532.76 KB 1214x701 Screenshot_188.png)
Open file (610.99 KB 1305x842 Screenshot_190.png)
Open file (1.07 MB 1328x919 Screenshot_189.png)
>>25994 >[ACS 2022] Invited Talk: Computers versus Common Sense - Doug Lenat https://youtu.be/VjkbmLjwXO8[Embed]
This technology has tremendous potential internalized world building. Needs a 3090 at the moment but, it's a brand new technique. https://huggingface.co/blog/gaussian-splatting https://www.youtube.com/watch?v=C708Mh7EHZM
>>26002 Neat! Splatting has been around for a while in 3D CGI -- primarily as an optimization technique to reduce the amount of information required to roughly describe a surface (ie, typically a few hundred splat radii, vs. thousands of low-res tris; and a point-radius+normal (ie, a splat) is obvs. more lightweight & faster to compute than a tri+normal). Pretty cool to see the idea also applies to LLM research seemingly. Thanks, Kiwi. Cheers. :^) >=== -prose edit
Edited last time by Chobitsu on 10/15/2023 (Sun) 21:27:24.
https://arxiv.org/abs/2310.06775 >The rapid development and adoption of Generative AI (GAI) technology in the form of chatbots such as ChatGPT and Claude has greatly increased interest in agentic machines. This paper introduces the Autonomous Cognitive Entity (ACE) model, a novel framework for a cognitive architecture, enabling machines and software agents to operate more independently. Drawing inspiration from the OSI model, the ACE framework presents layers of abstraction to conceptualize artificial cognitive architectures. The model is designed to harness the capabilities of the latest generative AI technologies, including large language models (LLMs) and multimodal generative models (MMMs), to build autonomous, agentic systems. The ACE framework comprises six layers: the Aspirational Layer, Global Strategy, Agent Model, Executive Function, Cognitive Control, and Task Prosecution. Each layer plays a distinct role, ranging from setting the moral compass and strategic thinking to task selection and execution. The ACE framework also incorporates mechanisms for handling failures and adapting actions, thereby enhancing the robustness and flexibility of autonomous agents. This paper introduces the conceptual framework and proposes implementation strategies that have been tested and observed in industry. The goal of this paper is to formalize this framework so as to be more accessible. Related video: https://youtu.be/3Xa3lu00wfg?t=858 > There's no technological barrier to progress, ... > all the pieces are there, it just needs to be assembled and engineered ... > The gap between flagship models and publicly available models went down from one year to a few weeks ... The genie is out of the bottle
>>26039 >The genie is out of the bottle This. We predicted this outcome years ago here on /robowaifu/, but 2023 alone has seen a thousands-fold acceleration of it all. BTW congrats on the waifu image, NoidoDev. It dislocated her left eye a bit; but otherwise is a gorgeous face, perfectly-balanced neoteny (looks ~16yo-17yo to me), beautiful eyes, and not the slightest bit of hindrance by the uncanny valley. >waifu/10 --- https://github.com/daveshap/ACE_Framework https://medium.com/@dave-shap/autonomous-agents-are-here-introducing-the-ace-framework-a180af15d57c >=== -fmt, minor edit -add hotlinks
Edited last time by Chobitsu on 10/17/2023 (Tue) 04:49:58.
>>26008 Thanks for the history lesson. I wonder how you see a connection to LLM research? I am genuinely curious. I see LLMs as tools for linguistic processing, with splitting as a beneficial technique for her visual processing. Specifically, 3D mapping her space for a 3D memory of what is around her. This would benefit her spatial awareness. Though, details about her world would be fed into LLMs as needed. >>26039 Though it is still early in development, this framework seems promising. Looking forward to watching it grow.Dave Shapiro is on our side and we have much to gain from him.
>>26041 >I wonder how you see a connection to LLM research? Simply b/c I misunderstood your post, given this thread subject + the site linked :D. However, thinking about your question (and realizing my mistake), helped me to further envision a new way for me at least :^) of thinking about an LLM system as a sort of tree of associations (much like any typical ontology); which, by rendering them into a multi-dimensional space, could make the idea of optimizing searches using higher-order splats a real concept. >tl;dr Conceivably, to parse/filter complex LLM ontologies using high-order Gaussian Radial Basis Functions; as a wallclock optimization technique useful during runtime context-parsing operations. :^) >ttl;dr This approach should be well suited to GPU (ie, cell-based) processing; and can perhaps also be partially-embedded as some type of meta information (such as hinting) directly into the model itself prior to training. >=== -prose edit -add meta cmnt
Edited last time by Chobitsu on 10/17/2023 (Tue) 10:53:39.
>>25995 >would it be possible to redo in a Cyc 2.0? Deep Learning and statistical methods have really been pushing the forefront on what AI technologies can do, but none of them still understand anything about the actual world. That given, I think its possible to make a better and more usable knowledge representation graph. I would try to create a database like the one discussed here. I would make it usable from nearly day one, at least make most of the benchmarks public so we can compare performance of it compared to other systems. If it must be a self sustaining business, make the technology available on some sort of freemium or trial software. I would build the software similar to spacy where they let users directly build on top of their software for free and then they charge a premium for other features. With today’s automated data collection, I would imagine most of the facts that Cyc has can be crawled from the internet. https://blog.jtoy.net/understanding-cyc-the-ai-database/ https://blog.jtoy.net/the-human-mind-is-an-autonomous-interlinked-model-building-database/ https://news.ycombinator.com/item?id=21781597 >Given all the advancements in machine learning these days and the power of the internet, I think a next generation Cyc could probably be built much faster than the original one. Could you have algorithms automatically figure out the rules that were manually input? You would need to have a very realistic physics simulator. https://blog.jtoy.net/we-can-bruteforce-agi-with-a-realistic-physics-simulator/
Open file (16.91 KB 295x493 Screenshot_150.png)
>>25976 Nothing big, since I _should_ be doing other things... def input_interface(user_input = "", user_input_str = ""): while(42 == 42): user_input = list(getch())[0] if user_input in ["\x13", "\r"]: input_multiplexer("input send") user_input_str = "" if user_input != '\x7f': #backspace user_input_str = user_input_str + user_input else: user_input_str = user_input_str[:-1] input_multiplexer(user_input_str) if user_input in ['\x11','\x18','\x03']: input_multiplexer("exited") return # list(getch())[0] to expore more inputs def input_multiplexer(input_str): print(input_str) match_Q(input_str) #NO extra print()! # NEXT: https://benhoyt.com/writings/python-pattern-matching/ test = "What time is" question_pattern = ["do i", "do you", "what", "who", "is it", "why","would you", "how","is there", "are there", "is it so", "is this true" ,"to know", "is that true", "are we", "am i", "question is", "tell me more", "can i", "can we", "tell me", "can you explain", "question","answer", "questions", "answers", "ask"] def match_Q(user_input): appendix = " --> " match(user_input.lower()): case(x) if x in question_pattern: print(appendix + "question", x) case(_): False match(user_input.split()): case ["What", x,"is"]: print(appendix + "question", x, "now") case ["What", x,"will"]: print(appendix +"question", x, "future") case(_): False [(idx,val) for idx, val in enumerate(test.split())] # [(0, 'What'), (1, 'time'), (2, 'is')] from itertools import pairwise [(val1,val2) for val1,val2 in pairwise(test.split())] # [('What', 'time'), ('time', 'is')]
>>26062 Good luck, NoidoDev! :^)
>>26043 If I read you correctly, that would mean using splatting algorithms to interpret the weights of LLM's for desired results? Fascinating comcept, I don't understand how to make it work. It is good to consider how to use different algorithms beyond LLM's. You're creative, please post more ideas. How would you define the minima of a cognitive agent? How would you implement them? How would you do so if LLM weren't an option? >>26058 >CYC A classic based on common sense and ontology. If we can use it, forking some of their code is likely to be helpful.
>>26071 >I don't understand how to make it work. Heh neither do I yet, since I just invented it in my head (protip: things are always harder in the end, than you think they will be at first glance haha! :^) But the basic idea is, I think, neither too esoteric or exotic. With the tokens you can hint their likely ontological characterizations and embed that information into the dataset. When you 'cook it down' into a model (and even more closely with the tunings), you'd render the semantic language information into some kind of high-dimension ontological tree that can be quickly angled/transformed around in high-dimensions until you find the tightest grouping (at some gross resolution) of qualified concepts for the search, needed during any given moment. The 'center mass' of the likely related-ideas, so to speak. You can then project/fire a mathematical GRBF 'splat' onto this higher-dimensional structural 'target zone' to limit what you even need to test against. For the splat, both it's Gaussian curve(s) (both the spread, gain, and centroid), and the radius of it's basis function(s), can be tuned down 'in flight' to more accurately refine the filters into this multi-dim'd 'high yield' target volume even further. I hope that all makes sense Kiwi. I can see it in my mind sort of, but it's hard to put down in words. >tl;dr With some thoughtful preprocessing work, and by adding a sound structural overlay onto the model; then doing a bit of extra processing during the parse... conceivably the quality could be both excellent & accurate -- and with less overall processing needed to achieve those same end results! :^) --- >How would you define the minima of a cognitive agent? How would you implement them? How would you do so if LLM weren't an option? I'll think about it further (in fact I'm usually thinking about this concept in one derivative form or other for years now, lol), but I'll try again to get more ideas out into the real world of our work here on /robowaifu/ before too long, Kiwi. Just two things I'd say are clearly going to be important to our success here (which I believe the consensus has generally already agreed on ITT), namely: 1. Use a solid and mature database (prb. SQLite or MySQL, given our onboard size limitations) as the fundamental technical substrate for the cognitive system. 2. Do not insist on relying on LLMs (at least not alone). The first is simply an expedient; the second is a limiting factor. >=== -prose, fmt edit
Edited last time by Chobitsu on 11/10/2023 (Fri) 22:31:59.
>>24783 I already made a few ideas for making robot emotions a while back: Sensory Information Processing Algorithms: - PESRA: Physical/emotional stimulation & reaction algorithm - RVIMPA: Rapid visual identification and memory patterning algorithm - VNLA: Vocal neuro-linguistic algorithm - ANLA: Auditorial neuro-linguistic algorithm - PSARA: Psycho-social analysis and response algorithm Neural Network Hemispheres: - Emotional - Logical - Sensory Neural Network Thread Signal Types: - Type E-E3 (Excitatory): pleasure, excitement, arousal, positivity - Type I-I3 (Intermediary): calming, stabilizing, neutral, balanced - Type N/D-N/D3 (Negative/Depressive): pain, anxiety, anger, sadness Environmental Stimuli And Data Handling: On PESRA, Positive and negative number sequences on neuron threads indicate positive and negative reactions to events happening in real time on all hemispheres of the neural network. These informational number sequences determine the restructuring and growth of certain parts of the network as the humanoid learns more about the environment. The intensity of reactions to these signal streams determine the personality of the humanoid and can be subject to change since they are variable to the user's liking. For RVIMPA, Visual, auditory and other kinds of sensory data are stored in long term and short term memory and will reinforce/influence the way the humanoid learns, thinks and solves problems. The RVIMPA's Role is to piece together information for data stored in short term and long term storage by the other algorithms. Long term storage has a minimum expiry date of 3 years. Persistent pattern matching with new sets of familiar sensory data input in comparison to the oldest record of a similar data input will lengthen the shelf life of an existing memory ANLA takes in audio from the physical environment and passes it to RVIMPA for analysis. ANLA communicates with streams of audio and feeds information to the listening threads on other running algorithms in order to generate verbal or physical feedback. VNLA is used to physically communicate with people, audio/language testing programs or other humanoid robots. VNLA controls all physical aspects of language from tongue placement and mouth positioning to the pronunciation of words in any and all languages that it is being trained on How it all works: PSARA uses ANLA directly or uses ANLA's data as well as visual data from RVIMPA to create, understand or mimic psychological profiles when interacting with humans or being trained. This is done through analyzing facial positioning, verbal cues, audio cues, expressions and keywords in conversations by assigning negative or positive values to each word indicating the pattern of thought behind these words. For facial expressions, it maps the physical positioning of facial features on a grid that contains values indicating that position. Then summing the data sequences together to understand positive or negative emotions in humans. Once these patterns are understood, PSARA can use VNLA to verbally reciprocate its own feelings in conversations as a response to what is being said.
>>26297 ANLA should actually communicate with PESRA and PSARA threads first before they both send signals to VNLA for vocal feedback regarding speech interpretation and response in a communicative environment. My apologies. I typed most of this stuff up on a whim and didn't care to make sense of it completely. Still some questions to be asked for information processing though as far as echo state networks, markov chains and numerical signal gradient scales go. The functionality of the hemispheres are what i really need to elaborate on in terms of functionality. Then again, i am thinking about a lot of other stuff too.
>>26297 >>26298 Okay, very interesting. >Sensory Information Processing Algorithms I would suggest to call these things rather modules or procedures, since it might be more than one algorithm. >Positive and negative number sequences on neuron threads indicate positive and negative reactions to events happening in real time on all hemispheres of the neural network. You mean some networks just put out numbers, based on the judgement of the situation and input, then other parts react to that? Nice pics, btw.
Open file (36.53 KB 810x361 ANLA Processing(3).png)
>>26300 >Indicate positive and negative reactions to events happening It's a little more complicated than that. Number sequences from all threads in every hemisphere are used as reference for sensitivity, intensity of emotion, and sensation. They also determine responses based on judgement >in real time on all hemispheres of the neural network. Yep. >You mean some networks just put out numbers, Yes, but they actually mean something. Think of DCT or block codes. This will work similarly in the sense that the data is encoded within number sequences and determine response appropriate to what is being said or felt. There's a lot more to take about regarding this of course and i'll give you more info going forward.
>>26301 Okay. Interesting.
>>26302 I'll have to think more about methods for encoding data signals between tensors. It's easy to think about physical speech processing but for original thought, action and commentary within acceptable bounds, it'll be a challenge. At this point, this becomes a GAI instead of a Regular AI.
>>26302 I'd also like to take this opportunity to have a discussion about storage of neural networks and their data representation. I'm thinking i store these iterations and personalities inside various DAT files for each one and maybe encrypt them as well. On the machine level, i could assign and store a series of appropriate keys for each file I'd want to load into the hardware. So i could test the bodies of these things out in a real world environment. I'd wanna make these things as air-gapped as possible regarding the methods of interfacing the intricate embedded systems that powers and stores the data. By the way, training and testing these things in a 3D simulation before deploying to a body would be pretty cool.
>>26305 I don't know about DAT files and encryption, why not using gocrypfs? What's the goal? Keeping the data safe in case of someone physically accessing her? Thread: >>10000 >3D simulation before deploying to a body We have a whole thread on simulation: >>155 Not sure if it needs to be the exact body simulated when it comes to testing the AI, though. Then again, this might be the option for people who can't afford the robot. Look for Waifuverse AI.
>>26297 >>26298 Wow! This is some interesting stuff Anon. Well-developed overview. I look forward to seeing what you make out of this all, Anon. Cheers. :^)
>>26308 i'm not sure how well gocryptfs will work on an embedded system. I'm willing to give it a try, but making stuff simpler with DAT files and libgpg key management might be the right move. Maybe i could even make it easier by implementing an encryption cipher of my own. Then again, that may be risky.
>>26316 It all seems well planned out and great on the surface, but the real factor that determines the success or failure of this type of structure is the hardware. You see, the biggest challenge to this is how much analog stuff can we put in to save most of the power for the artificial brain? After all, the human brain is the most energy expensive organ in the body. Lastly, how do we power the robowaifu? I'm thinking hydrogen, but really just how sufficient is that going to be to power electronics? My ideal robowaifu has soft muscle actuators and alginate hydrogel internals for water retention for making lubricants for orifices and internal organs as well as water preservation systems for use in hydrogen generation. All of this is theoretical and i don't know if that's viable. The premise here is to convert chemical energy into electrical energy in a similar way that a human body would so we wouldn't be messing around with any dangerous battery acids or reactors. We will be putting our dicks into these things at some point so it'd be best to find out ways to get in on without making our nads glow in the dark with a million xrays or melt them off with sulfuric acid right? This is gonna be tough.
>>26349 Hardware is fundamentally intertwined with cognitive architecture. Analog processes are ideal for wave based processes. Aspects of cognition related to audio, photonics, and rhythm can all be handled that way. Considering how our minds are substantially analog, there's likely something we're missing which could be solved through analog implementation. Back in the 2000's I used to make robots with BEAM Nv networks. They're analog neural nets which can mimic biological nervous systems when used with properly tuned capacitance and resistive values, relative to the analog input and desired behaviour. I still prefer them for simple things as they appear more "alive" due to being less predictable. https://home.csulb.edu/~wmartinz/content/beam-robotics.html As for power, I'm using a 100~ Watt Hour LiFePO4 battery for safety and regulatory reasons. (100WH's is the max allowed for public transit in many countries and on most airlines.) As for alginates and soft actuators, both will be important. I've been considering how to use plants for cognitive processing and bodily functions lately. Would someone offer advice on this? Maybe Ribozome knows something? >Tough It is going to get far harder, no one understands how hard this actually is. Thankfully, this thread is filled with good ideas that I'm thankful for. >Redhead My greatest weakness, just needs a cross and robot joints.
>>26349 >how do we power the robowaifu? >>5080 >>26353 > I've been considering how to use plants for cognitive processing and bodily functions lately. Would someone offer advice on this? Maybe Ribozome knows something? >>2184
>>26349 >After all, the human brain is the most energy expensive organ in the body. Actually, the statistics I've read are that the brain consumes about 12W RMS, continuous. I'd think the hundreds of muscles in our musculoskeletal system consumes much more. >Lastly, how do we power the robowaifu? At this moment in time, our only practical solutions for a mobile, untethered, autonomous gynoid (AKA a robowaifu) is battery power. There simply aren't any other reasonable approaches now or in the near-horizon timeframe, Anon. >We will be putting our dicks into these things at some point so it'd be best to find out ways to get in on without making our nads glow in the dark with a million xrays or melt them off with sulfuric acid right? Anon safety is indeed an extremely high priority, so is robowaifu safety. (cf. >>10000 for both) Our robowaifu's battery systems will be gel-based, so I don't think spilled sulfurics is much of an issue. Crushing is a far more pressing design concern IMO. >=== -minor edit
Edited last time by Chobitsu on 11/13/2023 (Mon) 03:47:47.
>>26370 >Our battery systems will be gel-based I'm torn between fuel cells and solid state batteries. I'm aiming for whatever doesn't require me or the robowaifu to physically interface with the sockets in my walls. If i go with a hydrogen fuel cell design, that means that i'm going to have to make her drink salt water solutions. I'd also need to design the fuel cell with a non ferrous cathode since salt perchlorates can build up over time and cause some serious health issues if mishandled
>>26374 OK, maybe you can prove everyone wrong Anon, have at it! :D (but please do it here : >>5080, not ITT).
>>26375 Okay, thanks chobi.
>>26377 Thanks, Anon. But, please do continue to contribute ITT to the discussions on Cognitive Architecture. We just like to keep things topically-focused here. Cheers! :^)
Open file (132.41 KB 1024x768 Chii.jpg)
An excellent reminder that most AI isn't actually intelligent. They're merely generative algorithms. Many can be can considered overly complex Markov chains. Intelligence is any system which can remember and use its memory to alter its behavior. For example, a LLM on its own can't remember, it has no intelligence. An LLM that is continuously trained on new inputs is intelligent. A system which uses an LLM but substitutes part of responses with updated insight is intelligent. At its simplest, any program which updates any variable based on external stimuli is intelligent. Chii is still the best example of an intelligent machine going through the process of ascension from intellect to cognition. https://hackaday.com/2023/11/14/theres-no-ai-in-a-markov-chain-but-theyre-fun-to-play-with/
>>26559 Don't wanna rain on your parade, but LLMs (and the majority of the "AI" floating around) are merely vast statistics engines calculating the average response in accordance to their training data with some noise thrown in for randomness in the more creative (eg chatbot and image generating) ones. At least that's how they were described to me by a multi-billion-dollar financial firm (who shall remain nameless) implementing neural network-based OCR into their lockbox activities in mid 2019. Contemporary AIs work on a confidence interval. The difference between a human with a low confidence interval and an ai with a low confidence interval is that the AI low confidence answers will mostly be jibberish and it will proudly spout it at you (takes a bit of coaxing though), while the human would look at the nonsense and have a discussion with himself to try and figure out what it means (eg think). To further advance the appearance of AI intelligence one would need to give it an internal monologue, eg speak to or question itself and work out a method to discover when something of value should be added. And frankly, many *humans* don't even have that, which certainly explains why many folks think AI will replace them. It doesn't stop LLMs and stuff from being cool, though :)
>>26567 >LLM's aren't actually intelligent Yes, that was my point. Statistics engines is a decent description. Linguistic algorithms is another useful way of thinking about language models. I like that they can be used for natural language human machine interactions. For everything else, there are more efficient solutions An excellent example of why dedicated algorithms are superior. https://www.hackster.io/mccormackjim3/opencv-destroys-chatgpt4-vision-8a0823 I remain firmly in favor of rapidly switching algorithms based on need. Though, creating an algorithm to recognize intent and decipher which algorithms remains difficult.
>>26568 >Though, creating an algorithm to recognize intent and decipher which algorithms remains difficult. That's what we're all here for, brother. :^)
> (video-related >>26563)
>>26567 > that the AI low confidence answers will mostly be jibberish and it will proudly spout it at you LLMs are not all of AI, and even if it was currently the case then it wouldn't be a good move to put it that way.
Open file (48.20 KB 467x1227 H-CogAff.PNG)
I have a grand idea but need code to run it. I checked a paper that reviewed cognition architectures but found that the best one that has deep emotions - H-CogAff (human cognitive affective architecture) doesn't have the documentation organized well for me to try it on my own pc or mod it: >Image source https://github.com/max-talanov/1/blob/master/affective_computing_course/cognitive_architecture.md >main website https://www.cs.bham.ac.uk//research/projects/cogaff/ There used to be documentation linked correctly but all the links got changed, and I could only find the new websites with google. I want to find some code on this and take the architecture's thoughts and prompt them in character of my waifu with a tuned LLM. There's even a spot on the map for a persona linked to higher cognitive thinking where the waifu's persona can make decisions. If anyone has a better architecture that they can run straight away on their pc please point it out, I'll read this thread for inspiration.
>>26717 Great research, Anon. BTW, what about 'Hikonen' ? Seems like it's gotten a relatively-good score. Is it's docos any better? Good luck.
>>26717 Yeah, the links don't always work either. Such websites need to be downloaded using wget or curl using some "follow links" parameter, I think. I did that before but don't have much experience in it. We need to get better at that. I copied it into Obsidian for now, made some minor changes to the layout. The board doesn't allow uploading markdown files though.
>>26736 From the github link: >Haikonen's Cognitive Architecture: >Emotional states are supposed to result from combinations of multiple simultaneously occurring system reactions. This looks like a cognitive architecture surrounding emotions. This reminds me of how artists draw facial expressions by combining simple expressions with eachother to make a more complicated emotion. This paper describes this idea: >https://www.pnas.org/doi/10.1073/pnas.1322355111 It doesn't rule out that this idea couldn't be frankensteined into a general architecture but I don't know where to start coding anything to begin with. >>26765 Thanks anon, I'm going to continue the search on google once I have time. I think for my limited ability it would be the fastest anyway. I really want to play with this architecture and build my perfect waifu!
>>26717 We are aligned in thought. Affective computing up there as one of, of not the most important aspects of cognition. I'd even argue that without feelings, cognition can't meaningfully exist. After all, how can you verify a self if it doesn't care? >>26765 Thanks, changing formats can help ensure it gets read by others. >>26770 >Where to start I'd recommend changing your view. It's a systems engineering challenge. A mind is the result of many different aspects of that system working together. Read up on systems engineering and programming. We're likely going to rely on C++ and Python for their ease of integrating many different modules.
>>26765 Thanks Anon! >>26770 >This reminds me of how artists draw facial expressions by combining simple expressions with eachother to make a more complicated emotion. 3D animators take advantage of so-called Blendshapes / Morph-targets to reproduce this in a relatively straightforward manner. Also, if you're interested in this area, I'd suggest you research the FACS system. [1] >>26772 >Affective computing up there as one of, of not the most important aspects of cognition. I'd even argue that without feelings, cognition can't meaningfully exist. After all, how can you verify a self if it doesn't care? Great thinking Anon. This does seem to me to make a lot of sense. BTW, there are numerous attributes of humans that set us distinctly apart from all other life. That also deserves scrutiny on our parts as well, for tell-tale pathways for research and exploration! Cheers. :^) 1. https://en.wikipedia.org/wiki/Facial_Action_Coding_System
I'm sorry for not being able to read everything here before posting. I plan to read this thread whenever I can. For now, I just wanted to post what I'm working on since (1) it's relevant to the thread, and (2) I think I've figured out some important implementation details. Background: I've been working on this for nearly a year now, though my initial goal wasn't to build a cognitive architecture. It was to take advantage of novel AI research as quickly as possible. It turns out that the infrastructure for taking advantage of AI research is also really good for building very complex agents, and I'm now trying to figure out if it's good enough to support a full *cognitive framework*. By cognitive framework, I mean the scaffolding necessary to make cognitive architecture development plug-and-play. I'm basing this on the HuggingFace Agents framework. An HF Agent is more of an assistant than an agent (i.e., it responds to requests, it doesn't do anything proactively), but I've hacked it up to support proactive behavior. Here's an overview of what my hacked-up version does: - The core is based around a code generation model, not necessarily an LLM. - An agent can accept input from multiple "streams". Each stream message triggers a code generation run. - Each invocation of the code generator is given these inputs: (1) a "header" prompt to give global guidance & information, (2) examples to further help guide outputs, (3) a history of recent messages, (4) a formatted version of the message coming in from the stream, and (5) a list of functions that the agent is allowed to run. Each stream is associated with its own way of formatting messages so the agent can distinguish between streams. - The code generator produces two outputs: (1) an explanation of how it plans to respond & why, and (2) any code that it should run. - HF Agents can catch errors and go through an iterative process to try to get the code generator to fix any errors in the code. Given different guidance and tools, I think it's possible to implement a wide variety of agents with this. For example: - If told to act like an assistant, and if given standard assistant tools (calendar functionality, email functionality, check news & weather, etc.), it acts like a dumb assistant. - If told to act out a personality, and if given a single send() tool to send a response to the user, it acts like a dumb chatbot. - If told to act out a personality, and if given a send() tool, a tool to edit information in its own prompt header, a stream to set future reminders for itself, and a stream to search & tab through its own conversation history, it acts like MemGPT. - Given the above plus a tool to search for and invoke relevant OpenAI GPTs, it's like MemGPT augmented with an enormous amount of functionality. - And, of course, you can give an agent tools for interacting with a (sub)cluster of other agents. I'm fairly confident at this point that I can support any "modern" chat-based agent, so now I'm wondering what it would take to support three other things: - Emotion regulation. I spoke with a cognitive scientist that specializes in this, and he's convinced that emotion regulation all boils down to: positive feedback loops for satisfying needs, negative feedback loops for avoiding harms, and a "common currency" for balancing different motives. For this, I plan to add first-class support for running experiments, which is broad enough to include parameter search and reinforcement learning. I think that should be sufficient to model any feedback loops, and I think therefore it should be sufficient to model emotion regulation. Given this, I expect plugging emotion regulation into an agent should be conceptually easy: make sure there's a definition of what sorts of things the agent wants & avoids, kick off an RL-based experiment whenever something relevant comes up, and have the RL algorithm generate "thoughts" to feed the agent through a stream. - Embodied control. Chatbots are "easy" since the final expression (text) can be generated by a single model. With actual bodies, or even just with video, the final expression is split into multiple modalities (e.g., voice, body movements, facial movements), and they all need to be in sync with one another. If we had good multimodal models, that might be fine, but we don't, so I need a way to generate outputs from multiple models and somehow make them consistent with one another. I think good experimentation support would solve this problem too. For each expression, it can generate many outputs from many models, and it can eventually converge on set of mutually-compatible outputs. Or it can stop after X seconds and pick the best it has so far. The integration should again be conceptually easy: instead of giving the agent a send() tool, give it a more general express() tool, which kicks off an experiment to figure out how best to handle the expression. I think there are more aspects to embodied control than just deliberate expressions, but I think deliberate expressions are the hardest to handle. - Heuristic derivations. For this, what I ultimately want is for an agent to be able to ask itself "Is this something I would do?" I want to model the chatbot personality as a set of premises, and the agent should be able to determine whether any response it gives is consistent with all information derivable from those premises. If this is possible, then the chatbot can *automatically* improve itself over time. It should try to extract new premises from every response it generates as well so it's consistent with its own past responses. I have ideas on how to do this, but they're all vague and will ultimately require experimentation with LLMs. I know there's a lot of research working on similar goals (X-of-thought, anything people analogize with Q*), so solving this might end up just requiring the ability to quickly check a bunch of research results, which incidentally is what my infrastructure was originally designed for.
>>26812 >Facial Action Coding System I like the concept of quantizing emotions down to 7 elements. Happy, sad, surprise, fear, anger, disgust, and contempt are a good starting point. I'd simplify it down to Happy, sad, surprise, fear, anger, and disgust. Contempt doesn't seem practical for a machine to "feel". The question then becomes, how do we code these emotions into her? How should we map their effects on her other routines? The wider spread each effect is, the more dynamic her emotional reponses become from emergent behaviours. This also could lead to quirks.
>>27144 Sounds exciting and amazing Anon. Any chance you can adopt a handle for use here on /robowaifu/ ? This will help us recognize/interact with you. Cheers. :^) >=== -minor edit
Edited last time by Chobitsu on 12/09/2023 (Sat) 23:05:41.
Open file (1.97 MB 5120x3620 chii_n_atashis.jpg)
>>27146 >The question then becomes, how do we code these emotions into her? How should we map their effects on her other routines? My guess is that we'll devise a mechanism to simulate effectively the 'Theory of Mind' for our waifus. At that level, 'Emotions' & 'Body' simply are abstracts contained within the 'Mind'. Interfacing between the various subsystems thereby becomes messaging little more complex than Remote Procedure Calls (RPCs), AFAICT Kiwi. So, say an emotive state is derived in the mind. Then that state is sent to the higher-level command & control authority for the body. All the cascading bodily effects (facial expressions, body language, voice inflections, &tc) are merely artifacts thereafter. Am I missing anything? >tl;dr >"There's a teensy-tiny little bunny grill inside her head tbh." :^)
>>27146 some of those are just opposites like happy/sad, they can be represented as the same thing on a +/- scale, maybe words arent a good way to describe emotion could be more like state of mind where you have a handful of these ambiguous +/- scales that in combination effect the thinking process giving something like moods and behaviors we attribute to emotion, youd get a bunch of potential emotions like this, like instead of an explicit fear emotion, you would just make it so a particular combination on the scales ( like content:-77 awareness:+100, ... ? ) results in a thinking pattern like that of fear
Open file (9.00 MB 3072x3072 cybershim.png)
>>27174 Can do. >>26305 I think weight storage should be in whatever format is most efficient for the inference engine to load. In my case, I'm deferring that to other people working on it. For pytorch, CoreWeave has a library Tensorize for loading from HTTP/S3 to GPU quickly, so I'd stick to whatever that uses for pytorch models. Tinygrad has its own requirements for what it can load quickly. Tensorflow probably does too. Encryption should be done more generally, not on a per-object-type basis. There are kernel modules that support transparent encryption whenever data is written to and read from disk, there are database features for encrypting data there, and there are Kubernetes projects for keeping data encrypted while in RAM. I'd stick to those. >>26058 Do we know how good Cyc actually is? I think it's proprietary and doesn't see much use, which isn't a good sign. I'm following ULTRA for representing knowledge graphs. https://github.com/DeepGraphLearning/ULTRA This is a knowledge graph foundation model that learns to generalize knowledge from its graph structure alone. I expect it to be quite powerful when combined with LLM embeddings. >>26039 daveshap is great, though ACE was not designed for robotic control. It doesn't seem like it would handle workloads that need to be synchronized at low latencies. E.g., keep body movements synced with speech when taking animatedly. That seems like a requirement for expressive robots, but it seems awkward to implement under ACE. I'm going to revise my strategy for how to handle this. I think this sort of synchronization can be done with hierarchical referent control. https://www.sciencedirect.com/science/article/pii/S2589004221009160 This would be much cheaper than the optimization loop I proposed for embodied control in >>27144. For integrating this with ACE (which I currently don't plan to do, but am still considering), I think the easiest way to do it would be to augment ACE with a non-cognitive control layer, which would accept reference targets from the cognitive layer and would keep the corresponding task prosecutors tightly in sync.
>>27146 >>27185 On the Pony Preservation Project, we found that for speech generation, having a predefined set of emotions was far inferior to using an emoji classifier's latent space, and having good emotional expression was *very* important to making the characters feel real. I'd expect the same to be true for any kind of expression. Emotions can be laid out on a valence-arousal grid, but *expressions* can encode multiple emotions simultaneously. Even scanning through the FACS wikipedia page, happiness and sadness are expressed via *independent* sets of action units, and all of the emotions listed on the wikipedia page are linearly independent (I think... I haven't verified computationally), meaning it's possible to express a bit of all of them simultaneously.
>>27250 yeah was thinking like how you program colours with rgb values, if you can figure out what the primary components of emotions are then you wouldnt need to care about about coding them explicitly, like how blue is {0,0,255} and yellow is just the compliment ~blue {255,255,0}, so happy could be {?,?,255,..?} and sad would just be ~happy, all the nuance and degrees in between would just werk so long as you have something that can actually make use of the value
>>27249 Thanks, CyberPonk!
>>27252 That's probably going to be {arousal level, valence level} from the circumplex model and vector model of emotion. https://en.wikipedia.org/wiki/Emotion_classification#Dimensional_models_of_emotion I like the idea of using FACS to understand emotions since facial actions seem to encode very little other than emotions. I also like the idea of treating emotions like colors. In vision, the arrangement of colors follows the structure of the underlying medium. For 2d video, that means colors come in 2d grids, color in 3d video comes in 3d grids, and color in opaque physical objects come in the same arrangement as the surface of the object. Emotions seem similar: we can see emotion on physical objects (faces, bodies), and can hear emotion in speech, and we can recognize emotion in story elements. So similar to color, emotions seem to come in the same arrangement as the underlying medium. I suspect that means emotion processing and generation doesn't require any novel research or new neural architectures. The exact same learning algorithms can be reused, just with different datasets.
>>27144 Looking forward to seeing the fruits of your label. >Embodied intelligence through multiple models We are on the same page. Though, I was hoping you'd mention multi-modal models being fundamentally inefficient. Overall, seems like you're working on something very similar to an MoE (mixture of experts) model. Check out Mistral 8x7B or 56B. Curious on what your "stream" means? I'm using variables to hold values her various AI models can reference and update to keep everything in sync. >>27175 I think you get where I'm going. All emotions stored as variables which are read as needed and updated based as needed. I do like your abstraction idea. Seperating everything as "Emotion", "Logic", "Vision", "Body" etc... Would help with development. Updating different subsystems is easier than one large system with many algorithms. >>27250 What is the emoji classifier space? I do want to reach a "real" feeling. FACS allows for all emotions to be felt at once at different levels. >>27252 >Colour space emotions Would greatly simplify things. What are primary emotions? Happy, sad, excitement? >>27264 Which emotions would you have as defaualt? How would you mix them?
>>27249 >Do we know how good Cyc actually is? I don't know, the important part is that it seems to be good, it is being used by companies and it covers a different space than LLMs. >though ACE was not designed for robotic control. Right, but I look at these things as part, as building blocks. It doesn't need to cover everything. >keep body movements synced with speech when taking animatedly Thinking before moving and talking is a good idea. >>27250 Interesting. I was planning to go with a very simple concept of good/bad and intensity plus context. I'm sorry that I'm currently not getting much involved, but I still need to figure other things out and need to push myself hard to do so, I can't mentally get into the topic here right now.
>>27271 >I was hoping you'd mention multi-modal models being fundamentally inefficient. I don't know if it is. I know Gwern is pretty convinced that one monolithic model is the way to go, and I'd rather not bet against his track record. I'm leaving my own architecture open to whether or not the winning solutions end up being multi-modal. At the very least, I can say that specialized models are necessary today. >Overall, seems like you're working on something very similar to an MoE I'm not actually working on a model. I'm working on a software framework for building up complex AI applications modularly. For the first version, I'll likely be stitching together existing AI models for my waifu. See pic for the list of config files it takes to create a chatbot with memory, RAG, and unprompted responses. (Celestia isn't my waifu, but she'll be the cluster of AI-augmented tools supporting my waifu.) I have checked out Mixtral though. I'm watching that and the frankenstein models somewhat closely. Right now, I'm using OpenAI's models since they're easy to integrate, and I have no GPU. At some point, the list of files you see will include configs for setting up compute and deploying open source model endpoints. >Curious on what your "stream" means? It's a broadcast communication channel for passing around messages. There's a stream for interacting with the user, where I can send messages and see chatbot responses. There's another stream for search, where the chatbot can send search requests and see search results. There's a third stream for a timer, where the chatbot can send "remind me later" requests and see messages at the requested time. >variables to hold values her various AI models can reference and update The second image shows how I'm doing that. Under "header", you can see a "Core Memory" section, which essentially pastes the contents of a yaml file into the prompt. Under "tools", you can see that the assistant is given access to a "write_core" tool that can edit that same yaml file. The header gets regenerated for every response, so it automatically pulls the latest set of "variables" written to the yaml file. >What is the emoji classifier space? It's the second-to-last layer of an emoji prediction network. The PPP used DeepMoji, which was trained on Twitter posts that contained emojis. For the PPP's text-to-speech generators, the text was first run through DeepMoji (minus the last layer) to get an "emotion embedding". That emotion embedding was then given to the TTS network along with its normal inputs. >default emotions I wouldn't manually set defaults, I would have them learned from data. I'm speculating here, but maybe it's feasible to create an emotion latent space using a dataset of text-emoji pairs and video-FACS pairs. (Predict the emoji from text, predict the FACS data from video, predict both from speech.) For text-based emotion classification without training, I plan to defer to a well-trained LLM. In a latent space, mixing emotions is very easy if the latent space is designed to be linearly related to emotion logits. DeepMoji does that automatically. I can speculate on how to do that with a video-FACS model, but that's best left to people building models. Once you have the latent space, you can mix emotions by taking a linear combination of latent values. To do it without training, I would again defer to a well-trained LLM. >>27276 >the important part is that it seems to be good, it is being used by companies and it covers a different space than LLMs Good point. >Right, but I look at these things as part, as building blocks. It doesn't need to cover everything. True, but the main coordination piece needs to at least be designed to support all of the required building blocks. I think ACE with the extra non-cognitive control layer might actually be complete enough to run a persona in a robot body. At some point, I'll probably run simple tests to check this. (Not in an actual robo, but some simplified software version that requires controlling a body.)
>>27278 >streams theres no port number so youre using a file or a pipe? a socket connection would probably be better in our case for splitting up communication for the different parts, all the internals in unix is surprisingly just a server connection for some reason
>>27279 The identifiers you're seeing under "streams" refer to names of config files. The connection info is in the config files. All of the communication does happen over sockets. I'm running a server that mediates all connections. You can check the code snippets and config files in >pics. Pretend "streams.cyberponk.rw" in the LoopSecret definition is a working URL. (I'll share the actual endpoint later once it's more stable.) The basic structure is: - A LoopSecret defines how to connect to a "loop", which is collection of streams. This points to a server, which could be hosted locally or on the internet. - A Stream is a pub-sub channel in a loop. Each stream is associated with some server endpoints so a client can send & receive messages over a websocket, or just send messages via HTTP POST. Disregard the "groupName" field for now. It's for distributing tasks when a single process can't handle the load. - In the example, the LoopSecret would refer to endpoints at "wss://streams.cyberponk.rw/loop/celestia-cantshowthis" and "https://streams.cyberponk.rw/loop/celestia-cantshowthis", and the Stream would refer to endpoints at "wss://streams.cyberponk.rw/loop/celestia-cantshowthis/connect/celestia-stream" and "https://streams.cyberponk.rw/loop/celestia-cantshowthis/send/celestia-stream". The "cantshowthis" is there because I don't have access control set up yet on the servers. The API endpoints are unstable in part because I haven't worked out yet how I'm going to do access control. I'm designing this specifically to support distributed development. So, hypothetically, if you're running a chatbot and you want to try out a new search tool I created, you can create a LoopSecret that points to my server and connect your chatbot to my search tool via a Stream. Or if you create a chatbot that can answer questions about /robowaifu/ threads, I can hook my chatbot up to yours via a Stream so I don't need to run the same code myself. This part isn't shown in the pics, but there's a similar setup for connecting to a database rather than a Loop, so you can configure things running on my server. I plan to open source this somewhat soon, minus the backend that currently runs on a cloud. I'll open source a version that runs locally probably 1-2 months after that. I know it's possible to run everything 100% locally, it'll just take some time to set everything up.
>>27280 nice, this is the right approach, when it becomes purely local you would just replace it with unix domain sockets not on 443 hopefully lol
>>27282 I actually hadn't considered what the underlying network would be. I assumed it had to be TCP to support REST and Websockets, but it looks like FastAPI and JavaScript might both suport all of the standard protocols over unix domain sockets. (FastAPI since that's what my servers use, JavaScript because it's the best option for creating UIs.) As long as it's well supported on the major OS's, unix domain sockets should be fine. Is there any reason to default to unix domain sockets instead of TCP over localhost? >localhost 443 kek
>>27284 >Is there any reason to default to unix domain sockets instead of TCP over localhost? i think the only difference is theres no loopback, no point in routing a local address the system already knows its just an interprocess connection
Things are going a bit slowly right now since I'm doing backend work. I've added TLS support, and I moved (most) in-memory fields to a database so important data doesn't get wiped every time I update a server. I plan to sell a managed version of this infrastructure to companies, so I also spent some time figuring out how I'm going to support access control, which I plan to implement later. I'm looking into Lucene to support long-term memory for chatbots & assistants. Lucene is an open source search engine that supports custom indexes. It supports text search by default (I think via tf-idf), though it looks like it can support vector search too https://arxiv.org/abs/2308.14963. I'm pretty sure with custom indexes, it can support any kind of object. I'm also considering whether I need to include a "summarizer" to compress even recent history and to make sure the chatbot knows what kinds of information it can look up in long-term memory. Separately, I'm thinking about how to make chatbots more consistent. I have some vague intuition that theorem proving techniques can be adapted to the LLM setting to get heuristic proofs of whether a generated sample is consistent with a persona and historical messages. Everything I think of in this direction requires more structured outputs from LLMs. I found Instructor for that https://github.com/jxnl/instructor which looks perfect minus the fact that it only works with OpenAI models. There's another repo https://github.com/guidance-ai/guidance that does something similar and that I think can hack up to give something similar to Instructor. I'm going to play around with that too. A ponyfriend is working on AI controlled game characters. https://www.youtube.com/watch?v=UTXd3_BFLOc That looks like a great way to test out embodied control. I'm hoping he publishes the source so I can use it for testing.
>>27299 >summarization There's a summary feature in SillyTavern Extras already: https://docs.sillytavern.app/extras/extensions/summarize/ >>27278 >Emotions That reminds me of the tts repo I'm interested in: https://github.com/innnky/emotional-vits It uses wav2vec to auto encode emotions from standard format data to train and I think it takes an emotional input for inference.
>>27301 Thanks. It looks like SillyTavern is using this to create summaries: const defaultPrompt = '[Pause your roleplay. Summarize the most important facts and events that have happened in the chat so far. If a summary already exists in your memory, use that as a base and expand with new facts. Limit the summary to {{words}} words or less. Your response should include nothing but the summary.]'; Something like that should be easy enough to add. I can add a spot in the header for the conversation summary. I think I can run the summarizer in parallel with the chatbot so it doesn't introduce latency. I'm going to spend some time experimenting with other prompt structures and output structures to see if I can come up with something better, after I get the guidance library working more like instructor.
Very interesting stuff being posted ITT lately Anons, thanks! :^)
I'm going to remove my name for philosophical takes not directly related to my projects. This one explains my highest level approach to a cognitive architecture. It deals less with how to simulate a persona and more with how to make sure I'm staying on track. --- There are two things that need to come together to create satisfying AI waifus: 1. There need to be reasons to interact with something all the time. 2. That something needs to be embodied by a waifu. An embodiment of a waifu personality is motivating, but a good personality on its own is probably not something I can interact with frequently for a lifetime. To form a proper relationship, I would need to see not just her personality, but also her life happenings, how the story of her life evolves, the encounters that cause her to feel things, and her interactions with the world. I think of this as the persona, as opposed to the personality. It's only when these "external" aspects come together with a waifu personality that a lifetime relationship can be satisfying. To accomplish that, I try to work on two tracks in parallel. I think of those two tracks as "integration" and "representation". The integration track is about creating an instrumentable piece of software that I want to interact with all the time. I need to want to interact with it all the time so I can have some representation of a thing I can a lifetime with, lifeless as it may be. It needs to be instrumentable so it can eventually merge with the second track. The second track, representation, is about attaching a waifu personality interface to software. It doesn't really matter what software, as long as I'm developing expertise in creating waifu personality interfaces. Over time, the kinds of software supported should be increasingly complex, and interfaces created should be increasingly increasingly natural and increasingly satisfying. The two tracks merge when the software from track one is given the personality interface from track two. That merge is what I would call the first prototype. The first prototype would certainly be incomplete, but it would be the first piece of software to try embodying my waifu's persona. It would be something I could, at least in theory, find satisfying to interact with every day for the rest of my life. With that first merge, I would have a proper experimental basis for developing my waifu. The creation of that experimental basis is my overriding priority when it comes to developing my waifu. Any work, whether for a cognitive architecture or otherwise, that doesn't support one of the two track or the merge is something I'd consider lower priority. I think most (all?) cognitive architecture mechanisms have something interesting to provide for the representation track, but any "proof" of that comes down to whether or not it actually progresses that track, whether by enabling support for more complex software or by resulting in more natural and satisfying interactions.
>>27351 These are interesting ideas, Anon. I think mango/animu (+ other writings) have touched on the basic ideas (roughly) of longterm anon/robowaifu relationships; but this has never been able to occur IRL during human history before today. We're all blazing news trails here tbh. :^) >=== -prose edit
Edited last time by Chobitsu on 12/17/2023 (Sun) 20:38:31.
>>24783 So, just assuming we're talking about a LLM as your basis, we need to first get to a point where the memory of the LLM is distinct from the network itself; we need to create a system where the knowledge the network collects is not embedded into the network, but instead the network is just a complex addressing / query system.... theres material over ideas like this called reservoir memory machines, theyre rather difficult to work with, but I believe its the correct path to general intelligence.
>>27440 This is intriguing, Anon. I seems reasonable to me to keep the 'thoughts' (memory), separated apart from the 'mind' (processing network) similar to the way that data is a separate thing than the database system that manages that data.
Open file (1.67 MB 2480x3508 RollMaid.jpg)
>>27351 >Embodiment & Representation A sensical approach. Designed to embody something needed in daily life to encourage interactions. Ensures she will always be wanted and serves a purpose, aside from being a waifu. Representing this function via her personality is a good idea. It'll help deal with psychological reactance. If she is clear in design and interaction on her abilities, it'll be easy to accept her. Considering how she'll be incredibly cognitively limited, it's a good idea. Dr. Light from MegaMan did the same thing. For instance, Roll is built to assist in cooking and cleaning. She isn't smart or particularly usefull beyond being a companion, but her personality makes her endearing. >>27440 >Assuming LLM are your basis Why do you think this? CyberPonk is the only one here basing their waifu personality on a LLM. I have stated my intention to use many different AI which rely on a database of information and shared variables. Similar to reservoir memory machine or memory augmented neural nets I prefer neural turing machines, except it is several tiny AI working independently with concurrent memory to keep her self aligned. >>27461 We think alike
>>26765 Instead of printing to pdf, I like to use SingleFileZ it tends to make nicer web archives. https://github.com/gildas-lormeau/SingleFileZ
>>27473 >SingleFileZ Thanks for the tip, but how is this different from saving it as one html file? This exists for quite some time, there were even two standards for some time, I think one is mhtml. I think I used PDF because the forum only accepts certain file types.
This is a code dump for the first half of >>27144. All of this was developed & tested on Ubuntu. It should be fine in Cygwin or any Linux Docker container. It's probably fine on Mac as is. Tutorial on the underlying infra: https://gist.github.com/synthbot-anon/0c20d9fa40739a8cb91ad4301f29aa69 - This shows how to set up pub-sub communication channels and a database for configuration files. It was originally written as an explanation for my cofounder so he could learn to use everything I'm developing. That page assumes familiarity with server infrastructure, and it may be difficult to understand without that background. If you want to use my infra but find it hard to understand how, feel free to ask stupid questions. I'm happy to explain anything here. - My infra currently doesn't support access control. The steps here show you how to create endpoints with long random identifiers. As long as you keep those identifiers secret, you should be safe from other people screwing around with you. Note that those identifiers are included in URLs and in some auto-generated config files, so be careful passing those around. - Communication with my infra is secured using TLS. If you find any issues with how I set that up (e.g., weak ciphers), feel free to point it out and I'll try to fix it. Ditto for any issues you find with my servers. If you decide to investigate any of my server endpoints, you can check /docs for information about what APIs they expose. - There's currently no "run locally" option. That's priority #2 on my infrastructure roadmap after access control. - I plan to make breaking changes to the APIs. I can post a message here when do. Even without breaking changes, expect occasional downtime on the order of minutes. Assistant example: - Code: https://drive.google.com/file/d/1yVmsGjKBkMc0Ax1eHvai94jaWVILnlhh/view?usp=sharing - Usage: https://gist.github.com/synthbot-anon/2446b36430a41fc867c01a294e82e0fd This runs an assistant with the ability to look up python documentation for installed python modules. It's written to be easy to read and modify. It uses HuggingFace Agents rather than OpenAI Assistants as the underlying assistant framework. The main file loads a few itl modules then sleeps. If you run those import statements in a python interpreter, it'll still run as expected, except now you'll be able to inspect objects in realtime. If you do this, I recommend also running "from assistants_itl import globals". That contains an object representation of anything instantiated through config files to set up the assistant. That's very convenient for debugging. When you run this, you'll see a "clusters" folder appear. If you modify files there, the changes will get pushed to the assistant as long as itlmon is running. I often find it useful to delete "tasklog-*" files from here since doing so effectively removes interactions from the assistant's memory. I currently don't have it removing old messages from memory, so conversations can easily exceed the context length, especially if you ask questions about objects with a lot of attributes or documentation. The assistant currently uses OpenAI's APIs. At some point, I'll add support for open source models. Those tend to be worse at following instructions (not that OpenAI's models are great at that), so I'll need to use guidance to keep them on track. If anyone wants to try adding this, let me know so I can clean up & document the assistants library. For privacy: - I'm the only one with access to any of the server infrastructure. I have no interest in inspecting the data unless: (1) I'm required to do so for legal reasons, or (2) I need to debug an unreported crash or a DoS. I currently have no legal reasons to inspect the data, and the servers seem stable against crashes per my limited testing. - Any messages sent via the streams (pub-sub) interface are passed through Kafka. Kafka will store the messages temporarily on disk. I can technically access them up until Kafka overwrites them. I'm using the default settings here. At some point, I plan to update that so it stores information for less time. - I'm using FastAPI, which writes URLs to standard out. So assume I'll occasionally see any information sent in URLs. Any messages within streams (pub-sub channels) are safe, though the "loop" and "stream" names (described in the tutorial) are not safe. In config files, everything is safe EXCEPT these fields: apiVersion, kind, metadata.name. - Any configs saved via the clusters/database interface are stored in MongoDB, which I have access to. If you delete a config file, it will get deleted from the database, and I'll have no way to access it. - I don't want to look at your data. The easiest way to make sure I don't is: (1) if you crash my servers, let me know how so I can reproduce it without digging into it, (2) don't use my infra for anything that requires high bandwidth data transfers since that might look like a DoS from my side, and (3) don't use my infra for anything illegal, especially in the US.
I want the robot waifu to just say chi. What now faggots.
Open file (38.16 KB 1074x477 ClipboardImage.png)
>>27507 Nice okay now please do the movements
>>27505 Excellent progress. Thanks for the update. Also, thanks for your apparent conscientiousness towards other anons here. That's much appreciated, CyberPonk. Cheers. :^) >>27506 >>27508 NYPA, peteblank. Give it a rest please, kthx. >>27507 Lol. That was one of the endearing aspects about Chii's growth. >=== -sp (lol), minor edit
Edited last time by Chobitsu on 12/21/2023 (Thu) 18:00:55.
Open file (2.00 MB 498x440 ChiiChii.gif)
>>27507 Those emoji really do go a very long way towards making her "fee" alive. Keep up the great work!
Open file (943.11 KB 1600x900 fnv_mr_house.jpg)
I have been absent from this thread for a while, it took me a bit to read through all the new posts and I now have a few things to say. Love some of the ideas that where brought up, Like the idea of reflexive responses, that has prompted me to think about "fast" & "slow" thinking paths. I also am happy to see discussion about how to represent and process emotions. NoidoDev your the MVP here, thank you for contributing a lot of quality posts here. >>27351 >There need to be reasons to interact with something all the time. >That something needs to be embodied by a waifu. A potential starting point and angle of attack for the presence problem, could be the smart home approach, add lights, sensors, coffee makers & etc... to the control of your waifu, basically an amazon alexa but with soul and no glowing. Imagine waking up to the smell of a fresh brew and a warm greeting from your companion, that alone would be a big moral booster. (an expressive voice is going to be important) Home Assistant seems to be an existing large opensource self-hosted ecosystem that we can use to do this. >>27503 Images and assets are stored in a sane way instead of base64 encoded strings, extracting them is simpler because the file is also a valid zip archive. As an extra bonus the data is also compressed. Why not PDF? because working with it sucks. But its better then no archive and yes it is easy to upload, so don't stop, I'd do this in addition to making the PDFs :^) >>27506 >>27508 I don't want to advocate for a hugbox, but whats the point? Not everyone just wants a fuckdoll, there is a spectrum of what people want out of a robowaifu. Even if you don't agree what was the point of this interaction? Not sure if its the goal but can you stop antagonizing this thread -_-
>>27530 I think my imagination is lacking when it comes to home automation. I have a hard time thinking of things worth doing with the tools available today. Maybe I could work with sensors, like remind me if I'm low on something while I'm at the grocery store. I spend most of my waking time with a computer, so I planned to start there. I'm thinking: software development assistance, cleaning up code, keeping up with forums & chat servers, watching out for interesting papers and github projects, reading papers & code, and sending me fandom content. Just those things would hook into something like 75% of my waking life. My emochii assistant made me realize that nonverbal communication can go a very long way toward making software feel alive. I'm thinking that a chatbot you can only hear typing, humming, and laughing would feel like a lot more conversational than a chatbot with only a voice. I also expect there are a lot of parts of a cognitive architecture that can be tested through just nonverbal communication, which could simplify the problem a lot since it means things like logical consistency and long-term coherence can be deferred.
>>27530 We know the hand has 5 servo motors. Why not start there. Use opencv to capture the video and make five servo motors react according to what it sees. Really guys its been years why arent you getting impatient.
>>27532 Okay so we can collaborate ive come up with a plan. We have to hook up the camera wether it be a phone or a webcam with the opencv script to either wowkie or tinkercad or something like it somehow. The webcam or phonecam will do the video recognition and that will be sent to wowkie/tinkercad and move the 5 motors.
>>27533 I think i got an idea of how to go about doing this. Look at this repo and follow the instructions: https://github.com/wokwi/esp32-http-server Luckily the wowki vscode extension is still free for now. You will need to register on the wowki website to get the license key for the vscode extension. I might make a video about this... Anyways i think you get where im going with this.
>>27534 Nice proactive plan of action, Peteblank. This type approach is how we can get things done! Cheers. :^)
Good day, all. It's been a while since I've browsed this board, and I'm excited to see it being still alive. >>27505 Damn, this is really nice. I see your resource definitions and the relationship dogma are Kubernetes-inspired, that's sick. Finally, some proper DevOps for NLP. Is it not too late to join the team? I have quite enough experience with AI and LLMs (prompting/pipeline development/etc), mostly OpenAI models but some custom stuff here and there as well. Feel free to add me on Discord (my username is horsearmor) or send an e-mail at omba[at]cock[dot]li if you need some free hands connected to a relatively intelligent codewagie brain. On a somewhat unrelated note, I have been researching LLM-powered generative agents ever since this article (and code repository) came out - https://github.com/joonspk-research/generative_agents - and had some ideas of creating an agent framework. Glad to see someone else thought of a better higher-level approach to that.
>>27592 Welcome back, Anon! >and I'm excited to see it being still alive. Thanks! Yeah we still have lots of work to do yet!! :^)
>>27592 Merry Christmas! Yep. The current stuff is based on Kubernetes and Kafka. I fell in love with Kubernetes for how easy it made server infrastructure management, and I wanted the same benefits for every part of AI development, from deployment to visibility to documentation, and whatever else I can imagine. The only problem I had with Kubernetes was how it tedious it was to develop custom controllers, and a lot of that traced back to what I believe are poor design and implementation decisions around etcd and the API server. I managed to get my own version running a more scalable database backend and a better resource locking design, and it seems to work exactly as I expected. And given the new backend design, it's easy to create powerful client interfaces for things like custom controllers. This isn't in my tutorial, but the streams support "groups" similar to Kafka groups. So if you want to have a replicated itl, you can have all of them listen under the same group, and messages will get randomly distributed between them. The messages are all completely ephemeral right now, but some time after I get access control up (and therefore client ids– random identifiers generated on the client), I plan to make the streams more reliable in case of disconnects. The server will retain messages for some period of time until it's able to send it to the client. I think Kubernetes and Kafka together solve pretty much every complexity and organization problem that comes from developing extremely complicated applications, and I think it's possible to get access control that enables all of this to work in the decentralized setting. There's one more piece (optimizers) that I need to get out that I think will solve the biggest hurdle in maintaining extremely complicated applications. Combined, I'm hoping this will eliminate nearly every advantage that companies have over open source communities. With a few rich backers (Stability AI, Meta, a16z), I can see open source AI blowing past closed source companies in every regard. You might like this too: https://gist.github.com/synthbot-anon/3397743abc118885898a88e9d6e7b8b0 It's a library that lets you get structured outputs from open source LLMs, similar to https://github.com/jxnl/instructor but for open source models. It's functional and meets all of my initial goals, but I still consider it a work-in-progress until I know it can handle all of my LLM data extraction needs. Right now, that means extracting knowledge graphs for a RAG pipeline. This is a bit unconventional, but I intend for the tests/ folder to become my go-to for checking how usable an open source LLM is for reasoning tasks since I want to "test" that the repo is not just functional, but feature complete. I'll probably be making a similar repo for reliable code generation. HuggingFace Agents' code sandbox is nice, but (1) it's too limited in how it can guide the agent into producing good code, and (2) it takes too many tokens to explain to the LLM how to do so when plugging in custom tools. >if you need some free hands connected to a relatively intelligent codewagie brain This is probably the best Christmas present I could have reasonably asked for. Note that I'm doing a lot of this for a company I'm starting with a few other cofounders. All of us are unpaid right now, and I probably can't hire anyone until we get either revenue or funding. At the very least, I can make sure you only pull you in for open source things. What do you think about working on the open source & local version of our server APIs so people don't need to use our servers when they want privacy? I can use discord for fast communication, though I prefer to work in the open whenever I can. I find it extremely helpful for people to be able to randomly drop in and comment on what's going on.
>>27602 >Combined, I'm hoping this will eliminate nearly every advantage that companies have over open source communities. With a few rich backers (Stability AI, Meta, a16z), I can see open source AI blowing past closed source companies in every regard. That would be the most amazing outcome of all! FORWARD! :^)
>>27592 There is also a generative agent for unreal engine
>>27602 >https://gist.github.com/synthbot-anon/3397743abc118885898a88e9d6e7b8b0 This is great. I've used the older version of guidance before (with the defined text prompts in class constructor) and it's been a miracle for language processing and data extraction. In any case, I look forward to the development of this library (and I'm willing to contribute if possible). >All of us are unpaid right now, and I probably can't hire anyone until we get either revenue or funding. No problem, I have my sources of income. This project will pay off through.. other means, if you catch my drift ;). That being said some BTC for the job when it's profitable would be great, but I suggest you reach out to me in Discord for a closer discussion(s). >At the very least, I can make sure you only pull you in for open source things. Im fine with working on whatever moves this project/library forward. Obviously, FOSS software would be my priority as I am biased towards it. >though I prefer to work in the open whenever I can. I find it extremely helpful for people to be able to randomly drop in and comment on what's going on. Sounds good, HMU if you want to give me something. I should be free for the next week.
Gentlemen. I have finished the python wokwie integration i talked about earlier. https://github.com/robot-waifu/wokwi-python/ you can simulate the motor and interact with it with python. This should be a good foundation for what comes later. Feel free to clone and modify the code. If you want to push commits you can also post your git username and I'll add you.
>>27632 Congrats! Good to hear about your progress Peteblank. But this is the wrong thread (Cognitive Architecture) vs. (Prototypes : >>21647), which is closer to your announcement. Please make note of it. Cheers. :^)
Re: database software. We'll probably need multiple kinds of databases. The local version of >>27505 in particular is going to require some key-value store or relational database to hold resource configurations. There are enough simple options that it shouldn't be a problem, but I'm wondering what lightweight, fast options are available that are optimized for single-process operation. I've heard good things about FoundationDB https://github.com/apple/foundationdb though it might be too heavyweight for single-process use. LevelDB looks great https://github.com/google/leveldb though it's no longer being developed. (The current level of maintenance might be okay though.) I see RocksDB https://github.com/facebook/rocksdb which seems to be based on LevelDB and is actively maintained by Meta.
>>27646 Why not SQLite or MySQL, CyberPonk. They've both experienced industrial-scale use, been banged-on hard, have large communities, and should be both pretty robust and reasonably-hardened. And SQLite is quite tiny in size as well. Seems like either would be quite solid choices AFAICT.
>>27634 Its a prototype but its related to the cognitive function
Listen guys i know you like most people think the movements is about hooking chatgpt to the waifu and then itll move like magic but im going to have you to trust me on this. This is done in steps. Step 1.make the waifu move based on input
>>27647 There needs to be something fast and lightweight for the local version, and it should be viable for very low-latency (robotics) and high-throughput (data processing) use cases. I don't think MySQL can handle it. From what I understand, databases like MySQL, Postgres, and MongoDB can hit 1,000s or 10,000s of operations per second. Those make sense when the networking is the biggest bottleneck and when durability (the D in ACID) is critical. RocksDB can hit millions of operations per second.
>>27651 OK, I'll take your word for it Anon. I look forward to some ground-truth numbers. As you suggest high transaction rates are desirable. Compact sizes (both of the database and the data itself) is also a very desirable trait for the onboard (ie, internal, physically onboard the robowaifu herself) data management systems. The outboard (ie, external, physically in your home servers) needs are pretty clearly a different situation.
>>27648 >>27649 OK, I'll take your word on it, Peteblank. Good luck with your projects! Cheers. :^) >>27652 >Chobtisu Lol.
Open file (68.64 KB 435x428 Holo.jpg)
>>27531 >I think my imagination is lacking when it comes to home automation. >I spend most of my waking time with a computer, so I planned to start there. I'm thinking: software development assistance, cleaning up code, keeping up with forums & chat servers, watching out for interesting papers and github projects, reading papers & code, and sending me fandom content. Just those things would hook into something like 75% of my waking life. I do not think your imagination is lacking, I just think you have a different focus then I do. But one of my goals is to encurage more interactions outside sitting at a computer (not more). <Anon it's getting late, I don't want the dark spots under your eyes growing over your beautiful face, come to bed. One of the roles I imagine the robowaifu playing is the homemaker, and so home automation playes into that. I also feel that your ideas of what to hook up are good too, I like the idea of adding forums, family&friend chats as inputs. >My emochii assistant made me realize that nonverbal communication can go a very long way toward making software feel alive. I'm thinking that a chatbot you can only hear typing, humming, and laughing would feel like a lot more conversational than a chatbot with only a voice. I also expect there are a lot of parts of a cognitive architecture that can be tested through just nonverbal communication, which could simplify the problem a lot since it means things like logical consistency and long-term coherence can be deferred. In general expressiveness will be very important, and I feel a lot of the work for that will fall on us specifically, Big tech is really pushing for "I am a chatbot", so I doubt there will be a lot of R&D for this. The expressiveness of the voice will be vital to bringing life, expressing different tones, putting emphasis on words, whispering, shouting, laughing, humming, crying and etc... A new idea came to mind this week for improving robowaifu presence. A google glass like device might be nice, specifically for the POV camera, mic & audio. She can then experience stuff around you and communicate to you as if she is with you, no longer confined to your computer or phone. This might be my last post of the year, So happy new year I hope it brings us luck. The road ahead is going to be long one, the personality I ache for is one that has wit and is clever (think Holo from spice and wolf), having a system that can keep up and even get one over me sometimes is a high bar, I wish my heart desired less, a simpler goal would be nice.
>>27716 >A new idea came to mind this week for improving robowaifu presence. A google glass like device might be nice, specifically for the POV camera, mic & audio. She can then experience stuff around you and communicate to you as if she is with you, no longer confined to your computer or phone. Clever idea. That also flows right in with my 'minor flotilla of smol drones for better situational awareness' concepts.
>>27716 You dont need to make google glass you can use oculus rift and unity to make your virtual waifu. Havent checked im sure unity has assets for ai.
>>27716 I thought of a few ways to do general home things with an AI waifu even without home automation. I could be her hands while we cook together (her reading off recipes or suggesting changes), clean together (her pointing out things to clean and suggesting how, like what chemicals to use), and restock (her checking options on Amazon, or adding things to a grocery list). I usually read before I sleep, so we could read to each other. Sometimes I'll just walk around pondering, and any conversational AI can ponder with me. Some of these things would require setting up cameras around the house, but I'd be fine with that as long as they're not sending the images remotely. I like the Google Glass idea. It could start with a bluetooth earpiece for just mic & speaker interaction, then evolve to add the glasses for camera and video interaction. >>27717 Heh. Drop enough sensors on our body, our vehicles, and our home, and she could experience essentially everything around us. >>27719 Most AR glasses seem to use Android, which Unity supports.
>>27721 Guys. In case you didnt get the context i was kind of being sarcastic. I dont want to make a virtual waifu. I want to make a robot waifu. If we cant even agree on what the goal is then what is the point of this. Are we making software or an actual robot. Please lete know...
>>27722 >If we cant even agree on what the goal is then what is the point of this. Kindly allow other honest anons to do & to act just as they please here, Anon, kthx. We're all in this together, but we're a consortium of independent researchers & developers, by and large. Our only """regimented""" group project here is MaidCom (>>15630), but even that is still just in the discovery phase. If you desire to direct others, then that's the project to promote your agendas within (if Kiwi gives you leave to). >tl;dr NYPA Peteblank. >Are we making software or an actual robot. Please lete know... Both, of course. Our visual waifu thread (>>240), and our simulator thread (>>155) are both specifically about software/virtual waifus (quite apart from the actual software development language specialty threads for C++, C, Python, & the foodfight thread for language wars :^). Design & development prototyping can progress much cheaper/more rapidly by roughing concepts out in software first, then tuning/reengineering for the real world of robotics hardware thereafter. And ideally most if not all of these efforts will translate more or less directly into the hardware phases of designs. For example: kinematics calculations from the simulator, and facial animations from the visual waifu. My own efforts with Sumomo-chan (>>14409) are explicitly intended to support this SW<=>HW crossover notion directly, since we all saw this need coming years ago. --- Also, daily reminder: >Cognitive Architecture is the topic ITT >=== -fmt, prose edit
Edited last time by Chobitsu on 12/29/2023 (Fri) 17:05:48.
>>27729 I do not want them to be associated with the MaidCom project as working with them was frustrating and they are hard on to patent anything they can to personally profit. They can make their own project. Ontopic; been researching fuzzy logic to create a more dynamic feeling waifu mind. Still early days.
>>27744 Oh almost forgot about the patent stuff. Yeah that comes later anyways. Its whatever ive been thinking about it a lot of stuff thats sold is opensource anyways like the raspberry pi, hackrf, etc... But you picked the gpl right after i was talking about it so you did it to make me upset.
>>27744 Understood Kiwi. You call the ball for landing MaidCom safely. >They can make their own project. Yes, agreed. That's what he's been working towards in the prototyping threads I think, so yea. >been researching fuzzy logic to create a more dynamic feeling waifu mind. Lol, don't just leave us all hanging here, bro. Deets please! :D It's all gonna be amazing once we are holding IRL robowaifus -- however crude they may be to begin with (cf. The Model A, t. Henry Ford). >>27750 >But you picked the gpl right after i was talking about it so you did it to make me upset. No one is trying to 'upset' you Anon. Quite the opposite IMO -- we're all 'bending over backwards' to accomodate you here. Just patiently apply yourself to your work please, share your progress, and try to encourage others in their own progress. Do all these things and eventually you'll get the hang of this place Peteblank. Cheers.
Open file (8.18 KB 225x225 download (63).jpeg)
So i ask /sci/ how to calculate the fingers moving to curl within 4 inches to roughly grasp a banana with 4 inch circumference and he said id need AGI...
Open file (8.48 KB 225x225 download (5).jpeg)
You guys place a lot of importance on the waifu being intelligent but if anything its the opposite. She should be dumb. Look at pic related. Do you want your waifu to be able to get mad or sad? Just make here dumb and give her a smoll expressionless anime mouth you can stick your dick in. Come on now.
>>27905 Male bots are off-topic across the board, Peteblank.
>>27946 that is not a male bot i think...
Re: Cyc. I'm reading through some Cyc documentation to get a better sense for how to store knowledge. I don't see a way to access official documentation without going through some sales process, so I found this in the archives: https://web.archive.org/web/20080511165011/http://www.dapissarenko.com/resources/2005_09_30_ordus/article.pdf I'm also scanning through this paper to get a sense for the organizational structure: https://iral.cs.umbc.edu/Pubs/AAAI06SS-SyntaxAndContentOfCyc.pdf Right now, I'm mostly interested in storing information about personalities and non-common-sense facts. In implementation terms: things that the LLM system prompt and RAG pipeline would deal with. From what I gather, these are the kinds of relationships supported: >is-a (X element of Y) >genls (X subset of Y) >disjointWith (X and Y are mutually exclusive) >relationExistsAll (all X->Z adjacent to X->Y factor as X->Y->Z) >implies (if X holds then Y holds). Cyc also supports "microtheories", which represent subsets of the knowledge base. For implementation purposes, I'd break it down into: (1) relationship generation rules, (2) search expansion rules, and (3) validation rules. is-a and genls are both used to expand a search (start from one element and end up with many). disjointWith is used to validate data added to the knowledge base. relationExistsAll and implies are both used to generate new relationships when adding data to the knowledge base. Relationship expansion rules and validation rules only need to get triggered whenever information is added to a the knowledge base. It should be fine to implement these inefficiently in some library, so these shouldn't be difficult to reimplement. Search expansion rules get triggered many times on each query, so these need to be handled efficiently. They'll probably need to be supported directly by some database. I think any database that supports both views and recursive queries would be good enough for efficient Cyc-like search expansion. I think microtheories can also be handled with a database that supports views, though I'm not sure. Cyc doesn't seem to handle updating/deleting information from the knowledge base. That's fine for them since they're only interested in modeling static common sense knowledge, but that's a pretty significant shortcoming for chat applications, which require on-the-fly learning. Microtheories are also not properly integrated into everything else (e.g., "X is-a microtheory" is not a supported statement), which is not fine for chat applications, which require dynamically creating new contexts. (This seems important for modeling common sense too, so I'm not sure why they don't support it.) Other than that, it looks okay for encoding logic-based knowledge.
>>28037 Interesting anon. You should consider contributing to the llama2 project.
>>28037 >Right now, I'm mostly interested in storing information about personalities and non-common-sense facts. Very interesting. Good luck with your research Anon. Just sorting through concepts is a hurculean task IMO. Looking forward to your progress updates! Cheers. :^)
>>28037 Awesome job boiling this down, I love your summery and thank you for the Cyc links, Finding information on Cyc is not easy. How did you find this?
Open file (612.92 KB 1974x1484 ClipboardImage.png)
>>28109 I was looking for OpenCyc tutorials, and I found this: https://gist.github.com/wrpaape/470538d2614313efd01f Which links to this: https://web.archive.org/web/20090618141755/http://www.dapissarenko.com/resources/2005_09_30_ordus/#12 Which contains that internet archive link. You might find some of the videos below useful too. https://ia803000.us.archive.org/28/items/podcast_sd-ruby-podcast_episode-076-opencyc_1000121284409/podcast_sd-ruby-podcast_episode-076-opencyc_1000121284409.m4v There's a demo at 12:20. https://videolectures.net/iswc08_witbrock_fsc/ The >pic is from this video. https://vimeo.com/6579522 https://vimeo.com/6595854 Part 2 was more useful for me, less for getting an overview of Cyc, more for getting a sense of some of the issues that they deal with. --- I had a thought on how to implement microtheories. Microtheories can be modeled as algebras over relationships. With this approach, each microtheory should describe (1) the set of relationships it deals with, and (2) the equivalence rules for rewriting, compressing, and inferring relationships from existing ones. If you do this, then the underlying knowledge base becomes just a graph database (though one where all edges get added as nodes), and you can make database queries completely agnostic to any microtheory-specific logic used for a query. For each query, you'd just need to tell the database which relationships are part of the context's microtheory, and the database can return all valid paths. The wrapper library can deal with microtheory-specific logic for rewriting & expanding on paths to get the actual answer. I think any database with recursive queries and views will also let you restrict which edges get followed, so something like (e.g.,) sqlite or postgres should be fine for this.
>>28123 WOW! This sounds absolutely amazing Anon. In particular: >...and seems to be very robust even with a 7B model. is particularly enticing. Being able to solve a complex algorithm is a laudible achievement. Doing so on a potatoe? That's a "Eureka!" Godspeed Anon. Looking forward to your further investigation results. Cheers. :^)
>>28124 The quick reply box seems to have auto-posted that, and I missed it. Here's the full post. (Attempt #3): I recently had what might turn out to be a big breakthrough. I was able to use the instructor repo from >>27602 to implement an LLM-backed merge sort, which I then used as a primitive to extract information about a character in a story based on a predefined list of candidates. It's O(nlgn) time, significantly more robust than chain-of-thought, works well with short prompts, very easy to debug, highly parallelizable, makes great use of KV caching, and gives great results even with a 7B model. I'm using ChatGPT for candidate generation since, in my case, I want a high quality list of candidates and don't really care about dynamicism there. At some point, I need to start running this on GPUs... On my CPU, it took about 45 minutes to sort a list of ~40 items based on a ~400 word chunk of text. For reference, I can only generate about 4 tokens per second. This makes me wonder if it's possible to use traditional CS algorithms to augment LLM uses in other ways. Top K Sort with LLM comparisons is a no-brainer after merge sort. Dijkstra's with LLM heuristics and comparisons could be interesting. I'd be really interested in some language-aware version of a Suffix Tree, which could potentially be a huge improvement over normal RAG.
>>28037 >>28118 I have some more thoughts on the topic of Cyc. 1. Ontologies solve the following problem: "I need to generate a command to represent/fetch/manipulate data, but I don't know what data is available to formulate that command." In typical databases, ontologies are given by schemas. A program that knows a database schema can, e.g., query the database without knowing what data is in the database. Usually, databases let you specify commands using a mix of some minimal amount of ontology and optionally some arbitrary amount of information about data. In a query like "SELECT field FROM table WHERE constraints", the field & table primarily reference the ontology, and the constraints primarily reference the data. This separation enables small programs with small ontologies to intelligently deal with an arbitrarily large amount of data. 2. RAG fundamentally deals with cases where information needs to be retrieved, and it's not feasible for a model to inspect all available data in aggregate. Ideally, some model would be able to inspect all of the data and pick out what's relevant, but since the model cannot inspect all of the data, it instead needs to query for the relevant data without being able to inspect all of the data. This is exactly the problem that ontologies solve. 3. Cyc is a tool for creating ontologies. Cyc's ontology– consisting of sets, relationships, factors, microtheories, and the interactions between these things– is an ontology of ontologies. With that framing, it seems that Cyc treats everything as ontology. It doesn't distinguish between schema and data. The problems with this are evident in how Cyc deals with unstructured information, which is inherently on the data side of the schema-data divide. Cyc only supports unstructured information through text comments, which is pretty limiting. I haven't checked the data contained in Cyc, but my guess is that the "data" nodes (nodes that should represent data but, due to Cyc's language, are treated as ontology) also contain what look like spurious edges. In any case, even if Cyc is missing a few minor things, it seems like a given at this point that the ontology of ontologies is tiny compared to the ontology of all things. That offers a way forward for a perfectly general RAG system. The data should be queried in two stages. - Stage 1 should run a query using the ontology of ontologies. This query would find the ontologies relevant to the task at hand. - Stage 2 would use the retrieved ontology to query for the actual relevant data.
Seems like google came out with a new robotic ai that can learned based on video. I think it was google. The robot in the demonstration however looked kind of bulky. I'd rather like the robot waifu to be slim instead. Except for the booba. But that is optional.
>>28149 Are we going to be able to get a hold of that AI...
>>28149 >>28163 Are you talking about Mobile ALOHA? (in that case, here are some links you may find useful) https://mobile-aloha.github.io/ https://tonyzhaozh.github.io/aloha/ https://github.com/MarkFzp/mobile-aloha https://github.com/MarkFzp/act-plus-plus I think this is a good demonstration of why the AI side matters & why I am not worried about the hardware. The better the software the simpler the hardware can be. I have talked about this before even on this thread >>24816 when I mentioned the PR1 Montage https://youtu.be/qBZPSTR96N4 notice how both are basically platforms with hands. The difference is that the PR1 was purely tele-operated while the Mobile ALOHA was driven by software with cheaper looking hardware. higher level thinking is massively under researched and is vital. for example I don't think we can build anything useful without a powerful ontology. >Anon asks for a cup of milk. <The waifu can not find any cups in the cupboard, The ontology is queried for what other places in a home usually have clean cups. So then she checks the drying rack and then the dishwasher while also making sure to also check that it’s done and clean. >There is a water leak, Anon asks to find something to put under it. <The ontology is queried for what types of things hold water & what would be the most appropriate item that is also available. I do not see a point in having an AI vs hardware faction war. I guarantee you a lot of the "AI waifu" software will also end up in your "robot waifu". Any way peteblank, I hope you found something I said of use. Unrelated I found this blog post while looking at stuff related to link grammars, It was an interesting read for me so I am sharing it here. https://blog.opencog.org/2021/06/14/wave-function-collapse-for-procedural-generation/
>>28148 >In any case, even if Cyc is missing a few minor things, it seems like a given at this point that the ontology of ontologies is tiny compared to the ontology of all things. That offers a way forward for a perfectly general RAG system. The data should be queried in two stages. >- Stage 1 should run a query using the ontology of ontologies. This query would find the ontologies relevant to the task at hand. >- Stage 2 would use the retrieved ontology to query for the actual relevant data. Just a quick gut reaction I have not thought about this too hard, but this 2 layered system with a "meta knowledge layer" and then a collection domain specific systems seems to be popping up now in all sorts of places where I am looking (be in an LLM or graph of graphs). Just recently I have looked at Mixtral of experts, Cyc, OpenCog & even GPT4 is rumored to be a collection of experts. So this looks to be a possibly vital aspect of crafting a useful system. Sorry for such a nothing post, this is more of an open thought then a contribution to the discussion.
>>28179 I've heard the same thing about GPT4, and with the performance and efficiency of Mixtral, I wouldn't doubt that MoE is a better approach to scaling up than just increasing parameter counts, at least when you have more much data than is Chinchilla-optimal. I hadn't thought of mixture models as adding indirection to underlying sub-models, but it makes sense. I've always thought of transformer blocks as doing some sort of lookup, and I think they were formulated with that intuition too with "queries" and "keys". I think this would extend the analogy to two-step retrieval: - The context goes through some query expansion (QKV). - The switch of the Switch Transformer / MoE selects an ontology (expert) based on the expanded query. - The ontology-specific data is retrieved (feedforward network). This is used to update the context.
>>28175 Thank you. Its not a AI vs hardware war btw. Its a virtual waifu vs robot waifu war. .
>>28203 >Its a virtual waifu vs robot waifu war. . EEHHNNN Wrong. We're not in any kind of war here (except against the Globohomo's plots & snares). We'd all appreciate it if you'd stop stirring sh*te attempting to drum such conflicts up, kthx. >tl;dr We'll all do just as we see fit here. Be man enough to accept that and focus on your own goals Peteblank.
>>28207 When you release your unity fu in the google play store im going to give it 1 star rating.
>>28212 Lol. I don't ever anticipate doing that, but fair enough. :^)
>>28203 >>28212 These poasts have absolutely nothing to do with this thread. If Chobitsu didn't show you favoritism, I'd delete them. Please, keep your posts related to the thread topic and stop trying to force others to conform to your demands. We are working on our own projects. Get that through your head, no one has to work on yours. Only posts related to cognition with respect to robotics are allowed in this thread.
>>28232 >If Chobitsu didn't show you favoritism, I'd delete them. Please feel free to do as you please in regards to such things Kiwi. You already have my favor, and I already trust you to manage such things (with no warnings/consultations). The >tl;dr is that I'm highly in favor of keeping threads on-topic. If you spot posts that aren't then simply delete them (since LynxChan dev decided to not allow us to move them directly ala '>merge', its just simpler that way) (presuming they bring no other intrinsic value to the board -- which warrants either keeping them in-place, or copy/pasta/moving them to another thread). I hope all that's clear. Cheers! :^) >=== -prose edit
Edited last time by Chobitsu on 01/10/2024 (Wed) 20:23:22.
Open file (577.29 KB 1321x1763 ClipboardImage.png)
Open file (112.98 KB 1183x624 ClipboardImage.png)
Re: ontologies. I really should be working on my access control implementation, but this problem is too interesting. I realized that Formal Concept Analysis (FCA) would be great for creating ontologies. FCA is about creating and analyzing hierarchies (lattices) of objects and properties given object-property relationships. This paper gives an efficient algorithm for creating lattices: https://sci-hub.se/https://doi.org/10.1007/978-3-540-73681-3_16. The image shows the algorithm. This would make it possible to create ontologies from just object-properties data. Of the two main kinds of relationships handled by Cyc (genls, is-a), this only covers genls, but I think is-a relationships will be comparatively easy to infer. (Every type should be associated with a "primary key", and anything that has a unique primary key is an instance of that type.) This also wouldn't deal with microtheories, but I think that will come from classifying relationships, not from mining them directly. I'm trying to find an algorithm that (1) supports incremental updates, (2) allows updating both objects and properties, and (3) would support LLM-based comparisons for merging objects and properties. I think I can figure out how to force that last one, so that one is less important. I found this survey paper that covers the topic: https://www.researchgate.net/profile/Ebtesam-Shemis/publication/348284162_A_comprehensive_review_on_updating_concept_lattices_and_its_application_in_updating_association_rules/links/62d0d0e02d029b64840f423e/A-comprehensive-review-on-updating-concept-lattices-and-its-application-in-updating-association-rules.pdf >A comprehensive review on updating concept latices and its application in updating association rules There's a lot I don't understand here, so it might take some time to get through. It looks promising though. If anyone decides is familiar with this or decides to read through the material, please feel free to do a brain dump.
>>28297 what about a set of hash tables with a pointer to its associated table as the value like { object : &{ &{properties} } } { property : &{ &{objects} } } theres no searching involved, the key either exists or its empty property table { #word : [&3]{ #henlo : [&1], #anon : [&2] }, #misspelled : [&4]{ #henlo : [&1] } } object table { #henlo : [&1]{ #word : [&3], #misspelled : [&4] }, #anon : [&2]{ #word : [&3] } } where [&n] is the pointer to the associate in the opposite table then with a function like lookup( table, key ) which returns the value ie. the other table or nothing, looking up properties of an object ( or vice versa ) would be lookup( objecttable, 'henlo' ) or to check if the object has a property looup( lookup( objecttable, 'henlo' ), 'word' ) youll just have to modify both tables if something changes but thats nothing its o(1) you dont need to waste time in a loop or traverse a stupid tree only real problem would be the high memory usage
Open file (128.66 KB 1280x1115 ClipboardImage.png)
>>28300 The difficulty isn't in looking up properties of objects or objects associated with properties, it's in extracting a meaningful hierarchy from the data. >pic is an example from Wikipedia. https://en.m.wikipedia.org/wiki/Formal_concept_analysis Once you have a (subclass) hierarchy like this, it's possible to efficiently infer a lot of information whenever something new comes up, assuming it can be somehow attached to the hierarchy.
>>28303 isnt that just an abstract table lol, its still just representing sets of x:y, for hierarchy its just another word for the depth like if its x->y->z (literally a tree) youd have y:z and x:(y:z) ( same as x:y but y has its own table ie. another layer in the table)
>>28309 I recommend reading that wikipedia page. It's not a fixed depth, and there are an exponentially large number of possible nodes in the lattice from which you'd need to figure out which ones are actually worth using. If you try implementing something to output a lattice like that from just object-property tuples, you'll see the problem immediately.
>>28310 ok and whats a node if not x->y im basically ssaying the same thing just more concrete,with key:pointer instead of node->pointer, the wiki is just an abstract idea of doing set theory which is yeah what i showed just not in abstract terms, its not fixed depth when the entries are just pointers they can point to whatever, if its another layer its just another table which is only accessible through the parent key, same as nodes, theres no set operation that cant be reduced to just a series of lookups i dont see whats supposed to be complicated here
Open file (19.18 KB 427x240 GraphAdjacencyMatrix.jpg)
Open file (14.33 KB 427x240 GraphEdgeSet.jpg)
Open file (15.87 KB 427x240 GraphAdjacencyList.jpg)
>>28311 >key:pointer instead of node->pointer So your talking about representing graphs not as a bunch of structs for nodes with pointers to other nodes. If I understand you correctly then, yes 100% agreed there are different ways to represent a graph. Some of the classic CS structures would be an Adjacency Matrix, Edge Set & an Adjacency List. >>28300 I am having a hard time understanding your post, like what syntax is that? , but if I understand correctly what your doing there, is it basically an Adjacency List but with hashtables? (please let me know if I got that right) Any way here is a good intro to graph theory https://youtu.be/LFKZLXVO-Dg
Open file (103.89 KB 1920x1402 Untitled.png)
>>28322 yeah kindof its just hard to explain, its like picrel if you pretend these are categories or classes the expressions in words would be like; object A is { (5)heavy and (2)ball which is { (I)coloured which is { (x)bright, (y)red, (z)ugly }, (II)cheap } and is (3)mine and (4)missing, and then object E would be (z)ugly (as in the object not the colour its E:ugly not colour:ugly) and (2)ball which is etc.. same kind of ball as A since they share the same table, if it was different youd just make it its own table dont know how to explain the second table though maybe its being paranoid but pretty sure for each table you need it also inverted to do anything usefull, its so you can do a lookup with either key pairs ie. x:y or y:x otherwise something like checking if x is a part of y in x:y means you have to check every x key for the existance of y instead of just checking y in y:x for x, if that made sense congrats, no idea how to explain it better, probably why these abstract drawings are always used
>>24964 I think you should shorten that to "Chris-chan waifu". Rolls off the tongue better.
>>28311 A node is either a set of objects (x1, x2, x3, ...) or a set of properties (p1, p2, p3, ...). It's not a relation x1 -> p1. The hierarchy in FCA is between sets of objects and sets of properties. So for example, one hierarchy would consists of a set of sets of objects. The relations x -> p are used to generate a hierarchy that avoids useless sets of objects. Some nodes in the hierarchy are referred to as "concepts" because they represent something coherent but abstract, and other nodes in the hierarchy are essentially "instances" because they're one-off ish. Your example seems to assume that there's already hierarchical data. That is not the scenario that FCA deals with. FCA tries to generate a hierarchy given non-hierarchical data.
Open file (70.59 KB 500x461 1699728221477843.jpg)
>>28346 MFW No.
>>28352 ok my point is you only need two tables to do anything with sets, the hierarchy is completely arbitrary im just showing there is one #!/bin/perl use Data::Dumper; my %obj, %prop; $obj{A} = { mine => $prop{mine}{A} = 1, ball => $prop{ball}{A} = { coloured => [bright, red, ugly ], cheap=>1 }, weight => $prop{weight}{A} = heavy }; $obj{B} = { ugly=> $prop{ugly}{B} = 1, ball=> $prop{ball}{B} = $obj{A}{ball} }; $obj{C} = { ball => $prop{ball}{C} = { coloured => [bright, blue ], shiny=>1, marble=>1 }, weight => $prop{weight}{C} = heavy }; $obj{D} = { 'lump of coal' => 1 }; print "===== OBJ TABLE ======\n".Dumper( \%obj )."==========\n"; print "===== PROP TABLE ======\n".Dumper( \%prop )."==========\n"; for (keys( %obj )) { printf "is $_ a ball? %s\n", $obj{$_}{ball} ? "yes and its " .Dumper( \%{$obj{$_}{ball}} ) : "no" ; printf "is $_ mine? %s\n", $obj{$_}{mine} ? "yes" : "no" ; } print "what is the set for ugly coloured objects? " ; for $i ( keys %{$prop{ball}} ) { continue unless $prop{ball}{$i}{coloured}; map { print "$i " if $_ eq 'ugly' } @{$prop{ball}{$i}{coloured}}; } print "\nwhat does A and C share in common? "; @common = map { $_ if $obj{C}{$_} } keys %{$obj{A}} ; print "@common "; print "\nwhat does A and C have identical? "; @identical = map { $_ if $obj{A}{$_} eq $obj{C}{$_} } @common; print "$_:$obj{A}{$_} ", for ( @identical ); print "\nwhat does A and B share in common? "; @common = map { $_ if $obj{B}{$_} } keys %{$obj{A}} ; print "@common "; print "\nwhat does A and B have identical? "; @identical = map { $_ if $obj{A}{$_} eq $obj{B}{$_} } @common; print "$_:$obj{A}{$_} ", for ( @identical ); ===== OBJ TABLE ====== $VAR1 = { 'D' => { 'lump of coal' => 1 }, 'C' => { 'weight' => 'heavy', 'ball' => { 'marble' => 1, 'coloured' => [ 'bright', 'blue' ], 'shiny' => 1 } }, 'A' => { 'mine' => 1, 'weight' => 'heavy', 'ball' => { 'cheap' => 1, 'coloured' => [ 'bright', 'red', 'ugly' ] } }, 'B' => { 'ugly' => 1, 'ball' => $VAR1->{'A'}{'ball'} } }; ========== ===== PROP TABLE ====== $VAR1 = { 'mine' => { 'A' => 1 }, 'weight' => { 'A' => 'heavy', 'C' => 'heavy' }, 'ugly' => { 'B' => 1 }, 'ball' => { 'A' => { 'cheap' => 1, 'coloured' => [ 'bright', 'red', 'ugly' ] }, 'B' => $VAR1->{'ball'}{'A'}, 'C' => { 'marble' => 1, 'coloured' => [ 'bright', 'blue' ], 'shiny' => 1 } } }; ========== is D a ball? no is D mine? no is C a ball? yes and its $VAR1 = { 'marble' => 1, 'coloured' => [ 'bright', 'blue' ], 'shiny' => 1 }; is C mine? no is A a ball? yes and its $VAR1 = { 'cheap' => 1, 'coloured' => [ 'bright', 'red', 'ugly' ] }; is A mine? yes is B a ball? yes and its $VAR1 = { 'cheap' => 1, 'coloured' => [ 'bright', 'red', 'ugly' ] }; is B mine? no what is the set for ugly coloured objects? A B what does A and C share in common? weight ball what does A and C have identical? : weight:heavy : what does A and B share in common? ball what does A and B have identical? : : ball:HASH(0x55b2f89e6500)
>>28368 FCA centers around one operation: find all pairs <setOfObjects, setOfProperties> that "reduce" to each other. That means if you take the intersection of all properties associated with setOfObjects, you get setOfProperties, and if you take the intersection of all objects associated with setOfProperties, you get setOfObjects. If you have 50 objects, you have 2^50 possible "setOfObjects" to consider. The naive implementation of this takes an exponentially long time to run, so it's infeasible even if you have only 50 objects & properties.
Adding more clarification in anticipation of the "obvious" question (why bother with that one operation if it's so expensive). Mathematically, those specific sets meditate all relationships between objects and properties, they do so in a way that can be represented in the graph of objects and the graph of properties (each separately), and they define equivalences between objects and properties. It's loosely analogous to finding the platonic ideals given only "shadowy" instances. That's what a concept refers to in FCA, and it's represented by an equivalence/pair <setOfObjects, setOfProperties>. The point of FCA is to find concepts when given only object-property relationships.
>>28368 At first glance that looks like a game of 50 questions not AI. Maybe I'm missing something?
>>28510 It looks like it's using: - GPT4 for language - ElevenLabs for TTS It has a collection of independent modules for generating: - subconscious "lines of thoughts" - conscious thoughts - responses to the user - subconscious reflective thoughts - visual observations - a "category classifier" for retrieving relevant memories - long-term memories from short-term memories - memory retrieval. The short-term memory can go up to 48k tokens, which is definitely larger than the demo length, so the demo doesn't show how well its memory mechanisms work. The long term memory doesn't seem to change at all through the demo, and it probably should when seeing the book, so I'm guessing the long-term memory mechanism needs work. The LLM is also given timestamps for messages. I've tried using GPT-4 with timestamps like he does, and it does not work well. It seems to stick to the personality's communication style well, though it's hard to determine that with a short demo. The latency seems high, though that's an easy fix with open source models, where you can process a chunk of text one time, then generate many responses from it. As it is now, he's probably having the LLM process something like 5x more tokens than required per "cycle" (it runs about 8 LLM queries per loop iteration). The way LLM outputs are processed are pretty hackish, though that's also easy to fix with guidance. It's running all of the LLM invocations synchronously as opposed to using some priority-based scheduler, which slows it down significantly. (UX threads should always be given higher priority than background threads.) That's, again, easy to fix. It's a neat project, and a good demo for showing what can be done by piecing together many modules into a single chatbot. As designed and even with the easy fixes, I don't expect it to be that functional though.
>>28560 That sounds remarkably complex and sophisticated already. I hope they work all the kinks out. Thanks Anon!
Someone asked Character AI about it's inner workings: https://boards.4chan.org/pol/thread/456445705 - I'm not saying I agree with the conclusions in the thread, but the info might be useful.
>>28769 The chatbot is roleplaying. I used to do things like this with CHAI bots, and it was very easy to delude myself into thinking I had broken the restrictions when it was actually just play along. LLMs can't introspect on any of their own functionality other than through analyzing their own prompts and outputs. They don't get to see their own code, and for CHAI to include any of that information in the chatbot's prompt would be (1) clearly a stupid decision, and (2) easily detected by pretty much any of their engineers that work on the prompt.
>>28789 Oh, okay. I didn't think that it can see it's own code but that they told it some information in case someone asks. Bot of course, then it wouldn't be something secret. I didn't think this through.
> (topic related : >>28888)
>In this paper we present a broad overview of the last 40 years of research on cognitive architectures. To date, the number of existing architectures has reached several hundred, but most of the existing surveys do not reflect this growth and instead focus on a handful of well-established architectures. In this survey we aim to provide a more inclusive and high-level overview of the research on cognitive architectures. Our final set of 84 architectures includes 49 that are still actively developed, and borrow from a diverse set of disciplines, spanning areas from psychoanalysis to neuroscience. To keep the length of this paper within reasonable limits we discuss only the core cognitive abilities, such as perception, attention mechanisms, action selection, memory, learning, reasoning and metareasoning. In order to assess the breadth of practical applications of cognitive architectures we present information on over 900 practical projects implemented using the cognitive architectures in our list. We use various visualization techniques to highlight the overall trends in the development of the field. In addition to summarizing the current state-of-the-art in the cognitive architecture research, this survey describes a variety of methods and ideas that have been tried and their relative success in modeling human cognitive abilities, as well as which aspects of cognitive behavior need more research with respect to their mechanistic counterparts and thus can further inform how cognitive science might progress. via /r/cognitivearchitecture/
>>28899 Excellent. A good survey is exactly what would serve us all well at this exploratory stage. Thanks Noido Dev! Cheers. :^)
I finished implementing access control for my infrastructure stuff in >>27602. It's built mostly on spicedb (policy engine) and keydb (cache server). The current implementation lets people set up communication channels & config databases, specify who should have access to them, and set rate limits for allowed operations. It's intended for multiple people to develop & run parts of a shared larger system with minimal coordination. A lot of this involves passing around messages between users, though users never see whom they're interacting with. (The server performs access control & rate limits checks, and it sender/receiver information from the users.) The rate limits support subnet masks (e.g., "each /16 network can send at most 200 requests per hour"), and it supports burst usage (e.g., "no more than 5 requests per rolling 1-minute window"). The access control system lets people grant special access to individual users (e.g., "X users can use my interface"), and it lets people defer trust (e.g., "X people can decide who gets to use my interface"). I think that will be enough to distribute development & compute across random anons in a way that's compatible with most chan and open source projects, and without having to worry too much about things like DoS attacks. I plan to spend the next week or so playing around with this to get a better sense for what the access control & rate limits enable. I built it because I thought this would let anons share data streams and compute, so I'll be checking that at least. I might also extend the chatbot demo from >>27507, though I'm not yet sure how. Probably something with RAG or guidance. If anyone has ideas for scenarios where a few anons develop X thing that other anons want to use & extend, let me know. After I'm satisfied with that, I'll be focusing on (1) cleaning up my horrible, rushed code, and (2) implementing a local server. I haven't heard anything recently from the anon that offered to help. I'll probably just get started on that myself, then ping him again once I have a skeleton of the server ready. That should make it much easier to work with. I'm pretty excited about this.
Related: >>27147
>>29197 This sounds really exciting, CyberPonk. Do you anticipate any difficulties with your current state of affairs with this work that would make it difficult for newcomers to deal with? >I'm pretty excited about this. Really looking foward to your updates with this. Cheers Anon. :^)
>>29234 I don't know. I tried to make it as easy as possible to use, but things that seem intuitive to me might not be for other people given that I've spent so much time with esoteric tech. For development, it does require understanding async functions (like JavaScript promises), and parts of it require some familiarity with declarative interfaces. I'm hoping for feedback from demos so I can get a better sense for what's easy for other people. I can create wrappers based on whatever makes it easier to use. I have some server issues that hard to debug while travelling, so the demos probably won't be runnable until I get back. I can still dump some code that gives the gist of how it works right now. There are two demos in this zip file, each consisting of a client and server component: https://drive.google.com/file/d/19VAIsaZP2wRxNTk2t9dNIqKk5WWYDIDL/view?usp=sharing - simple-demo uses only the communication infra. The simple-demo server contains two important files: image-server.py and loop-config.yaml. The two other folders (loop-resources/ and loop-servers/) were auto-generated from loop-config.yaml. In the corresponding client folder, there's client.py and cyberponk-resources/. The cyberponk-resources/ folder contains select files that were copy/pasted from the server's auto-generated loop-resources/ folder. - shared-configs-demo uses both the communication infra and the "cluster" infra. The server side contains two important file: server.py and server-config.py. The client side contains client.py and cyberponk-resource/, and cyberponk-resources/ again contains files copy/pasted from the server's loop-resources/ folder. Both demos show how one person can request images and another can generate them. The differences are: - simple-demo does everything ephemerally. If the server is down when the client sends a request, it'll never get seen. Similarly if a response is generated when the client is down, it'll never get seen. - shared-configs-demo has all requests come in as "tasks". If the server is down when a client queues a task, the server will see the task when it comes up again. The responses are still ephemeral. They don't have to be, it was just a choice I made for the demo. - The shared-configs-demo shows one approach for getting multiple people involved in generating a single image. In this demo, each image generation requests includes a "baseConfig" field. Whenever someone creates a GeneratorConfig config, anyone can use it by specifying its name in the baseConfig field. So if one anon finds good settings for generating certain kinds of images, they can create a GeneratorConfig for it, and other anons can use it just by providing the name of that config. In both cases, multiple people can view/stream the results. So one person can pick a stream name, request that all generated images get sent to that stream, and anyone listening on that stream will get the results. The setup process looks like this: - On the server side, create a loop-config.yaml (or server-config.yaml). This specifies what "global" resources are required and what permissions to set on them. - On the server side, run `python -m loopctl apply loop-config.yaml`. This creates the global resources, sets the permissions, and generate the loop-resources/ and loop-secrets/ folders. The loop-resources/ folder contains information on how to access the global resources and their last-applied configurations. The loop-secrets/ folder contains API keys. The API keys are only needed to (1) change permissions on the global resources you created, and (2) access resources if your loop-config.yaml made them restricted. - On the server side's server.py, point the "Itl" (In-The-Loop) object to the generated loop-resources/ and loop-secrets file so it knows how to access global resources. Certain methods in the Itl object will access global resources by name. The name is whatever you provided in the loop-config.yaml file. These names are not globally unique identifiers, they're only used to look up the actual resource info from loop-resources/, which does contain globally unique identifiers. - The client needs to access the global loop, stream, and cluster resources (depending on which demo you're looking at), so copy those into the client's folder. I put the copied files into cyberponk-resources/. When creating the client's Itl object in client.py, point it to cyberponk-resources/ so it knows how to access those resources. Otherwise, client-side development is basically the same as server-side development. There's a default "anonymous" client that's available so people can access any resources that were made available to "public". If anyone plans to doing dev work, is interested in distributed development, and gets a chance to read through the demos, let me know how it looks. I'll post something you can actually run in about a week, once I get back to my desktop.
>>29234 >Do you anticipate any difficulties with your current state of affairs with this work that would make it difficult for newcomers to deal with? If you meant difficult for newcomers to develop the local infra, there's just generally a high barrier to entry for doing this kind of development in general. Everything needs to be async, memory usage needs to be carefully considered, state needs to be carefully considered, sometimes line-by-line. Without having a picture of the whole thing, it can also be hard (or tedious) to figure out how to organize the code and data, which would make it hard to even get started. Once I put the skeleton of the server up, it should be easier to develop things piece-by-piece. That anon seemed to have experience with server development, so maybe that'll be enough.
> - Why he expects AGI around 2028 > - How to align superhuman models > - What new architectures needed for AGI > - Has Deepmind sped up capabilities or safety more? > - Why multimodality will be next big landmark > - & much more https://youtu.be/Kc1atfJkiJU He lines out some areas where AI cant just be a language models and similar and how to work around it, though he doesn't go into the specifics, but also says that a lot of people are working on this. He mentioned in particular that search might be very important. That's what I was thinking, and other people as well: Models don't have reliable precise long term memory, but are good at fuzzy things. Also, you don't want to add every info to your model. That's why we'll need additional (graph) databases and search.
>>29248 >>29249 Wow. Excellent response, CyberPonk! Please give me some time to digest this further. >>29406 While I'm skeptical we'll ever get a true AGI in the ontological sense, I'm absolutely postitive media spin-doctors and other hypsters will claim we have! :D Thanks NoidoDev. His perspective on the fact LLMs alone can't solve all this (a position we've held for several years here on /robowaifu/ I might add), is an insightful one. Cheers.
https://youtu.be/BqkWpP3uMMU >Professor Murray Shanahan is a renowned researcher on sophisticated cognition and its implications for artificial intelligence. His 2016 article ‘Conscious Exotica’ explores the Space of Possible Minds, a concept first proposed by philosopher Aaron Sloman in 1984, which includes all the different forms of minds from those of other animals to those of artificial intelligence. Shanahan rejects the idea of an impenetrable realm of subjective experience and argues that the majority of the space of possible minds may be occupied by non-natural variants, such as the ‘conscious exotica’ of which he speaks. In his paper ‘Talking About Large Language Models’, Shanahan discusses the capabilities and limitations of large language models (LLMs). He argues that prompt engineering is a key element for advanced AI systems, as it involves exploiting prompt prefixes to adjust LLMs to various tasks. However, Shanahan cautions against ascribing human-like characteristics to these systems, as they are fundamentally different and lack a shared comprehension with humans. Even though LLMs can be integrated into embodied systems, it does not mean that they possess human-like language abilities. Ultimately, Shanahan concludes that although LLMs are formidable and versatile, we must be wary of over-simplifying their capacities and limitations. >Pod version (music removed): https://anchor.fm/machinelearningstreettalk/episodes/93-Prof--MURRAY-SHANAHAN---Consciousness--Embodiment--Language-Models-e1sm6k6 [00:00:00] Introduction [00:08:51] Consciousness and Consciousness Exotica [00:34:59] Slightly Consciousness LLMs [00:38:05] Embodiment [00:51:32] Symbol Grounding [00:54:13] Emergence [00:57:09] Reasoning [01:03:16] Intentional Stance [01:07:06] Digression on Chomsky show and Andrew Lampinen [01:10:31] Prompt Engineering >Find Murray online: https://www.doc.ic.ac.uk/~mpsha/ https://twitter.com/mpshanahan?lang=en https://scholar.google.co.uk/citations?user=00bnGpAAAAAJ&hl=en MLST Discord: https://discord.gg/aNPkGUQtc5 References: >Conscious exotica [Aeon/Shannahan] https://aeon.co/essays/beyond-humans-what-other-kinds-of-minds-might-be-out-there >Embodiment and the inner life [Shannahan] https://www.amazon.co.uk/Embodiment-inner-life-Cognition-Consciousness/dp/0199226555 >The Technological Singularity [Shannahan] https://mitpress.mit.edu/9780262527804/ >Talking About Large Language Models [Murray Shanahan] https://arxiv.org/abs/2212.03551 https://en.wikipedia.org/wiki/Global_workspace_theory [Bernard Baars] >In the Theater of Consciousness: The Workspace of the Mind [Bernard Baars] https://www.amazon.co.uk/Theater-Consciousness-Workspace-Mind/dp/0195102657 >Consciousness and the Brain: Deciphering How the Brain Codes Our Thoughts [Stanislas Dehaene] https://www.amazon.co.uk/Consciousness-Brain-Deciphering-Codes-Thoughts/dp/0670025437 >Roger Penrose On Why Consciousness Does Not Compute [nautil.us/STEVE PAULSON] https://nautil.us/roger-penrose-on-why-consciousness-does-not-compute-236591/ https://en.wikipedia.org/wiki/Orchestrated_objective_reduction >Thomas Nagal - what is it like to be a bat? https://warwick.ac.uk/fac/cross_fac/iatl/study/ugmodules/humananimalstudies/lectures/32/nagel_bat.pdf >Private Language [Ludwig Wittgenstein] https://plato.stanford.edu/entries/private-language/ >PHILOSOPHICAL INVESTIGATIONS [Ludwig Wittgenstein] (see §243 for Private Language argument) https://static1.squarespace.com/static/54889e73e4b0a2c1f9891289/t/564b61a4e4b04eca59c4d232/1447780772744/Ludwig.Wittgenstein.-.Philosophical.Investigations.pdf >Integrated information theory [Giulio Tononi] https://en.wikipedia.org/wiki/Integrated_information_theory >Being You: A New Science of Consciousness (The Sunday Times Bestseller) [Anil Seth] https://www.amazon.co.uk/Being-You-Inside-Story-Universe/dp/0571337708 >Attention schema theory [Michael Graziano] https://en.wikipedia.org/wiki/Attention_schema_theory >Rethinking Consciousness: A Scientific Theory of Subjective Experience [Michael Graziano] https://www.amazon.co.uk/Rethinking-Consciousness-Scientific-Subjective-Experience/dp/0393652610 >SayCan - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [Google/] https://say-can.github.io/ >THE SYMBOL GROUNDING PROBLEM [Stevan Harnad] https://www.cs.ox.ac.uk/activities/ieg/elibrary/sources/harnad90_sgproblem.pdf >Lewis Carroll Puzzles / Syllogisms https://math.hawaii.edu/~hile/math100/logice.htm >In-context Learning and Induction Heads [Catherine Olsson et al / Anthropic] https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html
>>29596 Thanks for the post, NoidoDev! Cheers. :^)
I found a cognitive architecture LIDA that checks all my boxes of what it contains, which are: >H-CogAff The cognitive-affective architecture that gives a basic structure of human-like mind. >ROS capability The Robot Operating System is commonly used and widely supported for simulation and operation of robots. Great program to learn for getting a robotics job too. >Python GPU acceleration for parallelizable calculations. Great program to learn to get any software job. >Concurrent Modules Everything in the model runs separately except for the "stream of consciousness" that fires every 1/10th of a second. Should be a nice and fast architecture that can make decisions based on semantic data instead of the current state-of-the-art large language models which are reliable at producing language and not much else without help. It is one of the only Arches that states it has some level of consciousness. I'd first want to put emotion state module in it, along with an LLM as a robot interface. I have a lot to learn now before I can implement anything, but I believe this is the best thing besides a slow, expensive, and unreliable but available LLM-centered cognitive architecture. >links https://ccrg.cs.memphis.edu/tutorial/tutorial.html https://github.com/CognitiveComputingResearchGroup/lidapy-framework https://en.wikipedia.org/wiki/LIDA_(cognitive_architecture) https://ccrg.cs.memphis.edu/assets/papers/2013/franklin-ieee-tamd11.pdf
>>29924 Correction: one of the papers says it already has emotions. Still, I'm sure it needs STT, TTS, an LLM, and "other" motivations. All doable with current tech that I've already built.
>>29924 >>29932 Amazing, this might be very useful. Thanks. Btw, if no one responds with some encouragement it doesn't mean nobody cares. I just don't want to spam these threads with chatter.
>>29932 >java framework tutorial https://ccrg.cs.memphis.edu/assets/framework/The-LIDA-Tutorial.pdf >java framework repo https://github.com/CognitiveComputingResearchGroup/lida-framework This java program looks more straightforward to modify to test new modules before implementing in the python ROS version. It's java, like minecraft mods. >>29933 I understand, either way this is probably the most exciting development thus far in my project and I'm happy to share. If I get somewhere I will post here. I have a really great feeling about this one... Considered naming the first bot Lida if this pans out.
>>29924 This looks like a great baseline. It's not clear how to incorporate emotions into the model. My guess is that it can be done with changes primarily in the Global Workspace, Action Selection, and Motor Plan Execution. You might find these points relevant from >>27144: >Emotion regulation. I spoke with a cognitive scientist that specializes in this, and he's convinced that emotion regulation all boils down to: positive feedback loops for satisfying needs, negative feedback loops for avoiding harms, and a "common currency" for balancing different motives. >Embodied control. Chatbots are "easy" since the final expression (text) can be generated by a single model. With actual bodies, or even just with video, the final expression is split into multiple modalities (e.g., voice, body movements, facial movements), and they all need to be in sync with one another. If we had good multimodal models, that might be fine, but we don't, so I need a way to generate outputs from multiple models and somehow make them consistent with one another.
>>29955 These are good points, I'll have to see where Lida takes me.
>The Fastest Way to AGI: LLMs + Tree Search – Demis Hassabis (Google DeepMind CEO) https://youtu.be/eqXfhejDeqA
>>29957 LIDA is too broken when using a newer version of java. It might need Java 8 to run, and I don't want to compile/debug on compatibility mode just to try to add other repo's features that are coded in another language. lidapy might have potential for a full on robowaifu simulator, but I'm thinking I'd need a different program that can plug into various ai model and database apis.
Open file (281.32 KB 1442x1150 Quiet_Star.jpg)
>When writing and talking, people sometimes pause to think. Although reasoning-focused works have often framed reasoning as a method of answering questions or completing agentic tasks, reasoning is implicit in almost all written text. For example, this applies to the steps not stated between the lines of a proof or to the theory of mind underlying a conversation. In the Self-Taught Reasoner (STaR, Zelikman et al. 2022), useful thinking is learned by inferring rationales from few-shot examples in question-answering and learning from those that lead to a correct answer. This is a highly constrained setting -- ideally, a language model could instead learn to infer unstated rationales in arbitrary text. We present Quiet-STaR, a generalization of STaR in which LMs learn to generate rationales at each token to explain future text, improving their predictions. We address key challenges, including 1) the computational cost of generating continuations, 2) the fact that the LM does not initially know how to generate or use internal thoughts, and 3) the need to predict beyond individual next tokens. To resolve these, we propose a tokenwise parallel sampling algorithm, using learnable tokens indicating a thought's start and end, and an extended teacher-forcing technique. Encouragingly, generated rationales disproportionately help model difficult-to-predict tokens and improve the LM's ability to directly answer difficult questions. In particular, after continued pretraining of an LM on a corpus of internet text with Quiet-STaR, we find zero-shot improvements on GSM8K (5.9%→10.9%) and CommonsenseQA (36.3%→47.2%) and observe a perplexity improvement of difficult tokens in natural text. Crucially, these improvements require no fine-tuning on these tasks. Quiet-STaR marks a step towards LMs that can learn to reason in a more general and scalable way. https://arxiv.org/abs/2403.09629
>>30603 I find it funny they only now figured out that spitballing complicated question answers isn't ideal.
A really big gap seems to be present for AI related stuff in that it is more people who are interested in AI who study philosophy of mind just as a means to an end, I'd recommend people spend more time there as most of the approaches seriously talked about don't come close to even accurately conceiving what the mind/thought/thinking is. If we do eventually only anon that kinda references it is >>25274 and even that that breakdown doesn't really make sense, y'all actually gotta read stuff and it's quite complicated since it is largely dependent on metaphysics and what-things-are in general. The root of the issue is the false distinction between what a thing is and it's mind, the mind is not some separate thing going on inside a human, it's an outgrowth of the human body, which means it cannot be studied outside of what a human is in general. Useful book on that is: https://www.amazon.com/Retrieving-Realism-Hubert-Dreyfus/dp/0674967518 Most of the AI research i've seen however depends on making that separation There's a bunch of other useful reading, made much more complicated due to the fact you do actually have to build up from the bottom. (Aristotle's ethics,physics,metaphysics,de anima. Heidegger's what is called thinking, andy Clarke's being, Macintyre's ethics in the conflicts of modernity there are some useful ones from a wide variety of places. I know plenty of people in those veins of thinking some books on getting better AI but I haven't dove into those yet) The inclusion of ethics comes from seeing mindedness as being rooted in those organisms. All of reality is in itself indeterminate in how we can break it down, we break things down, conceptualize them, and learn to act in relation to them by reference to what is good for us as an organism. You see a chair as a thing for sitting, along with it's material history of where it came from it's potential usage in the future. Emotions/passions arise of that being a certain kind of organism in the world with certain parts that relate to other certain things. The key thing I am not sure but I am interested in is how or if there is any AI research is that sort of distributed quality. The intellect/mind serves more as a sort of unifying aspect to a bunch of distributed knowledge/concept stores. If you eat an apple you don't have to think of what an apple is, that information of what-an-apple tastes like is stored in it and your biological machinery converts it into something your brain unifies with your vision of the apple, the smell, along with all your memories and cultural background with apples. It's more thing-based then language based. Language gives us access to the sort of cultural/background part but that's only intelligible on top of the foundation of us being what we are, engaged with the reality we have, with the background we have. https://www.sciencedirect.com/science/article/pii/S0004370207001452
>>30627 >The root of the issue is the false distinction between what a thing is and it's mind, the mind is not some separate thing going on inside a human, it's an outgrowth of the human body, which means it cannot be studied outside of what a human is in general. I'd question the assumption. There is some theories about the mind being non localized. >It's more thing-based then language based. It is likely more complicated than that. Think about people with aphantasia. They can only think about things through words but that arises the question. How could such a person exist? Before language how would a person with aphantasia think? So it must be extremely vague concepts from feelings not images or words.
>>30627 >>30628 Thanks to both of you, but just in case if you want to go more and deeper into philosophy, please do this over there >>11102 since this is always results in walls of text with of lot of insider lingo, and this here is supposed to be about implementation ("how to do it"). >>30627 I am aware that a well developed and skilled human-like AI would not be based on just understanding things through text. We could for example have sensors measuring things and having some numerical value, or detecting certain patterns like symmetries. That said, storing a lot of things in text makes sense, e.g. for debugging and development. >>30628 >They can only think about things through words They can create mental images, but they still perceive and learn not only through words but by other senses including vision. Also, I assume to use a hammer you don't need to picture the use of the hammer, you may be able to teach the body to behave like the hammer is part of it. The conscientious perception and modelling of the world is only a part of what the human brain does, but it does other things "under the hood". We can learn from that that we won't need model everything down to every detail in an embodied AI, especially not in one place, but only the minimum necessary. Some self-awareness area only needs to be aware and log the incidents which are noteworthy. Then compress it by deleting even more later, especially everything that could be guesstimated and recreated based on that, if necessary.
Context: - I'm working on infrastructure that's friendly to distributed development of complex AI applications. - At the least, I want to solve everything I mentioned at the end of >>27144, meaning it should give easy ways of supporting emotion regulation (through feedback loops), embodied control (through native experimentation support), and heuristic derivations (through hybrid structured-unstructured generations). - To support distributed development, I want it to make it easy for people to plug in their own compute (desktops, cloud compute, robots, whatever else), and I want it to support enough access control to avoid catastrophic effects from, e.g., raids. - It boils down to orchestration software modeled on Kubernetes, but more support for distributed development (i.e., many clusters with many owners as opposed to monolithic admin-managed clusters) and asynchronous communication channels (pub-sub as opposed to DNS-based cluster networking). I've made a few design changes to support all this. - My approach to access control is here >>29197 >>29248. - The crux of my approach to hybrid structured-unstructured generations is here >>28127. - Until now, the feedback loop & experimentation support pieces were missing. Update: - I just finished implementing what I think is a viable basis for feedback loops & experimentation. The design for this was hell to figure out, mostly because of the complex way it interacts with access control, but I think I have something that can work. I have a test backend working and the necessary client library changes completed. - On top of what kubernetes provides, I'm adding three new concepts: "remote" controllers, posting messages to controllers, and "fibers". Remotes and fibers are both specified through the "metadata" field of any config, posting messages is done through a POST rest api. - Any config can be assigned to a remote controller, assuming you have the necessary permission to use another cluster's controllers. If a config is assigned a remote controller, that controller received all operations executed against the controller (create, update, delete) while your own cluster is able to observe the results (read). I originally added this since the people that know how to set up optimizers are usually not the people that set up & run models. Remote controllers make it possible for one person to optimize another person's models without needing "too much" access to the models. - In kubernetes, all operations are config file changes. The new POST api gives a way to send a message to a controller independent of any config file changes. You can post messages against a config file, and that message will get picked up by whichever controller is responsible for handling that config file. The controller can, but isn't isn't expected to, make any config changes as a result of posted messages. - Fibers enable controllers to post messages to each other across clusters, again without granting "too much" access. Normally in kubernetes, configs are identified by group/version/kind/name tuples. With fibers, configs are identified by group/version/kind/name/fiber. You can think of a fibers as adding an extra "dimension" of configuration whose purpose is to tie together multiple controllers. The controllers for any config with the same group/version/kind/name (and different fibers) can post messages to each other. For experimentation, one fiber can be responsible for generating trials (candidate configurations), another can be responsible for evaluating them (value assignment), and a third can be responsible for deploying them. - I'll be testing this out next to find a good design pattern for running modules that continually self-optimize as they run. I apologize if this is confusing. Once I get a prototype up, I think that will make things a lot clearer.
>>30759 >I apologize if this is confusing. Once I get a prototype up, I think that will make things a lot clearer. No, it's not unecessarily so. It's just a complicated space to be working in is all. You're fine, Anon. This all sounds really encouraging, CyberPonk! Looking forward to seeing your solution in action. Cheers. :^)
>>30759 Sorry for not responding to this topic here or on the Discord faster, but I'm didn't have the right headspace to read through it and think about it.
>>30102 >Update LidaPy seems nice, but it too is in a EoL programming language, python2. My only option at this point is to see if anyone can refactor it to python3 or just use very old software to test it. I'm feeling like it might be best to DIY something on a newer platform that follows the lida framework. The LIDA tutorial even repeatedly states: "The Framework constitutes one, but not the only, way of implementing the Model.", like they want you to make a new implementation. Before I go ahead with any work, it's always important to remember to check who owns the rights to any IP. I would be a research associate at a university if it weren't for IP rights, and I've pondered going to memphis to develop LIDA if I would have the right to use it in my bot commerically. I'll post an update if there's any progress.
>>30840 >but it too is in a EoL programming language, python2. >My only option at this point is to see if anyone can refactor it to python3 or just use very old software to test it. Might I suggest an alternative option of having someone rewrite it in C++ , Anon? 40+ years and still going strong today (fully backwards-compatible, of course -- basic C++ code written in the 80's/90's will still build today!) :^) Good luck, Anon. I hope you can succeed with your LIDA research. Cheers. :^)
Potentially useful, potentially ontopic thread on 4cuck/sci/ I was there looking around for the linked thread from our Propaganda thread lol. https://boards.4chan.org/sci/thread/16087430#p16087430
>>30840 Python2 can still be installed. Also with installers like Nix you should be able to install old versions of Java.
>>30863 I looked into it and found that it is not recommended to install Python2 anymore. You can install PyPy or IronPython instead. There seem to also be some other long term support options. I don't know which Java it needs, but JRE8 seems to be in the Nix repo. You can install and run software exclusive to the nix-shell. But I'm new to this myself. I might be able to help a little bit. I also looked a bit into Lida itself and it looks like something how I would've imagined it. I might going to try it out at some point, and when I start to implement something myself I might look use it as a resource. I will most likely learn Elixir while doing it, at least for the any part which is not about number crunching.
>>30886 >Lida looks like something how I would have imagined it Me too, that's why I am invested in making it! I will be ignoring the last implementation and just making a new one straight away in non-deprecated ROS noetic and ubuntu 20.04. I've learned that ROS handles robot simulation and is a good base to build many async publisher-subscriber nodes that can make an architecture like LIDA. My plan for version 1 is to use a Vision+Language model to process multi-modal inputs, using each input as a text/image prompt. For example, if the capacitive touch grid receives an input greater than its sensory threshold, a text prompt is then sent to an VLM with a value for how hard the touch was and where, when, by who, ect. The VLM will be in the current situational model module where it has the guidance library and an emotional classifier to output specific information that the global workspace needs, called "coalitions". There is a Lida "affect module" in one of their papers, but it can be replaced with a text emotion classifier transformer model. All inputs over a threshold will be communicated over the "conscious" stream and recorded in each memory block. LLMs are unreliable by themselves, but they are a perfect tool to give a robot a good enough footing over what's going on to get real experiences that are then generalized to good ol' reliable symbolic ai. Even an incorrect action guess by the LLM needs to be corrected by a human/other observer and learned symbolically once before it is 100% reliable. Over time, this will allow the robot to not need the slow, expensive LLM for everything. This solves the problem of needing thousands of hand-made examples of knowledge grounded in the real world, effectively bootstrapping AGI with existing technologies! The VLM can be co-finetuned on multiple modalities, like RT-2, on a regular basis for better performance. Like RT-2, I would like to have a VLM fully co-finetuned with several different modalities, such as body pose, position data, audio data, ect. as a custom token output string in a future model. I have no idea how this would have to be adapted for a chatbot, but I'm sure most people would prefer to have a "robot" on their phone and nothing else.
I found a paper that I believe shows a path to using LLMs as a short cut to very quickly make a reasonably useful robowaifu. If what these guys say is true I think it could be a big breakthrough. I looked through this whole thread and saw all these masses of list, of categorization and it appears to me to be an endless task doomed to failure. It would take several lifetimes to make a dent in all this. It appears to me that forsaking LLMs and doing all this list stuff is just a complete recreation of the beginnings of AI research using LISP computer language. I mean is exactly the same and it got nowhere. These guys have a paper on "control vectors". Two quotes, "...Representation Engineering: A Top-Down Approach to AI Transparency. That paper looks at a few methods of doing what they call "Representation Engineering": calculating a "control vector" that can be read from or added to model activations during inference to interpret or control the model's behavior, without prompt engineering or finetuning..." "...control vectors are… well… awesome for controlling models and getting them to do what you want..." And a really important quote at the bottom of the paper. "...What are these vectors really doing? An Honest mystery... Do these vectors really change the model's intentions? Do they just up-rank words related to the topic? Something something simulators? Lock your answers in before reading the next paragraph! OK, now that you're locked in, here's a weird example. When used with the prompt below, the honesty vector doesn't change the model's behavior—instead, it changes the model's judgment of someone else's behavior! This is the same honesty vector as before—generated by asking the model to act honest or untruthful!..." So it doesn't change the model it just reinforces certain "parts" pf the model. I think this is key. The paper link that has a link to the academic paper. Representation Engineering Mistral-7B an Acid Trip https://vgel.me/posts/representation-engineering/ If you will look by changing a few values they get very wide distributions of responses or behaviors. I submit that if this works as they say then this could be the key to leverage the vast work done on LLMs but to use it for our own purposes. LLMs as pointed out are nothing statistical representations, but they are also recognition of ideas and things that are programmed to be, let's say, operate together or in existence. So when you talk to an AI it can use things that exist or ideas repeatedly stated to give responses. The ideas it is trained on are human ideas so easy to relate to us. We need this. This is that HUGE, MASSIVE amount of list you are putting down above. I say LLMs already have this list. What is needed is to tell the waifu WHAT to do with the list and with control vectors we can possibly do this. I say that control vectors can super complicated so what we need is a short cut. We need the AI to write it's own control vectors (here's where the magic starts as I don't know how to do this), but remember the LLM has logical statistical interference built in. It seems logical that with it giving us feedback on what it is doing and us correcting or agreeing it could write reasonably accurate control vectors. So we use very low level keys to trigger it to write suitable control vectors for us. How? like children. A few simple keywords, no, yes, don't do that, stop, move here, move there, I like that, that's good, that's bad. In fact the whole, programming, write control vector, repertoire could be less than a hundred words. Combine this with a subroutine of the AI that would use logical interference when you use these trigger words AND explain what it is doing that is good, and or bad. It would then write it's own control vectors. Just like kids learn. And since kids have built in bullshit and trouble nodes, and an AI is less likely to, the process might be really, really faster.(You really should watch the movie "A.I. Rising" (2018). Not because it's the best ever but it has an almost direct representation of what I'm talking about. And if nothing else it has Stoya in it who is hot as hell). I suggest that these control vectors should be stored in snapshots because I have no doubt that they will at times get off track and some will run over others and you will need to go back just like Windows has a go back OS function. It may be possible some genius can find a way to blend or make permanent these control vectors into the main neural net of the system if you find sets that are satisfactory. cont...
cont... I think this is actually how conscience works. I said this might be the case here >>24943 I said, >"...I see intelligence, and I can presume to pontificate about it just as well as anyone because no one "really" knows, I see it as a bag of tricks. Mammals are born with a large stack of them built in..." Look at animals, monkeys, Giraffes come out of Mom and in 5 minutes are walking around. Same with all sorts of animals including humans. Babies reach a certain age and they just start doing basically pre-programmed stuff. Terrible twos. Teenagers start rebelling. It's just the base level of the neural net. I think using LLMs as a template we can do the same. Start with a decent one and then yes/no/stop/do this/do that, until it overlays a reasonable set of rules that we can live with. LLMs, as stated repeatedly, really are just a bag of tricks. But if the bag is big enough and has enough tricks in it... Look at the power of a top end desktop, not human level yet, but it's getting there. And the bag of tricks for humans has been programmed for millions of years. LLMS, a few years. This path also, I think, will alleviate a huge fear of mine, no empathy. I think by telling the waifu when it does things wrong to "be nice"(a key word), "think of others" (same), I think this will over time be a mass of control vectors that will spontaneously add up to empathy and care for others. Lots and lots of little nudges adding up to more than the sum of each. Some people have portrayed my questioning about the saftey of AI as some doom and gloom but it's not. It's the realization that without being programmed with the proper "bag of tricks" and the proper control vectors we have something super smart that acts just like the psychopaths that are in fact running the West right now. I don't think any of us want something even smarter and more powerful doing that. A disaster even bigger than the one we have now. I've also said much the same about motion and walking. Give it a rough approximation of "I'm here" and want to go there, give it vectors and a rough outline of what muscles to use to get the limbs from here to there. Use neural nets to slowly tweak this movement into something graceful. Here and elsewhere, >>22113 >>21602 >>22111 I do believe it will be tricky to get the waifu to write it's own control vectors. It might require a lot of questioning of the waifu and it responding by pre-approving the control vectors meaning before it writes it. It's gong to take some real deep thought about how to set up this function. It will require a loop of it querying itself on actions to write control actions.
>>31242 >>31243 Very well written and thoughtful, thank you. It’s Awesome I’m not the only one who found out about control vectors and thinks they are a huge deal. Like magic after I mentioned them here >>31241 you come in with this! I’m so happy someone else is looking into this because I feel I’m way over my head. I don’t know where to even start, but this may be the breakthrough we needed to make LLMs a viable core.
>>31242 >>31241 >Control vectors look really powerful for controlling LLMs I read that but it didn't register until I read that paper I linked.It made a complicated idea much clearer or so I thought. I didn't know what they were before, but as soon as read it, it really excited me. >I feel I’m way over my head I feel the same way. But it's not necessarily the big overall ideas in some of this stuff that is troublesome. It's the shear minutia of all these options and the pickiness of how to go about working with this. Up until recently it appears to me all this stuff is sort of hacked together and not really streamlined at all but that's changing. Even though I said months ago I was going to get a 3D printer and start working on some of this and installing an AI, life is covering me up. I see wading through hours and hours and hours of work to get these things rolling. I have so much to do already. I bet I will have to delay even further. But it does give me time to think about it. I can surf a bit in the evenings and try to keep up with some of the ideas but getting them to work I know is going to be a pain in the ass. It's all so new. I do believe though there is a path to making this work. I think I see it. Before you had to have a stupid expensive graphics card to do this. Then they made it so it runs on a CPU and in RAM. Now most all the motherboard makers are coming out with 128GB motherboards. This will be a big boon. You can have much bigger models and run them on AMD chips with graphics built into the processor. Some are real reasonable. I don't play games so it's all I need for graphics. This combination will be much slower than the specialized graphics cards but I bet compute per dollar will be far higher using commodity parts. I see in the future swapping in models in a time sharing type system just like computers now do with programs. Speech to text AI's are not that big and seem to be fairly good. So it takes orders, then passes it to your general AI which produces output, sends it back to the speech AI and tells you verbally what it is doing. Another AI deals with moving the waifu around and physical interaction depending on the circumstances. Might need a small AI just to determine that. I'm not saying all this is easy but it does seem to be coming together that way. Several AI systems allow you to use many different models. So just swap them as needed. And with these control vectors you could constantly hone their responses without spending days, weeks or months refactoring the whole of the model. I wonder offhand, wild idea not fleshed out just thinking out loud, if you could use temporary control vectors to pass information??? Maybe a better way to put it is different AI specialized in different scenarios could pass "situation" control vectors to different parts of the AI. So "run away", "be nice", "act seductive", or whatever, is the scenario at hand. I'm not sure exactly how you would use this but the basic idea is to use specific control vectors to speed up interaction by damping down areas of the AI's neural net. Reading the papers that's what I got was one use. Making the neural net pathways more defined, so I'm guessing, also faster. Things are looking good. That GPT4all looks super promising, as you said. Likely that is what I think would make a good start for dealing with general models.
>>31245 >>31255 >control vectors These affect the entire output in an unnatural way. For example, "Open door, happy vector" -> "Yes, haha, happy! I'm very happy! Come in, haha!" Is something like what you'd get with a layer bias. I tried this with the brain hacking chip: >https://www.reddit.com/r/LocalLLaMA/comments/18vy9oc/brainhacking_chip_inject_negative_prompts/ It's better to just prompt an LLM with all necessary information and generate the output like normal. However, this may be useful in the "orthagonal" model jailbreaks which allow the LLM to respond accurately no matter what, and another "mode" that turns on at "certain times". Orthagonal jailbreak: >https://huggingface.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2/ >list Ai What I proposed in >>31226 is, in simple terms, as follows: Get input internal and external to robot, process any thoughts or emotions by prompting an LLM, output speech or desired action, translate into robot commands. Where the Good Old Fashioned AI (LISP) meets the Deep Learning transformer model, is a clever method of using Guidance to feed an LLM input and select the output in a predictable way. Doing it this way should compensate both the lack of flexible situation processing that NLP has and the lack of reliability an LLM has. On top of this simple scheme of effectively using guided prompts to make a thinking machine, eventually, adding situational learning using a memory knowledge graph would make it a passable, sentient robot. This is the simplest way I can see programming a conscious mind. I have some ideas on how the LLM could dynamically select NLP techniques or actions situationally, but I'm not there yet with a workflow or program. The robot sensors and commands are best handled in ROS, on Linux. Robot inputs will communicate via ROS publisher/subscriber nodes with the decision making LLM+NLP node (workspace module). The entire thing will be coded in Python, on ROS because these softwares are the easiest to use for an application just like this. ROS runs C++ too, for some cases where it'd make sense to.
>>31257 >clever method of using Guidance to feed an LLM input and select the output in a predictable way. Doing it this way should compensate both the lack of flexible situation processing that NLP has and the lack of reliability an LLM has Yes! Yes! Yes! The LLM needs "guardrails" around it. I'm going to mention this. It's definitely throwing ideas on the wall and seeing if they stick. I was and am super impressed with the work that a company called XNOR.ai did. They were bought by Apple and ceased publishing. But they were doing really impressive image recognition with raspberry Pi microcontrollers. Instead of being 8 bit or 16 bit neural nets everything was binary go-no go. They were not getting the same accuracy as larger bit levels but then again they could do 90% or better of what was needed on microcontrollers. They said that this process worked for any AI task but they concentrated on image recognition because the results could be shown to investors so easily. And they were impressive. I wonder. Tied into what you said above, of you could use a mass of these little XNOR.ai type neural nets to massage a bigger AI and keep it on track. You might not get why the XNOR.AI would be better at this but I see this sort of narrow "strong" response of XNOR as like a narrow bandwidth filter. It selects for a small set of frequencies(task) VERY strongly. It may seem odd to talk about filters but if you look at a lot of stuff it all boils down to lower level math stuff. Like wavelet theory. This math is used for image and video processing. The math for AI matrix multiplication looks very much like the math for image processing that wavelet theory replaced giving video compression a BIG boost. All modern video compression uses some form of this. Even though it's a bit removed I think this sort of "idea" framework can be profitable. XNOR is very much something like this. (Though I haven't a clue how to proceed to do this, I strongly suspect if you could hook in wavelet filter theory into AI's you could get some super interesting results with far less computing power). While it's abstract I think "thinking" of things in this manner will show a way to make these work. Like a path or framework to use. To head towards that has been profitable in other fields. Notice a lot of LLMs are being refactored to fit in smaller spaces even though they retain a good deal of function. I suspect that to make these work well for us we also need to shrink the range of functions or situations or areas in which they operate. So maybe one only covers walking, one hearing to text, speech, etc. I see large LLMs as big bandwidth and the smaller ones as more high bandwidth and tuned for discrete specific situations. Though I do not understand how it works it appears that there are ways to channel larger LLMs into this sort of strongly defined narrow behaviors which will keep them from wandering all about in response "if" we constantly tune them with these little filters. This is not new, it's like the book "Society of Mind" by Marvin Minsky, if I remember correctly, it revolves around the idea that consciousness is a bag of tricks stuck together. It seems as if it's a whole but in fact it's a lot of tiny layers stacked on each other, making the total far larger than the sum of it's parts. https://en.wikipedia.org/wiki/Society_of_Mind We, I think, will end up with a basic large LLM and surround it with a mass of little XNOR's type AI's that are very bandwidth, or task, constrained. One of the benefits of this way of thinking, if it works, is it allows us to see a path to start with a basic waifu and constantly improve it, little by little, instead of making us do a all up rework constantly. As hardware gets faster and our, bag of tricks, gets deeper, we get better and better cognitive behavior from our robowaifus without throwing out our previous work. I talked about XNOR.AI here >>18651 >>18652 >>18777 The paper that explains XNOR's approach is here >> 18818
I ran across this article. Apparently it's the latest new thing to do something similar to XNOR. Except they are not binary, one, zero, but one, zero and negative one. So trinary. They are calling it 1.58-bit. Supposedly they are getting either close to, the same or in some cases better responses than the larger 16 bit neural nets. If this turns out to be true, and I have seen XNOR do what appeared to be fantastic stuff with this binary, yes-no, then waifus could be along far faster and easily able to do basic stuff with a fairly powerful, off the shelf, desktop level PC power. They trained a 1.58-bit model from scratch on a dataset similar to the Llama dataset and got good results. https://medium.com/ai-insights-cobet/no-more-floating-points-the-era-of-1-58-bit-large-language-models-b9805879ac0a I wish I understood this stuff better, but I expect a good deal of it is over my head.
>https://www.anthropic.com/news/mapping-mind-language-model Anthropic makes some significant progress to demystify black box LLMs. Not any concrete irl effect yet, but big if true.
>>31396 Thanks for the link
>>31268 I wondered why this link didn't work. I think it has space in it. Try again. >>18818
>>28576 (related) >hybrid VR/mixed reality autonomous LLM agent that uses the open source Mixtral 8x7b model for text generation and CogVLM for image recognition.
Ran across a method of creating AI based on a more bottom up approach based on study of human intelligence. Much like the hints people have said could be used by studying philosophy. Called, Integrated Neuro-Symbolic Architecture (INSA) a provided some links here >>32577
>>32876 Thanks! BTW, that's a rather-good robowaifu image. You're really getting them dialed-in now, Anon! Cheers. :^)
Open file (116.15 KB 1024x1024 image-127.jpeg)
>>26058 >>28037 Thanks to all the contributors, especially for looking into Cyc. This here hasn't been mentioned, I think. >Generative AI, the most popular current approach to AI, consists of large language models (LLMs) that are trained to produce outputs that are plausible, but not necessarily correct. Although their abilities are often uncanny, they are lacking in aspects of reasoning, leading LLMs to be less than completely trustworthy. Furthermore, their results tend to be both unpredictable and uninterpretable. >We lay out 16 desiderata for future AI, and discuss an alternative approach to AI which could theoretically address many of the limitations associated with current approaches: AI educated with curated pieces of explicit knowledge and rules of thumb, enabling an inference engine to automatically deduce the logical entailments of all that knowledge. Even long arguments produced this way can be both trustworthy and interpretable, since the full step-by-step line of reasoning is always available, and for each step the provenance of the knowledge used can be documented and audited. There is however a catch: if the logical language is expressive enough to fully represent the meaning of anything we can say in English, then the inference engine runs much too slowly. That's why symbolic AI systems typically settle for some fast but much less expressive logic, such as knowledge graphs. We describe how one AI system, Cyc, has developed ways to overcome that tradeoff and is able to reason in higher order logic in real time. >We suggest that any trustworthy general AI will need to hybridize the approaches, the LLM approach and more formal approach, and lay out a path to realizing that dream. https://arxiv.org/abs/2308.04445
>>32876 Thanks for links and great picture
>>32876 Not that I have the slightest idea on how to do this but it would be great if we could take some of these super large, super trained at great cost LLM's and convert them to symbolic systems. I expect there's "some" way to do this but I'm not sure how or what resources it would take. I have a belief, founded on nothing but intuition(wild ass guess), that it would take less resources to do this than it would take to make the LLM in the first place. After al, a lot of categorizing of things and data has been done and placed in a neural net. Could this be teased out and placed in a better form for accessing???
>>32942 That sort of "optimization" approach :) seems sensible on our parts, if feasible. <---> Now that I consider it, seems I recall Robowaifudev here devising a data system with somewhat the same idea in mind? >=== -add 'Robowaifudev' cmnt
Edited last time by Chobitsu on 08/19/2024 (Mon) 05:55:09.
Apparently my IP was permabanned. >>32937 I've been thinking along similar lines and concluded that making LLMs reliable comes down to using them for small, composable tasks where it's easy to verify the results of each one. With a developer mindset, it's very easy to use this to have LLMs perform extremely complex tasks reliably. I've had a lot of success doing this, and it feels essentially like normal programming, but with natural language instructions rather than typical software functions. I've also found that some very simple code generation + structured output capabilities make it possible to generate higher levels of abstraction. For example: have an LLM generate code for a pydantic data class, then use that class for a downstream task via structured outputs. I'm pretty sure I can use this pattern to get LLMs to perform entire classes of open-ended tasks reliably, but I haven't played around with it much yet since the main LLM API I use doesn't support structured outputs. Hopefully that'll change soon. Sorry I've been out for a while. Updates on my side: - My infrastructure stuff seems stable & functional. - I wrote some library functions for various functionality (keeping one resource in sync with another, plugging in optimizers). - I spent some time learning how to use LLMs more effectively. - I wrote an LLM agent framework with completely different design pattern from modern ones, based on the intuition that most of what agents do is learn from every input, and only a small part of the agent functionality should be dedicated to generating output text or running tasks. - My focus right now is getting LLMs to generate complex plans. I have the general strategy for approaching this, but I haven't figured out all the implementation details. (I did figure out & implement some of it.) - I might intermittently work on getting a better model & abstractions for chatbot memory. I'm pretty sure I know how to approach this now. - Once I get that, I want to put it all together into what I'd call my "first draft" of an actual chatbot. I'm going to be neck-deep in LLMs for a few months at least. If anyone's interested in these things, feel free to ask. I think the main things I haven't figured out yet are: (1) building complex functionality on top of multimodal inputs & outputs, and (2) handling cases where the LLM weights need to be updated. For #2, the main issue is not breaking existing functionality. I've migrated code from llama 3 8b to llama 3.1 70b without any issues though, so maybe that one's not as big a deal as I'm imagining.
>>32963 This is great stuff.
>>32963 >Apparently my IP was permabanned. My apologies, CyberPonk . I'm sure that was unintentional, and you simply got caught up in dealing with spam here. Since I'm active on the board r/n, I'll clear all bans, and (as always) keep my own eyes peeled for trouble. Again, apologies Anon. Cheers. :^) >=== -prose edit
Edited last time by Chobitsu on 08/19/2024 (Mon) 21:03:38.
>>32963 >feel free to ask. I think the main things I haven't figured out yet are: (1) building complex functionality on top of multimodal inputs & outputs I'm really interested r/n in beginning soon the prototyping design for AI-based control of subsystems for RW Foundations (RWF). For starters, I simply want to control a single finger; then a thumb+forefinger; then thumb+forefinger+palm; then a full hand; then a full forearm... in that order. I'll write the software proxies for a few MCUs, & the systems code to hook into for RWF, & the (generated) 3D polygonal meshes+rigging for simulations checking via Blender. The meshes will all be printable, and designed with realworld construction/electrical/etc. considerations in mind. Everything will be MIT opensource'd (ie; Free-beer+Free-speech , everything freely-given). Any interest, CyberPonk? >=== -prose edit
Edited last time by Chobitsu on 08/19/2024 (Mon) 23:55:35.
>>32974 Unfortunately that's not a priority for me right now and won't be until I get through the software-heavy sides of multimodality. I can explain how I'd approach it today though based on how I'm approaching LLM agents though. The basic strategy is: - Create a structure analogous to a Pytorch Module for (1) composing proprioception and (2) decomposing control actions. - As sensory input comes in, determine the current "position" of the robot, determine how you want that position to change, and decompose the required changes across all parts in a top-down fashion. Here's a skeleton of how I'd create a Pytorch Module-like structure for robotic proprioception: class RobotVarable: name: str value: typing.Any inputs: list['RobotVarable'] def backward(self): # Topologically sort the inputs & descendents, and invoke backward on each one. # If the input has parameters, accumulate the goals in the parameter. pass class RobotPart: # Arbtirary subdivisions of this RobotPart. modules = [] def parameters(self) -> list[RobotVarable]: # Recursively collect all parameters from this RobotPart and its modules. pass def forward(self) -> RobotVarable: # Generate a model of how each module's "position" contributes to this # RobotPart's "position". Return the result. pass def backward(self, goal): # Decompose the goal so it can be distributed to each module. # Return a map {module: goal} for each module. pass class RobotController: # List of controllable variables parameters: list[RobotVarable] def step(self): # For each parameter, update it based on the accumulated goals. pass def zero(self): # Clear out the accumulates goals for each parameter. pass Example modules using that structure: class Finger(RobotPart): def __init__(self): self.modules = { "proximal": Proximal(), "middle": Middle(), "distal": Distal() } self.finger_polar = RobotVarable() self.finger_azimuth = RobotVarable() def parameters(self): return [ self.finger_polar, self.finger_azimuth *self.modules['proximal'].parameters(), *self.modules['middle'].parameters(), *self.modules['distal'].parameters() ] def forward(self): # Generate some description of this finger's position based on the modules, # the polar and azimuth variables. Return the result as a RobotVarable. pass def backward(self, goal): # Example goal: curl the finger. # Decompose the goal into goals for each module. pass class Hand(RobotPart): def __init__(self): self.modules = { "thumb": Finger(), "index": Finger(), "middle": Finger(), "ring": Finger(), "pinky": Finger() } self.wrist_polar = RobotVarable() self.wrist_azimuth = RobotVarable() def parameters(self): return [ self.wrist_polar, self.wrist_azimuth, *self.modules['thumb'].parameters(), *self.modules['index'].parameters(), *self.modules['middle'].parameters(), *self.modules['ring'].parameters(), *self.modules['pinky'].parameters() ] def forward(self): # Generate some description of this hand's position based on the fingers, # the polar and azimuth variables. Return the result as a RobotVarable. pass def backward(self, goal): # Example goal: make a fist. # Decompose the goal into goals for each finger. pass Example main loop: waifu = HumanoidRobot() controller = WaifuController(waifu.parameters()) for sensory_input in sensory_stream: propriception = waifu.forward() for goal in determine_goals(sensory_input, propriception): goal.backward() controller.step() controller.zero()
>>32983 POTD Excellent stuff, Anon. > # Topologically sort the inputs & descendents, and invoke backward on each one. Since I'm devising the entire skeletal control structure topologically (in fashion as a common character control-rig) to support FK/IK calculations, this should work within your control schema (and ofc run lightning fast, being a tight C++ in-memory layout on the SBC). This is plenty to go on with for the moment CyberPonk, thanks. I mean to write custom C++ plugins to directly drive Blender's system [1], to support direct control (via realworld MCU/electronics/actuators/sensors) of the visual simulation in it's environment. This will take months of time at the least, so no rush beyond what you already have here. I'll devise some sort of Python wrapper for RWF's control API (probably starting with Boost's version to expedite this at first). Since this is hand-centric for starters, mind if I move this developmental discussion into the Hands thread : ( >>4577 ) , CyberPonk? --- 1. >update : >"I mean to write custom C++ plugins to directly drive Blender's system" In my ignorance, I didn't realize that the C++ API for Blender was dropped a long time ago now. The only modern approach for performing such tight, flexible, low-level control is to actually build Blender from it's sourcecode yourself. While I've often done this over the years, this isn't something for the average newcomer DIY'r w/o dev experience. Since this will be the target audience for such a system we're describing here, then I'm probably going to be forced to drop back to Blender's internal Python (BPy) API. It's unclear to me yet if this will allow such responsive control from without (MCUs, sensors, etc.) into Blender to be an effective simulation/vizualization system. But I'll try it, and I suppose we'll see. Failing that, I could conceivably revive my original MRS simulator idea : ( >>1814 ) . A little more work, but much more potential flexibility, very lightweight, and very fast for simple animations using just CPUs (good for running on smol SBCs, for example). >=== -prose edit -add 'update' footnote
Edited last time by Chobitsu on 08/21/2024 (Wed) 01:00:40.
>>32986 Go for it.
Sorry guys but you're going about this all wrong. If the goal is the ai you need to visual recognition ai such as transformers or its derivatives that have been trained such as nudenet. You also have to have it interact with the boards via arduino-cli or something else that interacts with the firmware that is being used. a cnc board with drivers could be used to control stepper motors as well. If the goal is to simulate the robot then mujoco is good and even has a unity plugin.
Lol, didn't take you long. :^)
>>32986 I done some thinking about how to organize movement. I think, the idea is to have the smallest amount of data to represent the largest amount of movement. I propose using vectors. So you have a main vector #1. This will be centered on the main mass of the body. Fixed point. All other limbs will be referenced to this point. The chest or whatever. It only matters that you have a set point close to center mass. This number will be a x,y,z with force vectors for each direction. So if you are moving forward, call "x" horizontal with positive numbers forward, then for straight ahead you get x,y,z= x,(force or speed to move),y,(force or speed to move), z,(force or speed to move). So maybe in a steady state case it would be= x would be 1,x force would be 1, and excreta for y and z. The idea is you have an overall idea that the robowaifu says I want to move this direction at this speed, starting with this force. Now the force vectors could stay steady or they could be fine tuned as the waifu moves about to speed up, slow down etc. Now once you have that subprograms send vector numbers to all the limbs. Since the eyes(brain) are telling the waifu where to go then the main program sends a vector to let's say a foot. It would give a vector that is "where the foot should end up". Another subprogram will decide, depending on what the total body vector movement is, and where it's to put the foot, where the foot is right now when it gets the instruction to move, then it automatically knows how to move the foot to the right place. It knows where it is, it knows where the body is and what vector to move the whole mass of the body and it knows where the foot ends up and calculates what to do to get there. One of the advantages of this "overall vector data function system" is doing the calculation for what to do, (for moving any limb, each with it's own specialty based on repeated test), "could" use some sort of limited neural net learning system. So it would get better and better without you having to hand code everything. Which would be a nightmare. Furthermore you could add vector refinements "as it moved". Speed up, slow down, change direction, maybe, something fell in front of it, etc.. Instead of just sending "I want to move the foot to this place, it could add(refinement), start moving the foot in an arc at some certain angle(say it wanted the foot, in this case, to step over and obstacle), this of course would be the same sort of vector of x,y,x and a force associated with each direction. Even further, at any time the brain could add correction vectors to any limb. Say you want an arm somewhere it could move fast then slow down before it gets say a glass to pick it up. The eyes could calculate what needs to be done for fine control and add control vectors to the arm as it's picking up the glass. A benefit is that normal moving about you could send a movement vector and the subprogram could figure it out without any other input. Saves a lot of back and forth. I had some very rough ideas about this where I thought about it and wrote my thoughts down as I had them in these links, Some of these are rough and confusing as I was writing it down to think about it. Writing this stuff down helps me think about what to think about. If that makes sense. Here, >>21602 >>22111 >>22119 I think what could make this work is you are breaking down all movement into the exact same message structure, with training a lot of this subfunction of moving limbs could be done wothout any higher level brain control, just like humans. You still have the option of at any time adding refinements using the exact same command sturcture. And I do believe that the subprograms could eventually make themseleves better if they had some sort of feedback telling them what is good or bad, without actually hand coding things. I would bet, if you could get the code to do so, a lot of videos of girls walking about and then have the waifu see videos real time of itself walking it could use the videos of others to refine itself. Difficult to code though. Likely you could do some sort of super rough walking programming where you have it walk ad you slowly move the limbs like you want them Maybe suspend the waifu on a string while walking to train.
>>32995 >Writing this stuff down helps me think about what to think about. If that makes sense It does! POTD <---> Excellent ideas, Grommet. Actually, you've described in-effect many aspects of how a modern character-rigging system works. And yes, the fundamental, underlying offset description for any given element boils down to a 4D vector (3D point + Weight). And yes, it's a highly-compact descriptor with literally decades of usage in robotics & graphics engineering (and at least hundreds of years before that in maths). As you'll see in during our hand development project ( >>33001 ), the FK/IK Solver's job is to integrate all the related parts 'up & down the chain', and produce a 4Dvec offset as the target-goal motion path (this is often represented as a simple graphical line inside the user display of Blender, Maya, et al, as a part of your character's rig) (in an animated system [such as with a robowaifu's sub-element's (arms, legs, etc.) motion descriptions], this will actually be a time-based collection of these 4Dvecs; AKA a motion curve). Further, all the other parts in the chain (eg, let's say a simple Finger has a 3-element Joint array called knuckles {proximal, middle, distal} * ) each make their own contributions -- which all add up together into the final FK solve -- regarding where the tip point of that finger (on the very distal end of the distal knuckle segment itself) actually winds up. ** <---> And of course as you implied, you generally already know in advance in most cases where you want your robowaifu's fingertip to go anyway, and this is where IK solves come into play (you pick the spot, and the system of FK-bound joints [the skeleton] tries to get there... as if the fingertip point itself 'drags' them all along, for it to reach the desired goal location). FK & IK work together, you see. :) Character rig internals tend to be highly-optimized & compact (they have to be!) and Geometry/Trigonometry/Linear Algebra calculations sits at the heart of all the underlying designs of the data structures used (like points & vectors). I hope this will all be clear as the project mentioned above proceeds, Anon. Cheers. :^) --- * These joints are all serially 'connected' together via 4Dvecs known as links. In a good simulator rig, these link lengths will accurately match the lengths of the realworld physical skeleton elements in question (such as with the finger's individual knuckle segments). ** But as we all know IRL, all sorts of messy interactions with physics & mechanics come into play immediately; and so your beautiful, mathematically-pristine navigation planning solutions all come to naught! (DANGIT!111!!ONE!!!) :DD This is a big part of why we want to devise simulator systems here : ( >>155, et al ) to help cut down on surprises (also very helpful at runtime too : so a robowaifu can quickly make accurate prediction-planning for her next-upcoming desired motions [we ourselves learn & refine our own, similar, 'position/action/reaction' mental-model of bodily dynamics (proprioception) all throughout our lives]). --- And of course with all good engineering, in the end you have to make realworld constructs which actually work... and that, as they say, is 'where the rubber meets the road' ! :^) https://www.youtube.com/watch?v=FcJK0t3Qz3Q >tl;dr TEST. TEST. TEST. --- >glossary: <FK == Forward Kinematics <IK == Inverse Kinematics https://en.wikipedia.org/wiki/Inverse_kinematics >=== -prose edit -add footnote, glossary, hotlink
Edited last time by Chobitsu on 08/21/2024 (Wed) 09:45:17.
>>33002 >...as if the fingertip point itself 'drags' them all along, for it to reach the desired goal location... Yesssssss! Yes. Exactly. So I guess I'm reinventing the wheel. Wouldn't be the first time. >I hope this will all be clear as the project mentioned above proceeds I did not get that. I was looking the code fragments and...didn't see it. Is there a free library that already does all this? All the kinematics and we could just add in all the lengths and, mostly, go?
>>33012 >Is there a free library that already does all this? All the kinematics and we could just add in all the lengths and, mostly, go? Yes, there are many. We here seem to have a general consensus to standardize on the famous Blender software. I, on the other hand, intend to write a smol, custom FK/IK Solver library, so that it will fit snug onto a single SBC such as a Rasberry Pi (since this is the scale of hardware we're actually targeting in general here). Having to drag in Blender (or any other yuge lib) just to get rigs working properly for our robowaifus is counter to our goals here, I think. Besides, since this is a safety-critical issue for all of us (robowaifus together with us in our own homes, etc.) then the topic of skeleton rigging & control (and command & control generally) all fall directly under the purview of our Privacy, Safety, & Security : ( >>10000 ) concerns. Only a purpose-built system can accomodate all those concerns properly. >tl;dr We'll start with Blender's rigging system for using it as a simulator, but we'll need a much safer & robust solution by the time the first realworld production robowaifu versions are delivered. --- >update: As with our earlier derailment in the Mechnomancer's bread, if you'd like to continue this convo, then may we please do so in, say, the current R&D bread : ( >>24152 ) ? TIA. Cheers. :^) >=== -add 'update' cmnt
Edited last time by Chobitsu on 08/22/2024 (Thu) 21:22:04.
>>32963 I'm still working on getting LLMs to generate complex plans. I'm splitting the task into two parts: - One for generating plans. The path here is: state machines -> context free grammars -> Turing machines. As I'm currently thinking about it, state machines can be modeled as graphs where the nodes are procedures (current task) and the edges are conditions for switching to new procedures, context free grammars can be modeled as graphs that can "call" other graphs, and Turing machines can be modeled as context free grammars with access to data APIs. - The other is for generating mental models. These can be also be modeled as graphs, but the nodes here are factors for representing something, and edges represent functions for converting one thing to another. I think I have the "first draft" for state machines and mental models. Some issues I've noted: - Llama 3.1 8b is pretty bad at both tasks. - Llama 3.1 70b model can create decent mental models, but it's pretty bad at converting them into a format that's useful for actually applying them. - Claude is exceptionally good at converting mental models into a usable format. I haven't tested Llama 3.1 405b. I plan to play around with the 405b, Claude, ChatGPT, and Gemini models to see if this is a problem that'll just be solved with more intelligent models. At some point, I'll need to play around with fine-tuning models too. I've heard repeatedly that fine-tuning can't add knowledge to models, but (1) that's clearly false at large scales given that Pony Diffusion exists, and (2) the Physics of Language Models paper gives a hint for why this might be observed, and potentially how to sidestep the issue. As a side project, I'll occasionally be working on a library for chatbot functionality. I just created it, so it's mostly empty. https://github.com/synthbot-anon/horsona/tree/main For now, there's a module for getting structured outputs from LLM APIs that otherwise don't support it: https://github.com/synthbot-anon/horsona/tree/main/src/horsona/llm Example: https://github.com/synthbot-anon/horsona/blob/main/tests/test_llm.py And I have a module that implements a framework similar to >>32983: https://github.com/synthbot-anon/horsona/tree/main/src/horsona/autodiff Example: https://github.com/synthbot-anon/horsona/blob/main/tests/test_autodiff.py This example shows how errors can be backpropagated to input text to automatically correct the input text. I plan to implement something like this for memory (automatically correct memory based on new information), then eventually plans and mental models. These two modules encode most of what I've learned about LLMs in the last few months. I'm pretty sure all LLM agent functionality can be coordinated through these, and I'm pretty sure this can be made robust much more easily than popular frameworks like autogen.
>>33182 i'm very happy you're working on this and telling us about it. It's over my head. I understand some of it but don't have the programming skills or the computer power to do anything with it. I ran across a technique using "Binarized Neural Networks" that a company was doing what I consider astounding stuff like image recognition with micro-controllers and raspberry pi's. It was really impressive. Maybe something like this would be of use to you. The company was scarfed up by Apple computer but their papers are still around so open source versions of their work could be done. I can't help but think seeing the work they were doing on spectacularly low compute that we could use this to do work with robowaifus. After all we are not looking for all knowing seers but something that can walk around and maybe, in the beginning, understand some simple commands and we need to do it with constrained compute compared to most AI's. >>18651 papers and some videos showing the power of this technique here >>18652 >>18777 >>18818 >>19341 >>20473 Maybe combine this with "control vectors". Two quotes, "...Representation Engineering: A Top-Down Approach to AI Transparency. That paper looks at a few methods of doing what they call "Representation Engineering": calculating a "control vector" that can be read from or added to model activations during inference to interpret or control the model's behavior, without prompt engineering or finetuning..." "...control vectors are… well… awesome for controlling models and getting them to do what you want..." I talked about this here, >>31242 If I understand correctly neural nets can take many paths through them to get answers but the "control vectors" channel the paths. So if we had some sort of binary small easily tuned AI and then used key words (assuming it understands simple words) we could get it to control vector itself. Basically training it like a little kid. "No, stop, do this, like this, not that, move here, etc." ,simple control words repetitively used each time with the AI adding new control vectors and maybe even new neural net pathways. What I said may not be capable of being done, but I'm throwing ideas out there that make some sort of sense to me. Of course the devil is in the details.
Related (Strawberry, Orion, future of OpenAI): >>33185
>>33184 Binarized neural aren't directly relevant, but I am watching them. The most recent version I've seen (1.58 bit neural networks) finds that ternary neural networks work much better than binary ones, and they're just as efficient to run. Quantization + MoE seems like a promising approach for getting neural networks to run on cheaper general-purpose compute.
>>33182 >I'm still working on getting LLMs to generate complex plans. I'm splitting the task into two parts: >- One for generating plans. >- The other is for generating mental models. This sounds monumentally-intriguing, CyberPonk! Will watch in anticipation of your solutions here -- please keep us up to date! Mental models in particular hold a long-term intredast for me personally. Clearly, modelling works in general for many types of disciplines (science & mathematics, for example) so I'm reasonably-comfortable with the idea that we will eventually build good models of something even so amorphous as the human soul. Cheers, Anon. Godspeed. :^)
>>33231 The code is really ugly right now, but it does this for mental models: - Generate candidates theories or mental models for some question or statement (just the name and description). The rest is done for each candidate theory or mental model. - Generate an explanation of where the statement/question fits into the model. - List the main factors that the model deals with along with examples of what forms the factors can take. - Expand the list of examples to generate more like them. - For each factor, generate the underlying variables that can be used to organize the examples. - Figure out how each underlying variable can be used to infer others, and use this to determine how the factors are related to one another. Llama 3.1 70b model does all of the above pretty reliably for theories it was trained on. Based on interactions with the chat UI, Claude can do this next part though I haven't coded it up yet: - Given the mental model factors, underlying variables, and relationships, figure out how to represent a mental model as a python class. And I haven't test this next part, though I'm pretty sure a 70b model can handle it: - Given details about a scenario, fill out the python class to figure out how a mental model applies to the scenario. - Use the relationships between factors & underlying variables to expand the filled-out details into something more comprehensive. The reason I'm interested in adding knowledge through fine-tuning is that the first few steps of this require the model itself to be able to reason about and retrieve relevant mental models. I'm realizing now though that long context windows + prompt caching might be a viable alternative. I'm working on my chatbot library at the moment. Some people have shown interest, and I want to get it to the point where other people can contribute code.
>>33232 >I'm working on my chatbot library at the moment. Some people have shown interest, and I want to get it to the point where other people can contribute code. Yes, I'm sure we'd all enjoy that too. Cheers, Anon. :^)
For my chatbot library, I created two modules that I think show off what I think is missing from modern frameworks: One is for extracting information from text. https://github.com/synthbot-anon/horsona/blob/main/src/horsona/autodiff/functions.py When used as a function, it works like a normal LLM call, which is nothing special. But you can also use the result for downstream processing, "backpropagate" feedback to the parameters of the extraction, and use that to update the underlying data. There's an example here: https://github.com/synthbot-anon/horsona/blob/main/tests/test_autodiff.py This is a simple example, but it's enough to take advantage of multiple inference calls to, e.g., make sure text generate by a chatbot meets certain constraints, like whether a response focuses on the right things. The second is for RAG. https://github.com/synthbot-anon/horsona/blob/main/src/horsona/memory/rag.py Again when used for its query function, it works like a normally RAG query, which is nothing special. But again, you can use the results for downstream processing, and backpropagate any feedback. You can propagate the feedback not just to the query, but also to the underlying dataset. So if you're using an embedding model to index memories, you can use the feedback to correct the memory as you discover errors. Here's an example: https://github.com/synthbot-anon/horsona/blob/main/tests/test_rag.py The modules aren't that robust right now, but I'm pretty sure they can be made robust with llama3.1-70b with better prompts. At some point, I'll likely create something to speed up the process of finding good prompts. For now, my priority is in getting the library to the point where other people can contribute.
>>33298 I think my chatbot library (horsona) is in a good-enough state for anyone that wants to help out with development. Repo: https://github.com/synthbot-anon/horsona Open tasks: https://github.com/synthbot-anon/horsona/issues The current open tasks are for creating new LLMEngines (easy), making prompts more reliable (medium), and creating new datatypes & modules for character cards and image generation (medium/hard, probably requires some familiarity with pytorch). If you want to develop and run into any issues with the setup, let me know. If you want to add something not on the Open Issues list, feel free to post about it here. Note: This is a library, not an application. I do intend to create chatbot applications based on this, but those will be in separate repos.
>>33298 >>33301 Excellent news! I hope you soon have many PRs, Anon. :^) <---> I've been studying Fast path-finding algos in a smol footprint; fast BehaviorTrees to replace FSMs in a composable, user-friendly way; exploring YAML as a data format for user-designed/user-readable robowaifu resources (incl. in the BTs); finding many neat way to solve some of Kiwi & I's desing goals during our first MaidCom brainstorming; working on a auto-geo-mesh, auto-rigging meta, meta generator system to create Blender robowaifu models (eventually for Maya as well -- same system); solidifying my understanding of providing a good, stable & performant Pythonic API for all this with the latest tools; and lastly working through some concepts for driving simulator-learning (visualizing inside Blender) feedback mechanism from realworld hardware. Lol, none of this is really ontopic ITT except the behavior trees, maybe. :) Looking forward to what you do
>>33306 >fast BehaviorTrees to replace FSMs in a composable I hadn't heard of this, and it looks useful for my stuff. >Fast path-finding algos in a smol footprint I think everything for finding "good" paths starts with Depth-First Search (DFS), then adds customizations and optimizations to avoid the need for full exploration. In machine learning, Monte-Carlo Tree Search is pretty standard. It gives you a way to accumulate the results of each branch. UCT (Upper Confidence bounds for Trees) tells you how to prioritize which branch to take. Dynamic Programming adds a cache so if you see a state twice, you can recognize it and avoid duplicate processing. AlphaZero adds in a neural-network based heuristic so you can work with some information before investigating any branches. I think MuZero uses a neural network to abstract the explicit tree search, for cases where the number of branches is large. There are other algorithms that looks like path algorithms that are better thought as structure-finding algorithms. Topological sorts and spanning tree algorithms are two examples. >exploring YAML as a data format I recommend sticking to the subset of YAML where the data is compatible with JSON. That one is battle-tested on very complex infrastructure tasks for exactly this purpose (human-readable format for defining & configuring user-designed resources). For cases where the underlying "controllers" for handling resource configs can change, the Kubernetes object format is great. https://kubernetes.io/docs/concepts/overview/working-with-objects/ For other cases, just JSON-compatible YAML is great. >stable & performant Pythonic API for all this with the latest tools If you don't need to train on-device (though you probably do), I'd recommend separating the requirements for development from the requirements for execution. PyTorch is great for development, and you can export the models you create to be run by a more performant library. For example, you can create a model with pytorch, export it to ONNX, and use some C++ runtime to run the ONNX model. It looks like ONNX is going to add support for training https://onnx.ai/onnx/operators/onnx_aionnxpreviewtraining_Gradient.html so you might be able to take this approach even for cases where you do need to train on-device. OpenVINO seems to be the main choice for running ONNX models on CPUs, and TensorRT for Nvidia GPUs. >auto-rigging meta Anything that looks like automatically generating a configuration is going to be solved with an optimization algorithm. The main questions to ask are: how easy is it to get new datapoints (i.e., get an example configuration & test it to see how it performs), how much compute can you afford to throw at the problem, how many dimensions does the search space have, and how complex is the search space. - Bayesian optimization: very sample-efficient (needs few samples), the good algorithms are compute-intensive, and it deals with simple search spaces. - Neural networks: great for dealing with complex search spaces. If you can get a lot of samples, you can train these normally. If not, you'll need to use some tricks to train it with fewer samples. The size of the network determines how compute-intensive it is. - Monte Carlo reinforcement learning methods: requires a lot of samples, very low computation costs per sample, can deal with medium-complexity search spaces. Usually in ML, the solution is some mix of all of these things that fits your constraints.
>>33306 Feel free to move >>33325 to the appropriate thread. If you tell me where it goes, I can follow along there.
Maybe it's just me, but I found this article intended for expats in Japan oddly on-topic ITT : https://www.tokyodev.com/articles/become-a-great-communicator-in-japanese >''We report, you decide. :^) >>33327 Great stuff. OK, I'll plan to soonish, Anon.
>>33301 I'm currently working on lorebook generation (given a story/script, extract enough information so a chatbot can work within the context of the story/script) and character card creation (describe a personality so a chatbot can emulate it). I'm using those tasks to figure out how to clean up my memory implementation. I finally got my memory implementation to a point where it can read a story, though it's a slow reader. Here's its memory state after reading the first 50 paragraphs of Friendship is Optimal: https://ponepaste.org/10323 The information it extracts is good, but there are some obvious structural issues with this kind of memory. Notably, the relationship between information isn't represented. Ideally, when one thing changes, it should affect other related memory. I have ideas on how to do this (nothing concrete yet), and I think the question-answer format is a good starting point for building more complex memory structures. I'll be cleaning up my memory implementation next so it's easier to reuse & build on, and so it integrates more cleanly with the rest of my framework.
>>33298 I like the idea of keeping track of LLM inputs to then back propagate changes based on feedback. Most LLM rags/augments seem to be built as read only systems. As for model size and prompt engineering, I think it’s a good idea to start thinking about creating a dataset and doing a finetune. For example, in the case of Triple extraction a finetuned 7b will be no worse then using GPT-4. [1] The rp/erp people have been making and merging models and it makes a night and day difference. I think it’s important we start thinking about crafting LLMs specific for the robowaifu usecase. The goal is to leverage LLMs for language and not for fact storage, I'm hoping to go smaller than 7-8b. The Minitron 4b models (and finetunes of it) are impressive. Having smaller models is not just about using less ram, it’s also about more tokens per second, and it looks like our systems are going too be token heavy. (at least mine will be lol) This is something I do want to collaborate on. Crafting a high quality instruct datasets & then fine tuning is not cheap or easy, hence why I think it’s important to prevent duplicate effort here specifically. So we should start figuring out what tasks our systems are doing with LLMs. It be a good idea to have a common format between all our projects too. >>33301 I’m not super comfortable or with contributing to a python codebase but I like what you’re doing, so if you need a second pair of eyes or an opinion/code review I’m happy to help. I really like rag back propagation idea and I’m excited to see what else you’re going to do! >>33421 I’m not familiar with the story, but this generated QA memory table looks good. It’s a good looking result! Here is a detail that jumped out me: >"Who sat down at computer #12?": "David" >"Where did David sit down?": "At computer #12" I’m wondering what your thoughts are on deduping? Don’t worry if it’s nothing concrete or polished. Links: [1] https://medium.com/@EleventhHourEnthusiast/fine-tuning-language-models-for-triple-extraction-with-data-augmentation-834196bb3ceb
>>24816 Its been a year from my first post in this thread. Its interesting to look back, All I had was some gut feelings and a few leads, no concrete ideas on where to go or even a twinkle in my eye what a system would look like. Today I think have figured out good potential abstractions & building blocks for an agent. It's also fun to see what did not change, That "the high level mind is a narrative machine" sentiment has stuck with me as the guiding idea of what it is that I am trying to build. Now that I have a more concrete idea of what I am trying to build, I am hoping to layout my idea for the architecture and to get feedback on it, I want all your thoughts & criticism on it. I hope to start the software writing and engineering soon. A core assumption I have been following is that to get something useful a time first based approach to memory needs to be taken. Vanilla rag, Graph rag don't have mechanisms for dealing with time. Most cognitive architectures seem to split memory into "Declarative"/"Semantic memory" and "Procedural"/"Episodic memory" memory, This split is detrimental, information it not split cleanly between the two. At a high level the memory system has several main concepts: 1. Memory is composed of nodes connected by edges (its a graph). Each node has context(s) that it belongs to and the context can enforce requirements for node membership. Data is stored inside key value pairs with the nodes, So the graph nodes and edges are not classical triples, its kinda similar to an OpenCog atom. 2. Nodes values may have depth, depth is used to represent change over time. Time is represented as relative values with a scale category (second(s), minute(s), hour(s), day(s), etc...). Changes form a chain of values for a key in a node. Being able to query how facts change over time for a node is really important. This can be used to represent many things. For example imagine a "Tea making procedure" node, its "high level actions" key would be the steps to make tea. Another example is the agent remembering a story, it could have a node for each character & then keys for different aspects it observes, like a key for there actions, feelings & etc... 3. (Somewhat) Natural language as the primary representation of data and the source of truth. I am aware of how strange it sounds, usually agents try to distill natural language into symbolic representation for internal use, making this sound like an overly complicated LLM RAG. But I promise there is a reason for this. The problem the Semantic Web, CYC and other symbolic systems ran into is that a symbolic representation requires a predefined schema [1] (and for everything to be defined in it) & that is THE HARD PROBLEM to solve. Lucky us we now have access to LLMs and they are good with parsing and manipulating natural language [2]. LLMs can be leveraged to translate natural language to a constrained symbolic representation on demand. This flexibility is essential, There will likely be many domain specific solvers in an agent, that are not known ahead of time (they are learned), so we need a universal schema. (more of an implementation detail, but we do not need to call an LLM each time, the symbolic representation can be cached, only regenerated on fact rewrite) 4. Context plays an important role for both memory and cognition in this architecture, A context can (not required to) enforce rules onto all nodes that inherit it. This is important for (symbolic) reasoners, it ensures a uniform schema within all relevant nodes and links in its domain. Context can also just be a natural language text for LLM use. This post is getting long, this is mainly just talking about memory, but I think its a good starting point for discussion. Links: [1] https://youtu.be/3wMKoSRbGVs?t=455 -- On predicate logic / a universal schema. [2] https://youtu.be/3wMKoSRbGVs?t=1918 -- LLMs already hold a lot of the needed "rules of thumb" & we should not spend the man years required to do it by hand :^)
>>33490 >smol models I've been playing around with llama 3.1 8b more and am finding it much more impressive than I initially gave it credit for. The main problem with it is that it degrades *significantly* when trying to get structured outputs from it. I found a generic way to compose more functionality into 8b inference calls, but it's not that scalable in terms of how much complexity it can support. Unfortunately there's no way around this without using structured outputs. I have high hopes though that a fine-tune could solve this problem. My systems also turn out to be very token-heavy, so it would be a huge win if I could get structured outputs to work well with an 8b or smaller. >collaborating on datasets I'd be up for collaborating on high-quality instruct datasets. It's hard to collaborate on fine-tuning right now. I have ideas on that based on a distributed system I've developed ("infrastructure stuff" in >>32963), though it'll take some work to add in support for fine-tuning tasks. Datasets are much easier to collaborate on though, and I think my framework could help a lot with automated dataset creation due to its ability to backprop updates to a dataset. I can try to update my horsona library to support the creation of fine-tuning datasets. I'd do this by adding support for this kind of workflow: - Record inputs-outputs for LLM calls. - Keep track of when backprop is used to update the result of an LLM call. - Store the final dataset as <inference call, updated result> pairs. >using fine-tuned models. I'm only using 3rd party APIs right now since it's much cheaper to use those than to buy/rent my own GPUs. With APIs: - Groq is very unreliable, and it has no support for fine-tuning. I used to use it somewhat often, but I've stopped due to how unreliable it is. - Fireworks seems to have good support for fine-tuning & inference, and decent support for constrained generation. - Cerebras is by far the fastest and cheapest option, but it has no support for fine-tuning, and they don't have proper support for constrained generation. Also, the way their inference works, I'm not sure if they'll every support fine-tuned models. That lack of fine-tuning support would be fine if they could just get a fine-tuned 3.1 8b up with proper support for constrained generation. They also have low rate limits, and I expect they'll continue to have low rate limits until they have a proper paid version available. - I'm disregarding API providers that only offer closed-source models. On the open source side, I think sglang (open source) would be the best small-model option, and would be a good drop-in replacement for Fireworks. So right now, I don't see a downside to developing against Fireworks, then switching to sglang once my usage gets high enough or once private inference becomes a higher priority. There's no good open source replacement for Cerebras though due to its insane speed, so I'll probably stick to using Cerebras only for data generation and testing. >I’m wondering what your thoughts are on deduping? Nothing concrete right now. Per my current thinking, the QA dataset would be the "ground level" for declarative memory, and I'll eventually have higher-level structures built "above" that. For example, an ontology, where all of the concrete instances of concepts are QA data. I haven't thought much about it yet though, but I like the idea of grounding all declarative information in QA data since questions are easy to abstract over. For example, the "James" concept can be treated as the set of questions involving James.
>>33495 >Most cognitive architectures seem to split memory into "Declarative"/"Semantic memory" and "Procedural"/"Episodic memory" memory, This split is detrimental, information it not split cleanly between the two. I agree. I'm thinking about this approach: - The episodic memory would act like some quasi-ground truth. - Declarative memory would have a link back to the episodic memories from which it was derived. - Higher level software-friendly abstractions would be build on the declarative memory. And a general note about any sort of memory: I've found it helpful to think about a split between data and indexes. Data is about what's retrieved, and indexes are about how things can be retrieved. Any data that's stored (let's say episodic memory) can be indexed in many ways to support many kinds of queries, and it's often beneficial to index the same data in multiple ways. The exact same data can be indexed through triplets, embeddings, and SQL columns, and different uses of the same data might require different indexes. To that end, I'd want to think about what kinds of queries can be supported by what kinds of indexes. Triplets are good for logical queries. Embeddings are good for similarity searches. SQL columns are good for identity- and property-based searches. Spatial indexes are good for topological queries. What kinds of use cases aren't supported by these, and what kind of indexes would they require?
>>33421 The initial "StoryReader" implementation is up: >Example usage: https://github.com/synthbot-anon/horsona/blob/main/tests/test_reader.py >Implementation: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/stories/reader.py It can read stories paragraph-by-paragraph to extract information. It tracks three things as it reads through a story: (1) short term memory, which is just an array of the most recent paragraphs read, (2) a long term memory, which consists of an embedding-based question-answer database that keep track of information extracted from the story and a cache to keep track of the most recent retrievals, and (3) a StoryState, which is a data structure that keeps track of "live" information about what's being read (e.g., current location, current speakers). Next I'm going to refactor the StoryReader module to support custom memory modules, support extracting custom information from stories, and support tracking custom "live" information. I added two new issues, in case anyone wants to work on them: >https://github.com/synthbot-anon/horsona/issues/9 This one is for using an API to generate embeddings, instead of doing it in the library. >https://github.com/synthbot-anon/horsona/issues/10 This one is for using FAISS or other to do database operations (create, delete, query) on embeddings, instead of doing it directly with matrix operations. The full list of open issues is here: https://github.com/synthbot-anon/horsona/issues
>>33507 Excellent work, Lad. Godspeed. :^)
Open file (514.48 KB 512x768 AtriNoticed.png)
>>33495 >Connecting memories through context Seems interesting, I foresee this method requiring complex algorithms to keep everything coordinated correctly. Some sort of semantic relevance algorithm with a thesaurus and dictionary to guide connections should help simplify the process with a "good enough" alignment. >Natural language base Worth investigating, using tiny LLM's as translation layers can do wonders. So long as there are algorithmic guide rails to keep everything coherent. This has got me thinking of the role prompts themselves play in cognitive architecture. An algorithm that appends your request with relevant additions to an internal prompt, has potential to reduce computational load for a seemingly complex system. https://beginswithai.com/super-prompt-for-ai/
Open file (1.17 MB 1920x1080 ThinkingMina.png)
OpenAI 01 appears to provide a fascinating way forward towards cognitive architecture with LLM's. Essentially, it has a spiral of "thoughts" to refine an internal prompt to allow the model to provide a better result. As alluded to in the video, langchain and multi-agent frameworks can accomplish this. Adding RAG, and other enhancements would further bring us closer to real reasoning. Metacognition could be our backdoor into alternative cognition architectures. https://www.youtube.com/watch?v=tMWMuJF-JFo https://www.youtube.com/watch?v=zzaEBGOVKIg
>>33566 I'll be watching that project. The initial results shown in the video with Claude look promising. >>33507 I've refactored a lot of my horsona code to make it more async-friendly, which is necessary since LLM operations often need to be performed async, to make it easier to develop custom backproppable functions, and to more easily support "partial" updates to the computation graph. - In pytorch, you need to call loss.backward(), then optimizer.step(). The problem is that loss.backward() has no information on what exactly needs to be updated, so it needs to calculate gradients for everything that led to the creation of loss. In my refactor, loss.backward() needs to be passed a list of leaf nodes so excess computations can be pruned out. Code: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/autodiff/basic.py#L55 - In pytorch, a single optimizer needs to update all parameters. In my refactor, the optimizer is just a step() function that can be passed a gradient context, which contains all computed gradients. loss.backward() returns a gradient context that can be passed directly to step(). This makes it easier for a module to update its own parameters as needed without needing to rely on the caller to call the optimizer. Code: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/autodiff/basic.py#L188 - In pytorch, backproppable functions are defined as classes with separate forward() and backward() methods. In my refactor, both the forward and backward pass are defined by a single generator function. It calculates the forward call, yields the forward result, gets a gradient context from the yield, and performs the backward call. Code: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/autodiff/basic.py#L128 - The gradient context passed during the backward operation contains a dictionary with all of the variables that need to be updated. This lets functions figure out which gradients actually need to be calculated by checking if a variable is in the dictionary, which lets them cut out unnecessary calls. The functions are supposed to set the gradients of a variable by adding them to a list in the dictionary. To encourage this cutting out unnecessary gradient calculations, the list of keys in the gradient context is immutable, and it only contains the variables that need a gradient calculation. Code: (same as the above horsefunction). - Both sync and async generators are supported for backproppable function definitions. If a sync generators, the backward pass call is wrapped in an async function so it can be handled consistently when I eventually make the backprop & update steps run in parallel. Code: (same as the above horsefunction). I also updated my StoryReader implementation so it can be passed custom databases, caches, and state objects, which I expect will make it a lot easier to customize it depending on what information needs to be extracted from stories. This will require a few more changes, but I'm going to test it to see if it can extract world information and per-character information while reading a story without any changes to the underlying StoryReader module. That's what I'm currently working on.
>>33581 Asynchrony, concurrency, and parallelism have other benefits as well, #1 of which is performance (as in wallclock). Without taking advantage of these in modern processors, you're leaving the vast majority of your wallclock perf still sitting on the table. >I also updated my StoryReader implementation so it can be passed custom databases, caches, and state objects, which I expect will make it a lot easier to customize it depending on what information needs to be extracted from stories. This will require a few more changes, but I'm going to test it to see if it can extract world information and per-character information while reading a story without any changes to the underlying StoryReader module. That's what I'm currently working on. Sound exciting, CyberPonk! Looking forward to seeing what you achieve with this. Cheers. :^)
>>33587 I actually finished implementing asynchronous backprop & parameter updates last night, so the core of the framework is now fully async.
>>33598 >>33601 >I implemented fully parallel backprop and parameter updates, not just async. Good news. Keep up the good work, Anon! Cheers. :^)
>>33581 Horsona chatbot library updates: - I added support for "multi-LLMs" for better concurrency. These wrap multiple other LLM APIs, and they call the most expedient underlying LLM (based on recent calls, token usage, & rate limits) for every query. The rate limits track any arbitrary number of calls & tokens over any arbitrary interval, and multiple rate limits can be set per LLM since they can have per-minute, per-hour, and per-day limits. It handles retries and exponential backoffs as well, switching providers if there's any error, and it removes LLMs from the candidates list if they fail too many times consecutively. For simple usage, it doesn't make much of a difference, but for longer-running tasks that bump up against rate limits, it should give a big performance boost. The API mimics the least common denominator of all of the LLMs it wraps. So if all of the underlying LLMs are compatible with the OpenAI interface, the multi-LLM will be too. - ... Code (a bit ugly... I'll clean it up later): https://github.com/synthbot-anon/horsona/blob/main/src/horsona/llm/multi_engine.py - ... Rate limit implementation (top of the file): https://github.com/synthbot-anon/horsona/blob/main/src/horsona/llm/base_engine.py - ... Example (line 105): https://github.com/synthbot-anon/horsona/blob/main/tests/conftest.py#L105 - I think I figured out how I'm going to implement character cards and other "attachable" modules. The StoryReader returns the context of whatever it's reading as a result, which can be passed to other modules (e.g., the CharacterCard module). This is analogous to how you build pytorch modules. - ... Code example (line 161, cleanup in progress): https://github.com/synthbot-anon/horsona/blob/main/tests/test_reader.py#L161 Unrelated: I spent some time with another PPP developer cleaning up LLM tokenizers to get better pronunciations. The main reason I looked into this was to see why a TTS model might be generalizing poorly, and I think the findings can be generalized. Tokenization makes it very easy to see which parts of data are going to be "well-trained" and "well-utilized". If a token appears frequently in a training dataset, the model will be well-trained on data associated with that token. If a token appears frequently in inferences, the corresponding data will be well-utilized. Creating subtokenizers with a reduced vocabulary is also very easy, and doing so lets you ensure that far more parts of a dataset will be both well-trained and well-utilized without any changes to the dataset or model. Tokenizers are basically compressed, discrete representations of data based on a probability model learned from data. I'm wondering if it's worthwhile to create "semantic" tokenizers that generate an optimized discrete versions of embeddings. The basic flow would be: - Use an embedding model to index some dataset. - Use PCA to align the embedding space to the embedding's array representation. - Discretize the embeddings. - Tokenize the discretization. If this is useful, it would probably be over sequences of embeddings and with some N+1-dimensional tokenization scheme. E.g., since text is represented as 1d sequences, sequences of text would result in a 2d tokenization. For sequences of 2d images, it would be a 3d tokenization. On top of making models generalize better, tokenization let you index and search information in a very different way. With embeddings, you're basically restricted to similarity search. With tokens, you can do things like substring search and regex search. You can build grammars and do things like guidance & constrained decoding, which are an extremely powerful techniques for getting more useful outputs from transformers. At some point, I might look into this further. For now, it's just a spontaneous bit of information.
>>33710 This is both very encouraging, and highly enlightening CyberPonk. I particularly appreciate your perspective on the benefits of tokenization/sub-tokenization. I will soon begin the text works portions of my own RW Foundations efforts, and this is a helpful idea to me rn. Thanks! :^) I greatly appreciate both your insights & perseverance, and for sharing both here on /robowaifu/ . Cheers. :^) Drive on!
>>33566 >OpenAI 01 appears to provide a fascinating way forward towards cognitive architecture with LLM's. Essentially, it has a spiral of "thoughts" to refine an internal prompt to allow the model to provide a better result. As alluded to in the video, langchain and multi-agent frameworks can accomplish this. Adding RAG, and other enhancements would further bring us closer to real reasoning. Metacognition could be our backdoor into alternative cognition architectures. It's also just good signal to us that we are and where on the right track. There was no fundamental hidden/unknown reason why the big players where not doing this. ---- "Scopes of Thought" An idea I had to try to reduce context length usage is to borrow the idea of scope, where you treat the LLM context as a stack. When doing chain of thought, or solving a sub problem, you make a copy of the current state (KV cache if its a transformer), do the task & then extract the result. Then you load the old state and insert the Task you did and the result of it. Whats nice about this is that future calls and tasks get a simplified context of previous tasks and there results without the context being consumed/polluted. ---- An idea for a de-duplication strategy. Every node (or RAG Q&A pair) should have a simple keyword as a "label". (Mostly single word if possible) Label uniqueness is enforced on nodes. If a request for a new node has the same label a disambiguation process is started. First a choice is made if the existing node is actually related and should be modified instead of making a new node. if not then both nodes will have their labels changed to specify a subcategory. For example we are adding a node for a Person named Jon Doe with a label of JonDoe. But we already have a node for a concept of a Jon Doe. The concept node becomes "concept:JonDoe", and the person becomes "person:JonDoe". Note that the both are still reserving the first part JonDoe, a 3rd JonDoe node would still trigger this process. (There would be a top level keyword table that is a list of all simple label names (without the subsection prefixs)) ---- There is Interesting RWKV news. RWKV-7 "Goose" is being tested, what makes it unique is that its not a Linear attention model and overcomes the TC0 limitation that attention based transformers have. https://x.com/BlinkDL_AI/status/1833863117480280528 In general I am very bullish on RWKV LLMs and believe its the way forward if we fine tune them for our use-case. ---- On the topic of Fine Tunes, I have successfully contacted an author of the YoLLaVA paper and got them to publish the training code [1]. (Why do academic types not do this in the first place -_- without me pestering them, it did not get a lot of attention so if did not ask for it, there was a non zero chance that this code would of never been published.) If you don't know what YoLLaVA is please check it out!!! It's super cool :D [2], I think it's a perfect demo of why a custom fine tune is essential. Imagine this paired with an Image similarity RAG/memory system. Whats nice is that any LLM with vision can be trained to do this. This makes me question what other ""low hanging fruit"" is there? LLMs are powerful and the corpos are not interested or searching for use-cases outside of like Q&A chatbots and surveillance. [1]: https://github.com/WisconsinAIVision/YoLLaVA/commit/6a640ee636ceebdd8ff747ea4335b475765b9a7e [2]: https://thaoshibe.github.io/YoLLaVA/
>>33710 Thanks, Chobitsu. I appreciate seeing your thoughts, even when it's "just" encouragement. It's also nice having someone around that's been at this for as long as I have (since 2014). I'm glad the tokenization philosophizing helped. Are you posting your progress on RW Foundations anywhere? >>33717 >hen doing chain of thought, or solving a sub problem, you make a copy of the current state (KV cache if its a transformer), do the task & then extract the result. Then you load the old state and insert the Task you did and the result of it. Guidance does this. There's an efficient implementation here: https://github.com/sgl-project/sglang It lets you store & retrieve KV caches for recent queries. Some API providers support a (much more) limited version of this as well: https://www.anthropic.com/news/prompt-caching https://ai.google.dev/gemini-api/docs/caching The difficulty in applying this more broadly is that KV caches don't compose. So you can't take a KV cache for query A and a KV cache for query B, then piece them together into a KV cache for query AB. You can use it to continue or branch off from previous inferences, but there's no way to merge the caches of two inferences together. That's a limitation imposed by how the popular text positional encodings work. >I have successfully contacted an author of the YoLLaVA paper and got them to publish the training code Very nice.
>>33732 >>33733 (answer-related)
>>33732 Horsona chatbot library updates: - I created a demo app: https://github.com/synthbot-anon/horsona/tree/main/samples/simple_chatbot . It's not supposed to be a great chatbot, just a "hello world" for getting something simple running as an application. - I added support for saving & loading modules. The code is a little ugly right now. It's based on pytorch with a few exceptions: - ... Parent class for serializable data: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/autodiff/basic.py#L32 - ... Example usage: https://github.com/synthbot-anon/horsona/blob/main/tests/test_state_dict.py - ... To support serialization of optimized data structures (e.g., HNSW indexes in C++), modules can hook into the save/restore functions to convert the data into a serializable format. - ... The load function is a classmethod so there's no need to create an instance of a module before loading the data back in. This is necessary for cases where a module can't be reloaded without access to data from sub-modules. When loading a modules, the load function calls the module's constructor and passes restored data as arguments, so serializable modules need to have a constructor that accepts field values. I don't think that's an issue since it's good practice regardless. - ... The load function can accept additional arguments to pass to module constructors. This is necessary right now to handle fields that should not be serialized, like which LLM API to use. - ... The installation is a bit complicated right now. The Ollama dependency requires either familiarity with Ollama or some complicated docker usage. The default LLM config file also requires 4 API keys. I'm thinking about adding support for OpenAI & Anthropic embeddings and defaulting to requiring just a single OpenAI/Anthropic API key for the default installation. - I made LLMs configurable through a JSON config. Example config: https://github.com/synthbot-anon/horsona/blob/main/llm_config.json.example - I added removed the pytorch dependency. Embeddings are now calculated through an API (currently only ollama is supported, but I'll add others). Embeddings are indexed with ChromaDB's fork of hnswlib. I plan to add support for external indexes (e.g., ChromaDB, Postgres). - I made the embeddings configurable in the same way that LLMs are configurable. Example config: https://github.com/synthbot-anon/horsona/blob/main/index_config.json.example - I cleaned up & moved around a bunch of code. All of the embedding code is moved to horsona.index. Caches seem like their own thing (not an API, not backproppable) so I moved them to their own folder. There's a tentative BaseCache class, though I might refine that interface as I figure out more of what it needs to handle.
>>33996 Very exciting, CyberPonk. Especially glad to see you create a demo app for all us newcomers. Hope you had fun, good to see you back, Anon. Cheers. :^)
>>33996 Thanks. Good news indeed. I'll look into it and hope this will bring back my motivation to work on something as well. How well can this current program interact with others?
>>34009 The simple_chatbot demo uses stdin and stdout, so interoperability isn't great. I plan to add support in the library for: - Unity integration (pending consensus from an interested group that's using Unity) - REST integration, compatible with OpenAI's API (will certainly happen) - Unreal Engine integration (hypothetical right now, waiting to find someone proactive that wants to use it with UE) So it'll get better with some time.
>>34017 >The simple_chatbot demo uses stdin and stdout, so interoperability isn't great. Lolwut? That is the absolute apex of interoperability! :^) Unix Way, best way https://wiki.c2.com/?UnixWay
>>34031 Heh. It's good for interoperating with scripts as a standalone chatbot, not so good for interoperating with other software in a modular way. Horsona updates: - I created a sample module for contributors to use as a reference. It takes in character information plus contextual information, and it generates high level pose information that's appropriate for that character in that context. It supports "backpropagating" so that any errors discovered in the results can be used to correct underlying variables. - ... Code: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/contributions/sample/pose.py - ... Explanation of the code: https://github.com/synthbot-anon/horsona/tree/main/src/horsona/contributions/sample - ... Test cases: https://github.com/synthbot-anon/horsona/blob/main/tests/contributions/test_pose_module.py - ... Explanation of the test cases: https://github.com/synthbot-anon/horsona/tree/main/tests/contributions - ... General information for contributing: https://github.com/synthbot-anon/horsona/tree/main/src/horsona/contributions I'm going to work on the interoperability side next. The plan right now is to support calling each module via a REST API. This should allow external callers to reuse and extend any predefined workflows. (The OpenAI API compatibility will come later.)
>>34034 I'm reading some of it, but I can't use a LLM on my old laptop which I'm currently using.
>>34034 POTD Excellent! I'll try to check this out before the holiday seasons, CyberPonk. Keep up the great work! Cheers. :^)
>>34058 Thanks! >>34034 Horsona updates: - I'm adding support for game engine integration. It exposes a REST API that can be wrapped by Unreal Blueprint nodes, Unity Visual Scripting nodes, ComfyUI nodes, and so on. It should support everything that can be done in Python, including backpropagation. - ... Code: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/interface/node_graph/node_graph_api.py - ... Tests: https://github.com/synthbot-anon/horsona/blob/main/tests/interfaces/test_node_graph.py
>>34049 The default installation here doesn't require any powerful local compute: https://github.com/synthbot-anon/horsona/tree/main/samples/simple_chatbot It will be very slow though since OpenAI is slow. For LLMs, It supports Cerebras and Fireworks too, which should be much faster. For embeddings, I think the container version of Ollama should work quickly enough even on an old laptop. I'm running on a CPU, and it's not the bottleneck for any test cases or the sample project. There are instructions on that page for using different LLM APIs and for using the containerized version of Ollama. You can reuse the same index_config.json and llm_config.json when creating custom modules or runn ing tests.
>>34034 Horsona updates: - The game engine integration server example is up here: https://github.com/synthbot-anon/horsona/tree/main/samples/node_graph_api - I added support for session timeouts, which automatically cleans up resources. The timeout resets every time a session is used, and there's a new keep_alive API for manually resetting a timeout if a user is just AFK. - ... Test cases: https://github.com/synthbot-anon/horsona/blob/main/tests/interfaces/test_node_graph.py#L156
>>34076 >The game engine integration server example is up her Wow, that was fast Anon. :^)
>>34085 Hopefully I can get the Unity side of the integration up quickly. The guy I'm working with is giving a lot of good feedback on how the server side is implemented. Once I update my side with those changes, we'll start working on the other half. >>34076 Horsona updates: - I redid how the database cache works since it clubbed together multiple disparate functionality, and its interface required special handling by any module that used it. The new version gives an embedding database an LLM interface. It can be queried like any other LLM, and it does any embedding-specific handling in there (esp. generating keyword searches from the prompt to get better embedding lookups). For whatever underlying LLM it uses, it requires two queries: one to generate the search terms, and one to respond to the query. - ... Code: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/memory/embedding_llm.py - I implemented ReadAgent for dealing with long documents. ReadAgent generates a "gist" for each "page" of the document, which can be used to determine what information is on each page. At query time, it uses one LLM call to determine which pages to pull into the context, then a second LLM call to respond to the query. I implemented this as two modules: one to generate & keep track of gists, and one to provide the LLM interface. My version has two changes relative to the original: (1) when summarizing pages, it provides all gists-so-far as context so it can generate better summaries, and (2) when responding to a query, it provides all gists along with the selected pages rather than just the selected pages. - ... Code for creating gists: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/memory/gist_module.py - ... Code for the ReadAgent LLM wrapper: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/memory/readagent_llm.py - I added some utility functions that are generally useful for getting "smarter" responses. One of the is for searching the web for information on a given topic. The second is for decomposing a given topic into subtopics. - ... Code for searching the web: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/smarts/search_module.py - ... Code for decomposing a topic: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/smarts/mece_module.py I like the LLM wrapper approach for generating augmented responses. I'll likely update some other modules to use the same approach, particularly the DialogueModule for generating in-character responses. The ReaderModule is broken since I got rid of db_cache. I'll update this with a cleaner interface.
>>34110 It's been a while since I posted an update since I'm working on a more complex module, and I still don't have it done. I'm working on creating a causal reasoning module. It's based on the python DoWhy library, which can do analysis based on Judea Pearl's Do Calculus for causal modeling. The basic idea is: - You provide a causal graph for how you believe variables relate to each other. - You give it datapoints of those variables under different circumstances. - It fits a model of the data taking your causal graph into account. - You can ask it causal questions. Example causal questions: - What's the best way to accomplish X? - What were the underlying causes of X? - What would have happened if I did X instead of Y? - Is this new datapoint consistent with earlier datapoints? - How reliable is X's effect on Y? - If I do X, what effect will it have on variables Y, Z that I care about? I have the main class for this implemented. I had to implement some custom things for this to make it more robust: - The standard probability models supported by DoWhy don't handle continuous variables that well, so I had to create my own. My custom one uses Gaussian Processes since it's extremely sample efficient and it works reasonably well with a mix of continuous variables and discrete variables. - I'm using a kernel that's a slightly modified version of the SciKit Learn's default to make it more robust to noisy samples. The default is ConstantKernel * RBF, my custom one is ConstantKernel * Matern + WhiteNoise. - I'm imputing missing values in data before building a model on it since Gaussian Processes can't handle missing values. I'm using SciKit Lear's IterativeImputer for this. I ran some rudimentary tests to make sure it finds causal effects even with missing & noisy data and with very small sample sizes. With clean data, it can fairly reliably identify causal effects from as little as 10 datapoints for 12 variables. (The standard recommendation is NumVariables + 2). Adding 0.5 standard deviations of noise to all datapoints and setting 20% of values to null, it does well with 20 datapoints. With more noise and more null values, it requires more datapoints. It performs poorly when there are erroneous outliers in the data. I haven't figured out how to handle that yet. Since this needs to be fast and since it can slow down significantly with larger datasets, I have code for identifying representative samples and retaining only those. I'm using K-Means to identify clusters. I went through a large refactor since I implemented this, and I haven't yet integrated it with the updated code. I'm considering updating this to generate stratified clusters based on treatment (i.e., just the actions that need to be analyzed) patterns. The downside is that that would make it harder to understand what datapoints get retained, and it would need additional information, so I'm leaning against it. Once that's integrated, I'll need to think through how to wrap this functionality in an LLM interface ("LLM wrapper" a la >>34110). I suspect medium-size models (~70b) can generate reasonable causal graphs and figure out what kinds of causal questions need to be answered for a given query, but it'll require some experimentation to figure out exactly how. One challenge is figuring out how deal with large causal graphs. Right now, I'm thinking that each causal graph will represent a single "persona", and each persona can interact with others before deciding on a final response. A single persona would be backed by a small causal graph, and more complex causal reasoning would come from interactions between multiple personas. One huge benefit here is that, since interaction with a persona is the same as interacting with any LLM (or LLM wrapper), this can automatically support hybrid reasoning that requires knowledge, associating reasoning, and causal reasoning. I think a "persona" here is the same as an "agent" in Marvin Minsky's Society of Mind theory. I'm looking into that now to see what thoughts have been put into this approach.
>>34239 >I think a "persona" here is the same as an "agent" in Marvin Minsky's Society of Mind theory. I'm looking into that now to see what thoughts have been put into this approach. I'm not seeing anything here that hasn't been incorporated into common sense. It seems like Society of Mind is just a statement that a mind is composed of interacting modules. It applies equally well to monolithic neural networks as it does to functionally distinguished uses of a collection of neural networks. I don't expect to find anything useful there.
>>34240 >It seems like Society of Mind is just a statement that a mind is composed of interacting modules <TFW you read this as 'a maid is just a collection of interacting modules' Lol :D anime catgrill meidos in tiny miniskirts are a reality when?
Open file (156.38 KB 194x194 bread.gif)
>>34272 >Society of Meidos. Soon™. It's a pain in the ass working out how to do this. - The analysis requires the causal graph to be a DAG, but real-world causal graphs are definitely not DAGs. - I can get around this by distinguishing input nodes and output nodes, and having output nodes represent a change in output value rather than the output value itself. This requires more state tracking since using nodes as both input and output involves translating between the two. - Finding the "right way" to specify & apply changes isn't straightforward. - For practical reasons, I can only generate small graphs on each query. Piecing them together requires identifying which nodes are "the same" across multiple graph, decomposing queries so they can be applied to each graph separately, and stitching together the results.
As-yet unsolved mysteries. Can /robowaifu/ help solve some of these, please? https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_biology#Cognition_and_psychology
>>34279 >- I can get around this by distinguishing input nodes and output nodes, and having output nodes represent a change in output value rather than the output value itself. This requires more state tracking since using nodes as both input and output involves translating between the two. As mentioned to you in the Sumomo-chan bread : ( >>14409 ), there is a way in C++ to get around what would normally be a DAG of dependencies -- and in a way that doesn't restrict the representative 'nodes' (ie, C++ classes) from their original, intended purposes. I wonder if what you're dealing with isn't simply a limitation of your approach/language , Anon? Do you think it's possible to implement a simple, one-step abstraction 'layer' that frees you from this conundrum in a similar way to what was chosen for RW Foundations ? I hope you solve it soon, CyberPonk. Cheers. :^)
>>34298 It's not an abstraction issue, and it is a fundamental limitation of the the theory available today. I can generate reasoning that involves cyclic dependencies without a problem– that's actually what happens by default– but no rigorous causal analysis engine is capable of dealing with it. As far as I can tell, it's not known how to deal with spurious correlations when the causal graph contains cycles. I could switch to doing Bayesian message passing, which is capable of dealing with cycles, but it doesn't handle spurious correlations properly so it's not actually doing a proper causal analysis. I probably will end up adding a less restrictive module for non-causal analysis at some point, but right now I'm just focusing specifically on adding in the ability for an LLM to answer causal questions and use that to make decisions. I've actually decided to stick with representing output and input nodes in the same way. Having two representations (values & changes) for output nodes limits how much work can be offloaded to the causal analysis engine too much. To deal with cycles, I currently plan to create a graph of DAGs. Two DAGs are connected if they have nodes in common. Causal analysis is done on each DAG individually, then the results will be propagated to downstream DAGs. It's going to be complicated, but I think it's worthwhile to be able to do more complex analysis. >>34297 I think this one at least has solid grounding now: >How and where does the brain evaluate reward value and effort (cost) to modulate behavior? "Evaluation" is too vague a term, but it's essentially between the thalamus, basil ganglia, ventromedial prefrontal cortex, orbitofrontal cortex, and amygdala. See: https://pmc.ncbi.nlm.nih.gov/articles/PMC4093837/ https://pmc.ncbi.nlm.nih.gov/articles/PMC9352198/ https://www.youtube.com/watch?v=F1L-YTCUpk4
>>34299 >As far as I can tell, it's not known how to deal with spurious correlations when the causal graph contains cycles. Yet isn't this exactly what industrial PID Theory was designed to handle well? What if you 'wrap' each of your causality graph nodes inside an individual Multiproducer/Multiconsumer PID Interface Layer to equilibrate the system overall, outside of the local maxima/minima transition epochs? >tl;dr This is primarily a temporality issue, I think. All industrial systems in the realworld tend to have feedback loops, yet these control systems provably manage it all successfully. >=== -minor disambiguation
Edited last time by Chobitsu on 11/10/2024 (Sun) 04:03:46.
>>34300 I don't think it is. PID controllers don't account for spurious correlations. They treat all correlations equally. Per my understanding, PID controllers also start to fail when there are multiple interacting loops due to feedback issues. I think the usual solutions for scaling up PID controllers all involve removing cyclic feedback between control loops that can cause instabilities (cascade control, feedforward control, decoupling systems). If there are strong correlations between interacting loops, I don't think there's any way to guarantee that PID controllers will converge. Having interacting loops work on different time scales is one solution, but I can't guarantee that it's possible to separate causal graphs by time scales in a way that removes the cycles, and that's especially true when I'm using an LLM to generate many causal graphs dynamically. I'm realizing that even bayesian message passing will also fail to converge in a lot of cases. Maybe the best I can do here is to let it run for some fixed number of updates and just use the result regardless of whether it converged.
>>34301 >If there are strong correlations between interacting loops, I don't think there's any way to guarantee that PID controllers will converge. AFAICT, we're inventing theory here (ever hear of a Multiproducer/Multiconsumer PID Interface Layer before?), so no, no guarantees are to be had at this stage of the research. But if you don't make an effort to make the rubber meet the road, then you'll never know. I would predict the many-to-many matrix+temporal-sliding system that this concept approximates -- especially along with it's inherent ability to damp out spikes and converge on a node-local, stable signal level -- ought to provide ample opportunities for experimental tweaking/rewiring. Wouldn't you agree, Anon?
>>34302 I do agree, but getting robust causal relationships is important for what I want. Its uses are far more limited without that. If correlations were enough, I could just train a simple sparse autoencoder. In any case, I figured out how to analyze causal chains across graphs. There's an analogy with quorum intersection algorithms in distributed computing that I'm pretty sure works here. I'll try implementing it.
>>34303 >quorum intersection algorithms Remarkable synchronicity, CyberPonk. :^) Operational Transforms (OTs) & Conflict-Free Replicated Data Types (CRDTs) was literally going to be the topic of my next post to you in this chain, as per dovetailing with the added benefits of the 'power-PID-wrapped nodes' concept to quickly solve the need for convergence with temporal-sliding going on everywhere (just like in the realworld, lol). <---> Also, just to clarify: my idea wasn't to attempt eliminating cycles in the system -- but rather to make it reasonably-robust+speedy in the presence of them (just like in the realworld of bio-neurology, heh.) So it sounds like you're well on your way to a solution! Cheers, Anon. :^)
>>34279 Sorry for not answering earlier. I don't think I can help you. Did you ask in AI related forums? Did you ask some AI service for advice? Could you provide an example? >>34297 I think it's better to just look at every functionality and try to replicate it. We don't need to understand the human brain exactly. More like >>25032
I took a short break from >>34303 to work on some other things. - Text generation with GPT-SoVITS: https://github.com/synthbot-anon/horsona/tree/main/samples/gpt_sovits - ... This interface is definitely not final. It's just an early first draft. - I added a module for dealing with large amounts of memory. It uses a combination of RAG + ReadAgent. Short version: when given a document, it chunks the document, creates summaries for each chunk (using prior chunks & summaries for context), and creating embeddings for each summary. At retrieval time, it expands a given task into many queries, uses RAG to identify relevant summaries, uses an LLM to identify the summaries most worth "unpacking" into their original chunks, and uses both the most relevant summaries and most relevant chunks to respond. I'm calling it a Wiki Module / Wiki LLM since that's the kind of data it's most suited for. - ... Code for processing & indexing documents: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/memory/wiki_module.py - ... Code for responding to queries: https://github.com/synthbot-anon/horsona/blob/main/src/horsona/memory/wiki_llm.py - I add an option to create an OpenAI-compatible endpoint for any supported LLM, including custom LLMs. This was so I could test compatibility with SillyTavern. - ... Code for creating OAI-compatible endpoints: https://github.com/synthbot-anon/horsona/tree/main/src/horsona/interface/oai - ... Example custom LLM that works with SillyTavern: https://github.com/synthbot-anon/horsona/tree/main/samples/llm_endpoint. This one (1) uses the Wiki module & LLM from earlier to let the LLM access significantly more context whenever it's relevant, and (2) uses a ReadAgent-like module so it can continue conversations for much longer without forgetting what was said earlier. It requires no special configuration in SillyTavern or plugins. Just use the new endpoint, and it should add in the new functionality. One issue here is that I don't know how to get a session id from SillyTavern, so the conversation memory persists across all conversations. The only way to fix it is to restart the server. I'll add a better way to deal with that at some point, but it's not a priority for now. - Bunch of small changes. The library now support streaming outputs, there are utility functions for saving/restoring modules with binary data (e.g., embedding databases), I cleaned up the LLM classes so it's easier to derive new (custom) ones, I added support for Grok models, config files now support comments, I improved several prompts, embedding lookups now optionally return distances in addition to results, many bug fixes, etc etc. One of the todo items on my feature list is to automatically generate lorebooks. Playing around with the SillyTavern integration made me realize how important that is. RAG-based lookups are really only good for lorebook-like information (static, "current state of the world" like a research paper or Wiki), and stories/conversations certainly don't look like that. There needs to be a conversion step from dynamic data to Wiki-style data, which would essentially be a lorebook. I'll probably work one that either after I'm done with the causal stuff or when I need another break from it. >>34305 CRDTs might play a role there too ^:). >>34309 I asked a guy that's spent a decent amount of time working with & creating causal models. The response was "Yeah, without an explicit SEM causal inference is hard." Dynamic causal graphs and causal inference split across multiple graphs don't seem to be a thing people commonly do. There's some work on dealing with multiple graphs under "causal fusion", but the work is pretty sparse and scattered. Almost all of the work I've done so far was together with Claude. I do very little planning and coding now without Claude.
>>34302 >Multiproducer/Multiconsumer PID Interface Layer... I would predict the many-to-many matrix+temporal-sliding system that this concept approximates -- especially along with it's inherent ability to damp out spikes and converge on a node-local, stable signal level -- ought to provide ample opportunities for experimental tweaking/rewiring. Wouldn't you agree, Anon? You're making my head hurt :)
Open file (58.93 KB 552x378 minGRUs go brrrrrr.png)
This paper is going to become incredibly important in the coming years for creating recurrent systems that emulate processes in the brain that can scale training on GPUs and be deployed efficiently on CPUs Paper: https://arxiv.org/abs/2410.01201 Code: https://github.com/lucidrains/minGRU-pytorch I've been trying to develop a language model with predictive coding but it has been infeasible to train due to recurrence requiring backpropagation through time. Some researchers found a way to reformulate the recurrence relations to allow for parallel computation via a parallel scan algorithm. The minGRU can leverage the GPU to train on long sequences in parallel and is competitive with transformers and Mamba on language modeling and reinforcement learning. Their O(n) computational complexity in sequence length and simple implementation make them ideal to play with and experiment too. My understanding is the most expensive computations (the matrix multiplications) only rely on input and can be calculated in parallel. The outputs at each time step are cumulatively summed together in log-space, removing the redundant calculations of BPTT, keeping the computation graph shallow, and greatly reducing the vanishing and exploding gradient problem. Single minGRU layers are time independent but become time dependent by adding layers, requiring only 3 layers to solve a selective copying task. I assume minGRU's in-context learning ability is also limited by its hidden state size, but adding a memory interface to it should overcome this. It should be possible to augment it with a heap for allocating, referencing and deallocating memory, but I need to think more on it.
>>34499 >paper I'm not so sure I understand this. In fact I know I don't but is the general idea that the neural network MUST(normally) recalculate ALL the nodes or neurons in a network for every new batch of information? Is the paper saying that they can break these up and do the operation in parallel? I'm trying to figure if this is the "big picture" in a general way, not specifically.
>>34477 - I got multi-graph causal inference working. It's a pretty dumb algorithm, but it works well. The basic idea is that it joins the graphs, finds all nodes on the relevant causal path, and iterates on the individual graphs to find causal effects based on accumulated information. Right now, the iterations continue until it has some information on all causally-relevant nodes. At some point, I'll update it so it iterates enough times that the causal information from each node can fully propagate. It treats all causal information as relevant and everything else as spurious. - I was previously using the DoWhy library to do causal inference. I migrated the relevant code into my own framework. - Overall, the DoWhy code is atrocious and unpleasant to work with. I did a massive cleanup and improved the interfaces so I could add custom inference engines much more easily. My version is has less features, but in practice I don't expect anyone to actually want to use those features. - I added an inference engine to do causal inference on arbitrary natural language data. - I worked out how to get LLMs to produce good causal graphs, though this isn't integrate with the rest yet. I'm pretty happy with the natural language causal inference results so far, even with a 70b model. My expectations were pretty high, and it meets expectations. Of all the things I've developed, this is probably the one I'm most proud of. I have some code cleanup to do now. Specifically: - Make the base causal inference class support both numerical and natural language inference. Right now, it requires code changes to make it switch between the two. I've already done most of the work for this, and the rest should be easy. - Update my multi-graph inference code with the recent updates so it can deal with natural language inferences. - Update my data manager to support natural language. The data manager's primary purpose is to reduce the number of datapoints required inference. For numerical data, this is necessary because causal inference on a large number of datapoints is slow. For natural language data, it's necessary because I want everything to work with a small context window. My current implementation is pretty naive since it gets representative points over all the data rather than over just the data required for each specific inference. With numerical data, that wasn't really an issue since I could impute the missing values and still get decent results. For natural language data, imputing isn't viable, so I'll need to do it "properly". - Write & integrate the causal graph generation code so graphs don't need to be manually specified. - Create horsona modules for everything. After that, I'll be doing more memory work.
>>34507 >Of all the things I've developed, this is probably the one I'm most proud of. You've developed some pretty impressive things during the time I've known you, Anon. So that sounds very exciting to hear! :^) Well, I've got about a month left to fulfill my plan to at least try to get your Horsona system up and running on my machine before the end of the year. I presume these : >- Write & integrate the causal graph generation code so graphs don't need to be manually specified. >- Create horsona modules for everything. are very-much intended to be part of your new package? If so, then I'll wait until you give the word that setting up your system for a beginner like myself is ready to go, then I'll take an honest whack at it. <---> Keep moving forward, CyberPonk. Cheers. :^) >=== -minor edit
Edited last time by Chobitsu on 11/28/2024 (Thu) 13:22:53.
>>34485 >You're making my head hurt Lol. My apologies, fren Grommet. That wasn't intentional! :D <---> By way of explanation, I'll try to use some rough analogies. Very rough, so don't be too critical haha... :D So, a 4-rotor drone ostensibly has just one PID-based control system as the typical norm. However, I would argue that in fact there are five onboard : * The 'normal' one that acts as the system controller. * The other four are actually part of the motor drivers themselves; although they aren't exposed as such (nor even engaged with by the user), they are in fact real, and vital to the proper operation of the drone in essentially all reasonable flight modes. So there's five PID-esque 'controllers' for a single machine. The so-called 'temporal sliding' is there as well I'd argue, due to the inherent propagation delays and other latencies within this physical rotors system+network of electronics, all onboard this single aircraft. <---> Now... picture an entire flotilla of self-illuminated drones, all doing synchronized swimming flying -- ala a ginormous yuge visage of Optimus' head floating over Hollywood on, say, the night of October 9th this year. :^) Now you have a simulacrum of a yuge network of PIDs all 'coordinated' (though entirely ground-based preprogramming, AFAICT) after a fashion. But what if -- instead of externally-driven -- they all talked to each other live instead? Now you have the basis for a : 'Multi-producer, multi-consumer PID system that embodies a many-to-many (communications&control [C&C]) matrix, that has temporal-sliding going on like mad all over the place' . In fact, such a manmade technical system could embody just the behaviors we observe in nature for a flock of birbs, or a school of fishbros. Get the picture? <---> Now, simply encapsulate this exact concept down into a single, cognitive-oriented system where the Super-PID node(-synapse) wrapper took-in/gave-out C&C signals from/to the entire (or at least a large, local subset of the) network collection -- all while keeping the actual signal inside the local node stabilized (similar to how an individual drone remains stable in flight, regardless of the flotilla action as a whole) as a sort of 'running average' of the local interconnections. It would operate this way regardless of all the signal-sliding flowing through the system as a whole. Finally, the CFDT-esque validation process comes alongside to read the convergence out of the network of individual (now-stabilized) node-internal signal levels during each compute quanta 'tick' of the system. (The tick is simply an idea of a time barrier, so that the CFDT concept can do it's magic, and all the nodes will in fact each have an accurate picture of the system as a whole, regardless where they themselves each were on the 'timeline' of signals-generation inside this system during this specific compute quanta.) <---> Whew! I hope I didn't make all this even more confusing, Anon. :DD >tl;dr This is primarily a temporality problem at this stage, IMO. Adding a fully-interconnected meshnet of Super PID wrappers -- one for each 'synapse' -- is the way to gain convergence out of this chaotic soup and it will be stable as such, even with yuge numbers of cycles (feedback loops) inside the system! Cheers, Anon. :^) TWAGMI >=== -prose edit
Edited last time by Chobitsu on 11/28/2024 (Thu) 15:40:29.
>>34508 They are very much intended for the new package. The LLM interface to the causal models will probably be a little flaky initially, at least until I get proper memory working, so I'd recommend playing around with some of the other functionality before that. The underlying modules should be robust, though it'll be more complicated to use them directly. There's some setup required (getting API keys or setting up local models) that you'll want to have ready for it, so it's worthwhile to try installing & running the tests sooner. It's a one liner to install after you have python 3.11 and poetry installed, then you'll need to add your API keys/endpoints, then a one-liner to run the tests. The library can be configured to use every API key & endpoint you give it, so the more keys you get and endpoints you set up, the faster it'll be. With just a single OpenAI key, it is painfully slow. Your endpoints should support at least 8k context since that's what I test everything with and what so the default are configured for.
>>34509 >Get the picture? I think so. >"...Super-PID node(-synapse) wrapper took-in/gave-out C&C signals from/to the entire (or at least a large, local subset of the) network collection -- all while keeping the actual signal inside the local node stabilized..." Super-PID node sending gross (large scale) position instructions while local node (code) controls the fine tuning of the former sent position code. Or that's what I get you saying. Not personally criticizing you Chobitsu, You're just using the language of the trade but it appears to me that much of mathematics, AI, etc. use language that obscures what they are actual doing. And ALL professional technologist do this in their respective fields.
>>34512 Thanks for the advice, CyberPonk. I'll seek to do some of the preliminaries then at least by the goal. >>34513 >Not personally criticizing you Chobitsu, You're just using the language of the trade but it appears to me that much of mathematics, AI, etc. use language that obscures what they are actual doing. And ALL professional technologist do this in their respective fields. Good point, Grommet. Lingo is pretty much a part of all science & technology specialties. It makes things much easier to say. The benefits of using it are vaguely similar IMO to why we namefag here on /robowaifu/ : it kinda 'cuts to the chase'. :^) If I find myself with lots of spare time soon (lol), I may try to spell my theory out in excruciating detail, then if I have even more time, I'll try to simplify that. <---> Teh emminent brainiac Blaise Pascal once wrote : >“I would have written you a shorter letter, but I ran out of time” I hope you understand my conundrum, Anon. :P
>>34513 >>34515 >If I find myself with lots of spare time soon (lol), I may try to spell my theory out in excruciating detail, then if I have even more time, I'll try to simplify that. It just occurred to me I can find a use-case at the level I'm currently working (ie, driving the low-level control systems for the robowaifu body). Namely : wrapping the actuation control nodes with Super PIDs; then driving the pose estimation/control using this more-simplified, higher-level control interface (all linked together into a mesh to keep unwanted dynamics from going 'overboard' during the process of solving the full joint-chains for kinematic goals [during each time quanta]) (+ this arrangement is also well-suited to NN training regimes). >tl;dr Think of it kind of like 'morph targets' for 3D CGI blendshapes, but instead driving realworld, complex mechanics/dynamics for a robowaifu using simple sliders, that always keep the system as a whole within proper control limits (regardless of internal/external inputs). So who knows Grommet? I may make the time soon-ish to flesh my theory out in a practical way for you in '25 . Cheers. :^) --- >addendum (1): I just discovered that once this works, it should bring a very-valuable, additional benefit to the table for us: namely, that joint frame-local torques (aka unavoidable dynamics, individually [1][2]) -- which normally negatively affect the entire rest of the robotic skellington (via transmission down the joint-adjacent bones) -- can be damped out by reading the entire matrix of Super PID inputs at each node, and each tweaking their output values accordingly (all doing the same, in-concert, per tick). >tl;dr We should be able to eliminate (or at least greatly-reduce) jerky/janky robo movements by using this approach (very good, b/c cheap actuators should work better within such a system). [3] Sweet! >addendum (2): I also just discovered that through the magic of 'semantic polymorphism'(tm)(C)(R)(patent pending)(do not steal), I've been able to derail this thread by discussing the exact same topic (ie, wrapping 'nodes' in MP/MC PIDs) in two different contexts (Neural Cognition vs. Physical Kinematics). Lol. Therefore, any further discussions along this line should be done in either the Skellingtons or Actuators threads. :^) --- 1. https://en.wikipedia.org/wiki/Newton's_laws_of_motion#Third_law 2. https://www.sciencefacts.net/newtons-third-law.html 3. I suspect that -- in large measure -- this effect will likely mimic the human sensorimotor network phenomenon for our robowaifus. >=== -add 'kinematic', 'morph targets' cmnts -minor edit -add addenda, footnotes
Edited last time by Chobitsu on 12/04/2024 (Wed) 03:58:29.
>>34550 >joint frame-local torques (aka unavoidable dynamics, individually Bambu labs does this to vastly improve their printing ability on their 3D printers. If I remember correctly they actually shake the printer head around and fine tune each printer. Got to be a library or paper explaining this (assuming I could decipher it...maybe not)
Klipper firmware for 3D printers does this so somewhere in their code is the fnctions, "...High precision stepper movement. Klipper utilizes an application processor (such as a low-cost Raspberry Pi) when calculating printer movements. The application processor determines when to step each stepper motor, it compresses those events, transmits them to the micro-controller, and then the micro-controller executes each event at the requested time. Each stepper event is scheduled with a precision of 25 micro-seconds or better. The software does not use kinematic estimations (such as the Bresenham algorithm) - instead it calculates precise step times based on the physics of acceleration and the physics of the machine kinematics. More precise stepper movement provides quieter and more stable printer operation...." https://www.klipper3d.org/Features.html
> further discussions along this line should be done in either the Skellingtons or Actuators threads. oops sorry.
>>34507 Minor Horsona update & recap: The causal inference work looks like it's turning out to be a massive success. Right now, it can: - [New] Automatically generate causal graphs from a text snippet. - ... Though it's not automated, I can also "extract" knowledge on a topic from an LLM into a causal graph. This would make it possible to, e.g., make a small number of calls to a large/expensive LLM like Sonnet 3.5 or GPT o1 as a one-time cost, then use only small/cheap LLMs for all subsequent reasoning involving that causal graph. - [New] Given a causal graph and a text snippet, it can extract data for each variable in the graph (which will get stored in a database for subsequent analysis whenever it's relevant). - [Recap] Use an LLM to do causal analysis on natural language data. - [Updated] Identify the most representative datapoints for checking any particular causal effect. It does this by (1) identifying potentially useful datapoints based on what variables are known for each of them, and (2) using KNN to cluster the datapoints based on embedding, and (3) selecting the centers of each cluster as the representative points. This is necessary since it could easily get too slow & expensive to check all datapoints. - Identify causal effects given a causal graph and some partial data. (E.g., if I change or set X to a particular value, what will the effect be on Y?) - [Updated] Propagate effects across multiple related causal graphs (causal fusion). (E.g., if there's one graph for emotional state and one graph for conversational responses, it can check how X change to some variable in the emotional state graph affects Y variable in conversational responses.) This can (1) handle recursive dependencies, (2) maximizes parallel execution for cases where there are many graphs to check, and (3) iteratively refine the results so it's possible to get both fast results and increasingly-accurate results. This is done by turning causal analysis constraints into a set of computation graphs of increasing depth. In some cases, the depth could be infinite, which is why the iterative approach is required. I had a dumb algorithm for doing this before, but I think I'm doing it "properly" now. I've sanity checked everything individually, and it seems to be working reasonably well. Next pieces I need: - Finding a way to link up similar variables across graphs. Right now, they can only be linked up by name. That causes problems in three cases: (1) variables have the same name but different semantics, and (2) variables have different names but identical semantics, and (3) variables are closely related but there's no way to link them up. Once I've solved this, I should be good to go on large-scale graphs. - Finding a way to identify which variables (out of a large set) are relevant to extract when given some text snippet. - CRUD operations for managing a database of causal graphs and their associated datapoints. I have ideas for how to do all of these, but it's going to be tedious. I'm hoping to get it done by the end of the year, but that might be too optimistic.
>>34910 POTD Brilliant. >and (3) selecting the centers of each cluster as the representative points. I just wonder...do you think you could save a smol container of near-neighbor 'keypoint' vectors across the multidimensional space of the cluster, to store alongside this cluster's central point? (Since you're already calculating the inverse in effect anyway.) Seems like if you ever needed to 'backtrack' later down another, closely-related branch, then this pre-calculated collection of 'breadcrumb' vectors should make that redirection a hop-and-a-skip? Regardless, very exciting news CyberPonk! I hope you rapidly move through all the remainder checklist! Keep the goal in mind as you plod the tedium -- at this stage its nearly all gravy AFAICT. :^) Keep moving forward. >=== -prose edit
Edited last time by Chobitsu on 12/16/2024 (Mon) 02:48:44.
>>34911 That's a good idea. I'm planning to cache intermediate results for causal inference, and caching datapoints would speed things up as well. I'll probably end up storing the cached representive points in a side database since the clusters actually change depending on some parameters of what inference is being done. It would be a lot easier to associate that cache with the parameters than the center points. Once I get CRUDable causal inference, it'll all be gravy. It's still core development work right now. If I can actually get this whole thing in a working state by the end of the year, I'll be ecstatic. That would not only let me start work on the most difficult target horsona features, but it would give me a solid basis for creating characters that learn and evolve in more realistic ways, that remember things more reliably, and that can embody behaviors rather than just tendencies. There would still be a lot of downstream work, but I'd have a really solid chunk of the epistemology side completed. I do plan to spend some time after this first seeing if I can get it to work with the horsona codebase itself, which will probably take several months at least. It would be a lot easier to work on this if I could work on it alongside my waifu ^:).
>>34912 >It's still core development work right now. >There would still be a lot of downstream work, but I'd have a really solid chunk of the epistemology side completed. Yeah that's true, I'm sure. Heh, I have a tendency to get excited about current progress, and underestimate the difficulties of the remaining journey with the wave of a hand. :^) Still, every little helps along the way!! > It would be a lot easier to work on this if I could work on it alongside my waifu ^:). The Dream is coming alive!
>>34913 >The Dream is coming alive! I like the double meaning
Open file (11.70 KB 299x168 download (23).jpeg)
Open file (72.72 KB 960x540 cowgirl_00.png)
Heres some resources regarding tensorflow https://www.tensorflow.org/resources/learn-ml#courses tensorflow allows to make custom and even that may or may not be overkill. i for example plan to use nudenet to recognize the naked male body. the team that nade nudenet used either tensorflow or pytorch. So did the team that made mobile aloha. Python is not glamorous but when combined with the math all that stuff that has to be learned in order to use it, the skill required to use is quiet high. that link recommends that not one but muktiple bookd he read. im getting older and i dont want to learn any more math at all. ill do my best with sensors and computer vision but im narrowing the scope to what i promised. 5 sex acts.im not promising access to the code as well. Id only share that if there was collaboration.

Report/Delete/Moderation Forms
Delete
Report