/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

Server and LynxChan upgrade done. Canary update and a general address soon. -r

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB


(used to delete files and postings)

/agdg/ 's Game Jam runs from 3/3 to 4/4 ! Join now and learn2code during this special month, Anon! :DD

LLM & Chatbot General Robowaifu Technician 09/15/2019 (Sun) 10:18:46 No.250
OpenAI/GPT-2 This has to be one of the biggest breakthroughs in deep learning and AI so far. It's extremely skilled in developing coherent humanlike responses that make sense and I believe it has massive potential, it also never gives the same answer twice. >GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. The model is chameleon-like—it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing >GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. Also the current public model shown here only uses 345 million parameters, the "full" AI (which has over 4x as many parameters) is being witheld from the public because of it's "Potential for abuse". That is to say the full model is so proficient in mimicking human communication that it could be abused to create new articles, posts, advertisements, even books; and nobody would be be able to tell that there was a bot behind it all. <AI demo: talktotransformer.com/ <Other Links: github.com/openai/gpt-2 openai.com/blog/better-language-models/ huggingface.co/ My idea is to find a way to integrate this AI as a standalone unit and add voice-to-text for processing the questions and TTS for responses much like an amazon alexa- but instead of just reading google results- it actually provides a sort of discussion with the user. (Edited to fix the newlines.)
Edited last time by Kiwi_ on 01/16/2024 (Tue) 23:04:32.
>>23872 I plan to use scripted responses (AIML) for her to be more responsive. At least for "stalling responses" and responses which are used very often.
>>23896 Seems a reasonable approach Anon. Good luck! :^) >=== -patch crosslink
Edited last time by Chobitsu on 07/08/2023 (Sat) 16:30:34.
Phi 1.5 - The small model getting big results: https://youtu.be/0lF3g4JtY9k >TinyStories: How Small Can Language Models Be and Still Speak Coherent English? https://arxiv.org/abs/2305.07759 >Textbooks Are All You Need II: phi-1.5 technical report https://arxiv.org/abs/2309.05463 >We are continuing our investigation into the capabilities of smaller Transformer-based language models. This research was initially sparked by the development of TinyStories, a 10 million parameter model capable of generating coherent English. We then built on this with phi-1, a 1.3 billion parameter model that achieved Python coding performance nearly on par with state-of-the-art models. >In the phi-1 study, the idea was to leverage existing Large Language Models (LLMs) to generate high-quality textual data akin to textbooks. This approach aimed to enhance the learning process compared to using traditional web data. In this current study, we follow a similar approach known as "Textbooks Are All You Need," but with a focus on common-sense reasoning in natural language. We introduce a new 1.3 billion parameter model named phi-1.5, which performs on natural language tasks comparably to models five times its size. It even surpasses most non-frontier LLMs on more complex reasoning tasks, such as grade-school mathematics and basic coding. >Phi-1.5 exhibits many of the traits of much larger LLMs, both positive, such as the ability to "think step by step" or perform rudimentary in-context learning, and negative, including hallucinations and the potential for toxic and biased generations. Encouragingly, though, we are seeing improvement on that front thanks to the absence of web data. We have also open-sourced phi-1.5 to promote further research on these urgent topics. Falcon 180B: https://youtu.be/XGOcLhBx_rc >Falcon 180B is a super-powerful language model with 180 billion parameters, trained on 3.5 trillion tokens. It's currently at the top of the Hugging Face Leaderboard for pre-trained Open Large Language Models and is available for both research and commercial use.. >This model performs exceptionally well in various tasks like reasoning, coding, proficiency, and knowledge tests, even beating competitors like Meta's LLaMA 2. >Among closed source models, it ranks just behind OpenAI's GPT 4, and performs on par with Google's PaLM 2 Large, which powers Bard, despite being half the size of the model. https://falconllm.tii.ae/falcon-models.html https://huggingface.co/blog/falcon-180b >3.5 trillion tokens using TII's RefinedWeb dataset. This represents the longest single-epoch pretraining for an open model. >Falcon 180B Training Full fine-tuning 5120GB 8x 8x A100 80GB >Falcon 180B Training LoRA with ZeRO-3 1280GB 2x 8x A100 80GB >Falcon 180B Training QLoRA 160GB 2x A100 80GB >Falcon 180B Inference BF16/FP16 640GB 8x A100 80GB >Falcon 180B Inference GPTQ/int4 320GB 8x A100 40GB Problem is, it has an Acceptable Use Policy that they reserve a right to change at any time. Also, it's big compared to Llama2. But they plan to improve it.
>>25352 We shouldn't even look at closed-source models outside of the research papers: unless their source code gets leaked, we won't have much to learn directly outside of some ground-breaking change written in the research paper. Phi 1.5 is definitely much more interesting to us in that regard.
Important numbers to know about LLMs, in regards to costs, memory and more: https://github.com/ray-project/llm-numbers
>>25352 Any idea how modified Phi-1.5 must be for us to use it? Microsoft has it on a strict research license. https://huggingface.co/microsoft/phi-1_5
>>25695 No, not yet, but I'll look into it. My mind is currently focused on AI. If you look in the leaderboard of HuggingFace for "TinyStories" there are some trained with that. The smallest (since the bigger ones aren't much better, I think): https://huggingface.co/roneneldan/TinyStories-1M My problem is, that this example is just text completion without context, which is probably only useful for further training or at least fine tuning. I always thought text completion could help with making systems respond fast by anticipating what someone is saying or asking, but without context, this doesn't work. Making such a small model into something very specialized might also work. For now I don't see how text generation itself is useful, some people seem to use it for writing articles, though. >MS: "We did not fine-tune phi-1.5 either for instruction following or through reinforcement learning from human feedback" >Microsoft has it on a strict research license. It's the Wild West right now, many people just do what they want. If you can use it, you can switch it out later. We're doing one of the most important research in human history here on /robowaifu/. Related dataset: https://huggingface.co/datasets/nampdn-ai/tiny-textbooks
Mythalion 13B was recommended here >>25709 A guy testing locally hosted models a lot, recommended it for chat/roleplay here: https://www.reddit.com/r/LocalLLaMA/comments/16kecsf/new_model_comparisontest_part_1_of_2_15_models/ https://huggingface.co/PygmalionAI/mythalion-13b https://huggingface.co/TheBloke/Mythalion-13B-GPTQ For 7B it's Synthia-7B-v1.3 https://huggingface.co/Undi95/Synthia-7B-v1.3-GGUF https://www.reddit.com/r/LocalLLaMA/comments/15ogc60/new_model_rp_comparisontest_7_models_tested/ >OrcaMistral This here can be tested directly on HuggingFace, it's similar to Synthia-7B-v1.3 but it's most likely not as good: >We have used our own OpenOrca dataset to fine-tune on top of Mistral 7B. This dataset is our attempt to reproduce the dataset generated for Microsoft Research's Orca Paper. Mistral Orca 7B: https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca Test Chat (needs good prompts or it is bad at tasks): https://huggingface.co/spaces/Open-Orca/Mistral-7B-OpenOrca > HF Leaderboard evals place this model as #2 for all models smaller than 30B at release time, outperforming all but one 13B model. Some Redditors are sceptical. As I already vote, WolframRavenwolf testing a lot of models, prefers Synthia-7B-v1.3.
Your new context window: > 4 Million Tokens Okay, not really: >While you can input a lengthy text, the model will only recognize the latest tokens. Thus, if a book is an input, StreamingLLM might only summarize the concluding paragraphs, which might not be very insightful. As emphasized earlier, we neither expand the LLMs' context window nor enhance their long-term memory. StreamingLLM's strength lies in generating fluent text from recent tokens without needing a cache refresh. >An example is a daily assistant based on LLMs. StreamingLLM would let the model function continuously, basing its responses on recent conversations without needing to refresh its cache. Earlier methods would either need a cache reset when the conversation length exceeded the training length (losing recent context) or recompute KV states from recent text history, which can be time-consuming. It seems aiming to stop the decay in response quality if the conversation is longer. https://github.com/mit-han-lab/streaming-llm > StreamingLLM —a simple and efficient framework that enables LLMs to handle unlimited texts without fine-tuning
>>25725 There are projects to make open versions of Phi-1.5. NanoPhi (https://github.com/VatsaDev/NanoPhi) is interesting towards this end. It will likely take some time until we have an ideal tiny LLM that we can use for a local "personality" on our waifu.
>>25742 >OrcaMistral WolframRavenwolf changed his mind, OrcaMistral is now a bit ahead of Synthia 7B. > Conclusion: Using the Roleplay instruct mode preset, this model had amazing writing, much better than many models I tested, including even some 70Bs. Didn't look or feel like a small model at all. Using the official ChatML prompt format, the writing was not as good, probably because messages were much shorter. Both formats didn't help MGHC which apparently is too complex a scenario for 7B models - even smart 7Bs. But yes, I start seeing Mistral's appeal with finetunes like this, as it does compare favorably to 13Bs! Can't wait for bigger Mistral bases... https://www.reddit.com/r/LocalLLaMA/comments/16z3goq/llm_chatrp_comparisontest_dolphinmistral/
Open file (85.63 KB 642x365 Screenshot_126.png)
> Today's large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that these networks are -- or will soon become -- "thinking machines", capable of performing tasks that require abstract knowledge and reasoning. Here, we review the capabilities of LLMs by considering their performance on two different aspects of language use: 'formal linguistic competence', which includes knowledge of rules and patterns of a given language, and 'functional linguistic competence', a host of cognitive abilities required for language understanding and use in the real world. Drawing on evidence from cognitive neuroscience, we show that formal competence in humans relies on specialized language processing mechanisms, whereas functional competence recruits multiple extralinguistic capacities that comprise human thought, such as formal reasoning, world knowledge, situation modeling, and social cognition. In line with this distinction, LLMs show impressive (although imperfect) performance on tasks requiring formal linguistic competence, but fail on many tests requiring functional competence. Based on this evidence, we argue that (1) contemporary LLMs should be taken seriously as models of formal linguistic skills; (2) models that master real-life language use would need to incorporate or develop not only a core language module, but also multiple non-language-specific cognitive capacities required for modeling thought. Overall, a distinction between formal and functional linguistic competence helps clarify the discourse surrounding LLMs' potential and provides a path toward building models that understand and use language in human-like ways.
>>25751 didn't we have a paper on possible 1-2 mil tokens quite a while back? But, nothing came of it. It seems we've hit a wall when it comes to context length.
>>25779 I think OpenAI or some big corporation wanted to do that, the biggest I know about are 16k, but not available for self-hosting. The biggest for that might have 10k or so.
>>25780 Last I heard, you can modify llama 2 to have 32k
>>25795 I simply looked into the HuggingFace Leaderboard and 200k was the highest I found, though it doesn't really use Regex, I had to trial and error. But since there's only one at 200k, I assume it is either hard to train or has problems. https://huggingface.co/ddobokki/Llama-2-70b-orca-200k
>>25796 Looking further into this and gathering some info: - Big contexts might give worse summaries - It might start to repeat itself - The usage of vRAM or system RAM (or both) goes up by having more context - token generation speed may drop about x times
>>25796 >>25797 HuggingFace leaderboards aren't a good metric. ALl their evaluation methods are quite retarded, and its easy to gimp. I wouldn't rely on them much. Every week some model tops the leaderboard, people start using it and realize how bad it is and drop it.
>>25806 Thanks for the warning, but in that case I was using it for search.
Open file (51.32 KB 640x480 google_robowaifu.jpg)
Not sure how much of this is hype and how much will be real...but if true this could be very big in regards to installing an actually decent A.I. brain into our Robowaifus. I mean...real-time image recognition alongside sound and video!? (I know Google is pozzed to f**k and I know this will be very expensive to sign up to for a long time yet, but I also always suspected that the first of the truly useful A.I.s - perhaps close to A.G.I? Would come from one of the big-tech corporations. They have too many resources and staff for it not to.) https://deepmind.google/technologies/gemini/#introduction https://www.youtube.com/watch?v=q5qAVmXSecQ
Open file (6.23 MB 393x480 waitwat_cat.gif)
>>27120 Hi SophieDev, glad to see you Anon! >G*ogle waifu < What could possibly go wrong? (>>20208) Hard pass. I hope you're doing well bro. How's things going with you rn? Cheers. :^) >=== -add 'go wrong' crosslink
Edited last time by Chobitsu on 12/08/2023 (Fri) 20:45:00.
>>27120 >Gemini >Close to AGI It's nowhwere close to AGI. https://youtu.be/90CYYfl9ntM >Realtime object recognition We've had that with OpenCV for decades. >Realtime sound recognition We've had CMU Sphinx for 8 years. It's just flash in the pan tech demos you could do with the above free software to provide context tokens for an LLM. >Video recognition It's a series of images which are sampled from the video. They actually go over this on their own site. https://developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html You've been bamboozled by a magician into thinking Gemini is far more capable than it actually is. It is impressive in one aspect, finding information from a series of images. It does appear to need some hand holding in the prompt to get it right, hence the frequent use of hints in the prompts used for the demo. >>27132 Considering how deceptive they are about Gemini, I wouldn't trust it even if I trusted Google. It got me excited for a moment, I don't blame anyone for wanting it to be real.
Edited last time by Kiwi_ on 12/10/2023 (Sun) 02:43:59.
>>27148 >It's nowhere close ot AGI. Understood, thanks. False alarm then, it wasn't a new advanced A.I. just humans being a bag of dicks, as usual. Same as with all the fraudulent claims about "room-temperature superconductors", "fusion power" and the "moon landings" pfffff. But thanks for the info Kiwi! I was not aware of either CMU Sphinx or OpenCV. >>27132 Good to see you too Chobitsu! > How's things going with you rn? Cheers. :^) I am just learning C programming. I mean, on the one hand Google claims that "AlphaCode 2 performs better than 85% of participants on 12 recent Codeforces contests" so there's not much point in me learning C, right? But on the other hand, humans (including professional journalists) are mostly liars and you have to double-check everything they say against at least two other primary sources that can both verify one another - which happens very rarely on the personal level. So I'll take my chances and keep learning C. I mean, it was invented in 1972 (back when ARPANET had under 30 nodes) and I can see it very clearly in black and white working on my computer so I don't think C is a lie, at least.
>>27150 >So I'll take my chances and keep learning C. I mean, it was invented in 1972 (back when ARPANET had under 30 nodes) and I can see it very clearly in black and white working on my computer so I don't think C is a lie, at least. Very solid decision SophieDev. C is a great language, one of the best. Since it is 'portable assembler' so to speak, you're always going to be quite close to the hardware (few 'lies'). Not that the GH-dominated chip vendors can't still do evil (backdoor surveillance, remote-control, &tc.) with their hardware (they do), but at least with C you've got a major, twofold, benefit with the programming language part of the robowaifu safety & security problemspace: 1. The C language itself is relatively smol by today's standards (safer), and it's been 'banged on' hard at industrial-scale usage for 50+ years now (robust). 2. As an ISO (international) standard, the countries themselves tend to act in self-interested ways to protect the integrity of the language itself -- especially backwards-compatibility. So, GH interests like M$, G*ogle, Am*zon, M*ta, I*tel, Wh*tehouse, Isr*el, &tc., can't corrupt/corral it to their nefarious ends very handily. Both of these effects are really strong arguments for the language's use by us here on /robowaifu/ . Another strong one is the laughable fact that the Big-Gov branch of the GH is now attempting to outlaw it's use today; in favor of their own, tightly-controlled (effectively proprietary) GH Big-Tech languages (R*st, G*, &tc.) You can be sure they will eventually pull the rug out from under any freedom-loving groups who had the misfortune to swallop the Current Year dev lies, and adopt these abominable monstrosity languages over the elegant ASM/C/C++ power trio. >tl;dr "Let's keep things simple & fast; let's keep them open & safe" here on /robowaifu/. This all starts with the ISO C++ & C programming languages. Cheers, Anon. :^) >=== -prose edit
Edited last time by Chobitsu on 12/09/2023 (Sat) 22:51:15.
>>27167 Some very good points well made in this post, Chobitsu. I will keep this in mind during my future programming endeavors.
Open file (1.12 MB 640x360 read an input in c.mp4)
>>27195 nice, the language is easy but learning how to use it can be brutal
>>27148 This people are over hyping it. Also next time, strip out everything after ? out of the youtube link, its not needed and its more tracking data for google :^) (Thanks :^) >>27120 I would also like to say that we are not actually that far behind in the open source space. individually all the needed components to create a similar "LLM" model already exist and all we need is for them to be put together. Look into minigpt-4 & riffusion. I think if the systems where to be combined it could create something comparable to Gemini. https://minigpt-4.github.io/ this is a way of adding visual perception to an LLM. https://github.com/riffusion/riffusion this would let you generate audio like they did in the other demos. To recognize audio (not speech) because its using "images" to represent the sound it can use the same pipeline as minigpt is for regular images. https://github.com/ggerganov/whisper.cpp for speech to text I would look at this over CMU Sphinx, I think you will get better results. >>27200 Also small note from the /robowaifu/ resident D language shill (me), I'd argue that knowing C & C++ is valuable, but I would not start a new code base in it and that if you value individual programmer productivity I think D is unmatched by any other systems level language.
Edited last time by Kiwi_ on 12/10/2023 (Sun) 02:45:24.
>Apple announces LLM in a flash: Efficient Large Language Model Inference with Limited Memory https://huggingface.co/papers/2312.11514 https://arxiv.org/abs/2312.11514 >Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their intensive computational and memory requirements present challenges, especially for devices with limited DRAM capacity. This paper tackles the challenge of efficiently running LLMs that exceed the available DRAM capacity by storing the model parameters on flash memory but bringing them on demand to DRAM. Our method involves constructing an inference cost model that harmonizes with the flash memory behavior, guiding us to optimize in two critical areas: reducing the volume of data transferred from flash and reading data in larger, more contiguous chunks. Within this flash memory-informed framework, we introduce two principal techniques. First, "windowing'" strategically reduces data transfer by reusing previously activated neurons, and second, "row-column bundling", tailored to the sequential data access strengths of flash memory, increases the size of data chunks read from flash memory. These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively. Our integration of sparsity awareness, context-adaptive loading, and a hardware-oriented design paves the way for effective inference of LLMs on devices with limited memory. via Meta Ronin on Discord
>>28275 Here is a HN comment that also helps breakdown the ideas in the paper. https://news.ycombinator.com/item?id=38712810
Open file (558.52 KB 629x722 Screenshot_193.png)
Cheaper, Better Alternative to Trillion-Parameters LLM >In conversational AI research, there's a noticeable trend towards developing models with a larger number of parameters, exemplified by models like ChatGPT. While these expansive models tend to generate increasingly better chat responses, they demand significant computational resources and memory. This study explores a pertinent question: Can a combination of smaller models collaboratively achieve comparable or enhanced performance relative to a singular large model? We introduce an approach termed "blending", a straightforward yet effective method of integrating multiple chat AIs. Our empirical evidence suggests that when specific smaller models are synergistically blended, they can potentially outperform or match the capabilities of much larger counterparts. For instance, integrating just three models of moderate size (6B/13B paramaeters) can rival or even surpass the performance metrics of a substantially larger model like ChatGPT (175B+ paramaters). This hypothesis is rigorously tested using A/B testing methodologies with a large user base on the Chai research platform over a span of thirty days. The findings underscore the potential of the "blending" strategy as a viable approach for enhancing chat AI efficacy without a corresponding surge in computational demands. https://huggingface.co/papers/2401.02994 https://arxiv.org/abs/2401.02994 https://www.reddit.com/r/LocalLLaMA/comments/192bhjm/this_is_pretty_cool/ It's not Mixtral... >it’s fundamentally different because each prompt gets nothing from the other models. It’s just swapping out models arbitrarily for every prompt. Mixtral is an actual ensemble model where multiple smaller models combine their weights to produce each prompt as one.
>>28344 >meme title >uses best of N sampling but doesn't say how many samples they use >doesn't say how big the reward model is or how finetuning the models on it improved them >didn't do any ablations to determine what actually increased the performance >doesn't share their prompts or test if changing the prompt has a similar effect to changing the model This just seems like a marketing campaign for Chai AI. To their credit though in another paper they did report how increasing the number of samples increased mean conversation length, +50% for N=4, +60% for N=8 and +70% for N=16, using a finetuned 124M GPT2 model for the reward model, whereas the new paper claims a +110% increase in engagement time over a similar baseline. https://arxiv.org/abs/2303.06135 Engagement time says nothing about how good the model is though. It's probably going up because the responses are more random and less predictable, not because they're necessarily more interesting. Randomly switching the models probably only got around a +25% improvement but the results aren't really comparable to the other paper because one of the models is 13B, not 6B. It could be the 13B carrying the conversation after 6B models say something stupid. This is a really silly paper because it obfuscates most of the improvement is coming from best of N sampling and makes it sound as though the improvement is coming from one weird trick, Blended™, aka giving the chatbot multiple personality disorder.
>>28275 >Apple announces LLM in a flash I would bet anything partly where this came from is the company, and employees, that Apple bought when they acquired XNOR.ai. I wrote about this here. They were doing image recognition and all sorts of seriously amazing stuff with rasberry pi's and micro-controllers. They were using "Binary Convolutional Neural Networks" Here's some links where I linked papers and comments on what they did. >>18652 >>18777 >>19341 >>18651 >>18652 >>18777 >>18778 A paper on this sort of computing algorithm >>18818 >>19341 This appears to be a good paper because it's a review of the binary networks >>20473 The stuff they did with low power devices was mind blowing. I can't imagine the power they are getting out a modern laptop. My belief is that the acquisition of XNOR is one of the biggest coups in the AI industry, and Apple will make serious leaps compared to everyone else in the future. I wondered myself why SSD were not used like they are doing. A waifu could load and unload task based neural net models. A basic one but by switching task nets could have a far bigger operational skill set without spending a fortune on RAM.
What do you guys think of the gpt4all.io project? Reading through the docs and messing around with it, it seems to be the easiest to integrate with out-of-the-box for the inexperienced/someone who doesn't have a PhD in this.
>>28413 It looks like it’s a nice to use wrapper for a fork of llama.cpp, if your just wanting to interact with a LLM, it looks like a nice way to do it. (Do note I have not used it, I just checked out the repo) But for using a LLM in your project, i'd just use llama.cpp or llama2.c
Considering how many posts are on general AI, I'd like to edit the OP to reflect this. Change it from OpenAI and GPT to AI research.
>>28419 This thread is about LLMs like the GPTs. We have threads on NLP, voice- and image recognition and cognitive architecture.
>>28425 Then a rebrand to be dedicated to LLM's in general rather than just GPT's. It appears as a GPT only thread in the catalog.
>>28428 Please feel free to edit OPs exactly as you see fit, Kiwi (incl. subjects). The only thing you can't change are the images (other than deletions), and OP's name. I'd suggest you two work closely together on such things; Noido Dev is remarkably gifted at our /robowaifu/ taxonomy! :D >=== -prose edit
Edited last time by Chobitsu on 01/14/2024 (Sun) 23:51:48.
>>28433 Lol.
>>28417 Thanks, this looks interesting. I hope that something like this will eventually get some documentation. Especially on training. I would like it to be trained in using other software to analysis various things like electromagnetic materials and hydrodynamics of water and air. So many of these software program tools exist but it takes forever to figure how to set up and use them. If the AI could read the instructions and then you guide it to analyze what it is you want done it could be a huge game changer. Another cool thing would be making the structure of waifus. Say you find some nice drawing of girls you like. Cartoon and real. You get it to compute the drawing of several that have characteristics you like. I've seen this done already with people using celebrities and putting them into different poses and situations. Maybe guiding it by saying different parts , head, or eyes or whatever are more predominate by percentage. It mixes these up and gives you actual dimensions and spits out STL files. Even further. Show it a bunch of skeleton pictures and also body pictures and have it calculate what the skeleton structure for the before mentioned drawing and save a copy of a STL file of the actual bone dimensions. I can think of a vast amounts of use for these that mostly revolve around using existing tools but the AI does the hairy work of interfacing the data to the tool under your instruction and then operating the software tool for you or giving you proper inputs to operate. I;m hoping also that the recent work by Apple on using SSD to hold much of the AI neuraons or data instead of all RAM will be plugged in to these open source models. It would be a huge leap. Maybe it would be ten times slower but you could trade time for a MUCH higher cost of super fast processors and massive RAM. I believe, though I can't prove it, that this would not be that slow if you could shift in various models that specialize in certain things into RAM from the drive. The present models try to fit everything for this huge training base into RAM, I think, and that's a big problem. Compartmentalizing this into a bunch of little well trained models would be fast and useful for waifus and a whole lot else.
>>28417 Sigh....I've been looking at this and find that it is not an actual AI but a tool to interact with an AI. Though I could be wrong I think you must use "other" pre-trained models. Not that this is bad but it appears to me that there are other tools presently existing that have better documentation and are farther along in usefulness that do much the same. So I start looking at stuff I already downloaded. One I see is Tensorflow. It's been around but looking at what they've been doing recently, they "might" be less work to set up and use. It has some attractive features and is open source. A couple that caught my attention is it has built in capability to interface and download a huge mass of datasets. I'm not exactly sure what "datasets" means. I'm not sure if it is just a set format set of data, like a list of books on say, cake building, which is then already formatted to a form that can be used by an AI. ( I think this is true but some of the datasets appear to have been manipulated such that they are "trained"?????) Now this one dataset appears to be a pre-trained "model". "...databricks-dolly-15k is an open source dataset of instruction-following records used in training databricks/dolly-v2-12b that was generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization...." https://www.tensorflow.org/datasets/catalog/databricks_dolly Trained as in the paper, "Training language models to follow instructions with human feedback" "...In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent..." This stuff is confusing to me because they call these "datasets" yet here is one that calls itself a dataset but then explains(in the paper) that it's pre-trained like a model. This nomenclature is not clear. If it's a pre-trained model, which I understand to be an actual neural net package, already trained, then why call it a dataset and not a model? Anyways not only is Tensorflow set up to download a lot of these prepackaged, whatever they are, it also has a tool that can shape data that you enter. I assume, from a quick read, it can take in raw data like books and websites and make datasets from these. Overview "...Datasets are distributed in all kinds of formats and in all kinds of places, and they're not always stored in a format that's ready to feed into a machine learning pipeline. Enter TFDS. TFDS process those datasets into a standard format (external data -> serialized files), which can then be loaded as machine learning pipeline (serialized files -> tf.data.Dataset). The serialization is done only once. Subsequent access will read from those pre-processed files directly...." https://www.tensorflow.org/datasets/add_dataset This is confusing to me. Some of these datasets they say are trained but they speak of them as if they need to "train" another existing AI without specifying what sort of computational load is needed for this. It's not clear to me how processed a "dataset" is. It does appear that Tensorflow can use a vast array of datasets and can also interact with trained models. "...TensorFlow Hub has been integrated with Kaggle Models. You can now access 2,300+ TensorFlow models published on TensorFlow Hub by Google, DeepMind, and more..." https://www.kaggle.com/models?tfhub-redirect=true Part of the problem is AI stuff is covered up in what I call "Varbage", (verbal garbage) which is when they make up new words for what ever specialization that is a new technology instead of using common easily understandable words. In fact a perfect example is me calling it "Varbage". :) See how that works?
Open file (59.65 KB 600x1183 myjobhereisdone.jpg)
>>28521 >Sigh....I've been looking at this and find that it is not an actual AI but a tool to interact with an AI. Though I could be wrong I think you must use "other" pre-trained models. Not that this is bad but it appears to me that there are other tools presently existing that have better documentation and are farther along in usefulness that do much the same. Yeah, ease of use is nothing to be sneezed at, and is a huge improvement in itself, like you sort of already suggested. What other tools, though? >>28433 In all seriousness, I've been playing with this for the past few weeks and it's kind of everything I wanted? My desire for a robowaifu is entirely just someone to talk to offline (my only issue with the current ChatGPT spate), and I guess I'm such a fucking simpleton that this has scratched that itch and thensome. Yes, you could make a Chobits, but there are always improvements you could make in the language model. You could always make it more of an Usain Bolt in terms of athletics. This is a weird philosophical question, and kind of off-topic, I don't know, but when would you guys consider yourself "done?"
Open file (59.71 KB 895x1174 dark_catgirl.jpg)
Since we might be in danger of seeing LLMs just as "word predictors" without taking into account that of course, there have to be some mechanisms there to find the best answer, this here might be a good talk (I'm currently listening to): >In this wide-ranging conversation, Tim Scarfe interviews Neel Nanda, a researcher at DeepMind working on mechanistic interpretability, which aims to understand the algorithms and representations learned by machine learning models. Neel discusses how models can represent their thoughts using motifs, circuits, and linear directional features which are often communicated via a "residual stream", an information highway models use to pass information between layers. >Neel argues that "superposition", the ability for models to represent more features than they have neurons, is one of the biggest open problems in interpretability. This is because superposition thwarts our ability to understand models by decomposing them into individual units of analysis. Despite this, Neel remains optimistic that ambitious interpretability is possible, citing examples like his work reverse engineering how models do modular addition. https://youtu.be/_Ygf0GnlwmY I guess if researchers get better at this, then it might also help to extract some algorithms from networks and manipulate them or make them smaller and faster. >Key areas of discussion: * Mechanistic interpretability aims to reverse engineer and understand the inner workings of AI systems like neural networks. It could help ensure safety and alignment. Neural networks seem to learn actual algorithms and processes for tasks, not just statistical correlations. This suggests interpretability may be possible. * 'Grokking' refers to the phenomenon where neural networks suddenly generalize after initially memorizing. Understanding this transition required probing the underlying mechanisms. * The 'superposition hypothesis' suggests neural networks represent more features than they have neurons by using non-orthogonal vectors. This poses challenges for interpretability. * Transformers appear to implement algorithms using attention heads and other building blocks. Understanding this could enable interpreting their reasoning. * Specific circuits like 'induction heads' seem to underlie capabilities like few-shot learning. Finding such circuits helps explain emergent phenomena. * Causal interventions can isolate model circuits. Techniques like 'activation patching' substitute activations to determine necessity and sufficiency. * We likely can't precisely control AI system goals now. Interpretability may reveal if systems have meaningful goal-directedness. * Near-term risks like misuse seem more pressing than far-future risks like recursiveness. But better understanding now enables safety. * Neel thinks we shouldn't "over-philosophize". The key issue is whether AI could pose catastrophic risk, not whether it fits abstract definitions.
>>28725 > My desire for a robowaifu is entirely just someone to talk to offline My dood, if you just want a personal chatbot fren get yourself oobabooga: https://github.com/oobabooga/text-generation-webui It is relatively easy to install: automagically downloads all the python stuff, so it is entirely local. Your AI waifu wouldn't be held at ransom by the corporations because it will live on your computer. Just make sure you get a model from hugging face that is smaller than your VRAM (aka graphics card memory) if you're using GPU, or a model smaller than your system RAM if you're using CPU (CPU is much slower).
Open file (92.62 KB 833x918 Discord_ylVzc5QwWg.png)
Open file (46.13 KB 758x402 Discord_ZlIBfiqm6A.png)
>>28417 saw small update on jan it will get RAG in version 0.4.7 (i think :/, see 2nd screenshot) https://www.promptingguide.ai/techniques/rag >it's possible to build a language model-based system that accesses external knowledge sources to complete tasks >This enables more factual consistency, improves reliability of the generated responses, and helps to mitigate the problem of "hallucination" "RAG" or "Retrieval Augmented Generation" should kickstart the flood of better AI chatbots, or even make it possible to do some very niche / specific personalities for your wAIfu using "outsider" databases & other data-related stuff. also it seems to be good for real-world applications too: https://arxiv.org/abs/2402.03610 (new paper on RAG theme) >we propose Retrieval-Augmented Planning (RAP) framework, designed to dynamically leverage past experiences corresponding to the current situation and context, thereby enhancing agents' planning capabilities. RAP distinguishes itself by being versatile: it excels in both text-only and multimodal environments, making it suitable for a wide range of tasks. Empirical evaluations demonstrate RAP's effectiveness, where it achieves SOTA performance in textual scenarios and notably enhances multimodal LLM agents' performance for embodied tasks. These results highlight RAP's potential in advancing the functionality and applicability of LLM agents in complex, real-world applications.
>>29205 Thanks 01! Looking forward to seeing how this advances over the next few months. Cheers. :^)
>AI as a tool for invention: Euro Beinat, Global Head, Data Science & AI, Prosus | CogX Festival 2023 >Prosus AI, a top-tier applied AI centre, drives rapid experimentation and implementation of AI throughout Prosus' global portfolio, which includes over 80 technology companies with more than 800 AI experts. Euro Beinat (Global Head of Data Science and AI) outlines how AI is harnessed for discovery within the Prosus network. He shares insights gained from 10,000 colleagues who utilise generative AI daily across the group, significantly enhancing the impact of their work. https://youtu.be/9K6E04z-Cl0 This might give you some insights how to use such tools, but also how to combine different models to something more useful. Also, shows how useful it would be to have user input and reports from many people.
Groq: New hardware architecture makes LLMs around 18 times faster at inference (using it to generate responses). https://youtu.be/zupmHMWuGCs https://www.youtube.com/@GroqInc https://youtu.be/Pr6nNuGSbCE https://groq.com/ (not really accessible publicly yet, only with telling them about a project) Though, I hate that they trademarked the term LPU (language processing unit).

Report/Delete/Moderation Forms