>>17806 (part 2 of 2)
This year, I've gotten familiar with IPFS, which is basically a decentralized torrent protocol with some very nifty features. In particular, I've:
- Learned to create, manage, augment, and distribute a 10 TB dataset on Hetzner servers with some tooling built around IPFS. This was a proof-of-concept showing that it's possible to store large amount of data and provide large amounts of bandwidth cheaply. Right now, I get only 8 mbps per server due to CPU bottlenecks. I've tested some tricks for getting this up to ~30 mbps per server, and I'm pretty sure I can get it higher if I spend more time on it.
- Learned to create what amounts to index files: files that store file organization without storing file content. I set up scrapers that store data in a format ideal for scraping, uploaded the data to seed servers that store the data in a format ideal for data transfer, and downloaded the data with a file organization that's ideal for normal usage. This was a proof-of-concept to show that it is in fact possible to decouple data scraping from data storage from data download.
- Tested adding bandwidth to existing files from an unaffiliated server. Again, this was a proof of concept showing that anyone can add bandwidth to anybody else's files, which enables data distribution to scale with popularity.
- Learned to work with IPFS's JavaScript APIs, particularly to load file structures, create new file structures, and download data. This part will require upstreaming some bug patches to IPFS, but I believe I have enough here to demonstrate that it's possible to do enough end-user functionality from a browser interface, so distribution of these tools will not be a limiting factor.
- Developed some initial tooling for create (fanfiction) text data subsets. I've learned a lot about organizing data, making interfaces that work for a broader audience, and making the necessary tooling & data available so people can create data subsets more easily.
There's still a lot more to do here, but the exploratory work is largely done. All of this seems feasible, and probably with only a few months of effort.
Lastly, I had smaller attempts to make higher math more accessible, both in the F=ma thread and elsewhere. I'm not really happy with my results on this front, and seeing how similar expository-posts-on-difficult-topics go, I think this strategy is not a good one. But seeing how much interest and expertise there is here, I'm convinced that more diffision of knowledge would be extremely valuable. I'll toss in some stray thoughts on this. First the problems.
- I suspect that text is mostly only good for labeling things and explaining the deductive glue of intuition. For math at least, many aspects of intuition are spatial or physical, and text explanations are only good there after people have already gotten the right picture in mind.
- A lot of explanations only work when people know which things they should be focusing on, and text is not good for placing explicit focus on things. This applies to both english text and equations. On a related note, natural language is often too imprecise, and equations are often too semantically-detatched. In both circumstances, it's hard to figure out what's worth focusing on.
- Explanations are often incomplete, and people can't "experiment" with an explanation to develop a more complete picture.
Here's my take on where a solution might come from.
- There's one person that's exceptionally good at explaining math: 3blue1brown. He combines exposition, some interaction (offers problems to solve), and visuals. The software he uses to create visuals (manim) is open source. This is all good, but a video explanations probably wouldn't work for us, and a significant chunk of his teaching strategy depends on the video format. Specifically, he can narrate over visuals simultaneously while they're playing. That's not possible without a vocalized explanation. I can see vocalized explanations fitting into this board if we have waifus narrating things with TTS, but otherwise, it's probably not going to work. This would be very high-effort on the part of the person explaining, and every low-effort on the part of the people consuming.
- Chris Olah and distill.pub are exceptionally good at explaining difficult concepts, and they don't use audio. They accomplish something similar to "narrating over a video" through UI widgets the user can use. Of course the user knows how they're manipulating the widget, and while they're doing that, they can observe what effect is has on the visuals. We can't use UI widgets on this board, but we can link to Colab notebooks that use manim.js or d3js alongside Jupyter widgets. This again would be high-effort on the part of the person explaining, low effort for consuming. With some heroic effort to create better tooling, I think this can be medium-effort for creating explanations, with the caveat that it will probably only work for math. That's probably too limiting a caveat.
- For non-visual explanations, visuals with widgets don't work, but I suspect you can get a similar effect to "narrating over a video" with "commented code," at least for people that are comfortable reading code. I suspect that even simple, crappy pseudocode can get a point across much more clearly and concisely than exposition in cases where an explanation is about, e.g., functional relationships between things rather than deductive relationships. In cases where code is appropriate, I think this would actually be low-effort for the person explaining and low-effort for the people consuming. That's interesting enough to me to warrant some exploration and experimentation.
Anyway. I look forward to the next year. Things are now moving at a breakneck pace, due largely to open source research & development, and the world is still speeding up. After nearly a decade going at it, I've finally hit the point where robowaifus seem to be entirely possible within a reasonable timeframe. There's still a lot of work to do, but the future promising.