/robowaifu/ - C++ General

Name
Subject
E-mail
Message	Max message length: 6144
Files	Drag files to upload or click here to select them Maximum 5 files / Maximum size: 20.00 MB

Spoiler images
Password	(used to delete files and postings)
Use bypass

C++ General Robowaifu Technician 09/09/2019 (Mon) 02:49:55 No.12

C++ Resources general The C++ programming language is currently the primary AI-engine language in use. >browsable copy of the latest C++ standard draft: https://eel.is/c++draft/ >where to learn C++: ( >>35657 ) isocpp.org/get-started https://archive.is/hp4JR stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list https://archive.is/OHw9L en.cppreference.com/w/ https://archive.is/gt73H www.youtube.com/user/CppCon https://archive.is/QGynC BTW if you're new to C++ and you're stuck on Windows (either you can't or won't upgrade to Linux) then you can at least incorporate a good, open shell into your system to begin with so you can follow along. Start at this link, and if you have any questions just ask ITT: www.msys2.org/ https://archive.fo/p3EUc >=== -add standard draft hotlink -add 'learn' crosslink

Edited last time by Chobitsu on 01/15/2025 (Wed) 20:50:04.

Chobitsu 07/25/2024 (Thu) 17:38:08 No.32375

>>32361 Of course, output filestreams also work in C++ as well. >time_is_now.cpp [1]

#include <fstream>
#include <iostream>
#include <string>

using namespace std;

// demo:  C++ output filestreams are just streams, like all the rest :
int main()
{
    ofstream ofs{"timing.txt"}; // this will create a file if it doesn't already exist
    string s{
        "Now is the time for all good men, with their robowaifus by their side, "
        "to come to the aid of their country."
    };

    ofs << s << '\n';
    ofs.close(); // IMHO, always a good idea to explicitly close output files when done writing to them

    //---

    ifstream ifs{"timing.txt"};
    string word;

    while (ifs >> word) // stream's operator>> parses on whitespace
        cout << word << '\n';
}

>output:

Now
is
the
time
//...
aid
of
their
country.

--- 1. https://coliru.stacked-crooked.com/a/4ef7677ea61a19a4 >=== -minor edit -add coliru hotlink

Edited last time by Chobitsu on 07/28/2024 (Sun) 02:58:11.

Chobitsu 07/28/2024 (Sun) 02:59:54 No.32423

>>32342 >But i also dont have a working machine to code on so just using an online compiler to play around with simple things. Ahh, so I added coliru hotlinks to my examples above so you could play with them directly without a local dev box. Cheers. :^)

Robowaifu Technician 07/29/2024 (Mon) 18:22:57 No.32436

>>32375 how do you change the delimiter for <<, it just stops reading after any whitespace it makes slurping a file a pain like with json files you use {:} as delims not whitespace

Chobitsu 07/29/2024 (Mon) 19:04:43 No.32437

>>32436 Good question, Anon. We actually had a class lesson in our textbook here for just that, for stream's operator>> : (>>21874, >>21886) . You'd simply do similar code to write your custom operator<< . Give it a shot and please post your results here. Feel free to ask any questions, and good luck Anon! Cheers. :^)

Chobitsu 08/06/2024 (Tue) 12:33:56 No.32610

'C++26 Senders'' (the std::execute namespace) /comfy/ funposting-related : https://trashchan.xyz/comfy/thread/9418.html#10396 It already has a working implementation today, Anon. Check it out: :^) https://godbolt.org/z/3cseorf7M --- >note: The C++ Standards Committee of today strongly encourages members to create practical, working examples before submitting proposals. Eric Niebler [1] has done just that via this repo. 1. https://github.com/ericniebler

Chobitsu 08/07/2024 (Wed) 02:27:04 No.32632

>>32610 Decided to copypasta the Senders example code here too, since the Godbolt site can make things tight on a smol screen. >main.cpp

#include <exec/static_thread_pool.hpp>
#include <stdexec/execution.hpp>

int main() {
  // Declare a pool of 3 worker threads:
  exec::static_thread_pool pool(3);

  // Get a handle to the thread pool:
  auto sched = pool.get_scheduler();

  // Describe some work:
  // Creates 3 sender pipelines that are executed concurrently by passing to
  // `when_all` Each sender is scheduled on `sched` using `starts_on` and starts
  // with `just(n)` that creates a Sender that just forwards `n` to the next
  // sender. After `just(n)`, we chain `then(fun)` which invokes `fun` using the
  // value provided from `just()` Note: No work actually happens here.
  // Everything is lazy and `work` is just an object that statically represents
  // the work to later be executed
  auto fun  = [](int i) { return i * i; };
  auto work = stdexec::when_all(
      stdexec::starts_on(sched, stdexec::just(0) | stdexec::then(fun)),
      stdexec::starts_on(sched, stdexec::just(2) | stdexec::then(fun)),
      stdexec::starts_on(sched, stdexec::just(4) | stdexec::then(fun)));

  // Launch the work and wait for the result
  auto [i, j, k] = stdexec::sync_wait(std::move(work)).value();

  // Print the results:
  std::printf("%d %d %d\n", i, j, k);
}

// sauce:  https://github.com/NVIDIA/stdexec
// also cf. CMake's missing Package Manager:  https://github.com/cpm-cmake/CPM.cmake

>output: 0 4 16 >CMakeLists.txt:

# Actually, further dependencies on down the line for the STDEXEC project requires >CMake v3.25 currently.
#   -if your CMake is older and your package manager doesn't have the needed level, here's how I did it on Ubuntu Jammy:
#      sudo apt purge --auto-remove cmake
#      wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
#      echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null
#      sudo apt update
#      sudo apt install cmake
#
#   -cf.  https://askubuntu.com/questions/355565/how-do-i-install-the-latest-version-of-cmake-from-the-command-line
#         (2nd answer)
#
cmake_minimum_required(VERSION 3.14 FATAL_ERROR)

project(stdexecExample)

# Downloading CMake's missing Package Manager:
#
#   mkdir -p build/cmake
#   wget -O build/cmake/CPM.cmake https://github.com/cpm-cmake/CPM.cmake/releases/latest/download/get_cpm.cmake
#
#   -cf.  https://github.com/cpm-cmake/CPM.cmake
#   -also cf.  https://github.com/cpm-cmake/CPM.cmake/wiki/Downloading-CPM.cmake-in-CMake
include(build/cmake/CPM.cmake)

# NOTE: This step can take a while to DL; check your network connection(s) & be patient...
CPMAddPackage(
  NAME stdexec
  GITHUB_REPOSITORY NVIDIA/stdexec
  GIT_TAG main # This will always pull the latest code from the `main` branch. You may also use a specific release version or tag
)

add_executable(main main.cpp)

target_link_libraries(main STDEXEC::stdexec)

>=== -patch 'stdexec::starts_on()' statements to reflect the current C++26 draft standard -add 'output:' codeblock -minor tweak of async work params 'stdexec::just(n)' -add 'sauce', 'also' hotlinks to code -ren codefile name to 'main.cpp' -add, patch 'CMakeLists.txt' contents

Edited last time by Chobitsu on 10/06/2024 (Sun) 23:31:31.

Robowaifu Technician 09/14/2024 (Sat) 14:58:09 No.33549

> ( C++ safety -related : >>33548 )

HCSM 09/29/2024 (Sun) 18:01:38 No.33816

Gonna give it a try, any advice welcome. Also, any way to upload code without converting it to pdf?

Chobitsu Board owner 09/29/2024 (Sun) 18:56:48 No.33817

>>33816 Hello, HCSM. Welcome! >Also, any way to upload code without converting it to pdf? A) If it's a snippet less than 6144 chars long, simply post it here using the code block tags. (cf. the little help link in the page header bar above). B) If you do the file route, then don't 'convert' to a pdf file, Anon. Simply rename the cleartext codefile with a .pdf extension, then post it. I'm personally fond of the muh_foo_file.ren_ext_to.cpp.pdf naming form. >any advice welcome. Very likely just one, at this point in your C++ career. Namely PPP3 : (cf. >>31023 ). Good luck, Anon. Hope to hear more from you here on /robowaifu/ . Cheers. :^) >=== -minor edit

Edited last time by Chobitsu on 10/05/2024 (Sat) 02:46:41.

HCSM 09/30/2024 (Mon) 22:04:01 No.33829

>>33817 Thanks. Any one knows if using the gpu through OpenGL is worth it?

Chobitsu 09/30/2024 (Mon) 23:51:52 No.33831

>>33829 Y/w, Anon. <---> >Any one knows if using the gpu through OpenGL is worth it? Please define 'worth it' ? If you're interested in creating your own high-performance rendering system (say your own game, or your own Robowaifu Simulator [1][2] ), then learning OpenGL from scratch [3] is a very good thing. OTOH, if you just want to use OpenGL (et al) in such a project, then simply taking advantage of an already-existing framework is probably a smarter choice. There are many out there, but for a number of reasons I think we'd recommend Raylib [4] first & foremost here, Anon. <---> If instead you mean you want to use the GPU's processing power for non-graphical, general computing purposes (so-called GPGPU), then that's a whole other (and complex) discussion we can also have ITT. --- 1. ( >>155 ) 2. https://gitlab.com/Chobitsu/muh-robowaifu-simulator 3. https://learnopengl.com/ 4. https://www.raylib.com/ >=== -minor edit

Edited last time by Chobitsu on 09/30/2024 (Mon) 23:57:12.

Grommet 10/03/2024 (Thu) 14:10:50 No.33866

>>33831 https://gitlab.com/Chobitsu/muh-robowaifu-simulator I hadn't seen this before. Very nice. Thanks!

Chobitsu 10/04/2024 (Fri) 11:25:58 No.33873

>>33866 Y/w Grommet. >Very nice. Thanks! Heh, it mostly was just a learning experience for me at that time (though it did run really fast on my non-gpu potatoe-top back then. :D >=== -minor edit

Edited last time by Chobitsu on 10/04/2024 (Fri) 16:23:14.

Chobitsu 10/05/2024 (Sat) 13:26:12 No.33880

>>32632 Here's my own CMakeLists.txt file, w/o most comments, and also auto-DL'g cpm-make so I can just copypasta this into new projects. >CMakeLists.txt

cmake_minimum_required(VERSION 3.14 FATAL_ERROR)

project(stdexecExample)

# download CPM.cmake
file(
  DOWNLOAD
  https://github.com/cpm-cmake/CPM.cmake/releases/download/v0.40.2/CPM.cmake
  ${CMAKE_CURRENT_BINARY_DIR}/cmake/CPM.cmake
  SHOW_PROGRESS 
  EXPECTED_HASH SHA256=c8cdc32c03816538ce22781ed72964dc864b2a34a310d3b7104812a5ca2d835d
)
include(${CMAKE_CURRENT_BINARY_DIR}/cmake/CPM.cmake)

CPMAddPackage(
  NAME stdexec
  GITHUB_REPOSITORY NVIDIA/stdexec
  GIT_TAG main # This will always pull the latest code from the `main` branch.
)

add_executable(main main.cpp)

target_link_libraries(main STDEXEC::stdexec)

Chobitsu 10/07/2024 (Mon) 02:43:16 No.33899

>>32610 >>32632 >C++ Senders >current draft standard -related: https://en.cppreference.com/w/cpp/execution

Chobitsu 10/07/2024 (Mon) 03:38:23 No.33900

>>33899 I discovered tonight that this experimental Senders implementation includes a working implementation of this work-stealing queue within the static threadpool tree [1][2][3][4] (Apache 2 license) : >"BWoS: Formally Verified Block-based Work Stealing for Parallel Processing" >Abstract: >"Work stealing is a widely-used scheduling technique for parallel processing on multicore. Each core owns a queue of tasks and avoids idling by stealing tasks from other queues. Prior work mostly focuses on balancing workload among cores, disregarding whether stealing may adversely impact the owner’s performance or hinder synchronization optimizations. Realworld industrial runtimes for parallel processing heavily rely on work-stealing queues for scalability, and such queues can become bottlenecks to their performance. >"We present Block-based Work Stealing (BWoS), a novel and pragmatic design that splits per-core queues into multiple blocks. Thieves and owners rarely operate on the same blocks, greatly removing interferences and enabling aggressive optimizations on the owner’s synchronization with thieves. Furthermore, BWoS enables a novel probabilistic stealing policy that guarantees thieves steal from longer queues with higher probability. In our evaluation, using BWoS improves performance by up to 1.25x in the Renaissance macrobenchmark when applied to Java G1GC, provides an average 1.26x speedup in JSON processing when applied to Go runtime, and improves maximum throughput of Hyper HTTP server by 1.12x when applied to Rust Tokio runtime. In microbenchmarks, it provides 8-11x better performance than state-of-the-art designs. We have formally verified and optimized BWoS on weak memory models with a model-checking-based framework. https://people.mpi-sws.org/~viktor/papers/osdi2023-bwos.pdf https://www.usenix.org/conference/osdi23/presentation/wang-jiawei https://www.youtube.com/watch?v=kQ3tRrM69UQ --- 1. ./build/_deps/stdexec-src/include/exec/__detail/__bwos_lifo_queue.hpp -(you will probably first need to query a library type declaration to DL the sauce files -- I did) -(ie, r-click a type, choose 'Go to declaration'. This triggers the file DLs in Juci++) 2. https://github.com/NVIDIA/stdexec/blob/main/include/exec/static_thread_pool.hpp 3. https://github.com/NVIDIA/stdexec/blob/main/include/exec/__detail/__bwos_lifo_queue.hpp 4. cf. exceptional ease of use in the final exec library call (ie, just werks fire-&-forget):

// Declare a pool of 3 worker threads:
exec::static_thread_pool pool(3);

( >>32632 ) >=== -add paper abstract -add/rm pixeldrain hotlink -minor edit -wrap hotlink in codeblock to patch L*nxchan's mangling of system lib file-naming within a hotlink :^)

Edited last time by Chobitsu on 10/07/2024 (Mon) 05:17:20.

Robowaifu Technician 10/08/2024 (Tue) 08:45:39 No.33909

made an ascii art generator for bmp files, couldnt figure out a spacing algorithm but tweaking the contrast or brightness was good enough for most images

#include <iostream>
#include <wchar.h>
#include <locale>
#include <stdlib.h>

#define XMAX 		80
#define YMAX 		50
#define BRIGHTNESS	30	// shading level
#define CONTRAST	20	// contour level

using namespace std;
struct bitmap
{
	int w, h;
	char *data;
};
class framebuffer
{	
	/* using brail unicode chars
	   dotmatri: [4][2]
		  { 0 3
		    1 4
		    2 5
		    6 7	}	char = 16bit 
		    		value is 0x2800 + dotmatrix read as binary  */
	public:
	int width 	= XMAX;
	int hieght	= YMAX;
	int dotn[8] 	= { 0, 3, 1, 4, 2, 5, 6, 7 };
	wchar_t *buffer = (wchar_t*)malloc( width * hieght * sizeof(wchar_t) );

	void clear( void )
	{
		for ( int r=0; r < hieght; r++ )
			wmemset( &buffer[ r*width ], 0x2800, width );
	}
	void draw( void )
	{
		for ( int r=0; r < hieght; r++ )
			wprintf( L"%*.*ls\n", width,width, &buffer[ r*width ] );
	}
	void putpixel( unsigned int x, unsigned int y )
	{
		int posx = x/2;
		int posy = y/4;
		x 	 %= 2;	
		y	 %= 4;
		
		if ( posx > width || posy > hieght ) // out of bounds
			return;
			
		wchar_t pixel = buffer[ posy * width + posx ];
		pixel -= 0x2800;
		
		if ( pixel & ( 1 << dotn[ y*2 + x ] ) )	// already set
			return;
			
		pixel |= 1 << dotn[ y*2 + x ];
		pixel += 0x2800;
		buffer[ posy * width + posx ] = pixel;
	}
	void putbitmap( int x, int y, struct bitmap *bmp )
	{
		for ( int yc=0; yc < bmp->h; yc++ )
			for ( int xc=0; xc < bmp->w; xc++ )
				if ( bmp->data[ (yc * bmp->w) + xc ] )				
					putpixel( x + xc , y + yc );
	}
} FB;

struct bitmap *loadfile( const char *filename )
{
	FILE *fp = fopen( filename, "r" );
	if ( !fp )
		wcout << L"! ERROR couldnt open " << filename << L'\n', exit(1);
	
	struct Header 
	{
		unsigned int size;
		unsigned int x;
		unsigned int y;
		unsigned short int planec;
		unsigned short int depth;
		unsigned int compression;
		unsigned int rez;
		unsigned int xbitc;
		unsigned int ybitc;
		unsigned int biClrUsed;
		unsigned int biClrImportant;  
	} __attribute__ ((packed)) Header;
	
	fseek( fp, 14, SEEK_SET );
	fread( (char*)&Header.size, 1, sizeof( struct Header ), fp );

	if ( Header.depth != 32 )
		wcout << L"! ERROR file must be a 32bit .bmp\n" << (wchar_t)0x1F620, exit(2);
		
	struct bitmap *bmp = (struct bitmap*) malloc( sizeof(struct bitmap) );
	
	// resize to framebuffer 
	float xscale	= ( Header.x > (XMAX*2) ) ? (float)Header.x / (float)(XMAX*2) : 1;
	float yscale	= ( Header.y > (YMAX*4) ) ? (float)Header.y / (float)(YMAX*4) : 1;
	bmp->w		= (float)Header.x / xscale;
	bmp->h		= (float)Header.y / yscale;
	
	bmp->data	= (char*) malloc( bmp->h * bmp->w + bmp->w );
	char *file 	= (char*) malloc( Header.x * Header.y * 32 );
	fread( file, 1, Header.x * Header.y * 32, fp );
	int x=0, y=0;
	for ( float r=Header.y -1; r >= 0; r -= yscale, y++, x=0 )
		for ( float c=0; c < Header.x; c += xscale, x++ )
		{
			int check;
			// check pixel for brightness
			int32_t pixle = ((int32_t*)file)[ (int)r * Header.x + (int)c ];
			unsigned char cur = 0;
			cur |= ( pixle & (0xff) );
			cur |= ( pixle & (0xff<<8) );
			cur |= ( pixle & (0xff<<16) );
			check = ( cur < BRIGHTNESS );
		
			 
			// check surounding pixels for contrast
			for ( int yd=-1; yd<2; yd++ ) 
			for ( int xd=-1; xd<2; xd++ ) 
			{
				int y	= (int)r + yd;
				int x 	= (int)c + xd;
				if 
				( 	( !yd && !xd ) 		   ||	// same pixel
					( y < 0 || y >= Header.y ) ||	// out of bounds
					( x < 0 || x >= Header.x )	// out of bounds
				)	continue;
				
				int32_t pixle = ((int32_t*)file)[ y * Header.x + x ];
				unsigned char near = 0;
				near |= ( pixle & (0xff) );
				near |= ( pixle & (0xff<<8) );
				near |= ( pixle & (0xff<<16) );
				if ( near > cur )
					check |= ( (near - cur) > CONTRAST );
				else
					check |= ( (cur - near) > CONTRAST );
					
			}
			
			bmp->data[ y*bmp->w + x ] = check;	
		}

	free( file );
	fclose( fp );
	return bmp;
}
int main( int argc, char **args )
{
	if ( argc < 2 ) exit(1);
	setlocale(LC_CTYPE, "");
	
	struct bitmap *pic = loadfile( args[1] );
	if ( pic )
	{
		FB.clear();
		FB.putbitmap( 0, 0, pic );
		FB.draw();
	}
	return 0;
}

Chobitsu 10/08/2024 (Tue) 14:33:14 No.33911

>>33909 POTD > saved LOL THIS IS GREAT ANON! :DD This whole arena of terminal vidya & media has been abandoned for the most part by newfags & scrubs. :^) BTW, I suggest you owe it to yourself to watch all this guy's stuff, Anon: 4'000fps+ FPS on the terminal!! https://www.youtube.com/watch?v=xW8skO7MFYw https://github.com/OneLoneCoder/CommandLineFPS >=== -fmt, minor edit -add code hotlink

Edited last time by Chobitsu on 10/08/2024 (Tue) 15:43:36.

Chobitsu 10/08/2024 (Tue) 15:52:06 No.33912

>>33909 >>33911 >related: https://pgetinker.com/ https://www.youtube.com/watch?v=AY99hF3kVH8&list=PLrOv9FMX8xJEEQFU_eAvYTsyudrewymNl https://pgetinker.com/s/OWNuhyU8IgX >=== -add'l hotlink

Edited last time by Chobitsu on 10/08/2024 (Tue) 15:56:10.

Robowaifu Technician 10/08/2024 (Tue) 22:50:05 No.33922

>>33911 nice, just realized theres already bitmap tools on linux that makes it easy to edit and make files you can just #include in your code, its actual bitmaps though, in my code is technically using bytemaps lol

Chobitsu 10/09/2024 (Wed) 02:25:53 No.33930

>>33922 >just realized theres already bitmap tools on linux Nice. Sauce, Anon?

Chobitsu 10/09/2024 (Wed) 02:30:02 No.33931

Slightly cleaned-up (suited to my tastes, anyway -- I find this easier to reason about) working version of the current nonworking example of std::execution::scheduler [1] >main.cpp

// orig. sauce (working) :  https://godbolt.org/z/146fY4Y91

#include <stdexec/execution.hpp>
// LOL, injecting into std:: namespaces...  :D
//
namespace std::execution {
using namespace stdexec;
}  // namespace std::execution

//---

#include <iostream>
#include <thread>

//---

using namespace std::execution;
using std::cout;
using std::jthread;

//---

class Single_thrd_cntx {
 public:
  Single_thrd_cntx() : thread_{[this] { loop_.run(); }} {}
  ~Single_thrd_cntx() { loop_.finish(); }

  auto get_scheduler() noexcept { return loop_.get_scheduler(); }

 private:
  Single_thrd_cntx(Single_thrd_cntx&&) = delete;  // no move ctor

  run_loop loop_{};  // DESIGN: this init needed??
  jthread  thread_;
};

int main() {
  Single_thrd_cntx ctx;

  // Note: No work actually happens here; the compute graph is simply enqueued
  auto sndr = schedule(ctx.get_scheduler()) | then([] {
                cout << "Hello world! Have an int.\n";
                return 12;
              }) |
              then([](int arg) { return arg + 42; });

  // work happens here
  auto [i] = sync_wait(sndr).value();  // what is this syntax ( `[i]` ) called?

  cout << "Back in the main thread, result is " << i << '\n';
}

>output:

Hello world! Have an int.
Back in the main thread, result is 54

>CMakeLists.txt

cmake_minimum_required(VERSION 3.25)

project(test_cppref_sched)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++20 -pthread \
       -O2 \
       -Wall -Wextra -pedantic \
       -Wno-unused-parameter"  # b/c getting this error in latest stdexec impl
)

# download CPM.cmake
file(
  DOWNLOAD
  https://github.com/cpm-cmake/CPM.cmake/releases/download/v0.40.2/CPM.cmake
  ${CMAKE_CURRENT_BINARY_DIR}/cmake/CPM.cmake
  SHOW_PROGRESS 
  EXPECTED_HASH SHA256=c8cdc32c03816538ce22781ed72964dc864b2a34a310d3b7104812a5ca2d835d
)
include(${CMAKE_CURRENT_BINARY_DIR}/cmake/CPM.cmake)

CPMAddPackage(
  NAME stdexec
  GITHUB_REPOSITORY NVIDIA/stdexec
  GIT_TAG main # This will always pull the latest code from the `main` branch.
)

add_executable(test_cppref_sched main.cpp)

target_link_libraries(test_cppref_sched STDEXEC::stdexec)

--- 1. https://en.cppreference.com/w/cpp/execution/scheduler >=== -minor edit -add output, cmakelists codeblocks

Edited last time by Chobitsu on 10/09/2024 (Wed) 03:21:44.

Robowaifu Technician 10/09/2024 (Wed) 02:42:51 No.33932

>>33930 https://linux.die.net/man/1/bmtoa its in the x11-apps package

Chobitsu 10/09/2024 (Wed) 02:56:49 No.33933

>>33932 Thanks! Frankly your power-level is greater IMO. :^)

Chobitsu 10/09/2024 (Wed) 16:32:41 No.33940

>>33931 I felt it might be helpful for some, to simplify the Scheduler statement, by moving the embedded lambdas out into named ones: >snippet:

  // note: no work actually happens here; the compute graph is simply enqueued:
  //
  auto hello_work = [] {
    cout << "Hello world! Have an int.\n";
    return 12;
  };
  auto add_work = [](int arg) { return arg + 42; };
  //
  // breaking out the two lambdas into named functions, this call is clearer IMO
  auto sndr = schedule(ctx.get_scheduler()) | then(hello_work) | then(add_work);

Other than breaking out this one compound statment into 3, everything else remains identical. Cheers. :^)

Chobitsu 10/10/2024 (Thu) 18:07:58 No.33953

>>33931 Lol, I answered my own question which was staring me in the face, if I'd just hovered over the `i` from within JuCi++ to see it's derived type. It's a C++17 structured binding. [1]

  // work happens here
  auto [i] = sync_wait(sndr).value();  // what is this syntax ( `[i]` ) called?
                                       //  -C++17 structured bindings? (ie [0th]
                                       //  position; tuple<int> ret from value()

--- 1. https://en.cppreference.com/w/cpp/language/structured_binding >=== -fmt, minor edit -add footnote/hotlink

Edited last time by Chobitsu on 10/10/2024 (Thu) 18:11:58.

Robowaifu Technician 10/11/2024 (Fri) 02:56:22 No.33965

>>33909 fixed a bunch of stuff, its doing outlines right now and its more usable, dont know if c++ has a better way of doing bit manipulation since it makes the code a mess [code]#include <iostream> #include <wchar.h> #include <locale> #include <stdlib.h> using namespace std; struct bitmap { int w, h; unsigned char *bits; // changed to actual bitmap, read using bitmasks }; class framebuffer { public: unsigned int width; unsigned int height; wchar_t *buffer; int dotn[8] = { 0, 3, 1, 4, 2, 5, 6, 7 }; void init( unsigned int xmax, unsigned int ymax ) { width = xmax; height = ymax; buffer = (wchar_t*) malloc( width * height * sizeof(wchar_t) ); } void clear( void ) { wmemset( buffer, 0x2800, width * height ); } void draw( void ) { for ( int r=0; r < height; r++ ) wprintf( L"%*.*ls\n", width,width, &buffer[ r*width ] ); } void putpixel( unsigned int x, unsigned int y ) { int posx = x/2; int posy = y/4; x %= 2; y %= 4; if ( posx > width || posy > height ) // out of bounds return; wchar_t pixel = buffer[ posy * width + posx ]; pixel -= 0x2800; if ( pixel & ( 1 << dotn[ y*2 + x ] ) ) // already set return; pixel |= 1 << dotn[ y*2 + x ]; pixel += 0x2800; buffer[ posy * width + posx ] = pixel; } void putbitmap( int x, int y, struct bitmap *bmp ) { for ( int yc=0; yc < bmp->h; yc++ ) for ( int xc=0; xc < bmp->w/8; xc++ ) for ( int bit=0; bit < 8; bit++ ) if ( bmp->bits[ yc * (bmp->w/8) + xc ] & (1<<bit) ) putpixel( x + xc*8 + bit, y + yc ); } } FB; struct bitmap *loadfile( const char *filename, unsigned char brightness, unsigned char contrast, int invert ) { FILE *fp = fopen( filename, "r" ); if ( !fp ) wcout << L"! ERROR couldnt open " << filename << L'\n', exit(1); struct Header { unsigned int size; unsigned int x; unsigned int y; unsigned short int planec; unsigned short int depth; unsigned int compression; unsigned int rez; unsigned int xbitc; unsigned int ybitc; unsigned int biClrUsed; unsigned int biClrImportant; } attribute ((packed)) Header; fseek( fp, 14, SEEK_SET ); fread( (char*)&Header.size, 1, sizeof( struct Header ), fp ); fseek( fp, 32*4, SEEK_SET ); // XXX misalignment since my header def is wrong or something if ( Header.depth != 32 ) wcout << L"! ERROR file must be a 32bit .bmp\n" << (wchar_t)0x1F620, exit(2); struct bitmap *bmp = (struct bitmap*) malloc( sizeof(struct bitmap) ); // resize to framebuffer float xscale = ( Header.x > (FB.width *2) ) ? (float)Header.x / (float)(FB.width*2) : 1; float yscale = ( Header.y > (FB.height *4) ) ? (float)Header.y / (float)(FB.height*4) : 1; bmp->w = (float)Header.x / xscale; bmp->h = (float)Header.y / yscale; bmp->bits = (unsigned char*) malloc( (bmp->h) * (1 + bmp->w/8) ); char *file = (char*) malloc( Header.x * Header.y * 32 ); fread( file, 1, Header.x * Header.y * 32, fp ); int32_t prev = 0; int x=0, y=0; for ( float r=Header.y -1; r >= 0; r -= yscale, y++, x=0 ) for ( float c=0; c < Header.x; c += xscale, x++ ) { int check = 0; // check pixel for brightness int32_t pixel = ((int32_t*)file)[ (int)r * Header.x + (int)c ]; unsigned char cur = 0; cur |= ( pixel & (0xff) ); cur |= ( pixel & (0xff<<8) ); cur |= ( pixel & (0xff<<16) ); check = ( cur < brightness ); check ^= invert; // check surounding pixels for contrast unsigned int check2 = 0; for ( int yd=-3; yd<4; yd++ ) // TODO test size should be based on resolution for ( int xd=-3; xd<4; xd++ ) { int y = (int)r + yd; int x = (int)c + xd; if ( ( !yd && !xd ) || // same pixel ( y < 0 || y >= Header.y ) || // out of bounds ( x < 0 || x >= Header.x ) // out of bounds ) continue; int32_t pixel = ((int32_t*)file)[ y * Header.x + x ]; unsigned char near = 0; near |= ( pixel & (0xff) ); near |= ( pixel & (0xff<<8) ); near |= ( pixel & (0xff<<16) ); if ( near > cur ) check2 += ( (near - cur) > contrast ); else check2 += ( (cur - near) > contrast ); } if ( check2 >= 16 ) // invert if in a cluster else outline check = 0; else check |= check2; int bit = x % 8; if ( !bit ) bmp->bits[ y *( bmp->w/8 ) + x/8 ] = 0; if ( check ) bmp->bits[ y *( bmp->w/8) + x/8 ] |= (1 << bit); } free( file ); fclose( fp ); return bmp; } int main( int argc, char **args ) { if ( argc < 2 ) exit(1); setlocale(LC_CTYPE, ""); // defaults unsigned int xmax = 40; // terminal width unsigned int ymax = 20; // terminal height unsigned int brightness = 30; // shading level unsigned int contrast = 20; // contour level unsigned int xrez = xmax * 2; // pixel resolution unsigned int yrez = ymax * 4; // pixel resolution int invert = 0; char *file; for ( int i=1; i<argc; i++ ) { char *arg = *(++args); if ( *arg == '-' ) switch ( *(++arg) ) { case 'r': // set resolution in pixels args++; i++; sscanf( *(args), "%dx%d", &xrez, &yrez ); xmax = xrez / 2; ymax = yrez / 4; break; case 's': // set resolution in chars args++; i++; sscanf( *(args), "%dx%d", &xmax, &ymax ); xrez = xmax * 2; yrez = ymax * 4; break; case 'b': // set brightness level args++; i++; sscanf( *(args), "%d", &brightness ); break; case 'c': // et contrast level args++; i++; sscanf( *(args), "%iu", &contrast ); break; case 'i': // invert shading invert = 1; break; } else file = arg; } FB.init( xmax, ymax ); struct bitmap *pic = loadfile( file, brightness, contrast, invert ); if ( pic ) { FB.clear(); wprintf( L"res: %dx%d\nsize: %dx%d\nb=%d c=%d\n", xrez, yrez, xmax, ymax, brightness, contrast ); FB.putbitmap( 0, 0, pic ); FB.draw(); } return

Chobitsu 10/11/2024 (Fri) 04:14:46 No.33966

>>33965 Wow, that's really becoming sophisticated, Anon. Why don't you publish this on a repo now? >dont know if c++ has a better way of doing bit manipulation since it makes the code a mess Most of it's original capability for this is directly derived from C, of course. OTOH, C++ now has intrinsic support for so-called bitfields, bitsets, & vector<bool> 's [1][2][3], so all the benefits of containers (ie, iterators, algorithms, span views, etc.) all come into play there. There are also a fair collection of bit manipulation functions in the numerics library now, too. [4] You might explore those spaces and see if they help you any. Regardless, very cool work, Anon! Grats. :^) --- 1. https://en.cppreference.com/w/cpp/language/bit_field 2. https://en.cppreference.com/w/cpp/utility/bitset 3. https://en.cppreference.com/w/cpp/container/vector_bool 4. https://en.cppreference.com/w/cpp/numeric#Bit_manipulation_.28since_C.2B.2B20.29 >=== -add'l footnotes -prose edit

Edited last time by Chobitsu on 10/11/2024 (Fri) 04:27:29.

muh .clang-format Chobitsu 10/15/2024 (Tue) 16:39:42 No.34006

Since glowniggers finally destroyed the based Anonfiles a while back (and since that's where the standardized .clang-format file I use was stored to share with the Anons here on /robowaifu/ ), I'll just copypasta the thing directely here. If you want to use it too, then: A. Install clang-format (the tool) B. Copypasta this codeblock into a new file named .clang-format (don't forget the leading dot!) into the base dir where you store all your C++ / C development projects. Then just execute the command: clang-format against any files you want properly formatted to the standard. <---> update: Well, lol. My file won't fit into a single post. Probably why I didn't just post it here last year or so :D I'll find a spot to host it again and link that instead. >=== -add/rm codeblock -add 'update' msg

Edited last time by Chobitsu on 10/15/2024 (Tue) 16:50:01.

C++20 Modules Chobitsu 10/16/2024 (Wed) 16:50:52 No.34010

Figuring out how to work with C++20 Modules under GCC. This is preparatory to starting up the new C++ Learning Classroom based on PPP3 here at some point : ( >>TBD ; cf. >>19777 , >>31023 ). Since support for modules is still sketchy at best rn for the big three compilers, it can be a bit of a challenge (at least is has been for me, heh :D. But here's an example setup [1] that I got to work, and g++ auto-pre-compiled the so-called '.gcm' file, and placed it in the right spot in the build tree for me. [2][3] >main.cpp

import hello;

int main() {
  greeter("world");
}

//  https://gcc.gnu.org/wiki/cxx-modules

>hello_m.cpp

module;

#include <iostream>
#include <string_view>

export module hello;

export void greeter(std::string_view const& name) {
  std::cout << "Hello " << name << "!\n";
}

>CMakeLists.txt

cmake_minimum_required(VERSION 3.5)

project(test_gcc_wiki_modules)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++20 -Wall -Wextra -fmodules-ts")

add_executable(test_gcc_wiki_modules main.cpp hello_m.cpp)

<---> Next I'll try to get working with GCC the four example files Stroustrup gives on his site for modules. I'll post back here with any updates+caveats, etc. Cheers. :^) --- 1. https://gcc.gnu.org/wiki/cxx-modules 2. note: I had to build the project twice; the first time generates the .gcm (and gives errors about no .gcm lol), the second time finds the .gcm and builds the project correctly (and correctly thereafter too). The point being that the .gcm must be present before g++ will fully build successfully at all; but once it's there, then the process works OK. 3. update: Hmm, when compiling this example as shown here on another system with a higher version number of g++ (g++ (GCC) 14.2.1 20240910), this 'compile twice in juCi++ and things just werk' approach didn't pan out. Easy enough two-step fix, but still: A. Go to terminal in the root directory of the project, and run: g++ -std=c++20 -fmodules-ts main.cpp hello_m.cpp B. Then just move the resulting gcm.cache/ directory over into the build/ directory (instead of the project's root), and it works OK inside the IDE thereafter. TBH, this may just be an artifact of my systems. I haven't tested this stuff thoroughly yet. YMMV, Anon. Good luck. :^) >=== -minor edit -code patch -add add'l footnotes

Edited last time by Chobitsu on 10/17/2024 (Thu) 00:29:07.

Chobitsu Board owner 01/15/2025 (Wed) 20:44:30 No.35657

Well, I had hoped to start another C++ class this year to support the new PPP3 . But since I only post on Tor, I can't post files here on alogs.space at least ATM! :D. So, for the moment at least, I recommend using learncpp.com . Seems like this guy is really on top of his game with the site, it provides a pretty expansive purview of the language, and it's probably much better than I could manage on my limited time budget anyway. I hope it helps any Anons here on /robowaifu/ with learning this most-important of systems languages for our robowaifus. Cheers. :^) https://www.learncpp.com/cpp-tutorial/introduction-to-cplusplus/ >=== -minor edit

Edited last time by Chobitsu on 01/15/2025 (Wed) 20:47:55.

Chobitsu 01/15/2025 (Wed) 23:07:54 No.35665

>>35657 BTW after several years at this, I'm still using juCi++ as my daily-driver IDE, and I recommend it above any of the three the author of learncpp.com recommends (I've also used all 3 of those too, regularly). A few years back I did a bit of a tutorial on getting jucipp up and running : (cf. >>5270, ...) . Still the same basic process today. * --- * though slightly-dated. be sure to follow the jucipp repo's install instructions for your platform to pull it's current build dependencies. >=== -add footnote -minor edit

Edited last time by Chobitsu on 02/13/2025 (Thu) 06:27:13.

khlor 01/30/2025 (Thu) 15:55:20 No.36279

c++ main here. please dont use this godawful language. its the least productive ive ever been in programming. about 2/3 of my time is spent on semantics that have nothing to do with what im writing. inb4 >just learn to read it knowing how to read it makes me hate it all the more, its unnecessary mental overhead better spent on other things ive written a custom dialect and template lib to replace the STL to be palatable, but it does nothing to help external code. move semantics are a nightmare, and can only be avoided by discarding half of the language. the average cmake build fills me with the same amount of disgust as seeing a butchered animal. ive had cmake work without troubleshooting maybe twice ever. >rust? eh. its a step in the right direction that doesnt go far enough, and replaces problems with new ones. and i subjectively find its syntax ugly >functional replacing a for with a map-reduce is not functional programming. even the lambda syntax is too verbose to be usable, i actually prefer x macros because their semantics are predictable, but they are still obscenely hacky the only good c++ library is a c library. if you need to bang bits, use c or zig if you want to actually finish anything useful and use high level libs like opencv, use python

Chobitsu Board owner 01/30/2025 (Thu) 20:10:59 No.36282

>>36279 >if you want to actually finish anything useful and use high level libs like opencv Ahh, but somebody has to write those "high-level libs", don't they Anon? That's a major part of our business here. This is where C++ shines (thus why OpenCV is written in it). <---> Anyway, welcome Anon. Have a look around while you're here. Cheers. :^)

Chobitsu 02/12/2025 (Wed) 20:27:08 No.36930

C++26 will incorporate officially-supported basic linear algebra (based on BLAS). https://en.cppreference.com/w/cpp/numeric/linalg This should enable standardized, fully-compatible LinAlg code across all hardware platforms going forward (with no external dependencies). Very good news for controlling robowaifu kinematics, of course! :^) >=== -add 'dependencies' cmnt

Edited last time by Chobitsu on 02/12/2025 (Wed) 20:47:30.

Robowaifu Technician 02/13/2025 (Thu) 06:05:43 No.36939

>>36930 >(discussion -related : >>36938, ...)

Robowaifu Technician 02/18/2025 (Tue) 22:57:17 No.37094

> (C++ compiler flags -related : >>37093 ) > (C++ safety design -related : >>37095 ) >=== -add'l crosslink

Edited last time by Chobitsu on 02/18/2025 (Tue) 23:27:48.

Chobitsu 02/19/2025 (Wed) 00:15:31 No.37096

Daily reminder Use idiomatic modern C++ for performance gains in many cases (b/c the optimizer). >update: Lol, apparently not so today (or maybe I'm just doing something silly in my tests). See these results from g++ 13 compiler : ( >>37138 ). <---> Here's a construed example of this using find_if() , instead of old C-style loops+indexing: >find_if_optimizer.cpp

#include <algorithm>
#include <iostream>
#include <vector>

using std::cout;

struct Joint {
  bool   active = false;  // is this joint currently active?
  double angle  = 0.0f;   // the current angle of this joint in degrees
};

int main()
{
  // 5-jointed limb (1st, 5th elements use designated inits [1]) [ -std=c++20 ]
  std::vector<Joint> joints{
      {.angle = 90.f}, {}, {}, {}, {.active = true, .angle = 45.f}};

  // do we have an active joint?
  //
  auto const iter =
      // idiomatic find_if is optimized; it runs faster than using C-style loops
      std::find_if(joints.cbegin(), joints.cend(),
                   [](auto const& joint) { return joint.active; });
  //
  if (iter != joints.cend())  // if found, whats it's angle?
    cout << "active joint found, angle: " << iter->angle << '\n';
  else
    cout << "active joint not found.\n";
}

// 1.
// https://en.cppreference.com/w/cpp/language/aggregate_initialization#Designated_initializers

>output: active joint found, angle: 45 <---> (cf. >>4402 ) >=== -fmt, minor edit -add 'update' msg+crosslink

Edited last time by Chobitsu on 02/21/2025 (Fri) 23:03:17.

Chobitsu 02/19/2025 (Wed) 23:03:01 No.37116

>>37096 In this type of case, if this was a large container (say, >10'000 items), then we could also use the par_unseq execution policy tag [1] for no-fuss, optimized, native multi-threaded parallel execution against that container. Eg : >find_if_optimizer_v2.cpp snippets :

...
#include <execution>
using std::execution::par_unseq;  // [ -std=c++17 ]
...
  std::vector<Joint> joints_big(1'000'000);  // one million defaulted Joint 's
  // TODO: set an example, random joint active. eg :
  //     joints_big[897'128].active = true;
  //     joints_big[897'128].angle  = 187.3;
...
  auto const par_iter =
      std::find_if(par_unseq, joints_big.cbegin(), joints_big.cend(),
                   [](auto const& joint) { return joint.active; });
...
  if (par_iter != joints_big.cend())  // if found, whats it's angle?
    cout << "active joint found, angle: " << par_iter->angle << '\n';
...

Notice that there's no fundamental change needed to the basic find_if() call: just slap the par_unseq tag inside the statement, and everything just automagically works thereafter (and runs much faster on a big container). The compiler does all the work for us. --- >note: This doesn't work with Apple Clang! [2][3] Not sure why they still have not resolved this yet after all this time, but w/e. macOS isn't really a high-priority platform for me regardless. >tl;dr Just use g++, bro! :D (works fine on WSL too) --- 1. https://en.cppreference.com/w/cpp/algorithm/execution_policy_tag 2. https://stackoverflow.com/questions/60859395/unable-to-compile-c17-with-clang-on-mac-osx 3. https://en.cppreference.com/w/cpp/compiler_support#C.2B.2B20_library_features >=== -add'l footnote/hotlink -prose edit -add 'apple clang' note

Edited last time by Chobitsu on 02/20/2025 (Thu) 09:58:04.

Chobitsu 02/20/2025 (Thu) 17:55:50 No.37125

>Au: A C++14-compatible units library, by Aurora >A C++14-compatible physical units library with no dependencies and a single-file delivery option. Emphasis on safety, accessibility, performance, and developer experience. --- >Au (pronounced "ay yoo") is a C++ units library, by Aurora. What the <chrono> library did for time variables, Au does for all physical quantities (lengths, speeds, voltages, and so on). >Namely: >Catch unit errors at compile time, with no runtime penalty. >Make unit conversions effortless to get right. >Accelerate and improve your general developer experience. >In short: if your C++ programs handle physical quantities, Au will make you faster and more effective at your job. https://aurora-opensource.github.io/au/main/ https://github.com/aurora-opensource/au <---> >vid-related: https://www.youtube.com/watch?v=o0ck5eqpOLc >=== -fmt, prose edit -add add'l hotlink

Edited last time by Chobitsu on 02/20/2025 (Thu) 17:59:58.

Robowaifu Technician 02/20/2025 (Thu) 18:04:43 No.37126

>>37116 made no difference for me, i benchmarked it and a simple loop is fractionally better everytime with -O2, without optimization its like x10 worse than a loop lol, there must be a lot of overhead when using these iterator things and things you count on getting optimized away, with a loop the cpu already does the optimization for you

#include <algorithm>
#include <execution>
using std::execution::par_unseq;  // [ -std=c++17 ]
#include <iostream>
#include <vector>
#include <time.h>

using std::cout;

struct Joint {
  bool   active = false;  
  int index  = 0;
};
struct Joint *loop_foind( struct Joint *J, int len )
{
	for ( int i=0; i<len; i+=2 )
		if ( J[i].active | J[i+1].active ) // can check 2 in parallel 
			if ( J[i].active )
				return &J[i];
			else
				return &J[i+1];
	return NULL;
}

int flushcache( void )	// no cheating
{
	int doNotRemove;
	size_t len = 2048 * 1000;
	long *ptr = (long*)malloc( len * sizeof(long));
	long *ptr2 = (long*)malloc( len * sizeof(long));
	
	for ( int i =0; i<len; i++ )
		ptr[i] = rand();
	
	for ( int i =0; i<len; i++ )
		ptr2[i] = ptr[i];
		
	for ( int i =0; i<len; i++ )
		ptr[i] += ptr2[len-i];
		
	for ( int i =0; i<len; i++ )
		doNotRemove ^= ptr[i];
		
	free( ptr );
	free( ptr2 );
	return doNotRemove;
}
int main()
{
	int ret = 11111;
	srand( time(NULL) );
	int testSize 	= rand() % 10000000;	
	int findMe 	= rand() % testSize;
	volatile clock_t start, end;	
	printf( "  testSize = %d findMe @ %d   depth = %f%%\n\n", 
		testSize, findMe, (1 - (double)(testSize-findMe)/testSize )*100 );

/* ----------
    TEST 1 
-------------*/
// _______________ init ______________
	std::vector<Joint> joints = {};
	for ( int i=0; i<testSize; i++ )
	if ( i == findMe )
  	  	joints.push_back({ .active=true, .index=i });
  	  else
  	  	joints.push_back({ .active=false, .index=i });
  	
  	ret ^= flushcache();
// _________ benchmark start _________
	start = clock();
	auto const iter =
	std::find_if(par_unseq, joints.begin(), joints.end(),
                   [](auto const& joint) {  return joint.active; });

	end = clock();
// _________ benchmark end  _________

	cout << "TEST 1: found at " << iter->index << " cycles= "  << end-start << '\n';
    
/* ----------
    TEST 2 
-------------*/
// _______________ init ______________
	int allign = sizeof(struct Joint) * testSize % 64;
	struct Joint *J = (struct Joint*) malloc( sizeof(struct Joint) * testSize + allign + 64 );

	for ( int i=0; i<testSize; i++ )
		if ( i == findMe )
			J[i] = { .active=true, .index=i };
		else
			J[i] = { .active=false, .index=i };

  	ret ^= flushcache();	
// _________ benchmark start _________
	start = clock();
	struct Joint *node = loop_foind( J, testSize );
	end = clock();
// _________ benchmark end  _________

	cout << "TEST 2: found at " << node->index << " cycles= "  << end-start << '\n';
	printf( "\n\n------- %p  %d\n", J, (intptr_t)J % 64 );
	
	free( J );
	return ret;
}

it would be useful if it came with a distribution pattern to search with since a lot of times you know the distribution and its not completely random so like -->>|(like my loop_foind), -->|<--, <--|--> etc. i dont know if that made sense

Chobitsu 02/20/2025 (Thu) 18:18:20 No.37127

>>37125 >related: https://mpusz.github.io/mp-units/latest/ https://github.com/mpusz/mp-units --- >related: >How to Improve the Safety of C++ Code With a Quantities & Units Library https://www.youtube.com/watch?v=pPSdmrmMdjY&t=30s https://www.youtube.com/watch?v=7dExYGSOJzo https://www.youtube.com/watch?v=eUdz0WvOMm0 >=== -add related hotlinks

Edited last time by Chobitsu on 02/20/2025 (Thu) 22:32:31.

Chobitsu 02/20/2025 (Thu) 18:27:28 No.37128

>>37126 Great! Thanks for taking the time to check up on me here. It's nice to have other Anons here come forward to engage with C++ development for our robowaifus. Please contribute good C++ ideas for us all here. TWAGMI <---> I'll plan to contriving some actual testing benchmarks at quick-bench.com , utilizing & testing your approach given. (If you'd like to do so, please feel free.) Probably save a year or so's delay atp. :^) Cheers, Anon! :^)

Chobitsu 02/21/2025 (Fri) 21:49:55 No.37138

>>37126 So, I had some time today and wrote a slightly better example set (more realistic as a function call that validates the id, then provides a result), with 100 million elements. >tl;dr I'd say you're right, Anon. It really is kind of all over the map, and there's no clear distinction between : * C-loop style indexing * C++ find_if * C++ find_if parallel Hmm. I'm assuming my test is too simplistic really. The compiler is probably roughly-speaking optimizing all the examples to about the same machine code. Doing an honest-to-goodness set of test benchmarks seems in order at some point. Cheers. >find_get_timings.cpp

#include <algorithm>
#include <chrono>
#include <execution>
#include <iostream>
#include <optional>
#include <random>
#include <vector>

using Clock = std::chrono::steady_clock;

using std::cout;
using std::execution::par_unseq;

struct Widget {
  size_t id = 0;
  double value = 0.0f;
};

//------------------------------------------------------------------------------
bool c_find_id_get_val(std::vector<Widget> const &widgets, size_t id,
                       double &value) {
  for (std::size_t i = 0; i < widgets.size(); ++i) {
    if (widgets[i].id == id) {
      value = widgets[i].value;
      return true;
    }
  }

  return false;
}

//------------------------------------------------------------------------------
std::optional<double> cpp_find_id_get_val(std::vector<Widget> const &widgets,
                                          size_t id) {
  auto const iter =
      std::find_if(widgets.cbegin(), widgets.cend(),
                   [&](auto const &widget) { return widget.id == id; });

  if (iter != widgets.cend())
    return iter->value;

  return std::nullopt;
}

//------------------------------------------------------------------------------
std::optional<double> cpp_find_id_get_val_par(
    std::vector<Widget> const &widgets, size_t id) {
  auto const iter =
      std::find_if(par_unseq, widgets.cbegin(), widgets.cend(),
                   [&](auto const &widget) { return widget.id == id; });

  if (iter != widgets.cend())
    return iter->value;

  return std::nullopt;
}

//------------------------------------------------------------------------------
int main() {
  cout << "initing...\n";
  cout.flush();

  // one hundred million defaulted Widget 's
  std::vector<Widget> widgets(100'000'000);

  // load up Widget id 's
  size_t set_id{0};
  std::for_each(widgets.begin(), widgets.end(),
                [&set_id](auto &widget) { widget.id = set_id++; });

  // set a rando Widget's value
  widgets[22'897'245].value = 777.77;

  // mix the container up
  std::shuffle(widgets.begin(), widgets.end(),
               std::mt19937{std::random_device{}()});

  //---

  double value{-1.f};
  size_t id{22'897'245};

  cout << "initing done\n";
  cout.flush();

  ////////////------------------------------------------------------------------
  // C-style loop :
  //

  /// clock this A1: (id is valid)
  auto start = Clock::now();
  c_find_id_get_val(widgets, id, value);
  auto elapsed = Clock::now() - start;
  ///
  cout << "c_find_id_get_val(" << id << ") returned : " << value << " in "
       << elapsed.count() << " ns\n";

  value = -1.f;
  id = 100'000'001;
  //
  /// clock this A2: (id is invalid)
  start = Clock::now();
  c_find_id_get_val(widgets, id, value);
  elapsed = Clock::now() - start;
  ///
  cout << "c_find_id_get_val(" << id << ") returned : " << value << " in "
       << elapsed.count() << " ns\n";

  ////////////------------------------------------------------------------------
  // C++ find_if, standard :
  //

  id = 22'897'245;
  /// clock this B1: (id is valid)
  start = Clock::now();
  value = cpp_find_id_get_val(widgets, id).value_or(-1.f);
  elapsed = Clock::now() - start;
  ///
  std::cout << "cpp_find_id_get_val(" << id << ") returned : " << value
            << " in " << elapsed.count() << " ns\n";

  id = 100'000'001;
  /// clock this B2: (id is invalid)
  start = Clock::now();
  value = cpp_find_id_get_val(widgets, id).value_or(-1.f);
  elapsed = Clock::now() - start;
  ///
  std::cout << "cpp_find_id_get_val(" << id << ") returned : " << value
            << " in " << elapsed.count() << " ns\n";

  ////////////------------------------------------------------------------------
  // C++ find_if, parallel :
  //

  id = 22'897'245;
  /// clock this C1: (id is valid)
  start = Clock::now();
  value = cpp_find_id_get_val_par(widgets, id).value_or(-1.f);
  elapsed = Clock::now() - start;
  ///
  std::cout << "cpp_find_id_get_val_par(" << id << ") returned : " << value
            << " in " << elapsed.count() << " ns\n";

  id = 100'000'001;
  /// clock this C2: (id is invalid)
  start = Clock::now();
  value = cpp_find_id_get_val_par(widgets, id).value_or(-1.f);
  elapsed = Clock::now() - start;
  ///
  std::cout << "cpp_find_id_get_val_par(" << id << ") returned : " << value
            << " in " << elapsed.count() << " ns\n";
}

>just one rando run output:

initing...
initing done
c_find_id_get_val(22897245) returned : 777.77 in 47429411 ns
c_find_id_get_val(100000001) returned : -1 in 95651640 ns
cpp_find_id_get_val(22897245) returned : 777.77 in 44374173 ns
cpp_find_id_get_val(100000001) returned : -1 in 83945474 ns
cpp_find_id_get_val_par(22897245) returned : 777.77 in 45866627 ns
cpp_find_id_get_val_par(100000001) returned : -1 in 87244314 ns

--- >update 2: After realizing I was calling the standard form of find_if for the parallel test block I patched that. Also, after several runs (in a nonformal context) I see that the standard find_if has a very slight edge, seemingly (at least on this particular machine/compiler config). This is compiled with -O3, BTW. >=== -add'l update msg

Edited last time by Chobitsu on 02/22/2025 (Sat) 02:12:54.

Chobitsu 02/22/2025 (Sat) 01:18:10 No.37141

>>37126 >i dont know if that made sense Oops, apologies for failing to respond before, Anon. Yes, that totally makes good sense. Bounding the search range to match the expected distribution (at least on a first pass) is a really good idea. Thanks! Cheers.

Robowaifu Technician 02/22/2025 (Sat) 02:44:56 No.37143

>>37138 >{size_t, double} aaaa you made it worse i think it just gets optimized as a loop anyway so there shouldnt be a difference, its not really a compiler or algorithm thing its the fact the cpu stalls waiting on ram cuz all youre really doing is reading from memory, the trick before was it was just {int16, int16} so two nodes are fetched in one read so you can do them in parallel, now its too big youre not clearing the cache in your test, everything after the first test has the advantage of having parts preloaded in the cache, change the order of the tests to see what i mean, just add the flushcache() i made in between the tests, and return the value otherwise the optimizer will just remove it, it probably needs to be bigger than i made it, check your l3 cache in lscpu and use double that

Chobitsu 02/22/2025 (Sat) 17:13:25 No.37151

>>37143 >aaaa you made it worse Haha, sorry Anon. :^) And actually, that was slightly-intentional, in an effort to 'complexify' the problemspace being tested by this simple harness. >its not really a compiler or algorithm thing its the fact the cpu stalls waiting on ram cuz all youre really doing is reading from memory Yeah, I can totally see that. Kinda validates my earlier claim that >"...my test is too simplistic really." >youre not clearing the cache in your test, everything after the first test has the advantage of having parts preloaded in the cache This would certainly be a valid concern in a rigorous test-harness. OTOH, I consider it a relatively negligible concern in this case. After all, the caches are quite smol in comparison to a 100M (8byte+8byte) data structure? (However, it probably does explain the 'very slight edge' mentioned earlier for the standard form of find_if [and, by extension, which doesn't occur for the more complex data-access strategy of the parallel version of it].) <---> Regardless, I think this simple testing here highlights that fact that for simple data firehose'g, the compiler will optimize away much of the distinctions between different architectural approaches possible. I don't see any need to test this further until a more-complex underlying process is involved. Cheers, Anon. >=== -prose edit

Edited last time by Chobitsu on 02/22/2025 (Sat) 17:27:02.

Robowaifu Technician 02/22/2025 (Sat) 18:17:41 No.37154

>>37151 >relatively negligible concern in this case it made a really big difference on my machine, its not just data, the instructions are also cached and theyre all the same after being optimized so its a big headstart after the first round also forgot to mention O3 doesnt really optimize it just messes up loops by going extreme with unrolling, no one uses it for that reason, its too much and has the opposite effect, declare the c function as

bool c_find_id_get_val(std::vector<Widget> const &widgets, unsigned int id,
                       double &value)__attribute__((optimize(2)));

if you have to use O3, when not messed up by the optimizer a loop should have less overhead just cuz theres no function calls like when calling an object

Chobitsu 02/22/2025 (Sat) 18:23:32 No.37155

>>37154 >the instructions are also cached and theyre all the same after being optimized Good point. >-O2 vs -O3 I simply went with the flag that produced the highest performance results on my machine. I tried both. But thanks for the further insights, Anon. Cheers.

GreerTech 05/30/2025 (Fri) 21:38:33 No.38844

C++ LLM usage >>38840 >>38841 >>38845 >=== -patch crosslink

Edited last time by Chobitsu on 05/30/2025 (Fri) 22:06:21.

Report/Delete/Moderation Forms

Delete

Password Delete only files (Removes the file reference to the posts) Delete media (Removes the saved files from the server)

Report

Reason Global