Code Style & Taste
A blog where I share thoughts on code and practices

The best C++ is std-less C++

There's nothing wrong with using the standard library. There are a lot of situations where you want simple implementations and you don't want to write it yourself. Not many people would like to write their array container and implement a sort function that works correctly with strings. It can be a lot of work. You can however get a lot of mileage from writing your own standard library and below will talk about some of the advantages.

First, let's cover having fewer headers in every file. Headers can be a fraction of a second. Let's take a moment to think about this, should the below take 10's of milliseconds, 100's or 1000's

  1. Printing Hello World
  2. Including a few popular headers. vector, unordered_map, unordered_set and iostream
  3. A small project that has roughly 8k lines of code across 6 source files and a few headers

Take a moment, the results are below.

It was a trick question. I listed them from slowest to fastest

$ time (echo -e '#include <print>\nint main() { std::print("Hello"); }' | clang++ -std=c++23 -g -x c++ - )

real    0m1.222s
user    0m1.126s
sys     0m0.088s

Yes, that's real. The print header takes over a second to process.

Here's the 4 header test

#include <vector>
#include <unordered_map>
#include <unordered_set>
#include <iostream>
int main() {}
$ time clang++ -std=c++23 -g headers.cpp

real    0m1.138s
user    0m1.048s
sys     0m0.082s

Those headers are barely faster. This site lists how long a header may take to include. At the moment it doesn't have <print>

Finally, the 8k lined project takes <800ms

real    0m0.772s
user    0m2.313s
sys     0m0.207s

It does not use the std lib. It does have a header that implements a dynamic array, hashmap, hashset, and print which is why I chose those 4 headers. Before I explain the numbers I want to show a hello world program so you can see it doesn't lack features. The example will print hello and a struct. The struct after implementing a printer function. The printer function works on various functions (print, printerr, sprint, and anything that inherits IStreamWriter.) This example prints "Hello 31"

#include "smm.h" // Standard minus minus, get it?
using namespace smm;

struct MyStruct { int a; };

void smm_format_print(const MyStruct& s, IStreamWriter&w) {
	w.print(s.a + 30);
}

int main() {
	MyStruct s{1};
	print("Hello ", s, NL);
}

This takes 378 milliseconds to compile. That's 750ms faster than the header test. The time difference might not be a big deal, but when every file takes <1s to compile and <1s to link it feels incredibly fast. If you do the math 6 files at ~400ms is indeed ~2.4s which matches the 'user' time above

If you have never seen the time command in bash it may be confusing. 'real' is actual time while 'sys' and 'user' is time spent in the kernel and userspace per CPU core. Most C++ projects will use more than one CPU to build so I measured by building on multiple cores.

The 8k lined project being significantly faster than the hello world program is crazy to me. You're not wrong if you choose 10's of milliseconds, it's true for printf, but not for iostream


$ time (echo -e '#include <cstdio>\nint main() { printf("Hello"); }' | clang++ -std=c++23 -g -x c++ - )

real    0m0.075s
user    0m0.047s
sys     0m0.028s

$ time (echo -e '#include <iostream>\nint main() { std::cout << "Hello World!"; }' | clang++ -std=c++23 -g -x c++ - )

real    0m1.041s
user    0m0.950s
sys     0m0.084s

Less Copying

There's nothing wrong with the code below, but if you're using C++ for performance, do you want to copy arrays by accident?

struct SomeStruct {
	int flags;
	std::vector<int> data;
};
SomeStruct getStruct() { return {}; } // Not a copy
SomeStruct getStructV2(const SomeStruct&s) { return s; }

void func() {
	SomeStruct a = getStruct();
	SomeStruct b = getStructV2(a); // copies here
}

In my smm library, you can choose to disallow array copies (via macro.) Doing so will cause an error in getStructV2 (no copy constructor or assign operator.) Copying by accident may cause a workload to be significantly slower than expected. I typically only use C++ on projects I want to optimize so I typically don't want accidental copies. In the smm library I could also disable copying on hashmap, hashset and disable blank inserts.

Less Error Prone

In some places, I use string transforms that don't mutate the string, usually basename and dirname. Sometimes I mix up what a function returns or change it hours to days later. Maybe my getDefauleFilename originally returned a slice pointing to a string literal, then I changed it to return a string because a user config may override the default filepath. Below the string getDefauleFilename creates will be destroyed before the text is printed. If you run below with asan you'll get a report about stack-use-after-scope. Sanitizer reports problems at runtime, if you never run into the problematic situation you won't know there's a problem

#include <string>
#include <iostream>
using namespace std;
string getDefauleFilename() { return "Hello"; } 
string_view firstFive(string_view s) { return s.substr(0, 5); }
int main() {
	string_view sz = firstFive(getDefauleFilename());
	cout << sz << endl;
}

Don't take this next implementation too seriously, a real implementation would likely have a length and more functions. This is to demonstrate that the above can become a compile error. For this article you can skip the code below but you may want to read the two lines with a comment.

#include<cstring>
#include<iostream>
using namespace std;
struct StringSlice {
	const char*ptr;
	size_t len;
	StringSlice substr(int i, size_t n) { return {ptr+i, n}; }
};
class MyString {
	char*sz;
public:
	MyString(const char*sz) { this->sz = strdup(sz); }
	~MyString() { free(sz); }
	operator StringSlice() & // the & is where the magic happens
		{ return {sz, strlen(sz)}; }
	StringSlice reallyMakeASlice() { return {sz, strlen(sz)}; }
};

MyString getDefauleFilename() { return "Hello"; } 
StringSlice firstFive(StringSlice s) { return s.substr(0, 5); }
int main() {
	StringSlice sz = firstFive(getDefauleFilename()); // you can use .reallyMakeASlice()
	cout << sz.ptr << endl;
}

In my own lib I have an escape hatch because sometimes I want to print text and I know the reference won't be used after the object is destroyed. If you use reallyMakeASlice() it will compile and asan will report an error. However, the below is fine

cout << firstFive(getDefauleFilename().reallyMakeASlice()).ptr << endl;

My Favorite Reason

Is that I can customize the lib as much as I want. There's something powerful about being able to customize the code you call all the time. Some things I did with my 'Array' class (std::vector alternative) were

It's nothing you couldn't do before, but it makes code easier to read.

Drawbacks

Sometimes you're in the flow and you want to do something that isn't implemented. Sometimes it's minor and won't break your flow other times it'll take hours to implement and your choice is to implement it, or change your code to use a standard object. It can be a real pain but I found it happens less frequently as the project grows.

Another problem is whenever you do anything with templates. If you pass in a type that causes a problem the error message can be long. I get annoyed when I need to scroll to read the message properly. Writing a requires statement can help. I usually write them but writing them well is a separate skill.

Try it

You should try by implementing no more than 3 classes at a time and use them in projects. You'll find yourself constantly tweaking, extending, or rewriting them. You may want to avoid templates unless you understand traits and the requires statement. I found that wrapping or inheriting an existing class has a high annoyance-to-learning ratio so I don't recommend wrapping around anything. If you try to replace the standard library you'll need to write non-portable code which might not be fun if you're not up for it.

Have Fun!