Sunday, February 15, 2015

Google C++ Style Guide

  • (Titus Winters at CppCon)
I am a big fan of this, mainly because I like to see side-effects clearly identified. There is an amazing amount of resistance. I don't think that the opposition can be moved, but maybe this reddit comment will at least convince managers of its value:

ericanderton 20 points  
My $0.02:
I've done a lot of work using this style guide religiously, largely because it was what my team came together over. It was also the most cogent guide we could find online that was more than merely prescriptive. It's exactness was the overall deciding factor.
I hated using it at first. Overall, the guide is very regressive, and chops the legs off of C++ such that it's not much more than "C with namespaces and classes." You could even go as far as to say that most arguments in the guide reduce to "this isn't a problem in Java or Python because you can't make this mistake, so don't use feature X here at all." Ultimately, it keeps you from doing anything that would allow bugs to creep in by mistake or by misinterpretation, by keeping it all nice and simple. And this is where this style guide actually helps.
The problem is that C++ provides almost too much leverage. Left unchecked, developers will likely use the language to its limits, which inevitably will confound other members of the team. While the program may be a masterpiece of template code, move semantics, and other concepts, it's now unmaintainable by anyone but the original author. Business wise, it's vastly preferable to have an uninspired piece of software if it means you can fix bugs while half your staff has the flu.
Also, consider that the goals of the Google C++ style guide align incredibly well with Go. To me, this is a very salient case against using C++ if that guide is at all a good fit for new development. Go has a very tight language spec, and a (IMO) superior concurrency model that is easier to construct, reason about, and debug. And it's still a compiled language.
Anyway, I'm proud to say that I have delivered excellent results, and relatively bug free code using this guide. If left to my own devices I probably would have used too much template magic and other mechanics that, while are all valid C++, would be harder to debug and understand by other members of my team. The resulting codebase is boring as all hell to read, but stable, reliable, and works incredibly well.
The downside is that compared to conventional languages, the result still takes a long time to compile, is twice as much code as is needed (header files), and relies heavily on smart pointers to manage memory (may as well use a GC). Again, this is why I mentioned using Go earlier.
One last thing: this guide does not stress the importance of "const correctness" in class construction. Add that to your work and you'll really have some solid code to rely upon.
tl;dr: For new development, either forgo this guide completely, or just use Go. Otherwise, you'll just piss off experienced C++ developers by using this thing.
Edit: I forgot to mention that the GSG has a massive blind spot for exception safety. Just because your code doesn't use exceptions, doesn't mean that the libraries you use don't throw. This includes the STL; the guide should steer you away from throwing variants of STL functions/methods, but it doesn't. So be on the lookout for coders not throwing try/catch around third party code, and refusing to use basic RAII (also not mentioned in the guide) to make things airtight. Either that, or just except that every exception-based runtime fault should halt the program, and that it's an acceptable failure mode (probably not).
For a longer discussion, see:


Wednesday, January 7, 2015

C++: STL Iterator confusion

Recently, folks at my last company got confused with STL iterators.
  1. They used boost::python to create an InputIterator, using stl_iterator.
  2. They used is_sorted() to check whether the underlying list was sorted (of course).
  3. They then looped over their iterator.
That was a mistake. An InputIterator is single-pass, so someone obviously forgot to read the documentation.

But this highlights a problem with STL. is_sorted() actually requires a ForwardIterator. At that link, you can see that a ForwardIterator might be "mutable", meaning that it can be written into. But is_sorted() takes a copy of an iterator, and it certainly does not write into it. There can be no issue of side-effects, right?

Wrong. You can pass an InputIterator if you want, at your own risk. is_sorted() might still return the correct result, but the original iterator (which was copied for the call to is_sorted()) might now be invalid.

So the question became whether it was safe to pass an Inputerator to is_sorted(). In general, the answer is "No". Look at representative source code:

template <class ForwardIterator>
  bool is_sorted (ForwardIterator first, ForwardIterator last)
  if (first==last) return true;
  ForwardIterator next = first;
  while (++next!=last) {
    if (*next<*first)     // or, if (comp(*next,*first)) for version (2)
      return false;
  return true;

That function requires a copy of the original "first" iterator. "first" and "next" must both be valid during the entire function, and that is not guaranteed for an InputIterator.

However, the boost::python InputIterator is actually a memoized iterator. If you increment it, then the next value for any copy will no longer be the value after what the copy points to. But the copy remains valid for reference. So is_sorted() is fine, as long as no further iteration is performed later.

The problem is that STL is missing a concept between ForwardIterator and InputIterator. So is_sorted() is given the more restrictive designation.

Another way to look at this is that C++ templates are insufficient for practical iterators. The following is closer to the function we really want:

template <class ForwardIterator>
  bool is_sorted (ForwardIterator first, ForwardIterator last)
  if (first==last) return true;
  ForwardIterator next = first;
  auto prev_value = *first;
  while (++next!=last) {
    auto next_value = *next;
    if (next_value<prev_value);
      return false;
    prev_value = next_value;;
  return true;

But that has problems too. Suddenly, the underlying type must be copy-constructible, and those copies could be expensive. We could use a pointer, but a pointer to what? If we don't make a copy, then there is nothing to point to.

We need an extra version of this algorithm which can safely accept InputIterators, with the implication that there will be copies of the underlying data. But it would need a different function name, and it would confuse everyone.

What we really need is the concept of "Iterator with invariant current element (unless underlying container is modified) but possibly variant subsequent Iterator". Lots of iterators actually work that way. But it would probably confuse people even more.

Tuesday, September 16, 2014

Reactive Programming

This is hard for me to explain yet, but here are helpful links:

Basically, state is passed down, and events go up. Component A creates Component B, passing parts of A into B as immutable properties, and setting call-backs on B to call into A. Most components are re-created repeatedly.

  • Data-flow is clear and consistent.
  • Mutable state is isloated and minimal.
  • The entire architecture is highly scalable.

Friday, September 12, 2014

Git: Rebase, then merge (--no-ff)

In case you think you should always either rebase or merge, check out this reddit post.

And this blog shows how to rebase a stale GitHub "pull request" (as opposed to using the "Merge Pull Request" button).

C++: copy elision

The C++ standard permits the compiler to skip (or "elide") a constructor under some circumstances, e.g. when a temporary is passed into a function by value. There is plenty of information on the web about this subject, but here is a concrete example.

What's interesting is that "copy elision" can actually alter semantics:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
Anyway, because of this feature, it can be wise to let a method accept an input by value rather than by const-reference, e.g.
table[key] = value;
The pattern should be considered whenever the method would have created an explicit copy, as for swapping.

Friday, August 15, 2014

Recursive memoization with Go

There are many odd solutions on the web to the "Web Crawler" exercise in the GoLang tutorial:

It's presented as a recursive function which requires memoization and parallelism. There are 2 main approaches:
  1. Synchronization primitives, or
  2. Callbacks
Both are needlessly complex. Here is my solution:
Mine is simple because I dropped the recursion. If you see a producer/consumer pattern with recursion, you probably need synchronization primitives. But if you can drop the recursion, then the consumer can know exactly how many producers he created.

An interesting sidebar is how to memoize in Go.  A closure is probably the best way, but I didn't bother with that in my solution, to minimize diffs.

Friday, July 4, 2014

Storing dotfiles in Git: symlink + worktree

I use 3 tricks:

  1. core.worktree = /Users/cdunn
  2. symlink .git -> dotfiles/.git/
  3. In .gitignore: *

The beauty of my approach compared to many others is that regular git commands work directly in my home-directory. It's easier to explain with an example:
To add new ones, pass the '-f' flag to 'git add'.