Friday, August 15, 2014

Recursive memoization with Go

There are many odd solutions on the web to the "Web Crawler" exercise in the GoLang tutorial:
    https://tour.golang.org/concurrency/9
    http://soniacodes.wordpress.com/2011/10/09/a-tour-of-go-69-exercise-web-crawler/#comment-176

It's presented as a recursive function which requires memoization and parallelism. There are 2 main approaches:
  1. Synchronization primitives, or
  2. Callbacks
Both are needlessly complex. Here is my solution:
    https://gist.github.com/cdunn2001/a0caf94ce6c5f1da002b
 
Mine is simple because I dropped the recursion. If you see a producer/consumer pattern with recursion, you probably need synchronization primitives. But if you can drop the recursion, then the consumer can know exactly how many producers he created.

An interesting sidebar is how to memoize in Go.  A closure is probably the best way, but I didn't bother with that in my solution, to minimize diffs.

For another angle on the deceptive simplicity of gochannels, read this:
    http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/

2 comments:

  1. Nice one, simplest solution I have seen so far.

    ReplyDelete
  2. Very nice.

    I think you found a good balance between the synchronous and asynchronous part of the code.

    Using structured synchronous programming to handle the timing of reading and writing to the result map in stead of using mutexes makes it a lot easier to reason about.

    It also limits the use of asynchronous code to the "fetching" of the url's that may finish out of order when fetching url's over the network.

    Nice work :-)

    ReplyDelete