Advent of Code Retrospective
I’m happy to say I’ve finished Advent of Code 2020.
It was quite a battle and along the way I’ve learned a few new tricks and painful lessons.
Lazy evaluation
While lazy evaluation nicely solves some problems it is also a great source of confusion when inevitably bugs arise during development. What follows is a simple sample that demonstrates how this can happen:
(defn on-evaluate [x]
(println x)
(if (= x 5)
(throw (Exception. "fail"))
(+ x 5)
)
)
(defn demo-lazy-bug
[]
(let [call-lazy-bug (map on-evaluate (range 10))]
(println "it would be nice to see a crash here")
(try
(doall call-lazy-bug)
(catch Exception e
(println "but no")
)
)
)
)
Calling demo-lazy-bug
produces the following output.
it would be nice to see a crash here
0
1
2
3
4
5
but no
The reason is that map produces what is called a lazy sequence which gets evaluated when it is actually used.
This means if I wouldn’t have added a call-lazy-bug
it wouldn’t be called at all. It all makes sense but during development it does result in quite a bit of head scratching.
To force evaluation early use either doall or dorun
Sequences and vectors
There were quite a few problems in AOC where I struggled with speed of execution. Initially I stored everything in (lazy) sequences by default. Clojure standard library also has plenty of functions that return lazy sequences even when used on evaluated lists, maps etc. so it pays to be careful when composing functions.
While sequences are easy to use and feel idiomatic using them can easily result in severe performance bottlenecks
when used indiscriminately. Here’s a short demo where I just continuously split and merge a list of numbers.
Note that in sequence-shuffler
I use doall
to force evaluation. On any list of significant size omitting doall
causes a stack overflow error for this particular benchmark.
(defn sequence-shuffler [n iterations]
(loop [some-sequence (range n)
steps 0]
(if (>= steps iterations)
some-sequence
(recur (doall (concat (drop (/ n 4) some-sequence)
(take (/ n 4) some-sequence))) (inc steps))
)
)
)
(defn vector-shuffler [n iterations]
(loop [some-vector (into [] (range n))
steps 0]
(if (>= steps iterations)
some-vector
(recur (into (subvec some-vector (/ n 4) n)
(subvec some-vector 0 (/ n 4))) (inc steps))
)
)
)
(defn sequence-vector-benchmark []
(let [all-ns [1e1 1e2 1e3 1e4 1e5]
steps 1e3]
(loop [remaining-ns all-ns]
(when (not (empty? remaining-ns))
(println "n:" (first remaining-ns) "steps:" steps)
(time (sequence-shuffler (first remaining-ns) steps))
(time (vector-shuffler (first remaining-ns) steps))
(recur (rest remaining-ns))
)
)
)
)
For this particular benchmark vector-shuffler
works faster as soon as n
exceeds 100.
There are two reasons why performance is much better:
- subvec is fast (O(1))
- into works faster than concat for vectors
(from testing both ways of merging vectors on several aoc days)
n | steps | sequence-shuffler [ms] | vector-shuffler [ms] |
---|---|---|---|
10 | 1000 | 8.363016 | 7.38162 |
100 | 1000 | 21.158169 | 5.998396 |
1000 | 1000 | 208.560073 | 43.798472 |
10000 | 1000 | 1889.878375 | 349.64432 |
100000 | 1000 | 20178.565774 | 4021.776388 |
While this is only one particular case the real lesson here is to always use an appropriate data structure for the problem at hand.
Last but not least
- simplifying problem statement makes it easier to hit the right solution earlier - pen and paper helps here
- using loop/recur rather than recursion makes it less likely to encounter stack overflows
- extract functions early before code becomes a mess of round brackets