When debugging or profiling Haskell code, it’s common practice to pepper it with cost centre annotations, often called SCC (for strongly connected component set cost centre) annotations. If you compile a program using ghc -prof -auto-all
, this causes all of the top-level names in every module you compile to automatically be annotated with their names, which will show up in a profile file later on when you run your program with +RTS -P -RTS
.
However, since much Haskell code tends to consist of a top-level declaration with a thicket of smaller local functions and partial applications “inside”, merely seeing top-level names in profiling output is rarely enough to identify the source of a problem with much accuracy: you get roughly the right place to look, but not a precise enough picture to actually fix it.
For example, here’s a piece of code that has a space leak.
count lines = foldl (λm i → Map.insertWith (+) i 1 m) Map.empty lines
Doing a bare-bones profiling run over a program containing a definition like this will readily identify the count
function as leaky, but which part of it is leaky?
Rather than hoist a bunch of local functions up to the top level, or pull fragments of partially applied code out and give them top-level names, it’s much easier to manually add SCC annotations.
count lines = foldl (λm i → {-# SCC "insert" #-} Map.insertWith (+) i 1 m) Map.empty lines
Rerunning the program gives us a new entry, named insert
, which GHC uses when accounting for time spent and memory allocated during evaluation of the code to the right of the SCC annotation, until the right parenthesis is hit. For more details about SCC annotations, cost centres, and profiling, see the GHC User’s Guide on Profiling.
Adding and removing SCC annotations by hand is moderately annoying, so I’ve automated the process in Emacs. This function attempts to somewhat intelligently insert an SCC annotation wherever the cursor is. The name within the annotation is empty, but the cursor is centred on it, eliminating all boilerplate keyboarding, and letting you just type the name you want to use.
(defun insert-scc-at-point () "Insert an SCC annotation at point." (interactive) (if (or (looking-at "\\b\\|[ \t]\\|$") (and (not (bolp)) (save-excursion (forward-char -1) (looking-at "\\b\\|[ \t]")))) (let ((space-at-point (looking-at "[ \t]"))) (unless (and (not (bolp)) (save-excursion (forward-char -1) (looking-at "[ \t]"))) (insert " ")) (insert "{-# SCC \"\" #-}") (unless space-at-point (insert " ")) (forward-char (if space-at-point -5 -6))) (error "Not over an area of whitespace")))
Nuking an SCC annotation can be made just as convenient.
(defun kill-scc-at-point () "Kill the SCC annotation at point." (interactive) (save-excursion (let ((old-point (point)) (scc "\\({-#[ \t]*SCC \"[^\"]*\"[ \t]*#-}\\)[ \t]*")) (while (not (or (looking-at scc) (bolp))) (forward-char -1)) (if (and (looking-at scc) (<= (match-beginning 1) old-point) (> (match-end 1) old-point)) (kill-region (match-beginning 0) (match-end 0)) (error "No SCC at point")))))
(By the way, the annotated code above has two space leaks. The first is due to the use of foldl
instead of its strict counterpart foldl'
, while the second is due to the use of Data.Map
‘s insertWith
instead of its respective strict counterpart, insertWith'
.)
Update: Josef Sveningsson pointed out that “SCC” doesn’t mean “strongly connected component” in this context, but he didn’t know what it does mean. I asked Simon Marlow, who filled me in: it means “set cost centre”.
I hate to be a besserwisser but SCC does not stand for strongly connected component in this case. I don’t know what the ‘S’ stands for (possibly Stack) but the ‘CC’ stands for Cost Center. See more at:
http://www.haskell.org/ghc/docs/latest/html/users_guide/profiling.html
Josef, I corrected my error and found out the correct expansion from Simon Marlow. Thanks!