Over the past few months, the Sigma engineering team at Facebook has rolled out a major Haskell project: a rewrite of Sigma, an important weapon in our armory for fighting spam and malware.
Sigma has a mission-critical job, and it needs to scale: its growing workload currently sees it handling tens of millions of requests per minute.
The rewrite of Sigma in Haskell, using the Haxl library that Simon Marlow developed, has been a success. Throughput is higher than under its predecessor, and CPU usage is lower. Sweet!
Nevertheless, success brings with it surprises, and even though I haven’t worked on Sigma or Haxl, I’ve been implicated in one such surprise. To understand my accidental bit part in the show, let's begin by mentioning that Sigma uses JSON internally for various purposes. These days, the Haskell-powered Sigma uses aeson, the JSON library I wrote, to handle JSON data.
A few months ago, the Haxl rewrite of Sigma was going through an episode of crazytown, in which it would intermittently and unpredictably use huge amounts of CPU and memory. The culprit turned out to be JSON strings containing zillions of backslashes. (I have no idea why. If you’ve worked with large volumes of data for a long time, you won’t even bat an eyelash at the idea that a data store somewhere contains some really weird records.)
The team quickly mitigated the problem, and gave me a nudge that I might want to look into the problem. On Sunday evening, with a glass of red wine in hand, I finally dove in to see what was wrong.
Since the Sigma developers had figured out what was causing these time and space explosions, I immediately had a test case to work with, and the results were grim: decoding a mere megabyte of continuous backslashes took over a second, consumed over a gigabyte of memory, and killed concurrency by causing the runtime system to spend almost 90% of its time in the garbage collector. Yikes!
Whatever was going on? If you look at the old implementation of aeson’s unescape
function, it seems quite efficient and innocuous. It’s reasonably tightly optimized low-level Haskell.
Trouble is, unescape
uses an API (a bytestring builder) that is intended for streaming a result incrementally. Unfortunately the unescape
function can’t hand any data back to its caller until it has processed an entire string.
The result is as you’d expect: we build a huge chain of thunks. In this case, the thunks will eventually write data efficiently into buffers. Alas, the thunks have nobody demanding the evaluation of their contents. This chain consumes a lot (a lot!) of memory and incurs a huge amount of GC overhead (long chains of thunks are expensive). Sadness ensues.
The “old ways” in the title refer to the fix: in place of a fancy streaming API, I simply allocate a single big buffer and blast the bytes straight into it.
For that pathological string with almost a megabyte of consecutive backslashes, the new implementation is 27x faster and uses 42x less memory, all for the cost of perhaps an hour of Sunday evening hacking (including a little enabling work that incidentally illustrates just how easy it is to work with monad transformers). Not bad!
Nice, Haskell FTW
who pissed in Asil’s cornflakes this morning?
Wow ! Impressive bug and an impressive fix 😀
Btw. who is this guy Asil? why so abusive? calm down man.
Nice write up, thanks. Reading about code optimization is interesting, and usually fun because I get to sit back and read instead of trying to unravel my own code for once.
Please please please write the 2nd edition of RWH.
スーパーコピー時計販売はコピーガガミラノ通販専門店です . 0.106072179 レプリカガガミラノの私は(私がオンラインで見つける大丈夫、ランダマイザ)大きなふわふわサンタの帽子の中にすべてのaBlogtoWatchチームメンバーの名前を入れて、ランダムに彼らは匿名で2014年からガガミラノの時計を選ぶだろう誰のために相互に各チームメンバーをペアに名前を描いた私はその後、それぞれに通知彼らは「買い物」のためだった、と彼らに “贈り物”を選択し、彼らはその人のためにその時計を選んだ理由について書くための期限を与えた人物 . スーパーコピーガガミラノ時計豊富な品揃えで最新作も随時入荷致しております のでごゆっくりとご覧ください。☆ ガガミラノ時計税関の没収する商品は再度無料にして発送します☆ 送料無料 . http://www.bagkakaku.com/cariter_watch.html
Hi Bryan, what is your email address please? thanks vm! Ak
The birth of a boxers calvin klein child is indeed a wondrous moment.
Mind blowing. Thank you for the post.
Nice write up, thanks.
A Fraud company which hired me saying it needed some basic typing work from me is demanding a huge some of money from me. It didn’t sent me work, after 10 days from hiring it did and its mail landed in my spam folder. The workload was of 10 days. I saw it on the 9th day. I could only complete 1 day workload.
Now they’re saying I have done a loss of the company so I must pay an amount more than my due salary. Here’s why I think it is fraud, the work was to look at the text in an image and fill up the form below. It could be easily done by a software so why were they making me do it. Don’t they want to save money? I need your help. I don’t have software degree, but if you give me in writing that it is indeed possible through software. Then I will be saved and I would be grateful to you. I am a jobless person not very qualified. And a fraud company is trying to scam me. Please help.
Well it makes sense
NIce post http://getdailybook.com/
You have not written any word about rebase – A you mad?
i mean your definiteve guid book
no thanks
All the best
Stay Home
you can easily download free software
https://pcpapa.net/
you can download free software
https://pcpapa.net/
https://getdailybook.com/
free books download on todaynovels.com, in pdf epub both formats
download alll boos free
download all books
Perfect writing style, perfect research and attention to detail,
not only to events but to emotions felt. Thank you so much sir for your great post on the great blog.
Perfect writing style, perfect research and attention to detail,
not only to events but to emotions felt. Thank you so much sir for your great post on the great blog.
Of his works, he is especially famous
bride, Julie d’Angenne.
Since the era of Charlemagne
Western Europe also formed
Nice post
Amazing site
We will support you
Thank you
AllpcHub.com
Download all pc software free within a single click.
sometimes-the-old-ways
AutoCAD 2013 free download
thank you coreldraw 2019 download
fastdub software downloads Center
Thank you so much it is the best website.
consists of the book itself
book about the chess of love “, created by
Andrew is right, I think he was angry and now happy
We will never be remaining our site
Homie, never forget that old is GOLD.
I truly believe in this. Sometimes, modern ways and strategies are causing some complications.
professional beard trimmer
If you want to read books and novels just visit our website it provides free books and novels for free in various formats including ePub and PDF.
https://novelabx.com/
Download all books in all formates including PDF, Epub from website
novelabx.com