My take on Haskell on Advent 2017 Maze challenge (response to /u/chrisdoner’s post)

My take on Haskell on Advent 2017 Maze challenge (response to /u/chrisdoner’s post)

Looking at Chris Done's implementation of the challenge, the main complaint was the sheer complexity and magic involved. Looking closer, it turns out none of that comes from the actual program logic; the loop is fairly simple and performant. All the unidiomatic stuff was in keeping the parse time down.

But the parser isn't even the slow part of the program! It's measured in microseconds, while the main loop is measured in milliseconds. If we sacrifice a few dozen microseconds, we can simplify the parser and make the code much more idiomatic. Data.ByteString.Char8 already has a readInt function that's plenty fast, and it's much easier to just use vector's unfoldr and thaw than to write a complicated custom loop.

reading file into content took 98.12 µs parsing 1057 lines took 73.93 µs 25608480, is the answer, it took 52.40 ms 

One small note, I'm using drop 1 to drop the \n character. I didn't test this, but I assume dropWhile (=='\n') would have been negligibly slower, and probably a bit more correct.

Another complaint was that the Rust code was reading into UTF8, which requires more overhead than simply reading a ByteString. We can do the same with Text in Haskell. The parse time becomes much worse, but again, it's still negligible compared to the main loop.

reading file into content took 86.94 µs parsing 1057 lines took 191.01 µs 25608480, is the answer, it took 46.52 ms 

Well if we're throwing parse time to the wind, what's the most idiomatic way to write fast parsers in Haskell? Parser combinator libraries! Particularly Attoparsec. All that parsing code can be replaced with 3 lines of idiomatic Attoparsec, with both Text and ByteString versions. It's more than 10x slower than the Rust parser, so I'd recommend the idiomatic Data.ByteString.Char8 parsing, but this is also using a parsing framework that's far more powerful (for instance, it handles line endings much more correctly).


reading file into content took 76.73 µs parsing 1057 lines took 430.43 µs 25608480, is the answer, it took 51.04 ms 


reading file into content took 126.07 µs parsing 1057 lines took 551.06 µs 25608480, is the answer, it took 52.21 ms 

Next are the GHC and RTS options.

  • All trials used -O2 -threaded -rtsopts.
  • Using the -N RTS option causes file reading to slow by about 20% (maybe some kind of file locking?), and has no effect on the non-attoparsec parsers. But for some reason, the attoparsec parsers get a whopping 100x slower with -N. I really have no idea why this would be, and I haven't really bothered to find out.
  • I've actually been keeping a secret: I've been using -fllvm for all of this. It has no effect on file reading and parsing, but makes the main loop almost twice as fast in all cases, usually surpassing the Rust code by a thin margin.

In conclusion, it's not hard to write fast Haskell that's also idiomatic! If parse time doesn't actually matter, don't break your back micro-optimizing it and making your code wildly unidiomatic. The fastest parser I presented is a few times slower than the Rust parser, but I think this is ok (especially since we more than make up for it in the faster file-reading time, which I assume is due to Haskell's crazy fast allocations). The main loop performs wonderfully. All in all, it performs quite well and I'd say it's all reasonably idiomatic Haskell. The only real "trick" is to use an unboxed mutable vector, but that's fairly commonplace in mutable algorithms.

P.S. Please please please test your benchmarks with LLVM! It's very often faster, and I think it'd be good to know if we should be striving to make it the default.

Submitted December 16, 2017 at 10:57AM by ElvishJerricco
via reddit


Parsing Haskell with Parsec

Parsing Haskell with Parsec

I'm wondering how difficult it would be to parse Haskell with Parsec, or whether anyone has already done it.. The use case is being able to interpret simple-ish Haskell from code running in the browser (compiled with ghcjs). 'Hint' won't work of course as it's not purely Haskell. I suppose the problem comes down to representing enough of Haskell's type system and precedence rules to get by?

Submitted December 15, 2017 at 06:30PM by yaxu
via reddit