Developing a set of helper functions for data exploration


Developing a set of helper functions for data exploration
http://ift.tt/2wJSBOx

Submitted September 25, 2017 at 12:14AM by theocean154
via reddit http://ift.tt/2fqHeDT

Advertisements

Newbie question: Speeding up code and parallelism


Newbie question: Speeding up code and parallelism

First, I understand that an absolute newbie asking about speeding up code and parallelism is more than little ironic. I am working through "Learn you a Haskell" while trying to decide whether Haskell is right for me. I am an academic economist, and I am trying to learn some Haskell to see if I could be more productive with it than with Fortran. Productive here means a combination of calculation time and debugging/coding, not just speed of computation.

To play with Haskell vs. Fortran I wrote a toy program to calculate primes. I am not interested in a faster algorithm to calculate primes, I want to keep them comparable in their naiveness towards the mathematical algorithm. I am asking if there is something in my Haskell programs that could dramatically make it faster. My Fortran program is a very simple combination of do-loops, not posting it here. To print out all primes among the first 1 million integers in my laptop the Fortran program executes in 0.7 seconds.

My first attempt in Haskell was the following:

isprime :: Integer -> Bool isprime n = [(n,y) | y<-[2..div n 2], mod n y == 0, y*y<n] == [] main = do let aout = [ x | x<-[2..10^6], isprime x] putStr . unlines $ map show aout 

This was terrible slow, took several minutes to complete.

My second attempt was significantly better:

primetest :: Integer-> Integer -> Bool primetest n m = if m*m>n then True else if mod n m == 0 then False else primetest n (m+1) isprime :: Integer -> Bool isprime n=primetest n 2 main = do let aout = [ x | x<-[2..10^6], isprime x] putStr . unlines $ map show aout 

This runs in 4.9 seconds, when compiled with -O2 option. So 5 times slower than Fortran.

So the questions:

1) Is there anything obvious that I could do in the second version without sacrificing readability to make it faster (still more 5 times slower than Fortran). I can actually live with five times slower, so this is a secondary concern.

2) How do I parallelize that the evaluation (in this case the "let aout" line)? Regardless of the language, I will be running my actual project on a supercomputer cluster with multiple CPUs, and so parallel processing is a must. In Fortran and openmp it is a simple matter of just marking the loop you want to parallelize. I have tried to read Haskell parallelizing documentation/webpages, and my hunch is that "parlist" might be what I need, but I really can't figure out it based on documentation. Does anyone have a simple example on how to parallelize a list construction, or another comparable strategy?

I hope posting a Newbie question is ok.

Submitted September 24, 2017 at 10:46PM by nongaussian
via reddit http://ift.tt/2y1wHH1