Erlang, Ruby and PHP battle it out!

Update: here’s the followup!

Yesterday I attended PHP Day, in the beautiful city of Verona, and while going there by train, together with my friend Federico Feroldi, I told him that in a test I made a week ago Erlang was much slower than ruby in finding pythagorean triplets.

That was a partial lie, since the ruby implementation was somewhat optimized, while the erlang implementation wasn’t at all. So we decided to rewrite both programs using the same optimized algorithm and then test their speeds.

This is the ruby implementation of a method that counts the number of pythagorean triplets up to N:

def pythag(n)
  result = []
  (2..n).each do |c| 
    (1...c).each do |b|
      a = Math.sqrt(c*c - b*b)
      result << [a.to_i, b, c] if a.to_i == a 
    end
  end
  result
end

Then we coded it in Erlang. The code I’m going to show you is more complex than the ruby code because we took advantage of the parallelization features of the Erlang language:

-module(pythag).
-export([bard/1, pbard/1]).
 
my_is_integer(X) ->
  round(X) == X.
 
pbard(N) ->
  pmap(fun(C) -> 
              [
               {C, B, trunc(A)} || 
                B <- lists:seq(C, N), 
                A <- [math:sqrt(C*C + B*B)], my_is_integer(A) 
              ]
            end, 
            lists:seq(2, N)).
 
bard(N) ->
  lists:map(fun(C) -> 
              [
               {C, B, trunc(A)} || 
                B <- lists:seq(C, N), 
                A <- [math:sqrt(C*C + B*B)], my_is_integer(A) 
              ]
            end, 
            lists:seq(2, N)).
 
pmap(F, L) -> 
  S = self(), 
  %% make_ref() returns a unique reference 
  %% we'll match on this later 
  Ref = erlang:make_ref(), 
  Pids = lists:map(fun(I) -> 
                    spawn(fun() -> do_f(S, Ref, F, I) end) 
             end, L), 
  %% gather the results 
  gather(Pids, Ref). 
 
do_f(Parent, Ref, F, I) -> 
  Parent ! {self(), Ref, (catch F(I))}. 
gather([Pid|T], Ref) -> 
  receive 
    {Pid, Ref, Ret} -> [Ret|gather(T, Ref)] 
  end; 
gather([], _) -> 
  [].

The pythag function has been named bard, as in The Bard :)
The pbard function fires up a thread for each iteration of the outer loop and then gathers all the result back. This allows for up to 50% performance increase when using dual core CPUs (like my Macbook Pro :) ).

Once we coded it in two languages we couldn’t stop, so here’s a PHP implementation, a Javascript one and, last, but not least, a C implementation, because you can never forget C when you need speed :)

PHP:

<?php
$i=0;
 
for ($c = 2; $c <= 5000; $c++) {
  for ($b = 1; $b < $c; $b++) {
    $a = sqrt($c*$c - $b*$b);
    if (is_int($a)) {
      $i++;
    }
  }
}
print("$i");
?>

JavaScript:

function pythag(n) {
  var a,b,c,i
  for (c = 2; c <= n; c++) {
    for (b = 1; b < c; b++) {
      a = Math.sqrt(c*c - b*b);
      if (Math.floor(a) == a) {
        i++;
      }
    }
  }
  return i;
}

C:

#include <stdio.h>
#include <math.h>
 
#define MAX 5000
 
int main() {
  double c, b;
  int i = 0;
  double a;
 
  for (c = 2; c <= MAX; c++) {
    for (b = 1; b < c; b++) {
      a = sqrt(c*c - b*b);
      if (trunc(a) == a) {
        i++;
      }
    }
  }
  printf("%d\n", i);
}

To test the various relative speeds I ran them from the shell using time (php ruby and c), or used the internal timer (erlang), or ran them in firefox (js only actually).

The results were interesting. At the first run, using 5000 as N in the pythag functions the order was:

  1. C (obviously): 0.40s
  2. Erlang (smp): 3.95s
  3. Erlang (non-smp): 5.66s
  4. Ruby: 15.31s
  5. PHP: 19s (or so)
  6. Javascript: couldn’t even finish running because firefox said it was taking too much time

Fullo passed by and asked what we were doing, I told him I was testing languages and that PHP was dog-slow. He said it couldn’t be and asked me what version of PHP I was using. I checked and it was PHP 4.x.x.

After downloading and compiling PHP5 I ran the tests again and the final result was this:

  1. C (obviously): 0.40s
  2. Erlang (smp): 3.95s
  3. Erlang (non-smp): 5.66s
  4. PHP5: 8.9s (or so)
  5. Ruby: 15.31s
  6. Javascript: couldn’t even finish running because firefox said it was taking too much time

Now the numbers make sense. Doing these comparations was a nice experience, because when you implement the same thing in different languages you always learn something about optimization and code beauty.

48 Responses to “Erlang, Ruby and PHP battle it out!”

  1. Erlang, Ruby and PHP battle it out! | the pix zone Says:

    [...] Yesterday I’ve been at the PHPDay 2007 in Verona with my Nimboo friend Giovanni Intini, during the travel to Verona on the train, we played with Erlang, Ruby and PHP writing small programs to compute Pythagorean triplets and comparing performances. I don’t want to tell you yet who’s the winner, you will find the results of our tests in Giovanni’s blog. [...]

  2. Lawrence Oluyede Says:

    You may want to bench the Python version too: http://pastie.textmate.org/private/ob7q5ikivlvwo5qm1u

  3. Stefan Scholl Says:

    How was the Erlang code compiled? Old style, or HIPE?

  4. Giovanni Intini Says:

    @Stefan: It was compiled old style
    @Lawrence: I’ll try the python one too, keep an eye on the post for the update :)

  5. Giovanni Intini Says:

    Lawrence I’m getting 14.839s on python 2.3.5, by doing ‘time python pythag.py’, is there a way to speed it up?

  6. Lawrence Oluyede Says:

    Did you try with Python 2.5.1? Python 2.3 is really old

  7. Giovanni Intini Says:

    Now don’t get all upgradey on me :)
    I’ll do it just for the sake of science :)

  8. Simon Willison Says:

    The Python interpreter is slow to start up – you will see very different results if you run your benchmark using the timeit module.

  9. Giovanni Intini Says:

    Simon can you post some code that uses the timeit module so I can run the python tests?

  10. Giovanni Intini Says:

    Upgrading python to 2.5, but still using time, lowered the execution time to 11.53s

  11. Lawrence Oluyede Says:

    The easiest way is issuing this on the command line: “python2.5 -m timeit -n 100 “from pythag import pythag; pythag(5000)”

    or you can also patch the module to do that:

    http://pastie.textmate.org/private/qsx4ahc9tirpsomigf

  12. Lawrence Oluyede Says:

    Maybe testing 300 times is too much. Anyway, here’s the doc: http://docs.python.org/lib/module-timeit.html

  13. Googol Says:

    I test python(2.5.1) and C in my pc, the result is:
    1 C: 0.39s
    2 Python(2.5.1): 14.83s

  14. Anon Says:

    I converted the C-example to Java (1.5) and got impressive results:

    http://pastebin.ca/496904

    Time: 0.11s

    To put the number into perspective, PHP 5 used 9.3s on the same machine (a shared host, using 3 Xeon 3.60GHz CPU’s)

  15. Anon Says:

    import java.lang.Math;

    public class test1 {

    private final static int MAX=5000;

    public static void main(String args[]) {
    double c, b;
    int i = 0;
    double a;
    long start = System.currentTimeMillis();
    for (c = 2; c

  16. jherber Says:

    you are not playing fair. the ruby bench is putting results into an array. it is also being called as a function. you should equalize your benchmarks. try this for your ruby benchmark:

    i = 0
    (2..5000).each do |c|
    (1..c).each do |b|
    a = Math.sqrt(c*c – b*b)
    if a.to_i == a then i+=1 end
    end
    end
    puts i

    let me know what you get on your hardware :)

  17. Giovanni Intini Says:

    Jherber don’t tell me I’m against Ruby, it’s actually my main-and-favourite language :)

    Anyway, I ran your code and got this result: 15.27s, so I guess we can say ruby is pretty efficient in calling methods :)

  18. Stuart Dootson Says:

    Thought I’d try this in Haskell –

    module Main where

    isPythag :: Int -> Int -> Bool
    isPythag c b =
    aSqr==diff
    where a = ratSqrt (diff)
    diff = cSqr-bSqr
    cSqr=c*c
    aSqr=a*a
    bSqr=b*b

    ratSqrt :: Int -> Int
    ratSqrt = truncate.sqrt.fromIntegral

    main = putStrLn (show (length [ 1 | c

  19. riffraff Says:

    ah, useless microbenchmarks, I love this :D

    first, I believe you should run with warm cache :)

    But more important, the ruby & erlang versions do collect pytagorean triples, all the rest do not, they just count (i.e. no arrays, no gc kicking in etc etc). This is especially important comparing with php because array handling is quite slow there due to arrays also being dictionaries, IIRC.

    Anyway, a probably slightly faster ruby version: in line with c,php and js:

    1. no funcall overhead, lol
    2. no need for var N
      include Math # no lookup overhead
      a=b=c=result=0 #no var creation/destruction overhead
      #if you use upto no overhead for range object creation
      2.upto(5000) do |c|
      1.upto© do |b|
      # don’t do to_i twice
      a = sqrt(c*c – b*b)
      result+=1 if a == a.to_i
      end
      end
      print result

      use yarv and you get huge speedup by using while instead of blocks, you can write fortran in any language, as they say :)

  20. Isaac Gouy Says:

    Did you try HiPE?

    ”... when you implement the same thing in different languages you always learn something about optimization and code beauty.”

    The Computer Language Benchmarks Game
    http://shootout.alioth.debian.org/gp4/

  21. Mike McNally Says:

    That’s not a very efficient Erlang implementation. Why build up all those lists when all you need is a count?

    Here’s some smaller, cleaner code, without all the list comprehensions that don’t really do you any good:

    -module(pyth).
    -export([find/1]).

    find(Max) -> find(2, Max, _Count = 0).

    find(Max, Max, Count) -> Count;
    find(N, Max, Count) ->
    find(N + 1, Max, Count + search_pairs(N * N, N – 1, 0)).

    search_pairs(C2, A, Count) ->
    case A * A of
    X when X Count;
    A2 -> search_pairs(C2, A – 1, case math:sqrt(C2 – A2) of
    N when trunc(N) == N -> Count + 1;
    _ -> Count
    end)
    end.

  22. Giovanni Intini Says:

    The reason why I am building lists (and arrays) in Erlang and Ruby is because In earlier versions I was printing all the results, then C came and I stopped printing results, but I didn’t bother changing the ruby and erlang code. After all it was only a game :)

  23. LudoA Says:

    You could run the Javascript version in Rhino (http://www.mozilla.org/rhino/), if you want JS stats too.
    I’m very surprised by the LOC needed for the Erlang version, even if it has SMP support. Also, I thought Erlang was parallel by default?

  24. Jay Anderson Says:

    The purpose of this experiment (as I understand it) was to compare languages but there is a much better way to generate all pythagorean triples. Pick two integers M and N such that M>N. The following formula generates a triple:
    A = M^2-N^2
    B = 2*M*N
    C = M^2+N^2

    Here’s an erlang function which enumerates them:

    pyth(X) -> pyth(X, 2, 1, []).
    pyth(X, M, N, L) when M*M + N*N > X, M == N+1 -> L;
    pyth(X, M, N, L) when M*M + N*N > X -> pyth(X, N+2, N+1, L);
    pyth(X, M, N, L) -> pyth(X, M+1, N, [{M*M-N*N, 2*M*N, M*M+N*N}|L]).

    Re-implementing this in the other languages will make them all quite a bit faster also.

  25. Giovanni Intini Says:

    @Jay: it all started as a contest to see how easy it was to implement a pythagorean triplets algorithm in erlang and ruby, I started from an example in the erlang book that used list comprehensions and I never put too much thought in using a better algorithm, but thanks I wondered if there was a bettere mathematical approach.

    @Ludoa: The parallelizing erlang code was taken verbatim from pragprog’s Erlang book, I don’t know if it can be done better and shorter. If anyone here can confirm I don’t need that parallelizing function I’ll be happier :)

  26. Stuart Dootson Says:

    Hmm – didn’t realise the text limit was so short on comments – anyway – using list comprehensions in Haskell gave code about 20% faster than your C example (both run on my iBook, compiled with maximum optimisation) – ain’t laziness (a la Haskell) wonderful :-)

  27. Jon Harrop Says:

    Here is an OCaml version that is shorter and faster than everything else including the C code:

    let n = 5000

    let () =
    let i = ref 0 in
    for c=2 to n do
    for b=1 to c-1 do
    let a = sqrt(float(c*c – b*b)) in
    if float(truncate a) = a then incr i
    done
    done;
    print_endline (string_of_int !i)

    I get 0.341s compared to 0.381s for C.

  28. Me Notyou Says:

    Perl easily beats Ruby and PHP on my hardware.

    #!/usr/bin/perl
    use strict;
    use warnings;

    my $i = 0;

    for my $c ( 2 .. 5000 ) {
    for my $b ( 1 .. $c ) {
    my $a = sqrt( $c * $c – $b * $b );
    if ( int $a == $a )
    {
    $i++;
    }
    }
    }

    print $i, ”\n”;

  29. Emre Sevinç Says:

    I wondered how Common Lisp would do so I tried the simplest form that comes to mind and timed id:

    CL-USER> (time (pythagorean-triplet))
    13413
    Evaluation took:
    5.015 seconds of real time
    3.87941 seconds of user run time
    0.035995 seconds of system run time

    Well, not that bad not that good.

    And then a little bit optimization (not that I’m good at it):

    CL-USER> (time (pythagorean-triplet-optimized-for-speed))
    13413
    Evaluation took:
    1.571 seconds of real time
    1.433782 seconds of user run time
    0.002999 seconds of system run time

    Hmm, definitely better.

    And Chris Russel from comp.lang.lisp provided a better version:

    CL-USER> (time (pythagorean-triplet-re-optimized-for-speed))
    13413
    Evaluation took:
    1.366 seconds of real time
    1.336084 seconds of user run time
    0.008001 seconds of system run time

    Probably it can be made better. This was just a quick check and having little fun with Emacs + SLIME + SBCL. It was good to see once again how much flexibility Common Lisp provided being a dynamic language which can optimized in various ways, providing type information, compilation settings for a function, etc. Oh, don’t be bothered for very long function names with code completion they are just a few keystrokes away and never a pain (and I won’t delve into fuzzy code completion capabilities of Emacs + SLIME).

    PS: Hope the links appeared fine! :)

  30. Mike McNally Says:

    Here’s my version (no lists) modified to do one process per outer loop:

    -module(pyth).
    -export([find/1]).

    find(Max) ->
    new_num(2, Max, self()),
    receive Answer -> Answer end.

    find(Max, Max, Who) -> Who ! 0;
    find(N, Max, Who) ->
    new_num(N + 1, Max, self()),
    Who ! (search_pairs(N * N, N – 1, 0) + receive SubTotal -> SubTotal end).

    search_pairs(C2, A, Count) ->
    case A * A of
    X when X Count;
    A2 -> search_pairs(C2, A – 1, case math:sqrt(C2 – A2) of
    N when trunc(N) == N -> Count + 1;
    _ -> Count
    end)
    end.

    new_num(N, Max, Who) -> spawn(fun() -> find(N, Max, Who) end).

  31. Alpha Chen Says:

    Just for kicks… this version should be almost as fast as the C version. ;.)

    require “inline”

    class Pythag
    inline do |builder|
    builder.c ’
    void pythag() {
    double c, b;
    int i = 0;
    double a;

    for (c = 2; c

  32. Federico Feroldi Says:

    Hey mate, I’m glad to see that your article is getting famous! :)
    We have to organize another train trip to think about a new puzzle! ;)

  33. Giovanni Intini Says:

    @Federico: next time let’s study more languages though, this is becoming really interesting.

    How can Ocaml be faster than C?

  34. Alpha Chen Says:

    D’oh! Didn’t escape the HTML in my last comment.

    require "inline"

    class Pythag
    inline do |builder|
    builder.define
    builder.c ’
    void pythag() {
    double c, b;
    int i = 0;
    double a;

    for (c = 2; c < = 5000; c++) {
    for (b = 1; b < c; b++) {
    a = sqrt(c*c – b*b);
    if (trunc(a) == a) {
    i++;
    }
    }
    }
    printf("%d\n", i);
    }’
    end
    end

    Pythag.new.pythag

  35. links for 2007-05-19 -- A Tempest of Thoughts Says:

    [...] Erlang, Ruby and PHP battle it out! — A Tempest of Thoughts See how they rank compared to each other. (tags: erlang javascript performance php ruby) Last Modified : May 20th, 2007 Filed under : Del.icio.us Navigate : Previous post / Share : [...]

  36. Jon Harrop Says:

    I only compiled the C code with -O2 before (when I said OCaml was faster). As it happens, compiler options make a big different for the C code. I managed to make it significantly faster just by choosing different options:

    0.379s: -O2
    0.350s: -03
    0.299s: -O3 -ffast-math

    0.341s: ocamlopt

    Very brave of someone to post such a slow Lisp implementation. ;-)

  37. Paul Says:

    The Lisp version has been furtherly optimized (with a few small changes, see c.l.l.) and is now already close to the C version.

    So I would vote for Lisp.

  38. Emre Sevinç Says:

    > Very brave of someone to post such a slow Lisp
    > implementation.

    So that I have fun and others have fun,too, and people learn more about Common Lisp optimization (including me) yielding results such as (approximately):

    C: 0.299s
    OCaml: 0.341s
    Lisp: 0.454s

    Not that this kind of microbenchmark provides a lot of information or brings vision but at least it helps people to see that they can have the flexibility of Common Lisp development environment (similar to but still better than Python and Ruby, only to be compared with Smalltalk) with the opportunity to optimize close to the speed of C programming language. Not that bad I guess ;-)

    Thus, I don’t think it is about bravery or timidity :)

    Thanks to the original blog entry author for providing yet another opportunity to learn more Lisp.

    Happy coding.

  39. Jon Harrop Says:

    Running the OCaml code through the F# compiler takes 0.864s on the same hardware (but 32-bit WinXP whereas my other timings were on 64-bit Debian Linux).

  40. Shall We Zen » links for 2007-05-20 Says:

    [...] Erlang, Ruby and PHP battle it out!—- A Tempest of Thoughts (tags: javascript erlang c performance php ruby programming) [...]

  41. andrew cooke Says:

    curiously, i can’t reproduce your erlang results on my machine (core 2 duo). your single and multithreaded versions take 3.6 and 3.1s respectively, which is nothing like as big a difference as you report.

    also, although i think i am repeating others, the following is faster still:

    pyth2(N) ->
    [{A,B,C} ||
    A

  42. fullo Says:

    about php, from the shell you should launch the script without invoking the php.ini parsing.

    If you try to run “php -n file.php” you should decrease the execution time by 2 seconds or so..

    ciuaz

  43. Giovanni Intini Says:

    @Andrew: you have to launch erlang with the smp switch to get the faster results: erl -smp +S 2

  44. andrew cooke Says:

    ah, thanks!

    (ignore the rest of my post – looks like the angle bracket got confused with html markup).

  45. FZ Blogs » Common Lisp Optimizasyon Dersleri: 5 saniyeden 0.16 saniyeye Says:

    [...] (microbenchmark) çok anlamlı olmasa da çok basit bir Pisagor üçlüsü bulma algoritması üzerinden ne tür Common Lisp optimizasyonları yapılabileceğine dair güzel örnekler dilin [...]

  46. J. Avatar Says:

    Perhaps someone could test all of the examples here (Ruby, Java, Javascript, Ocaml, Erlang, C, Python…) on a single machine and compare them directly?

  47. The battle of the languages part II -- A Tempest of Thoughts Says:

    [...] My previous post about an programming contest I had with my friends was wildly popular. Much much more than I even thought possible. [...]

  48. raja r Says:

    this gives the results:
    [0, 2, 2]
    [0, 3, 3]
    [0, 4, 4]