Erlang, Ruby and PHP battle it out!
Posted by Giovanni Intini | Filed under Erlang, Javascript, PHP, Programming, Ruby
Update: here’s the followup!
Yesterday I attended PHP Day, in the beautiful city of Verona, and while going there by train, together with my friend Federico Feroldi, I told him that in a test I made a week ago Erlang was much slower than ruby in finding pythagorean triplets.
That was a partial lie, since the ruby implementation was somewhat optimized, while the erlang implementation wasn’t at all. So we decided to rewrite both programs using the same optimized algorithm and then test their speeds.
This is the ruby implementation of a method that counts the number of pythagorean triplets up to N:
def pythag(n) result = [] (2..n).each do |c| (1...c).each do |b| a = Math.sqrt(c*c - b*b) result << [a.to_i, b, c] if a.to_i == a end end result end
Then we coded it in Erlang. The code I’m going to show you is more complex than the ruby code because we took advantage of the parallelization features of the Erlang language:
-module(pythag). -export([bard/1, pbard/1]). my_is_integer(X) -> round(X) == X. pbard(N) -> pmap(fun(C) -> [ {C, B, trunc(A)} || B <- lists:seq(C, N), A <- [math:sqrt(C*C + B*B)], my_is_integer(A) ] end, lists:seq(2, N)). bard(N) -> lists:map(fun(C) -> [ {C, B, trunc(A)} || B <- lists:seq(C, N), A <- [math:sqrt(C*C + B*B)], my_is_integer(A) ] end, lists:seq(2, N)). pmap(F, L) -> S = self(), %% make_ref() returns a unique reference %% we'll match on this later Ref = erlang:make_ref(), Pids = lists:map(fun(I) -> spawn(fun() -> do_f(S, Ref, F, I) end) end, L), %% gather the results gather(Pids, Ref). do_f(Parent, Ref, F, I) -> Parent ! {self(), Ref, (catch F(I))}. gather([Pid|T], Ref) -> receive {Pid, Ref, Ret} -> [Ret|gather(T, Ref)] end; gather([], _) -> [].
The pythag function has been named bard, as in The Bard
The pbard function fires up a thread for each iteration of the outer loop and then gathers all the result back. This allows for up to 50% performance increase when using dual core CPUs (like my Macbook Pro
).
Once we coded it in two languages we couldn’t stop, so here’s a PHP implementation, a Javascript one and, last, but not least, a C implementation, because you can never forget C when you need speed
PHP:
<?php $i=0; for ($c = 2; $c <= 5000; $c++) { for ($b = 1; $b < $c; $b++) { $a = sqrt($c*$c - $b*$b); if (is_int($a)) { $i++; } } } print("$i"); ?>
JavaScript:
function pythag(n) { var a,b,c,i for (c = 2; c <= n; c++) { for (b = 1; b < c; b++) { a = Math.sqrt(c*c - b*b); if (Math.floor(a) == a) { i++; } } } return i; }
C:
#include <stdio.h> #include <math.h> #define MAX 5000 int main() { double c, b; int i = 0; double a; for (c = 2; c <= MAX; c++) { for (b = 1; b < c; b++) { a = sqrt(c*c - b*b); if (trunc(a) == a) { i++; } } } printf("%d\n", i); }
To test the various relative speeds I ran them from the shell using time (php ruby and c), or used the internal timer (erlang), or ran them in firefox (js only actually).
The results were interesting. At the first run, using 5000 as N in the pythag functions the order was:
- C (obviously): 0.40s
- Erlang (smp): 3.95s
- Erlang (non-smp): 5.66s
- Ruby: 15.31s
- PHP: 19s (or so)
- Javascript: couldn’t even finish running because firefox said it was taking too much time
Fullo passed by and asked what we were doing, I told him I was testing languages and that PHP was dog-slow. He said it couldn’t be and asked me what version of PHP I was using. I checked and it was PHP 4.x.x.
After downloading and compiling PHP5 I ran the tests again and the final result was this:
- C (obviously): 0.40s
- Erlang (smp): 3.95s
- Erlang (non-smp): 5.66s
- PHP5: 8.9s (or so)
- Ruby: 15.31s
- Javascript: couldn’t even finish running because firefox said it was taking too much time
Now the numbers make sense. Doing these comparations was a nice experience, because when you implement the same thing in different languages you always learn something about optimization and code beauty.
May 19th, 2007 at 12:35 pm
[...] Yesterday I’ve been at the PHPDay 2007 in Verona with my Nimboo friend Giovanni Intini, during the travel to Verona on the train, we played with Erlang, Ruby and PHP writing small programs to compute Pythagorean triplets and comparing performances. I don’t want to tell you yet who’s the winner, you will find the results of our tests in Giovanni’s blog. [...]
May 19th, 2007 at 1:38 pm
You may want to bench the Python version too: http://pastie.textmate.org/private/ob7q5ikivlvwo5qm1u
May 19th, 2007 at 2:00 pm
How was the Erlang code compiled? Old style, or HIPE?
May 19th, 2007 at 2:09 pm
@Stefan: It was compiled old style
@Lawrence: I’ll try the python one too, keep an eye on the post for the update
May 19th, 2007 at 2:13 pm
Lawrence I’m getting 14.839s on python 2.3.5, by doing ‘time python pythag.py’, is there a way to speed it up?
May 19th, 2007 at 2:19 pm
Did you try with Python 2.5.1? Python 2.3 is really old
May 19th, 2007 at 2:22 pm
Now don’t get all upgradey on me
I’ll do it just for the sake of science
May 19th, 2007 at 2:26 pm
The Python interpreter is slow to start up – you will see very different results if you run your benchmark using the timeit module.
May 19th, 2007 at 2:29 pm
Simon can you post some code that uses the timeit module so I can run the python tests?
May 19th, 2007 at 2:30 pm
Upgrading python to 2.5, but still using time, lowered the execution time to 11.53s
May 19th, 2007 at 2:43 pm
The easiest way is issuing this on the command line: “python2.5 -m timeit -n 100 “from pythag import pythag; pythag(5000)”
or you can also patch the module to do that:
http://pastie.textmate.org/private/qsx4ahc9tirpsomigf
May 19th, 2007 at 2:46 pm
Maybe testing 300 times is too much. Anyway, here’s the doc: http://docs.python.org/lib/module-timeit.html
May 19th, 2007 at 4:11 pm
I test python(2.5.1) and C in my pc, the result is:
1 C: 0.39s
2 Python(2.5.1): 14.83s
May 19th, 2007 at 4:58 pm
I converted the C-example to Java (1.5) and got impressive results:
http://pastebin.ca/496904
Time: 0.11s
To put the number into perspective, PHP 5 used 9.3s on the same machine (a shared host, using 3 Xeon 3.60GHz CPU’s)
May 19th, 2007 at 5:01 pm
import java.lang.Math;
public class test1 {
private final static int MAX=5000;
public static void main(String args[]) {
double c, b;
int i = 0;
double a;
long start = System.currentTimeMillis();
for (c = 2; c
May 19th, 2007 at 5:49 pm
you are not playing fair. the ruby bench is putting results into an array. it is also being called as a function. you should equalize your benchmarks. try this for your ruby benchmark:
i = 0
(2..5000).each do |c|
(1..c).each do |b|
a = Math.sqrt(c*c – b*b)
if a.to_i == a then i+=1 end
end
end
puts i
let me know what you get on your hardware
May 19th, 2007 at 6:06 pm
Jherber don’t tell me I’m against Ruby, it’s actually my main-and-favourite language
Anyway, I ran your code and got this result: 15.27s, so I guess we can say ruby is pretty efficient in calling methods
May 19th, 2007 at 6:16 pm
Thought I’d try this in Haskell –
module Main where
isPythag :: Int -> Int -> Bool
isPythag c b =
aSqr==diff
where a = ratSqrt (diff)
diff = cSqr-bSqr
cSqr=c*c
aSqr=a*a
bSqr=b*b
ratSqrt :: Int -> Int
ratSqrt = truncate.sqrt.fromIntegral
main = putStrLn (show (length [ 1 | c
May 19th, 2007 at 6:18 pm
ah, useless microbenchmarks, I love this
first, I believe you should run with warm cache
But more important, the ruby & erlang versions do collect pytagorean triples, all the rest do not, they just count (i.e. no arrays, no gc kicking in etc etc). This is especially important comparing with php because array handling is quite slow there due to arrays also being dictionaries, IIRC.
Anyway, a probably slightly faster ruby version: in line with c,php and js:
include Math # no lookup overhead
a=b=c=result=0 #no var creation/destruction overhead
#if you use upto no overhead for range object creation
2.upto(5000) do |c|
1.upto© do |b|
# don’t do to_i twice
a = sqrt(c*c – b*b)
result+=1 if a == a.to_i
end
end
print result
use yarv and you get huge speedup by using while instead of blocks, you can write fortran in any language, as they say
May 19th, 2007 at 6:38 pm
Did you try HiPE?
”... when you implement the same thing in different languages you always learn something about optimization and code beauty.”
The Computer Language Benchmarks Game
http://shootout.alioth.debian.org/gp4/
May 19th, 2007 at 6:47 pm
That’s not a very efficient Erlang implementation. Why build up all those lists when all you need is a count?
Here’s some smaller, cleaner code, without all the list comprehensions that don’t really do you any good:
-module(pyth).
-export([find/1]).
find(Max) -> find(2, Max, _Count = 0).
find(Max, Max, Count) -> Count;
find(N, Max, Count) ->
find(N + 1, Max, Count + search_pairs(N * N, N – 1, 0)).
search_pairs(C2, A, Count) ->
case A * A of
X when X Count;
A2 -> search_pairs(C2, A – 1, case math:sqrt(C2 – A2) of
N when trunc(N) == N -> Count + 1;
_ -> Count
end)
end.
May 19th, 2007 at 6:56 pm
The reason why I am building lists (and arrays) in Erlang and Ruby is because In earlier versions I was printing all the results, then C came and I stopped printing results, but I didn’t bother changing the ruby and erlang code. After all it was only a game
May 19th, 2007 at 6:57 pm
You could run the Javascript version in Rhino (http://www.mozilla.org/rhino/), if you want JS stats too.
I’m very surprised by the LOC needed for the Erlang version, even if it has SMP support. Also, I thought Erlang was parallel by default?
May 19th, 2007 at 7:03 pm
The purpose of this experiment (as I understand it) was to compare languages but there is a much better way to generate all pythagorean triples. Pick two integers M and N such that M>N. The following formula generates a triple:
A = M^2-N^2
B = 2*M*N
C = M^2+N^2
Here’s an erlang function which enumerates them:
pyth(X) -> pyth(X, 2, 1, []).
pyth(X, M, N, L) when M*M + N*N > X, M == N+1 -> L;
pyth(X, M, N, L) when M*M + N*N > X -> pyth(X, N+2, N+1, L);
pyth(X, M, N, L) -> pyth(X, M+1, N, [{M*M-N*N, 2*M*N, M*M+N*N}|L]).
Re-implementing this in the other languages will make them all quite a bit faster also.
May 19th, 2007 at 7:17 pm
@Jay: it all started as a contest to see how easy it was to implement a pythagorean triplets algorithm in erlang and ruby, I started from an example in the erlang book that used list comprehensions and I never put too much thought in using a better algorithm, but thanks I wondered if there was a bettere mathematical approach.
@Ludoa: The parallelizing erlang code was taken verbatim from pragprog’s Erlang book, I don’t know if it can be done better and shorter. If anyone here can confirm I don’t need that parallelizing function I’ll be happier
May 19th, 2007 at 7:43 pm
Hmm – didn’t realise the text limit was so short on comments – anyway – using list comprehensions in Haskell gave code about 20% faster than your C example (both run on my iBook, compiled with maximum optimisation) – ain’t laziness (a la Haskell) wonderful
May 19th, 2007 at 8:06 pm
Here is an OCaml version that is shorter and faster than everything else including the C code:
let n = 5000
let () =
let i = ref 0 in
for c=2 to n do
for b=1 to c-1 do
let a = sqrt(float(c*c – b*b)) in
if float(truncate a) = a then incr i
done
done;
print_endline (string_of_int !i)
I get 0.341s compared to 0.381s for C.
May 19th, 2007 at 8:11 pm
Perl easily beats Ruby and PHP on my hardware.
#!/usr/bin/perl
use strict;
use warnings;
my $i = 0;
for my $c ( 2 .. 5000 ) {
for my $b ( 1 .. $c ) {
my $a = sqrt( $c * $c – $b * $b );
if ( int $a == $a )
{
$i++;
}
}
}
print $i, ”\n”;
May 19th, 2007 at 8:30 pm
I wondered how Common Lisp would do so I tried the simplest form that comes to mind and timed id:
CL-USER> (time (pythagorean-triplet))
13413
Evaluation took:
5.015 seconds of real time
3.87941 seconds of user run time
0.035995 seconds of system run time
Well, not that bad not that good.
And then a little bit optimization (not that I’m good at it):
CL-USER> (time (pythagorean-triplet-optimized-for-speed))
13413
Evaluation took:
1.571 seconds of real time
1.433782 seconds of user run time
0.002999 seconds of system run time
Hmm, definitely better.
And Chris Russel from comp.lang.lisp provided a better version:
CL-USER> (time (pythagorean-triplet-re-optimized-for-speed))
13413
Evaluation took:
1.366 seconds of real time
1.336084 seconds of user run time
0.008001 seconds of system run time
Probably it can be made better. This was just a quick check and having little fun with Emacs + SLIME + SBCL. It was good to see once again how much flexibility Common Lisp provided being a dynamic language which can optimized in various ways, providing type information, compilation settings for a function, etc. Oh, don’t be bothered for very long function names with code completion they are just a few keystrokes away and never a pain (and I won’t delve into fuzzy code completion capabilities of Emacs + SLIME).
PS: Hope the links appeared fine!
May 19th, 2007 at 10:33 pm
Here’s my version (no lists) modified to do one process per outer loop:
-module(pyth).
-export([find/1]).
find(Max) ->
new_num(2, Max, self()),
receive Answer -> Answer end.
find(Max, Max, Who) -> Who ! 0;
find(N, Max, Who) ->
new_num(N + 1, Max, self()),
Who ! (search_pairs(N * N, N – 1, 0) + receive SubTotal -> SubTotal end).
search_pairs(C2, A, Count) ->
case A * A of
X when X Count;
A2 -> search_pairs(C2, A – 1, case math:sqrt(C2 – A2) of
N when trunc(N) == N -> Count + 1;
_ -> Count
end)
end.
new_num(N, Max, Who) -> spawn(fun() -> find(N, Max, Who) end).
May 19th, 2007 at 10:55 pm
Just for kicks… this version should be almost as fast as the C version. ;.)
require “inline”
class Pythag
inline do |builder|
builder.c ’
void pythag() {
double c, b;
int i = 0;
double a;
for (c = 2; c
May 20th, 2007 at 12:16 am
Hey mate, I’m glad to see that your article is getting famous!
We have to organize another train trip to think about a new puzzle!
May 20th, 2007 at 12:39 am
@Federico: next time let’s study more languages though, this is becoming really interesting.
How can Ocaml be faster than C?
May 20th, 2007 at 12:47 am
D’oh! Didn’t escape the HTML in my last comment.
require "inline"
class Pythag
inline do |builder|
builder.define
builder.c ’
void pythag() {
double c, b;
int i = 0;
double a;
for (c = 2; c < = 5000; c++) {
for (b = 1; b < c; b++) {
a = sqrt(c*c – b*b);
if (trunc(a) == a) {
i++;
}
}
}
printf("%d\n", i);
}’
end
end
Pythag.new.pythag
May 20th, 2007 at 1:27 am
[...] Erlang, Ruby and PHP battle it out! — A Tempest of Thoughts See how they rank compared to each other. (tags: erlang javascript performance php ruby) Last Modified : May 20th, 2007 Filed under : Del.icio.us Navigate : Previous post / Share : [...]
May 20th, 2007 at 3:33 am
I only compiled the C code with -O2 before (when I said OCaml was faster). As it happens, compiler options make a big different for the C code. I managed to make it significantly faster just by choosing different options:
0.379s: -O2
0.350s: -03
0.299s: -O3 -ffast-math
0.341s: ocamlopt
Very brave of someone to post such a slow Lisp implementation.
May 20th, 2007 at 2:35 pm
The Lisp version has been furtherly optimized (with a few small changes, see c.l.l.) and is now already close to the C version.
So I would vote for Lisp.
May 20th, 2007 at 4:20 pm
> Very brave of someone to post such a slow Lisp
> implementation.
So that I have fun and others have fun,too, and people learn more about Common Lisp optimization (including me) yielding results such as (approximately):
C: 0.299s
OCaml: 0.341s
Lisp: 0.454s
Not that this kind of microbenchmark provides a lot of information or brings vision but at least it helps people to see that they can have the flexibility of Common Lisp development environment (similar to but still better than Python and Ruby, only to be compared with Smalltalk) with the opportunity to optimize close to the speed of C programming language. Not that bad I guess
Thus, I don’t think it is about bravery or timidity
Thanks to the original blog entry author for providing yet another opportunity to learn more Lisp.
Happy coding.
May 20th, 2007 at 6:24 pm
Running the OCaml code through the F# compiler takes 0.864s on the same hardware (but 32-bit WinXP whereas my other timings were on 64-bit Debian Linux).
May 20th, 2007 at 7:48 pm
[...] Erlang, Ruby and PHP battle it out!—- A Tempest of Thoughts (tags: javascript erlang c performance php ruby programming) [...]
May 20th, 2007 at 8:43 pm
curiously, i can’t reproduce your erlang results on my machine (core 2 duo). your single and multithreaded versions take 3.6 and 3.1s respectively, which is nothing like as big a difference as you report.
also, although i think i am repeating others, the following is faster still:
pyth2(N) ->
[{A,B,C} ||
A
May 21st, 2007 at 12:21 am
about php, from the shell you should launch the script without invoking the php.ini parsing.
If you try to run “php -n file.php” you should decrease the execution time by 2 seconds or so..
ciuaz
May 21st, 2007 at 1:48 am
@Andrew: you have to launch erlang with the smp switch to get the faster results: erl -smp +S 2
May 21st, 2007 at 2:00 am
ah, thanks!
(ignore the rest of my post – looks like the angle bracket got confused with html markup).
May 21st, 2007 at 4:15 pm
[...] (microbenchmark) çok anlamlı olmasa da çok basit bir Pisagor üçlüsü bulma algoritması üzerinden ne tür Common Lisp optimizasyonları yapılabileceğine dair güzel örnekler dilin [...]
May 22nd, 2007 at 6:53 pm
Perhaps someone could test all of the examples here (Ruby, Java, Javascript, Ocaml, Erlang, C, Python…) on a single machine and compare them directly?
May 23rd, 2007 at 5:41 pm
[...] My previous post about an programming contest I had with my friends was wildly popular. Much much more than I even thought possible. [...]
June 19th, 2007 at 9:29 pm
this gives the results:
[0, 2, 2]
[0, 3, 3]
[0, 4, 4]