Latest Posts »
Latest Comments »
Popular Posts »

Case study: Is PHP embarrasingly slower than Java?

Written by Toomas Römer on August 4, 2008 – 11:50 am

IP2C is a small library that provides IP to country resolution. It uses the free ip-to-country database. IP2C takes the database CSV file that is about 4mb and converts it into a ~600kb binary format and provides PHP and Java frontend to query the database.

The library is great, easy to convert an ip to a country and when using the country flags from it’s side project you could spice up your statistics with the country information. This a lot faster than using reverse DNS lookup.

The problem. The PHP implementation is a lot slower. Embarrassingly slower. Without any caching the Java version is able to do ~6000 queries per second. The PHP counterpart can push through ~850 queries. The implementations are the same. The stats provided by the author of the library are 8000 vs 1200. So about the same as my measurements.

I like PHP, I don’t use it that much anymore but I still care when I see such embarrassing numbers. I took the implementation and started profiling it. Spent the night running different tests and trying to optimize.

General outline of the algorithm is as follows. We take the dotted string IP and convert it to an IPv4 Internet network address (e.g. 69.55.232.153 becomes 1161291929). The DB holds sorted ranges of these addresses. A binary search will happen on these addresses and we have a country for the ip. Take a look at the implementation.


Lets see where the vanilla version of IP2C spends its time at. The results are based on 1000 iterations with Xdebug enabled and visualized by KCacheGrind. It processed about 210 IP addresses during this time.

IO part is surprisingly low. The internal fseek, fread constitute to 2% of the execution time. On the other hand the user level fseek which is just a wrapper alone uses 5%. readShort and readInt take 20% of the execution time.

PHP:
  1. function readShort() {
  2.     $a = unpack(‘n’, fread($this->m_file, 2));
  3.     return $a[1];}
  4.  
  5. function readInt() {
  6.     $a =unpack(‘N’, fread($this->m_file, 4));
  7.     return $a[1];}
  8.  
  9. function seek($offset){
  10.     fseek($this->m_file, $offset);}

Functions calls are expensive. Lets eliminate them. readInt, readShort, fseek are now inlined. Recursion changed to iteration (e.g. 14 000 less function calls). Able to process 400 queries per second compared to the previous 210.

We see that the latest profiling results have twice the number of freads and unpacks than fseeks. It seems that fseek is used to seek out the right position, read two numbers with unpacking them. The implementation confirms that. Luckily we could just read once (2 bytes more) and unpack once (2 unpackings with one invocation).

PHP:
  1. $a =unpack(‘N’, fread($this->m_file, 4));
  2. $np[‘ip’] = $a[1];
  3.                            
  4. $a =unpack(‘n’, fread($this->m_file, 2));
  5. $np[‘key’] = $a[1];
  6.  
  7. // this can be changed to
  8. $np =unpack(‘Nip/nkey’, fread($this->m_file, 6));

How does this version stack up to the Java version? Lets disable profiling and run 100 000 iterations. Vanilla version processes ~850 IPs, when functions are inlined the number is around 1400. Java version can still do 6000.

Lets try caching. Peeking at the Java implementation shows that Java caching version (whopping 141 242 IPs per second – yup 141k) uses just a byte[] array and makes lookups from there instead of seeking and reading from file. Easy, lets do the same in PHP.

We read everything into a string and instead of fread with access the string elements with the offset. For fseek with just set the offset. We are using 600kb more memory but can increase the throughput to ~2800.

As it seems I’ve just wasted a night, I just should have checked the Computer Language Benchmarks. PHP in the sense of execution speed is uncomparable to Java.

The upside, we can still take the library, eliminate recursions, double unpacks and add caching. A small gain is still a gain.


Posted in Featured, report, review | 34 Comments »

34 Comments to “Case study: Is PHP embarrasingly slower than Java?”

  1. FeepingCreature Says:

    For what it’s worth, I’ve whipped up a quick benchmark of country lookups in the free GeoIP database (which is 7M of CSV, should be about equivalent), using a binary search, in D.

    Source is here. http://paste.dprogramming.com/dplpew7f

    On my 1.6g box, it does about 500k lookups per second.

    This seems unrealistically and confusingly large, but I’m not aware of mistakes in my code. Any idea what’s going on here?

  2. Toomas Römer Says:

    Could you verify your results? See http://firestats.cc/browser/trunk/ip2c/php/ip2c_test.php

    I too got some pretty good results at some point until I discovered I had mistakes and the lookup were blazingly fast because they finished in one iteration.

    Then again you are the closest to machine when comparing Java, PHP and D.

  3. FeepingCreature Says:

    Here’s a slightly optimized version: country names are only stored once, and lookups are done in steps of 16 to offset the impact of the sec() system call on the time. 1.2mio lookups per second. I’m fairly sure now that I’m doing something wrong, but it seems to work on the face of it :/

    http://paste.dprogramming.com/dpwhu5vy

  4. FeepingCreature Says:

    http://paste.dprogramming.com/dpfgp1j2

    Added some validation functionality. It seems most of the speed comes from the fact that I generate the IPs as uints to begin with. The average validation time of 0.29s for 103684 IPs suggests that when actually parsing IP strings, the throughput is a more realistic 300something k. Still pretty decent :)

  5. Toomas Römer Says:

    Yup, this seems okay as compared to the Java one. I’m not at the machine that I ran tests on but I’ll check the results of D at home later tonight. I hope the pastebin won’t have the programs deleted by then.

  6. FeepingCreature Says:

    http://paste.dprogramming.com/dpnrl010

    Ahh .. I figured out what’s wrong. Stupid of me to forget that not all IPs are inside valid ranges.

    The above example also adds a neat example of the lazy evaluation idiom, which probably contributes to its reduced speed of 170k/s.

  7. FeepingCreature Says:

    http://paste.dprogramming.com/dpua3y3y

    About a third of all attempted IPs fail. Is that normal?

    Toomas, thanks for looking at it :) Don’t worry, paste.dprogramming rarely if ever expires, as far as I know.

    By the way, the above version is back up to 3mio/s without IP parsing, and 0.28s for the validation, primarily because (as it turns out), Exceptions Are Slow.

    Yeah, big surprise there. Anyway, that’s probably my last attempt at this problem for now. Sorry for commentspamming.

  8. Behrang Says:

    No wonder. Cauchoo’s Quercus PHP implementation in Java outperforms Apache/PHP by a factor of 6x.

    Swing apps still startup slower compared to native apps and are more memory hungry as well, however, performance-wise, on the server side, Java is one of the best available options.

  9. Toomas Römer Says:

    This sample was slower with Quercus. The reason was probably memory wise as I ran it with the default options.

    In the previous post (http://dow.ngra.de/2008/08/01/php-and-microbenchmarks/) Quercus outperformed the PHP function_call vs inline competition. PHP ratio between calling a function or inlining was 8, with quercus it was 3.

  10. Am Leben Says:

    How are the figures if you use apc or a similar opcode cache? Did you took this into account wile optimizing?

  11. Jacek Says:

    The only thing is – you’re comparing apples to oranges. Turn on APC and then tell us the results – only then will they be comparable.

  12. maht Says:

    Try this :

    http://www.php-accelerator.co.uk/

  13. Toomas Römer Says:

    @Am Leben, @Jacek
    I ran the tests with APC. The configurations I used were:

    apc.enabled=1
    apc.optimization=10
    apc.enable_cli=1

    No difference to the original results. Why? APC caches opcodes. So it eliminates the disk hit.

    As I’m just running the script once and I’m actually just timing the loop that finds the country codes I’m not saving anything. About the apc.optimization the manual says “Expect very modest speed improvements. This is experimental.” So my code does not get any faster.

    Would I make large number of script executions it would make a difference but not in these circumstances.

    @Maht
    The accelerator does not seem to do any optimizations either. Am I mistaken?

  14. Nick Says:

    both apc and ionCube are opcode cachers and as you pointed out you are only running the script once, so neither of those can really help

  15. Toomas Römer Says:

    @FeepingCreature
    I’m sorry but I was not able to run your samples. I was able to finally install a compiler. The problem with D for me was finding packages from Debian repository. The letter D stands for so many things :).

    Once I got the compiler (gcd) installed I was not able to overcome the messages about the tools, std packages missing. Finding the libraries was probably the problem.

    If you happen to know the package names that I’m missing let me know.

  16. Huh Says:

    Where is a nice graph of comparison? Do you expect us to read everything?

  17. FeepingCreature Says:

    toomas, Sorry .. tools is a personal utility library I’m using. std belongs to phobos, the D standard library. It _ought_ to be installed by default.

    If you can find a package for dsss, the D shared software system (think D’s CPAN), it would make things easier (dsss net install scrapple-tools).

    Otherwise, you can always grab tools from svn (http://svn.dsource.org/projects/scrapple/trunk/tools/).

    Also, I recommend you use 4.1.x as the GCC version; I remember there were some serious issues with the experimental 4.2 patches.

  18. Vjekoslav Nesek Says:

    Interesting discussion. I’ve benchmarked it against homebrew IP to country Java lib developed some time ago using GeoIP CSV database. Run it against latest GeoIP lite database, for test IPs I’ve used http://iplists.com/misc.txt, almost all IPs resolve.

    It uses off-the-shelf java.util.TreeSet and keeps ranges in memory. For tests I’ve used MBP C2D 2.4GHz and looped 4 milion times.

    With apple JRE 1.5 got 293K lookups/s and 24M memory usage (32bit)

    With apple JRE 1.6 got 543K lookups/s with 31M memory used (64bit)

    Interesting acceleration for this benchmark when going from 1.5/32bit to 16/64bit. 85% better performance for 29% more memory.

  19. Toomas Römer Says:

    @Nesek
    The large numbers itself come from a faster CPU. The change between 1.5/32b and 1.6/64b will required profiling.

    The weird found I had was that -server flag produced ~40% worse results :)

  20. Vjekoslav Nesek Says:

    @Thomas
    I wasn’t using IP2C but a Java library, let’s call it HSWW, I’ve coded year or two ago. To compate IP2C Java agains my lib I’ve benchmarked IP2C with same set of test IPs on the same hardware.

    Here are the results (lib, jvm, K lookup/s), sorted by lookups/s

    HSWW 1.6-server 547.2 31M
    HSWW 1.6-client 538.6 31M
    IP2C 1.6-server 436.6 2M
    IP2C 1.6-client 401.5 1M
    IP2C 1.5-server 383.1 1M
    HSWW 1.5-server 291.6 24M
    HSWW 1.5-client 290.6 24M
    IP2C 1.5-client 255.2 1M

    So, If you run 1.6 and got memory to spare you can do more lookups/s with HSWW. Other than
    analyzing access.log-s of high traffic web site
    I can’t imagine who would benefit from this 25% performance increase.

    By checking out sources of IP2C vs HSWW my guess is that 1.6 is significantly faster than 1.5 in compare/binary search against TreeMap.

    If anyone cares to run code themself, let me know and I’ll post it.

  21. aspect Says:

    I’m not in a position to try and generate a meaningful benchmark, but I wonder how the below trivial adaptation to use sqlite performs, given it seems an obvious and trivial optimisation.

    ip2c.sql
    * $ sqlite ip2c.sqlite
    * > CREATE TABLE ip2c (min unsigned int not null primary key, max unsigned int not null, id2 char(2), id3 char(3), name varchar(16));
    * > .read ip2c.sql
    * > .exit
    *
    * .. for sqlite3 use:
    * $ sed -i -e ’s/”,”/|/g’ -e ’s/”//g’ ip-to-country.csv
    * > .import ip-to-country.csv ip2c
    */

    function get_country($ip) {
    $int_ip = ip2long($ip);
    $ipdb = sqlite_open(‘ip2c.sqlite’);

    $res = sqlite_array_query($ipdb, “SELECT * FROM ip2c WHERE min $int_ip”);
    sqlite_close($ipdb);
    return $res;
    }

    ?>

  22. aspect Says:

    .. of course it was too optimistic to expect wordpress to not destroy my code. Try:

    http://pastecode.com/?show=m255234d5

  23. Vjekoslav Nesek Says:

    I’ve been checking if there are some optimization opportunities left for so called HSWW version.

    First I looked how to cut memory use. By intern()-ing country code strings memory usage felt from 31M to 15M. Then I’ve replaced TreeMap with an array of custom IPRange objects. Instead of doing map lookups it was binary search on an array. Now memory usage felt to 4-5M which is more acceptable.

    Performance felt from 547K lookups/s to 505K.

    Then I’ve started looking at possible speed optimizations and replaced InetAddress.getByName() calls with a hand coded IP to long key conversion function.

    With this change performance jumped to 1547K or 1.5 million lookups/s. Thats 3.5x performance increase over IP2C, running on 1.6-server… wish someone would compare it to PHP and D :)

    Quick math gives that optimized PHP version by Thomas is around 50 times slower than IP2c and if we multiply that by HSWW/IP2C-java factor we can estimate that Java is almost 180x faster than PHP!

    Sources with test data for both HSWW and IP2C Java version can be found at:

    http://rapidshare.com/files/135017220/hsww-geoip.zip.html

  24. Mark Says:

    So, if some people are saying that you can’t use PHP-APC, maybe you should disable java hotspot as well.

    PHP to Java is completely apples to oranges. Java is way faster than PHP at processing bytes, reading from files, etc. Pack/unpack are really slow.

    I don’t think anyone should be surprised at this comparison at all. What’s surprising to me is that the real world performance of java is still so slow in comparison to PHP. In terms of system load, memory used to run a server, etc etc.

    If you need 1 process to quickly return ip2c results, right a server in Java and connect to it from PHP. But, you’d better have an extra 256 megs of memory lying around to launch tomcat, deploy a war, etc. etc.

  25. Toomas Römer Says:

    @Nesek – Cool optimizations. I ran your code. The difference is quite big compared to ip2c and memory consumption aint that large.

    @Mark – Give some links to PHP >> Java @ Real world.

  26. Toomas Römer Says:

    @aspect – There are some results to SQLite being used for the binary searching http://www.reddit.com/r/programming/comments/6usfd/case_study_is_php_embarrasingly_slower_than_java/

  27. Jon Gilkison Says:

    Are you kidding me?

    You’re comparing a JIT’d VM based language versus an interpreted dynamically typed scripting language? And you’re shocked the performance is so off?

    If they had compsci licenses I would ask for yours to be revoked because you obviously don’t get it.

    You choose PHP because it’s easy and fast to develop. You choose Java because you have a penchant for self mutilation and are a bit of a masochist.

    The only valid comparison you can make here is how quickly it took you to wrote and deploy. Measuring performance is just pure idiocy.

  28. Toomas Römer Says:

    From the point of trying to optimize a script to get as close as possible to the other implementation this is all valid.

    Had I started it in the sense of lets compare PHP & Java, I would have done it differently.

    I’m revoking my compsci revokation.

  29. lomo Says:

    I have no compsci lic., but I tend to agree with Jon.

    If you are comparing performance between compiled Java versus interpreted PHP, then it’s only fair that you also took PHP optimizer like Zend or a bunch others here into considerations :
    http://en.wikipedia.org/wiki/PHP_accelerator

    Zend claims 25x performance, so does that makes PHP faster than Java?

    Doesn’t matter how far you optimize a script, they will never be the same, and this comparison is just plain silly.

  30. Vjekoslav Nesek Says:

    @Jon
    Well I guess I’m a bit of masochist. Sticking with Java since I tend to get a job done much faster in every sense of the word.

    Regarding… apples vs. oranges stuff I’ll aggree. Comparing PHP to Java performance is a complete waste of time and irrelevant. Java is 10-1000x faster on every computation bound task I can think of.

    I was mostly interested to see how much performance can I extract from my lib when Thomas posted it’s results.. This 8-140K lookups/s for Java version advertised on IP2C site looked really slow to me for such a simple task so I tried to optimize a bit to see how much further I can push it.

    I’ve learned something from it, for example that memory caching things can take 50% more RAM in 64bit JVM, that String interning can reduce it by half and that parsing String to InetAddress instances is a real performance killer.

    @Jon… This quick-to-deploy attitude can cost you a lot of money when you need a farm of computers to run what a single machine can.

    …and taking offense on a fact that PHP is slower than Java is a bit childish, wouldn’t you agree?

  31. Jon Gilkison Says:

    Let me get this right.

    If I can do the exact same thing you can, but I can deploy it 2x as fast as you, in 1/3rd the development time, in 1/4th the amount of code … that is costing me what? It isn’t costing me salary, which I can put into hardware and still come out way ahead.

    Java has it’s places, don’t get me wrong, but for most web apps I’ll have to give a resounding “meh” when far easier, faster solutions are out there.

  32. Indrek Altpere Says:

    Well, with javarebel being out now, it speeds up development time (by removing need of full package redeployment to apply code changes) thus making it almost feel as an interpreted-on-the-fly language but still remaining as compiled language.
    But then again, it comes at a price of performance overhead and thus is not advised to be used in production systems. So for production stuff, the undeploy/deploy phase will still remain there.

    Generally, there is place for php and then there’s place for java, depending on the purpose and amount of users for the site.
    For some number crunching programs that need to be as fast a possible, java would be better choice than php, but in many cases to get better memory usage too, c would be best of all.

    As for web applications, it is not as clear, both languages have advantages and disadvantages.
    In java, you can have the server (tomcat, glassfish etc) host the site code, keeping all the main config data, translations, database entity classes etc in memory thus removing unnecessary reading of config files, language files, unneeded database hits/queries etc per each page generation.
    Whereas in php you have to load everything for each page generation. But there are things to ease that: memcached server and php plugin/class to use it to store data, then there are cachers/optimizers to speed up script itself and reduce time spent on parsing php files (APC, eAccelerator, zend optimizer, ioncube etc). Main downside is that although they make php faster, they don’t provide such thing as shared memory as in: have one database row referenced from two separately running php scripts. So if you have 10 scripts that need a row with id of 123, that row is queried from db 10 times (or from memcached 10 times) and after all that, it is taking up memory space in 10 different running scripts.
    But as I said, this is not black and white thing, the fact that there does not exist a shared memory or php provided application scope makes it also less vulnerable to small/medium memory leaks because as you all know it, at the end of php script lifecycle, the process closes and releases al resources. So you can accidentaly have few circular references appearing for every invokation and still see server running happily. As in java (an all other similar long-running server providing languages as asp.net etc), few accidental circular references and unreleased resources may end up in webserver crash pretty soon because memory runs out even with small number of people visiting the site.

    Don’t get me wrong, I’m not saying it’s a good thing to be lazy and/or code badly, but what I have seen so far is that you can write WTF code and write buggy code in absolutely every coding language. Thus, because no one is flawless, it’s actually good that php releases all resources after script completes, making it less error prone. Meaning if for every pageload a 100KB object/memory leak occurs, in php it would be no critical problem to bring the whole site down, but in java (and other similar) it would bring the site down quite easily, causing lots of headaches and financial losses.

    So, all in all, both have negative and positive sides.
    Java would be useful for sites that expect lots of users to view some certain pages or do some similar action at the same time so that the count of different objects in server’s memory at one point in time would be optimal. Thus keeping the memory usage normal and site fast since every page load does not need to load up everything again and store same thing in many different places.
    But when there are not so much requests coming in all at once but rather in a row with few spikes, php is very good to use, since it has nice fault tolerance for a bit unoptimized code and because of the fast deploy time (and with good IDE, quite similar development time to java coded in IDE).
    The most positive thing about php I myself like, is the zero deploy time: when I find a piece of unoptimized code or need to do a little change in production (for example some function gets called twice instead once or some billing logic needs to be changed/fixed etc etc) I edit the file, change it, save it and be done with it.
    As with java, the process would go generally like this: change code, compile, build production deployment package, upload it to java webserver (java webapps tend to grow quite big because of different libraries used so it takes usually quite some time unless you have production server in your own LAN), undeploy old application, deploy new one, check if all is ok.
    And the worst part is, that unless you do some special setup or stuff so that sessions are stored somewhere between undeploy and deploy, all users on the website at that time will be logged out, all session data gets lost. AND depending on the system setup again, they will see some page reporting about site update being in progress or just some general webserver error page.

    In general Java is indeed much (depending on program, sometimes very much) faster than PHP but when you don’t have all the fancy frameworks and libraries to use in java, it is generally much easier to develop in php than in java. Not to mention the class autoloading feature in php with what you can divide complex codeflows to classes so that minimal number class definitions are loaded up at any point of time.

    When using all of those nice and fancy framework, the fastness of java gets a hit plus the memory usage tends to get out of hands also because of the complexity of the codeflow that it takes to actually get to the part to run your function when user does something. I have seen stacktraces that have 50 to 150 or more lines in them, not a nice thing to see, and generally only 2 or 3 of those lines are actually about my own written code.
    Plus the shiny and actually cool component-approach in java thanks to different libraries based on application scope (storing complex viewstate in memory between requests), is two-edge sword IMHO. Depending the complexity of the page, it takes some certain amount of requests (depending on page) without storing cookies, to fill up all the server memory with unused and useless viewstates/sessions.

    Anyhow, both languages have their own places and situations where they are most effective and useful, but comparing a language made for creating websites fast and efficiently to a universal compiled language in a number-crunching test is lost cause upfront for the webpage creating languge :P But it does not mean that “java is the best language for everything” automatically ;) I bet some assembler code would be even faster but we don’t see too many assembler written applications for some reason :P

  33. מחשבות, מחשבים, ושאר דברי בלע » Blog Archive » משתמש תרם שיפורי ביצועים משמעותיים לIP2C Says:

    [...] העסק לילה שלם ושיפור את הביצועים של גרסאת הPHP ב150%. תומס כתב פוסט מעניין על השינויים שהוא עשה, ושלח לי את השינויים. [...]

  34. Pierre Says:

    For a “JaveRebel”-like ANSI C tool, have a look at:

    http://trustleap.ch/

    And, before you ask, yes, ANSI C scripts are (much) faster than Java.

Leave a Comment

Additional comments powered by BackType