Archive for the ‘review’ Category
Case study: Is PHP embarrasingly slower than Java?
Written by Toomas Römer on August 4, 2008 – 11:50 amIP2C is a small library that provides IP to country resolution. It uses the free ip-to-country database. IP2C takes the database CSV file that is about 4mb and converts it into a ~600kb binary format and provides PHP and Java frontend to query the database.
The library is great, easy to convert an ip to a country and when using the country flags from it’s side project you could spice up your statistics with the country information. This a lot faster than using reverse DNS lookup.
The problem. The PHP implementation is a lot slower. Embarrassingly slower. Without any caching the Java version is able to do ~6000 queries per second. The PHP counterpart can push through ~850 queries. The implementations are the same. The stats provided by the author of the library are 8000 vs 1200. So about the same as my measurements.
I like PHP, I don’t use it that much anymore but I still care when I see such embarrassing numbers. I took the implementation and started profiling it. Spent the night running different tests and trying to optimize.
General outline of the algorithm is as follows. We take the dotted string IP and convert it to an IPv4 Internet network address (e.g. 69.55.232.153 becomes 1161291929). The DB holds sorted ranges of these addresses. A binary search will happen on these addresses and we have a country for the ip. Take a look at the implementation.
Lets see where the vanilla version of IP2C spends its time at. The results are based on 1000 iterations with Xdebug enabled and visualized by KCacheGrind. It processed about 210 IP addresses during this time.
IO part is surprisingly low. The internal fseek, fread constitute to 2% of the execution time. On the other hand the user level fseek which is just a wrapper alone uses 5%. readShort and readInt take 20% of the execution time.
We see that the latest profiling results have twice the number of freads and unpacks than fseeks. It seems that fseek is used to seek out the right position, read two numbers with unpacking them. The implementation confirms that. Luckily we could just read once (2 bytes more) and unpack once (2 unpackings with one invocation).
How does this version stack up to the Java version? Lets disable profiling and run 100 000 iterations. Vanilla version processes ~850 IPs, when functions are inlined the number is around 1400. Java version can still do 6000.
Lets try caching. Peeking at the Java implementation shows that Java caching version (whopping 141 242 IPs per second – yup 141k) uses just a byte[] array and makes lookups from there instead of seeking and reading from file. Easy, lets do the same in PHP.
We read everything into a string and instead of fread with access the string elements with the offset. For fseek with just set the offset. We are using 600kb more memory but can increase the throughput to ~2800.
As it seems I’ve just wasted a night, I just should have checked the Computer Language Benchmarks. PHP in the sense of execution speed is uncomparable to Java.
The upside, we can still take the library, eliminate recursions, double unpacks and add caching. A small gain is still a gain.
Posted in Featured, report, review | 34 Comments »
COBOL blog platform
Written by Jevgeni Kabanov on April 1, 2008 – 2:59 pmSeveral weeks ago, while working on JavaRebel AI Module, we accidentally gave it access to our web server. Before we found out it rewritten all of our blog platform in COBOL. We are not sure where did it learn to program that, but when we tried the new platform, it was excellent. Not only is it a SOA-based RIA, but it’s fully written using REST, JSON and CAPS.
In fact it is now our firm belief that with technologies like that COBOL will make a return and become the language of choice for web development. I mean who needs local variables, recursion, dynamic memory allocation, or structured programming constructs when we have a language that reads like plain English. All the real programmers know, how important is to have code that reads well, and our new blog platform provides twice the scalability of Java on half the hardware to boot.
Tags: cobol, java, javarebel
Posted in review | 3 Comments »
Mozilla Prism gets an overhaul
Written by Jevgeni Kabanov on March 22, 2008 – 2:11 pmAlthough two weeks late, I finally noticed that Mozilla Prism has been updated. Mozilla Prism is a “One Site Browser”, which is to say a browser started from your desktop tied to one particular web site. I have been using it since the first release, mainly to separate the Google Mail, Reader and Calendar windows from the rest of my browsing experience.
The new version is a significant reworking of Prism. First of all you no longer have to install a 6.6 Mb application in addition to Firefox. Now you can just download a 500 Kb Firefox extension, which will start Prism as a particular Firefox profile. And you can create the desktop shortcuts to your web site in one click using “Tools -> Convert Website to Application”.
Secondly Prism will finally pick up the favicons that the website is using and use it both as shortcut icon and (drumroll!) the application window icon! Before you had to download the icons manually and it still would use the Prism icon in the taskbar, which made it much harder to distinguish the windows. Having the GMail icon in the taskbar is just what I’ve been waiting for. Now, if only it would change on new mail…
However, no matter the changes, I’m still stuck with a major annoyance — no Firefox shortcuts work. And since half the sites on Internet do not optimize for 1680×1050, my first reaction in Firefox is often Ctrl+, to increase the font size. Well, hopefully they hit it in the next release, you hear that, Mozilla?
P.S. I’m making this post from my nifty dow.ngra.de admin browser app, now where do I find an icon for that?…
Tags: firefox, web
Posted in review | 1 Comment »
WordPress Plugin Format
Written by Toomas Römer on March 19, 2008 – 4:20 amGoogle Summer of Code has announced this years’ OSS projects that have been accepted to the program. While going through the list I stumbled upon some features that I would really like to have in WordPress, like Plugin Update Notification, One-click installation of themes/plugins and this, that and this. They all have something in common, they all require a better plugin format!
There are tons of plugins out there and only a couple support these features. Why? Well, if you look at the Writing a Plugin documentation you see that a plugin can be packaged absolutely in any possible way.
Plugins can be zip, rar, gz, bz2 etc. archives. They can contain any number of subdirectories that the administrator has to copy to certain other folders and the requirement of meta information is quite relaxed. The automation of installation, upgrade, deletion, versioning for all the plugins is impossible with such relaxed rules.
WP plugins (like the others already have — FF extensions, Eclipse plugins, Google Gadgets, WARs …) should have a solid structure so that the management of the plugins could be automated. One click install, update notifications or automatic updates, public repositories and other cool features can be built upon this.
Besides all the other great ideas already posted I’ve submitted one that would try to address these problems from more ground up.
Tags: php wordpress gsoc
Posted in opinion, review | No Comments »
Aptana Jaxer or sliced bread?
Written by Jevgeni Kabanov on February 19, 2008 – 8:45 amSometimes a technology appears that is just so damn cool you are amazed. More often than not the ideas behind it can be quite simple.
Aptana Jaxer is exactly such a technology. There is nothing new about having a server-side API. There is nothing new about building applications in HTML and JavaScript. The genius part is running Mozilla engine on the server-side and having access to full server resources from the browser via a controlled environment with no extra layers.
As far as I understood the communication between server and client is done by:
- DOM updates are seamlessly propagated from the server to the client
- Function calls can be proxied to call to the server. All marshalling is done automagically
This pretty much means that code runs seamlessly in a mixed, secure environment.
Although the “runat” attribute that controls whether code is run server-side or client-side can be a bit unlogical at first, the setup can allow to build powerful applications with only one (count it, one) technology — HTML and JavaScript. And this is the technology you have to use anyway to even deploy something on the web.
Of course “the one technology” suffers from being dynamically typed and generally known for its quirkiness, but with JavaScript 2.0 support on the way to Mozilla and good IDE support (which Aptana is in a good position to provide) this technology might yet give a fresh meaning to the word web application.
An interesting question I’d like to ask from the Aptana developers is if we can add a third environment to the mix — desktop applications? Having the same API to use on the server, in the browser and (with less restrictions) on the desktop could give Adobe AIR a run for their money.
Tags: ajax, javascript, web
Posted in review | 3 Comments »

