IP2C is a small library that provides IP to country resolution. It uses the free ip-to-country database. IP2C takes the database CSV file that is about 4mb and converts it into a ~600kb binary format and provides PHP and Java frontend to query the database.
The library is great, easy to convert an ip to a country and when using the country flags from it’s side project you could spice up your statistics with the country information. This a lot faster than using reverse DNS lookup.
The problem. The PHP implementation is a lot slower. Embarrassingly slower. Without any caching the Java version is able to do ~6000 queries per second. The PHP counterpart can push through ~850 queries. The implementations are the same. The stats provided by the author of the library are 8000 vs 1200. So about the same as my measurements.
I like PHP, I don’t use it that much anymore but I still care when I see such embarrassing numbers. I took the implementation and started profiling it. Spent the night running different tests and trying to optimize.
General outline of the algorithm is as follows. We take the dotted string IP and convert it to an IPv4 Internet network address (e.g. 69.55.232.153 becomes 1161291929). The DB holds sorted ranges of these addresses. A binary search will happen on these addresses and we have a country for the ip. Take a look at the implementation.
Lets see where the vanilla version of IP2C spends its time at. The results are based on 1000 iterations with
Xdebug enabled and visualized by
KCacheGrind. It processed about 210 IP addresses during this time.
IO part is surprisingly low. The internal fseek, fread constitute to 2% of the execution time. On the other hand the user level fseek which is just a wrapper alone uses 5%. readShort and readInt take 20% of the execution time.
function readShort() {
$a = unpack('n', fread($this->m_file, 2));
return $a[1];}
function readInt() {
$a =unpack('N', fread($this->m_file, 4));
return $a[1];}
function seek($offset){
fseek($this->m_file, $offset);}
Functions calls are expensive. Lets eliminate them. readInt, readShort, fseek are now inlined. Recursion changed to iteration (e.g. 14 000 less function calls). Able to process 400 queries per second compared to the previous 210.
We see that the latest profiling results have twice the number of freads and unpacks than fseeks. It seems that fseek is used to seek out the right position, read two numbers with unpacking them. The implementation confirms that. Luckily we could just read once (2 bytes more) and unpack once (2 unpackings with one invocation).
$a =unpack('N', fread($this->m_file, 4));
$np['ip'] = $a[1];
$a =unpack('n', fread($this->m_file, 2));
$np['key'] = $a[1];
// this can be changed to
$np =unpack('Nip/nkey', fread($this->m_file, 6));
How does this version stack up to the Java version? Lets disable profiling and run 100 000 iterations. Vanilla version processes ~850 IPs, when functions are inlined the number is around 1400. Java version can still do 6000.
Lets try caching. Peeking at the Java implementation shows that Java caching version (whopping 141 242 IPs per second – yup 141k) uses just a byte[] array and makes lookups from there instead of seeking and reading from file. Easy, lets do the same in PHP.
We read everything into a string and instead of fread with access the string elements with the offset. For fseek with just set the offset. We are using 600kb more memory but can increase the throughput to ~2800.
As it seems I’ve just wasted a night, I just should have checked the Computer Language Benchmarks. PHP in the sense of execution speed is uncomparable to Java.
The upside, we can still take the library, eliminate recursions, double unpacks and add caching. A small gain is still a gain.