PHP file_exists() Performance

For one of our projects we need a large amount of cheap storage, an NFS mounted NAS suits us. But we also want to load most recently used files from the server’s local disk, so that the notoriously slow NFS protocol doesn’t get the better of us during peak times. We’ll simply run a cron each hour to mirror our most recently used files to the local disk. Think of it as a very simplified reverse proxy cache.

Now, we could store the current location of the files in the database, e.g. file1 is on the nas, file2 is on the nas and the local disk but that would mean more work, and an extra query on the database.

Databases are generally more expensive to scale than other website components, so we are simply going to check if the file exists on the local disk and if it does then serve it from there, and if not then load it from the NAS directory.

This raised the question of whether checking if the file exists on the local disk take too long, especially when checking several million times per day. The answer is no. Using PHP to check whether 10 million different files exist on the local disk takes approximately 40 seconds, that’s about 4 millionths of a second per check which is acceptable for our needs of 100,000 image loads per day.

Here’s the code we used to check:

<?php
 $x=0;
 $start_time=time();
 while($x<10000000)
 {
 $x++;
 }
 print "loop without checks took ";
 echo time() - $start_time;
 print " seconds <br /><br />";

 $x=0;
 $start_time=TIME();
 while($x<10000000)
 {
 $name=$x . ".wmv";
 if(file_exists($name))
 {
 //do nothing
 }
 $x++;
 }
 print "loop with checks took ";
 echo time() - $start_time;
 print " seconds <br /><br />";
?>

And here was the output from the script:

loop without checks took 1 seconds

loop with checks took 41 seconds

Obviously the next step to increase performance would be to have the “local disk” which caches the files actually be a RAM disk, however we’re many millions of image loads per day away from needing to do that. For now we’ve solved our limited storage issues without taking a performance hit.

2 Responses to “PHP file_exists() Performance”

  1. Thiemo says:

    Did the 1Mio files all lie in the same folder?

  2. admin says:

    Yes, but feel free to try it with multiple folders, and let us know if you see any difference. I don’t believe you will find any significant difference in performance. The server was using the EXT3 file system.

Leave a Reply