Put Your PHP App on Steroids – Optimizing with APC Cache

Nothing’s cooler than writing a bad-ass site or application and watching it gain popularity and a significant user base. By the same token, nothing’s more frustrating than watching your app fall on its face when its running under high load. If you’re like me, you know how disheartening this can be, as it usually means that its time for a hard dose of reality: your code probably isn’t as awesome as you thought it was.

Or is it? There’s a whole slew of things that a person could point a finger at regarding slowly running code. The most-oft thing that gets called out is code with a lot of overhead (a ton of includes, excessive logic, and the like), and right behind that is poorly designed databases (unoptimized indexes, no indexes, ridiculous amounts of joins, blah, blah, blah). Well, let’s assume for a minute that you’ve got a fair amount of experience under your belt, so you know your code is pretty damn optimal, and you also have a DBA buddy that took a look at your database and helped you tweak it up a bit. What do you do if this isn’t enough? Before you say, “throw more / better hardware at that mo-fo”, why not take a moment and learn about APC: Alternative PHP Cache…

OK, as usual, let’s make sure we are aware of a few general understandings before we get into the goodness of APC. It is very important to understand that despite all your optimization work, you’re always going to run into two major hurdles when it comes to keeping your app running blazing fast… that’s just the way it is. True, there’s a lot of things I’m glazing over here, and there’s even more points that could be debated… if you want to talk about it, feel free to contact me, I love talking code. For the sake of brevity and clarity, however, we’re going to steer clear of the nitty-gritty and look at things in a much broader scope. So, anyway, those two hurdles: memory, and database bottle-necking. You’re always going to need more memory, and at some point, your database is going to be slowing your app down. But before you get ready to get that beefier server, or a separate database server, or anything else (as much fun as new toys can be), let’s take a look at what can be done with your existing hardware and code.

I think it’s a fair statement that a lot of applications retrieve the same information from a database over, and over again, and that it’s also safe to say that this information gets updated at predictable intervals. Wouldn’t it be nice if we could store this data somewhere so we don’t have to keep pinging the database over and over for the same thing? It would also be pretty cool if we could leave that data stored, and update it not only in the database, but in the “storage area” as well? Good news kids, you can certainly do that… all with the magic of caching… APC caching!

I’m going to show you how APC can be leveraged to cache your application data, thus eliminating a ton of repetitive trips to the database, and thus removing your bottleneck! As an added bonus, I’m also going to do my best to explain some of the other goodness of APC and caching in general, as a lot of the articles floating around the web assume you simply understand what “opcode caching” and other stuff like that means (I certainly didn’t for a while).

Hopefully, if you’ve written an app that would benefit greatly from APC, you understand how PHP works. If you don’t, here’s a quick review… PHP is an interpreted language, which means that the code you write isn’t actually compiled, like actual application code is. The PHP code you write is compiled at the time it’s requested, and it’s compiled every time. Obviously, this is a really fast process, but I’m sure you can see how it’s also problematic. Well, that’s why APC was developed (by one of the guys that developed PHP, in fact). APC keeps that PHP code you’ve written compiled in memory, or caches it. This is what an opcode cache is, nothing more than a cache of pre-compiled code. By simply installing APC and enabling it, you will probably see a noticeable speed improvement in your app. But there’s a lot more we can do with APC to speed up an app, and we’ll get to that shortly, but I think I need to answer a more pressing question: “How the hell do I get and install APC?”

It’s really easy, so don’t fret. APC is free, and it’s a PECL extension, so there’s no need to re-compile PHP. What’s more, the default settings are pretty darn optimal, so there’s really no need to tweak the configuration once you’ve got it up and running. Keep in mind, since you’re going to be caching everything in memory, you’re going to need some memory to spare. So, don’t install this on a server with 256 megs of RAM, but don’t go thinking you need to run out and get something with 4 gigs of RAM either. You can get by just fine with a gig or so (even 512 if you want), and that’s a pretty standard amount you get with most dedicated / VPS setups these days. Anyway, rather than going into the actual details of installation, just check out the PHP page for APC. It’s got everything you need, including links to installing PECL extensions: www.php.net/apc.

Now, assuming you’ve got everything installed and set up, let’s get busy looking at how we can use APC in our own code to make things more gooder. Let’s pretend you have an application that fetches some news to be displayed on a home page, and this page gets a lot of traffic, but the news isn’t updated all that often (maybe once or twice a day). You’ve probably got something that looks like this to fetch all that:

function getNews()
{
     $query_string = 'SELECT * FROM news ORDER BY date_created DESC limit 5';
     $result = mysql_query($query_string, $link);
     ....
     return $newsThatYouParsed;
}

Let’s take a look at what we could do to have APC store that $result in memory so we don’t have to hit the database constantly for relatively static data. A good way to think about caching is to treat it like an array. You’re going to store and retrieve data by using a key, the same as you would in an array. The only difference is that we’ve also got to decide how long we want to store this data in cache. This should all make a bit more sense once you take a look at the following code, which defines a class that we can use to interact with the cache (and we’ll later use in our getNews() example):

class CacheManager
{
     public function fetch($key)
     {
          return apc_fetch($key);
     }

     public function store($key, $data, $ttl)
     {
          return apc_store($key, $data, $ttl);
     }

     public function delete($key)
     {
          return apc_delete($key);
     }
}

Pretty straight-forward, right? Everything should make sense to you, except perhaps the $ttl. This is the number of seconds we’d like the value to stay in cache (or 0 to keep it until we explicitly remove it)… it’s also short for “time to live”. Anyway, let’s assume that you’ve included this class definition somewhere else in your app, and you now want to use it to speed up your news function. Well, we need to decide what key to use when we store this data, and this decision depends on whether or not other functions might be using this cached data as well. This key needs to be unique enough that you won’t accidentally overwrite it elsewhere in your code, and easy enough to fetch without doing a ton of processing. Think about it this way: you might have a query that gets run often (maybe fetching a user’s name from the database to display on every page), but it’s slightly different for each visitor (everything’s the same, except perhaps a user id, or date range). There’s a really easy way to deal with this issue, and rather than explain it, I’ll just implement it in my above getNews() example:

// make sure the cache class is included somewhere
function getNews()
{
     $query_string = 'SELECT * FROM news ORDER BY date_created DESC limit 5';

     // see if this is cached first...
     if($data = CacheManager::get(md5($query_string)))
     {
          $result = $data;
     }
     else
     {
          $result = mysql_query($query_string, $link);
          $resultsArray = array();

          while($line = mysql_fetch_object($result))
          {
               $resultsArray[] = $line;
          }

          CacheManager::set(md5($query_string), $resultsArray, 3600); // whatever TTL is right for you
     }

     ....
     return $newsThatYouParsed;
}

Pretty neat, huh? By simply using an MD5 hash of the query string, we’re pretty reliably guaranteed to have a unique key for that dataset, and we don’t need to write a ton of extra logic to remember that key in the future. But what do you do if the news is updated before the cache expires? Well, it depends on whether or not it needs to be super-fresh. Once the item expires (assuming the ttl wasn’t zero), the cache will be refreshed. This is nice because essentially only one user will ever cause a request to the database for your news per hour (in our example). Every other user will be seeing the cached results. No more database bottleneck! If you need to make that cache update whenever news is added, regardless of expiration, just add some code to your update functionality. Say you had an update function, here’s what you could do to make it update the cache and the database:

function updateNews($newsData)
{
     $query_string = 'UPDATE news SET ...';
     mysql_query($query_string, $link);

     // this is really basic, and you should do something much more elegant
     $query_string = 'SELECT * FROM news ORDER BY date_created DESC limit 5';
     $result = mysql_query($query_string, $link);
     $resultsArray = array();

     while($line = mysql_fetch_object($result))
     {
          $resultsArray[] = $line;
     }

     CacheManager::set(md5($query_string), $resultsArray, 3600);
}

Why not just call “getNews()”? What if it had an echo statement in there? Wouldn’t look so good having your news appear randomly every time you updated it, would it?

Anyway, that’s a real quick look at how easy it is to super-charge your app with a little bit of extra code, and one cool PHP extension. In my personal experience, it does make a huge difference with high-load applications, but it does have one hitch: this is only good if your app runs on one server. If you have multiple servers, the cache can’t be shared… not with APC at least.

Which brings me to the closer…
Very soon, I’ll talk about what you can do if you need to have, and use, a shared cache across many servers (it’s actually the same technology the folks at facebook use, and was invented to speed up livejournal). In the meantime, take a few minutes to play around with APC… I think you’ll come to the same conclusion that I have: It’s pretty damn bad-ass ;)

Other's Thoughts   (10 so far...)


  • Christian Decker
    Apr 6 '08 at 8:54 pm

    Great article, I’m definitely adding APC capabilities into my Database abstraction layer from now on.
    I was wondering wether I can store arbitrary datasets into the cache, for example some datastructures for complex routing calculations that I’d like not to repeat every time.


  • Ian
    Apr 7 '08 at 12:44 am

    @Christian,

    You can store anything that php can serialize in the cache… which is just about anything in php ;)

    Thanks for the comment!


  • Christian Decker
    Apr 8 '08 at 8:21 pm

    Thanks Ian for responding so quickly. I’m still having a small problem with the above example code as it saves a resource descriptor in the cache. Is the apc_cache intelligent enough to extract the resultrows from the resource, or did I get something wrong with the serialization?


  • Ian
    Apr 8 '08 at 8:44 pm

    Christian,

    I actually missed a bit of my code in the examples… you have to parse the result set then everything’s hunky-dory. I’ve updated the samples appropriately (I’m just used to having the pre-parsed from a database abstraction layer)


  • Jilles
    Apr 11 '08 at 2:08 am

    This is a good article. One note though: for large systems you’d want a cache that is local to that machine (APC can do this for you), but also a cache that is centralized (ex. memcached). You’ll want this as the number of servers is growing, you will increase the cache misses (depending on the data and it’s access patterns).


  • Ian
    Apr 11 '08 at 3:07 am

    @Jilles

    You’re right, but I didn’t want to try and wrap them up into one post since there’s a bit more to getting memcached going than a simple PECL extension… I did mention right at the closer of the article that I’d hopefully be getting around to working with mecached (I’m just cooking something else up first that should be pretty cool ;)

  • IT-Republik-php
    Apr 11 '08 at 4:44 am

    Web-Apps auf Speed: Eine Einführung in Caching mit APC…

    Wenn eine Webanwendung erst einmal populär wird, kann die Last schnell zum …


  • PHP cache performance
    Oct 12 '08 at 12:57 pm

    I was about to have a whinge about the needless, habitual reaching for OOP techniques in the simplest code examples – and then I was going to harp on about how you are aiming for better performance, but that OOP approaches don’t generally lend themselves to such ends.. and then I decided to run some experiments based on your code (http://www.spiration.co.uk/post/1392/PHP performance test – functional, or OOP comparison). Now I have to retract all of the above, since your code appears to perform distinctly better than the procedural equivalent. That’s gotten me scratching my head now :) ..


  • Radek Havelka
    Mar 5 '09 at 5:58 am

    Hi,
    first of all, thank you for the sharing of this code and article.
    Let me remind you, that codes as mentioned in the article are not 100% correct : in class definition you are using “fetch” and “store”, in following code you are using “get” and “set” methods, so if someone (like me :) ) blindly copies the code, it will not work :) Else, the method is great and the APC caching reduced my pure PHP server load by 50%, which is very significant number. Thanks again.
    R.


  • Dominic Watson
    Jan 10 '11 at 6:18 am

    He’s not using an object oriented approach for performance reasons… I don’t think there’s any need to dabble in such a minimal performance gain :D

  • Share Your Thoughts...

    Some HTML is ok. If this is your first comment on my site, it will be reviewed before being posted publicly. Your comment may be edited or marked as spam if it appears intended for SEO purposes.