The Gamer Corner
Don't let the name fool you, we don't actually play games
Lost your password?

Adventures in Cache-Busting

Adventures in Cache-Busting – August 26, 2013 1:39 AM (edited 8/25/13 9:39 PM)
Talraen (2373 posts) Doesn't Play with Others
Rating: Not Rated
I ran into a bug in one of our features last night which I found particularly interesting, so I thought I would share. Fair warning: there is no guarantee that this will be nearly as interesting to you!

So, the problem: we have a graph widget which is powered by XML files that are constantly updating. The directory where the file resides has a cache time of 30 minutes, but the file is being updated every 30 seconds. Further, the XML file is referenced once by the page it is on, and then the next file to load in sequence is referenced in the XML file itself (sort of a linked list setup).

A primer on web cache busting
A primer on web cache busting
Now, before I continue, a brief summary of how query string cache busting works on the web, in case you're not aware. Simple caching will save a copy of a page based on its URL, so www.sample.com/index.html and www.sample.com/page.html are different but a second call to www.sample.com/index.html will get the same result as the first - as long as it is cached.

A query string, which begins with ?, is part of the URL but doesn't necessarily make any difference on the page. If we requested www.sample.com/index.html?cache=busted, we would get a new copy of that page, effectively bypassing the cache. However, another request to www.sample.com/index.html?cache=busted woudl be cached, so its important for each cache buster to be different (for this reason, extremely large random numbers are often used).

Now if we just kept requesting the same XML file URL, it wouldn't update for 30 minutes. The ingenious solution (so I thought!) was to have each generated XML page add its own cache buster string. So instead of listing "www.sample.com/data.xml" it would be "www.sample.com/data.xml?12345", with a different number on each update. As I mentioned before, the updates happen every 30 seconds. To minimize lead time, the XML is retrieved every 15 seconds.

There was one issue with the page early on: the original xml request was not cache busted, so it always started with the oldest XML (though it could update from there). However, even when that was fixed, the file would only update once, and not again. Can you see the logic flaw in our system?

Spoiler alert: I didn't!
Spoiler alert: I didn't!
One more hint. When I went to the network tab of a browser that had been running the page, I noticed that every request had the same cache buster. Why is this?

The Answer!
The Answer!
The problem is that we were assuming each cache-busted file would be new. This obviously isn't the case since we're polling more often than the file gets updated, but even if we weren't, there's still a chance of an issue if anything gets delayed. What happens is this (and I'll use sequential cache busting for simplicity):

First, we read data.xml?x=1. Inside that file, we have a reference to data.xml?x=2. If by the time we load that file, the XML has been updated, we'll get to data.xml?x=2, which leads to data.xml?x=3. At this point, or at x=2 if the XML hasn't been updated, we will get a file that refers to itself. That's the problem here: if you read data.xml?x=3 and it's the same as data.xml?x=2, then it's going to refer right back to data.xml?x=3, and you will be caught in that loop until the cache clears.

As far as I can tell, there is no absolute solution to this that does not require the caching rules to be changed. (Doing that's a pain, which is why we didn't go that route initially). Having it poll the file less frequently than it updates it could work, but if an update fails for some reason, you'll have the same issue - it's a bit too risky. I'm open to other ideas, but for the time being I'll just have to appreciate this as a tricky little bug.

--
There is no Mythril Sword in Elfheim
Re: Adventures in Cache-Busting – August 26, 2013 7:09 AM (edited 8/26/13 3:09 AM)
Debonair (259 posts) Lurker Extraordinaire
Rating: Not Rated
Do you control the graph widget or is it 3rd party. Could always modify the widget to call blahblah.xml?<generateGuid> and have the Guid be generated each refresh.

Re: Adventures in Cache-Busting – August 26, 2013 12:21 PM (edited 8/26/13 8:21 AM)
Talraen (2373 posts) Doesn't Play with Others
Rating: Not Rated
Yeah that would be the best solution, but it is a third party widget.

--
There is no Mythril Sword in Elfheim
Re: Adventures in Cache-Busting – August 26, 2013 12:39 PM (edited 8/26/13 8:39 AM)
Cuzzdog (1522 posts) Head of Gamer Corner R&D
Rating: Not Rated
I'm not sure I understand web coding quite well enough to follow all of this, but couldn't you use php to grab the $_get[cachebuster] and return an incremented integer that was changed server side? Or can you only grab the $_get if a form is submitted?

alternatively, why not just turn off client-side caching for that page? Isn't there an HTML tag that lets you do that? Or is that not considered a reliable solution?

Re: Adventures in Cache-Busting – August 26, 2013 1:13 PM (edited 8/26/13 9:13 AM)
Talraen (2373 posts) Doesn't Play with Others
Rating: Not Rated
Those are all good ideas, but this isn't a PHP widget. It's Flash, and precompiled at that, so we can't (easily) change it.

As for the caching, client-side caching isn't actually the problem here. Well, I assume it isn't - I didn't actually set up the server, but it seems unlikely we would have put a long cache time on XML files. The problem is server-side caching, which - let me tell you - is a huge bitch at all times. Of course it also lets our pages work without the servers melting, so that's something. Smile

--
There is no Mythril Sword in Elfheim
Re: Adventures in Cache-Busting – August 26, 2013 1:37 PM (edited 8/26/13 9:37 AM)
Cuzzdog (1522 posts) Head of Gamer Corner R&D
Rating: Not Rated
What if you changed the cachebuster to be epoch time instead of an incremental number?

Re: Adventures in Cache-Busting – August 26, 2013 1:40 PM (edited 8/26/13 9:40 AM)
Talraen (2373 posts) Doesn't Play with Others
Rating: Not Rated
You'd have the same problem, for the same reasons. The actual value is irrelevant, because once you read the same file for the second time, it's always going to refer back to itself.

--
There is no Mythril Sword in Elfheim
Re: Adventures in Cache-Busting – August 26, 2013 1:46 PM (edited 8/26/13 9:46 AM)
Cuzzdog (1522 posts) Head of Gamer Corner R&D
Rating: Not Rated
Could you possibly tackle this the other way since it's a server side issue? Can you change the process that is updating the xml file every 30 seconds to force seed the cache with the updated file as well?

Re: Adventures in Cache-Busting – August 26, 2013 1:48 PM (edited 8/26/13 9:48 AM)
Talraen (2373 posts) Doesn't Play with Others
Rating: Not Rated
Depending on how the server-side caching worked, that would be a viable option. I think we actually do have that option on our CDN, but it's expensive and not worth it for something this minor. (It's a cool feature, but not really a traffic driver or anything.)

--
There is no Mythril Sword in Elfheim
Re: Adventures in Cache-Busting – August 26, 2013 1:53 PM (edited 8/26/13 9:53 AM)
Cuzzdog (1522 posts) Head of Gamer Corner R&D
Rating: Not Rated
How expensive can it be? Isn't that what you're essentially trying to do but from a client call perspective? One way or the other, every 30 seconds you need to update the cache with the new xml file. Why not do that when you know the file has changed instead of "guessing" from client calls? And if the answer to that is "we don't expect this file to be hit frequently", then it sounds like it just shouldn't be cached at all.

Re: Adventures in Cache-Busting – August 26, 2013 2:20 PM (edited 8/26/13 10:20 AM)
Talraen (2373 posts) Doesn't Play with Others
Rating: Not Rated
This is all due to how a CDN, such as Akamai, works. This is not something installed on the server, but rather an entirely different server (actually a large collection of servers). Here's a basic breakdown:

The server where the content is actually generated is the "origin server" and is not given a domain that is linked to (since linking directly to it will bypass caching)
The DNS for the domain you want to use is actually configured for the CDN server.

At this point, whenever the CDN server gets a request, it checks to see if it's in cache. If so, it serves it, and if not (or if cache has expired), it gets it from the origin server. The effect is that there are many, many fewer hits on the origin server, so it doesn't need to be nearly as beefy to withstand the load.

Now as a result of all this, there's no real way for the CDN server to know when a file changes. It only checks when the cache is clear. I believe there is an option to set up a push for certain files, but we don't use it because it's impractical. Alternately, you can get some space on the CDN server to serve things directly - but they don't like to set a short cache on these files.

As for not caching it, let's say we have 20,000 viewers (which is fairly low). At one pull per 15 seconds, that's 1,333 origin hits per second on average. That's a lot of traffic - that's the upper end of some load tests. I don't know how many servers Akamai has, but even if it was 1,000, that's 50 hits per second (with a 20-second cache, which is what we're going with) - less than 1/26th as much traffic.

--
There is no Mythril Sword in Elfheim
Active Users: (guests only)
1 user viewing | Refresh