This is just a quick entry for me to jot down some caveats that I encountered while making a simple instant messaging application for the browser using Asynchronous JavaScript, XML and PHP.
Properly Set the Content Type of the Returned Document
I stumbled around for a bit and couldn't figure out why the XML document returned by the PHP server script wasn't being parsed into walkable DOM objects at the client side by my JavaScript. Finally it dawned on me that the client must receive the XML content type in the header. I used the following PHP code (before any output):
And suddenly everything started parsing from the client side!
Preventing Browser Caching of the URL
When I finally got an XML document returned, I noticed that Internet Explorer would not retrieve a newer document from a subsequent request (though somehow Firefox was able to get a new XML document). Once I determined that different browsers were experiencing different results, I had an inkling that it was a caching problem on the client-side. When doing periodic "get" via the XMLHttpRequest document, apparently Internet Explorer was checking its cache and noticing that the same URL had been fetched previously and just using that stale document. IE had no way of knowing that the retrieved document could by dynamic content that might vary over time.
The way around this is to modify the URL slightly so that IE believes it is a new document. I did this using the following JavaScript code:
var req = new XMLHttpRequest(); // use ActiveXObject("Microsoft.XMLHTTP") if IE
req.open("GET", urlstr, true);
This inserts an (unused) URL variable into the string and fools the browser into thinking it is a new document (which, in fact, it is!). I have no idea if this should be considered a bug from the IE side or the FF side...
PHP File Locking
Since my little application uses the server's filesystem for its backend storage (and not a database) and transactions can occur in parallel and asynchronously, I took some care to make sure I locked the server-side files before accessing them. I used PHP's "flock" function:
while( !flock($f, LOCK_EX) ) {
usleep(50000);
}
$xmlstr = formatIntoXml(...);
fwrite($f, $xmlstr);
fwrite($f, "rn");
flock($f, LOCK_UN);
fclose($f);
Though I never experienced any odd conflict scenarios even before I implemented file locking (due to the relatively low network traffic that my app experiences), I'm convinced that this is a necessity for any real application (and a very persuasive argument to using a database as the backend storage mechanism).
Good pointers. I ran in to the text/xml thing but I was ready for it (I just went through that with serving the wrong content-type for my SVG docs). As for refreshing the document, I think that might be in the response headers too. Maybe Cache-Control, Expires, or Pragma: no-cache. Have a look at the response headers using the Information menu with the fx web developer plugin. Of course there’s nothing I know that’s equivalent for ie, but what else is new. And for the last tip, a database is definitely the way to go. I’ve been doing a little MySQL in PHP and next to nothing with files, so that colours my view of course.
Yes, don’t forget the header, but by all means don’t make the mistake of thinking you have to use XML to transfer data. I’ll often be using the header
in order to squirt javascript object literals back to the script. In fact I have yet run into a situation where XML was strictly necessary, much less the optimal format.
Chris,
Agreed, the content type need not be XML. In fact, I later reworked my online chat application to actually mangle the XML document before it’s sent over the wire. This means what gets sent across is plain text (text/plain) that looks to any sniffers or casual snooper like gibberish. I un-mangle the text after receiving it. My scheme is by no means foolproof, but it does illustrate your point.
Jeff
aha! i had just run into this (caching) problem on my new site and was wondering if this would work. thanks to google, and to you, for confirming that it will!
i also noticed, not sure this is relevant anymore, that IE has temporary internet file settings, and when you select the option ‘Every Time You Load The Page’ it will compare it’s cache against the new request, so you won’t run into this problem… but good like getting all your visitors to fix their settings just to view your page!
Although very unlikely, the randomization in your code could produce a previously cached URL.
I solved this problem similarly (without randomization):
http://www.howtoadvice.com/StopCaching
Respectfully,
Lonnie Best
Agreed that it is unlikely, but I like your tip of using the timestamp instead of a random string…
Are you sure you can’t set another header on the PHP side to disable caching? Read the header docs at http://us2.php.net/header
“… I have no idea if this should be considered a bug from the IE side or the FF side…”
It shouldn’t be considered a bug at all. You’re telling the server to send a GET request. According to the W3 spec, GET requests are “idempotent”, meaning that the repeated requests with identical input yield identical output. This is why browsers and firewalls cache these types of requests.
You can get around problems entirely if you use POST instead of GET. Most of the time you really want POST anyway. Just change your code like so..
req.open(”POST”, urlstr, true); //Mission accomplised.
Yes, you can certainly use tricks to fool the browser into downloading a new copy for a GET request. However, probably shouldn’t be doing this depending on your intended goal.
I believe “idempotent” is supposed to mean that the side effects of repeated requests are identical to the side effects of one request. Note that this is in regards to “side effects”, not to “response” or “output”.
POST is supposed to be used to denote making a durable change in a resource of some kind.
The real issue is caching, and I think IE has never handled caching very well. (Note: I believe using SSL generally disables caching. Although it increases the bandwidth and computational requirements, it also keeps your transactions away from prying eyes.)
This is a technique I used to get around the caching problem. It is the most elegant solution I found.
ajaxRequest.open(“GET”, url, true);
// This ensures that the result is not cached.
ajaxRequest.setRequestHeader( “If-Modified-Since”, “Sat, 1 Jan 2000 00:00:00 GMT” );
ajaxRequest.send(null);
I noticed this problem too, only that IE on windoz once it loaded content once would never load it again. I got round this by adding a random number at the end of the url after ‘?’. Its well cowboy work-around, coding for IE5.5 and 6 is ever more difficult since it like 4 years behind everyone else.
I tried both doing the timestamp (or randomizing) and If-Modified-Since and they both work on 1 environment. But on another development environment, it fails to retrieve the newer version of the XML file (for IE). FF works just fine (w/o the timestamp or If-Modified-Since).
I use GET to retrieve the XML file as such
http.open(“GET”,”myfile.xml”,true)
HTTP = getHTTPObject();
//var url = “news.xml”+”?ms=”+new Date().getTime();
var url = “news.xml”;
HTTP.open(“GET”,url,true);
HTTP.setRequestHeader(‘If-Modified-Since’, ‘Sat, 1 Jan 2000 00:00:00 GMT’);
HTTP.onreadystatechange = Process;
HTTP.send(null);
On one server, Apache/1.2.33 (Unix) mod_ssl/2.8.22 OpenSSL/0.9.7e PHP/4.3.10 (as returned by
alert(HTTP.getAllResponseHeaders());
It still loads the initial XML file before changes were made to it.
On this server, Netscape-Enterprise/3.6 SP2, as I’m making changes to the XML file, it loads the changed XML just fine.
Any ideas? FF works fine on either servers.
Here’s some code you can put in a php file to prevent caching. It solved my problem using AJAX to hit a php page:
header(“ETag: PUB” . time());
header(“Last-Modified: ” . gmdate(“D, d M Y H:i:s”, time()-10) . ” GMT”);
header(“Expires: ” . gmdate(“D, d M Y H:i:s”, time() + 5) . ” GMT”);
header(“Pragma: no-cache”);
header(“Cache-Control: post-check=0, pre-check=0, max-age=1, s-maxage=1, no-cache, must-revalidate”);
session_cache_limiter(“nocache”);
Thank you, I’ve been searching in JavaScript manual and AJAX references for solutions, and nothing …
It was caching problem from the beginning, thank you guys, very much!
gg 🙂
Another gotcha – don’t leave any blank lines at the start of the XML file – I included a library someone else had written which did this before I started spitting out my Content-Type header and kept getting a null responseXML. As soon as I moved the Content-Type header to the top it was fine.
Hi,
Wonderful to be right here.Awesome work buddy.