A while back, I was running into hangs on Facebook with the html5 parser. I wanted to reproduce the problem locally and try to minimize it. However, Facebook is pretty complicated and loads content using ajax, so the traditional approach of saving the page didn't work. I needed a different approach, so I hacked up a http proxy to be able to record and replay server interactions. It works by writing out the result of each GET request to a file named after the Request-URI. After you're done recording an interaction with the server, you can switch to replay mode and instead of proxying, the recording will be played back to the browser.
I should mention a couple of limitations to this approach. First, it assumes that the resources specified at particular urls don't change over the recording session. Second, it assumes that the resources requested are dependent only on the data recorded and not on the time or anything like that. Even with these limitations, the proxy seems to work well enough.
It's currently pretty hacky, but you can grab it from git://anongit.freedesktop.org/~jrmuizel/http-recording-proxy. There's a README in the repository explaining how to run and replay. I'd love to hear if this of any use to anyone else or if you have patches for improving it.