Wednesday 5 August 2009

Configuring web servers for HTML5 Ogg video and audio

When serving HTML5 Ogg <video> or <audio> from your web server, there's a number of things you can do to make videos load faster. This post outlines how to configure your web server to improve HTML5 video and audio playback performance.

1. Serve X-Content-Duration headers

The Ogg format doesn't encapsulate the duration of the media. So for the progress bar on the video controls to display the duration of the video, we need to somehow determine the duration. You can either support HTTP1.1 Byte-Range requests (see 2. below) or better yet, serve an X-Content-Duration header for your Ogg videos. This provides the duration of the video in seconds (not HH:MM:SS format), as a floating point value. For example, a video which is 1 minute and 32.6 seconds would you'd serve the extra header: "X-Content-Duration: 92.6".

When an Firefox requests an Ogg media, if you should serve up the X-Content-Duration header with the duration of the media. This means Firefox doesn't need to do any extra HTTP requests to seek to the end of the file to calculate the duration so it can display the progress bar.

You can get the duration using oggz-info, which comes with oggz-tools. oggz-info gives output like this:

 $ oggz-info /g/media/bruce_vs_ironman.ogv
 Content-Duration: 00:01:00.046

 Skeleton: serialno 1976223438
         4 packets in 3 pages, 1.3 packets/page, 27.508% Ogg overhead
         Presentation-Time: 0.000
         Basetime: 0.000

 Theora: serialno 0170995062
         1790 packets in 1068 pages, 1.7 packets/page, 1.049% Ogg overhead
         Video-Framerate: 29.983 fps
         Video-Width: 640
         Video-Height: 360

 Vorbis: serialno 0708996688
         4531 packets in 167 pages, 27.1 packets/page, 1.408% Ogg overhead
         Audio-Samplerate: 44100 Hz
         Audio-Channels: 2

Note that you can't just serve up the Content-Duration line that oggz-info outputs, it's in format, you need to convert it to seconds only, and serve it as X-Content-Duration.

Be warned, it looks like oggz-info makes one read pass of the media in order to calculate the duration, so it would be wise to store the duration value, and not to calculate it for every HTTP request of every Ogg video.

Also be aware that oggz-info does not calculate the duration of videos that start at a non-zero time correctly. oggz-info reports the duration as the time of the last frame, not the time of the last frame, minus the time of the first frame. Edit - 6 Aug 2009: Looks like this was only true for old versions of oggz-info, current versions use the presentation time from the skeleton track to calculate the duration correctly.

2. Handle HTTP1.1 byte range requests correctly

In order to seek to and play back regions of the media which aren't yet downloaded, Firefox uses HTTP1.1 Byte Range requests to retrieve the media from the seeek target position. Also if you don't serve X-Content-Duration, we use byte-range requests to seek to the end of the media (provided you're serving Content-Length) to determine the duration of the media.

Your server should serve the "Accept-Ranges: bytes" HTTP header if it can accept byte-range requests. It must return "206: Partial content" to all byte range requests, else Firefox can't be sure you actually support byte range requests. Remember you must return "206: Partial Content" for requests for "Range: bytes=0-" as well.

If you're curious, see bug 502894 comment 1 for more details of the HTTP requests Firefox can make and why.

3. Include regular key frames

When we seek, we have to seek to the keyframe before the seek target, and then download and decode from there until we reach the actual target time. The further your keyframes are apart, the longer this takes, so include regular keyframes. ffmpeg2theora's default of one keyframe every 64 frames (or about every 2 seconds) seems to work ok, but be aware that the more keyframes you have, the larger your video file will be, so your mileage may vary.

4. Serve the correct mime-type

For *.ogg and *.ogv files containing video (possibly with an audio track as well), serve the video/ogg mime type. For *.oga and *.ogg files which contain only audio, serve the audio/ogg. For *.ogg files with unknown contents, you can serve application/ogg, and we'll treat it as a video file. Most servers don't yet serve the correct mime-type for *.ogv and *.oga files.

5. Consider using autobuffer

If you have the autobuffer attribute set to true for your video or audio element, Firefox will attempt to download the entire media when the page loads. Otherwise, Firefox only downloads enough of the media to display the first video frame, and to determine the duration. autobuffer is off by default, so for a YouTube style video hosting site, your users may appreciate you setting autobuffer="true" for some video elements.


Anonymous said...

Hi Chris, I wonder about a buffer option that's part way between none (default) and everything (autobuffer=true).

Perhaps something that caches the first 10 seconds or so, so I'm a) not wasting too much bandwidth on something I may not watch, but also b) it starts immediately with no latency waiting for the connection and initial buffer.

Just thinking about this because it's something that always annoys me with flash video.

Sheppy said...

Would it be all right with you if I adapt this post into an article on MDC?

Chris Pearce said...

eythian: Buffering part of the resource doesn't really give you much benefit, as on slow connections, where you don't get enough data per second to play without stopping to buffer anyway, you'll still run out of data once you start playing. Maybe you want the autoplay attribute.

Chris Pearce said...

Sheppy: Please do! Thanks!

Sheppy said...

I used this post as the basis for a new article on MDC entitled "Configuring servers for Ogg media", located here:

I made a few style changes and the like, and added a few small tidbits here and there, so feel free to tweak it further if I screwed anything up.

Anonymous said...

Sorry, I meant "it starts immediately when you click play." This isn't so much to give a benefit to slow connections, which are going to be annoying anyway, but on fast ones it removes the few-second delay after hitting play: waiting for the connection, waiting for the initial buffer, and then finally, play.