Monday 11 January 2010

Indexing keyframes in Ogg videos for fast seeking

Seeking in Ogg videos in HTML5 <video> over the network currently can be very slow. This is because when we seek an Ogg/Theora video to a target time, we must perform a bisection search over the file in order to find the target Theora video frame. If this is an interframe (which just records what's changed since its preceding frame), we must then perform another bisection search to find the interframe's keyframe, and then decode forwards to the target frame in order to completely construct it.

A reasonable bisection search implementation may require half a dozen HTTP requests to complete, so if we need to do two bisection searches per seek (once for the target frame, and once for the target frame's keyframe), we actually need to do about a dozen HTTP requests per seek...

If we knew in advance where the keyframes were, we wouldn't need to do any bisection searches; we could make just one HTTP request to the last keyframe preceeding the target frame. Clearly making fewer HTTP requests is faster.

Enter the Skeleton 3.1 with Keyframe Index track. This extends the existing Skeleton 3.0 metadata track to provide an index for every Theora video and Vorbis audio track in an Ogg media. This will enable players to make the optimal HTTP request when seeking in media files served over the internet, resulting in as fast seeking as possible when viewing online video.

It's nice if a video player's UI can display the playback duration of the Ogg media. Unfortunately the raw Ogg format does not store the duration either, so it must be calculated, which requires additional HTTP requests, and slows down the video loading. The Skeleton 3.1 track also includes the playback duration of its Ogg containing file, to eliminate this overhead and speed up loading.

I developed the Skeleton 3.1 with Keyframe Index track in conjunction with the folks at Xiph.org. See the Xiph.org Skeleton 3.1 with Keyframe Index wiki page for the work-in-progress specification. Any comments on the specification would be much appreciated, please send them to the Theora mailing list.

I have developed a prototype Ogg indexer, OggIndex, and also recent ffmpeg2theora nightlies will encode keyframe indexes if you specify the command line option -–seek-index. OggIndex and experimental indexing ffmpeg2theora nightlies are available for download.

To see how keyframe indexing improves network seeking performance of HTML5 Ogg/Theora <video>, you can download a development version of Firefox which can take advantage of indexes here:

http://pearce.org.nz/video/firefox-indexed-seek-linux.tar.bz2
http://pearce.org.nz/video/firefox-indexed-seek-macosx.dmg
http://pearce.org.nz/video/firefox-indexed-seek-win32.zip

If you already have a Firefox instance running, you'll need to either close your running Firefox instance before starting the index-capable Firefox, or start the index-capable Firefox with the --no-remote command line parameter.
To compare the network performance of indexed versus non-indexed seeking, point the index-capable Firefox to the indexed seek demo page.

You should notice a clear speed difference when seeking to an unbuffered position in the indexed media.

The Skeleton 3.1 with Keyframe Index specification is still being developed, but we hope to lock it down soon. We are planning to ship support for keyframe index-assisted seeking in Firefox 3.7.