Monday 11 January 2010

Indexing keyframes in Ogg videos for fast seeking

Seeking in Ogg videos in HTML5 <video> over the network currently can be very slow. This is because when we seek an Ogg/Theora video to a target time, we must perform a bisection search over the file in order to find the target Theora video frame. If this is an interframe (which just records what's changed since its preceding frame), we must then perform another bisection search to find the interframe's keyframe, and then decode forwards to the target frame in order to completely construct it.

A reasonable bisection search implementation may require half a dozen HTTP requests to complete, so if we need to do two bisection searches per seek (once for the target frame, and once for the target frame's keyframe), we actually need to do about a dozen HTTP requests per seek...

If we knew in advance where the keyframes were, we wouldn't need to do any bisection searches; we could make just one HTTP request to the last keyframe preceeding the target frame. Clearly making fewer HTTP requests is faster.

Enter the Skeleton 3.1 with Keyframe Index track. This extends the existing Skeleton 3.0 metadata track to provide an index for every Theora video and Vorbis audio track in an Ogg media. This will enable players to make the optimal HTTP request when seeking in media files served over the internet, resulting in as fast seeking as possible when viewing online video.

It's nice if a video player's UI can display the playback duration of the Ogg media. Unfortunately the raw Ogg format does not store the duration either, so it must be calculated, which requires additional HTTP requests, and slows down the video loading. The Skeleton 3.1 track also includes the playback duration of its Ogg containing file, to eliminate this overhead and speed up loading.

I developed the Skeleton 3.1 with Keyframe Index track in conjunction with the folks at See the Skeleton 3.1 with Keyframe Index wiki page for the work-in-progress specification. Any comments on the specification would be much appreciated, please send them to the Theora mailing list.

I have developed a prototype Ogg indexer, OggIndex, and also recent ffmpeg2theora nightlies will encode keyframe indexes if you specify the command line option -–seek-index. OggIndex and experimental indexing ffmpeg2theora nightlies are available for download.

To see how keyframe indexing improves network seeking performance of HTML5 Ogg/Theora <video>, you can download a development version of Firefox which can take advantage of indexes here:

If you already have a Firefox instance running, you'll need to either close your running Firefox instance before starting the index-capable Firefox, or start the index-capable Firefox with the --no-remote command line parameter.
To compare the network performance of indexed versus non-indexed seeking, point the index-capable Firefox to the indexed seek demo page.

You should notice a clear speed difference when seeking to an unbuffered position in the indexed media.

The Skeleton 3.1 with Keyframe Index specification is still being developed, but we hope to lock it down soon. We are planning to ship support for keyframe index-assisted seeking in Firefox 3.7.


Anonymous said...

Thanks for working on this, the poor seeking that exists now is annoying for users accustomed to the youtube/flash experience.

Any timeline for when this might hit FF3.7 nightlies (which I am already using for the OOPP)?

Chris Pearce said...

Hopefully "within a month". :)

We're reworking the Ogg decoder in bug 531340, so once that's done, I can rework the index-seek support and land that too. We need to land the new decoder first.

John said...

This is great news, it's one of the things that was needed to compete with other video solutions.

The other big issue that remains is, in my opinion, playback performance. The moment you resize the video any (and fullscreen is the most extreme case of this), or the moment you paint anything over it (such as the controls themselves), performance drops through the floor.

This isn't a video problem, it's a problem with all graphics - incidentally, the Direct2D builds work wonderfully. However software rendering could be made a few times faster (and should, in order to compete with that pesky plugin). Is there any ongoing work on that?

Chris Pearce said...

Playback performance is totally implementation dependant, whereas adding a keyframe index to Ogg files is a file format thing.

Part of the reason we're in the process of rewriting our Ogg decoder backend is to improve playback performance. We're working on it!

Also Bas' work on the Direct2D cairo backend will accelerate scaling (on Windows at least) dramatically.

Mike said...

This is fantastic work! I'm very excited to see it integrated into Firefox builds.

Will the new decoder you're working on integrating have support for slow motion playback by any chance?

Chris Pearce said...

The new ogg decoder will make it easier to implement variable playback rate, but variable playback rate feature probably won't make 3.7.

Duncan said...

This is great stuff, and really valuable. At the moment I'm having trouble with the releases referenced in the post. I've a half-hour ogv file and whether I make the indexes with ffmpeg2theora or OggIndex it has trouble seeking in the firefox build you referenced (both mac and pc). When I try the slider it always jumps all the way to the end.

Chris Pearce said...

@Duncan: You probably picked up a build while I was changing stuff. I've just finalized a new version of the index. So if you download a new OggIndex nightly build and a new Firefox build, you should be able to index and seek in your files. We've not updated ffmpeg2theora for the new index yet, it should be updated over the next few days.

anarchafairy said...

Out of curiosity, does the ogg skeleton also provude information about the video duration, or does Firefox still have to search out to the end of the file to work this out?

Chris Pearce said...

The current Skeleton track does not store the duration, but the new Skeleton with keyframe indexes will provide the duration.