Thursday, 31 March 2011

HTML5 Video painting performance statistics in Firefox 5

I've landed video frame paint performance counters for HTML5 video onto mozilla-central. This should ship in Firefox 5, barring any disasters. This work was a combined effort by Chris Double and I. These are Mozilla specific fields which will only be available in Firefox.

The new statistics enable us to measure the performance of the video decoding and frame painting pipeline.

This adds the following fields to the HTMLVideoElement:
  • mozParsedFrames - A count of the number of video frames that have been demuxed/parsed from the media resource. If we were playing perfectly, we'd be able to paint this many frames.
  • mozDecodedFrames - A count of the number of deumxed/parsed video frames that have been decoded into Images. We skip decoding of parsed/demuxed frames if the decode is falling behind the playback position (this can happen if it takes a long time to decode a keyframe for example).
  • mozPresentedFrames - A count of the number of decoded frames that have been presented to the rendering pipeline for painting (set as the current Image on the video element's ImageContainer). We may not present decoded frames if the frame arrives for presentation late.
  • mozPaintedFrames - A count of the number of presented frames which were painted on screen. We may end up not painting presented frames if another frame is presented before the graphics pipeline has time to paint the previously presented frame, or if the video is off screen. 
  • mozFrameDelay - The time (as a floating point number in seconds) which the last painted video frame was rendered late by. This is the time duration between the decoder saying "paint frame X now", and the graphics pipeline physically getting frame X displayed on the screen. The value is accurate on desktop Firefox, but not on mobile. Improvements in the graphics pipeline, and the integration with the graphics pipeline, will show up as a decrease in this number.
Here's a demo of the video paint statistics in Firefox 5. You'll need a recent Firefox trunk nightly build for the demo to work.

Thursday, 17 February 2011

Firefox 4 video decoder architecture

To assist others coming up to speed on the architecture of the video decoder, I've put together a diagram of Firefox 4's video playback engine. We rewrote our video architecture for Firefox 4 in order to give us better control over the complete stack.

Click on the image for a larger diagram.

The key classes in our architecture are:
  • nsHTMLMediaElement - This manages the JavaScript/HTML accessible HTMLMediaElement interface, and implements the resource selection, load, and preload logic.
  • nsBuiltinDecoder - Manages a main thread accessible snapshot of the state of the underlying decoder. The decoders run on non-main threads, and we don't want to block the main thread to dig into the decoders when JS queries playback state, so we maintain a snapshot of the playback engine's state in this class. This inherits from nsMediaDecoder. You can also implement playback support for a new format by inheriting and implementing nsMediaDecoder. nsWaveDecoder is currently implemented this way, but we're in the process of reimplementing that as a sublcass of nsBuiltinDecoderReader.
  • nsBuiltinDecoderStateMachine - Manages the decode, state machine, audio-push threads, frame queueing, A/V sync, and buffering logic. This ensures that all the HTML5 events get dispatched at the appropriate time, and that behaviour is consistent and sane across different media types. Demuxing is handled abstractly by subclasses of nsBuiltinDecoderReader. This way all media types can share as much playback logic as possible, reducing our maintenance overhead.
  • nsOggReader/nsWebMReader - Demuxing and codec specific functionality is implemented by subclassing nsBuiltinDecoderReader. This reduces the amount of work required to implement and maintain support for new codecs. When a new codec is implemented as a nsBuiltinDecoderReader subclass, support for HTML events, buffering, and playback logic does not need to be reimplemented, since it already exists in nsBuiltinDecoderStateMachine. To add support for a new codec, it's easiest to implement support as a new nsBuiltinDecoderReader subclass.
  • nsAudioStream - Our cross platform audio API wrapper. It is based on libsydneyaudio, which operates on a push model rather than a (more commonly used) callback-based model, which has brought in a whole raft of headaches. Matthew Gregan is in the process of rewriting our audio layer to a more sane callback based model. We also provide a cross-process nsAudioStreamRemote, which proxies audio commands to an audio stream in another process. This is required on mobile.
  • ImageContainer - When it comes time to present a video frame, nsBuiltinDecoderStateMachine sets it as the "current image" of the video element's ImageContainer object. This then propagates through the Layers/2D scene rendering system, and it eventually gets rendered on the screen. The Layers compositing runs on the main thread, and ImageContainer provides a thread-safe wrapper. The images contained in the ImageContainer can be in OpenGL/D3D surfaces, so we can take advantage of hardware accelerated scaling, rendering, and YCbCr to RGB conversion.
  • nsVideoFrame - This resides in layout, and manages the dimensions/reflow of the video, as well as its poster image.
  • nsMediaStream - Our network code runs on the main thread, but the underlying libraries we use for media decoding (libvpx, libtheora, etc) assume synchronous reads. We can't afford to do blocking reads on the main thread, so we cache the media data downloaded into the nsMediaCache, and provide a thread-safe wrapper synchronous wrapper for reading in the nsMediaStream class. We use Necko for our networking, so we can take advantage of all the existing security and load-group functionality it implements.
The advantage of controlling the entire playback engine are many. We can easily control frame dropping, memory allocation, the threading model, what, when, and how we decode, and we can integrate more tightly with our network stack.

Wednesday, 12 January 2011

Google to remove H.264 support from Chrome

Google is removing H.264 support from Chrome in the coming months. This is good news, as it means we're closer to having a high quality patent unencumbered video format for use in HTML5 video that works across all browsers. Now we're just left with Safari, Apple's iDevices, and IE9 as natively supporting patent encumbered video formats. Microsoft has said that IE9 will play "VP8 video when the user has installed a VP8 codec on Windows", it will be interesting to see how that pans out.

Monday, 8 November 2010

How to stop a video or audio element downloading

Say you've started playback of an HTML5 audio or video element, and you decide you really want to cancel playback and downloading of the media resource. Stopping playback is easy, just call the pause() method. But the network connection won't be stopped until the media element gets garbage collected, and even if you release all references to the media element, it won't be destroyed until the browser decides to runs its garbage collector. How can you stop the download of the media resource in the meantime? Here's a quick hack to achieve this: just reset the element's src attribute to the empty string. This destroys the element's internal decoder, and stops the network download.

For example:

<audio id="audiocontrols>
  <source src="funky-music.ogg">
  <source src="funky-music.mp3">
</audio>
<!-- ... Some time later, we decide we should stop the audio element playing and downloading... -->
<script>
var audio = document.getElementById("audio");
audio.pause();
audio.src = ""; // Stops audio download.
audio.load(); // Initiate a new load, required in Firefox 3.x.
</script>


Be aware that this will destroy the media element's decoders, so the element won't be playable anymore, and it will be rendered as an "error cross" if it's in a document. Also in Firefox 3.x you need to call load() after changing the source, whereas in Firefox 4 the load is scheduled to run when you change the src attribute, and the extra load() call is not required.

Changes to HTML5 video/audio load() function in Firefox 4

I've updated the media load() implementation in Firefox 4 to match the current WHATWG media load algorithm specification. There are three main changes that web developers using media elements should be aware of.

Firstly, error reporting has slightly changed. When a media element fails to load from its <source> children, an "error" event is dispatched to every child element which failed to load. Previously in Firefox 3.x you'd receive only one "error" event dispatched to the media element once all of its child <source> elements had failed to load. Now you only receive "error" events in the child <source> elements, and not in the media element itself.

For example, suppose you have the following markup:

<video>
<source id="mp4_src"
        src="video.mp4"
        type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
</source>
<source id="3gp_src"
        src="video.3gp"
        type='video/3gpp; codecs="mp4v.20.8, samr"'>
</source>
<source id="ogg_src"
        src="video.ogv"
        type='video/ogg; codecs="theora, vorbis"'>
</source>
</video>

Firefox 4 does not support the playback of patent encumbered content, so you'll receive "error" events in the <source> elements with the MP4 and 3GP resources, before the Ogg resource is loaded. Note also that the <source> children are loaded in the order in which they appear in the markup, and if one <source> child successfully loads and is playable, the children after it won't be loaded.

To detect that all child <source> elements have failed to load, check the value of the networkState attribute of the media element; if its value is HTMLMediaElement.NETWORK_NO_SOURCE, you know all child <source> elements have failed to load.

If you add another child <source> element to a media element which is in networkState HTMLMediaElement.NETWORK_NO_SOURCE, it will attempt to load the resource specified by the newly added <source>.

Secondly, when the load begins has changed. When you set the src attribute of a media element, or add a <source> child element to a media element, a load will be scheduled to run asynchronously as soon as the current JavaScript context exits (basically the load starts the next time we return to the browser application's main event loop). So for example, suppose you have some timeout set such as:

<script>
setTimeout(
  function() {
    var v = document.createElement("video");
    v.src = "video.ogg";
    document.appendChild(v)
    // Do some other stuff...
  },
  1000);
</script>

When the function runs, the load for the video element won't begin until after the function returns, and control returns to the browser. This is important, because in Firefox 3.x, the load is started (and could even run to completion!) as soon as you set the media element's src attribute.

Lastly, the media element's events now no longer bubble. Previously they bubbled, and this was a bug in Firefox 3.x and in violation of the spec.

Sunday, 17 October 2010

Using pymake to build mozilla on Windows

A number of people, myself included, have had trouble getting pymake to build Mozilla on Windows. Roc discovered the trick to getting it to work: use relative paths in the mozconfig. For example, my mozconfig (which lives in $srcdir/mozconfig) for a debug build is:

. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/objdir
ac_add_options --enable-debug
ac_add_options --disable-optimize
ac_add_options --enable-tests

I then build with the commands:

cd $srcdir/objdir
export SHELL; time python -O ../build/pymake/make.py -f ../client.mk -j4

It takes about 26 minutes to build on my Core i7  M620. A normal build is about 50 minutes.

Saturday, 18 September 2010

How to get Tinderbox logs faster

David Baron pointed out at the "War on Orange" meeting today, you can get Tinderbox logs much faster if you avoid using the "showlog.cgi" program.

For example, if you have a build log URL:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284748812.1284749884.15885.gz

You can remove "showlog.cgi?log=" from the URL, and you'll download the log directly, i.e.:

http://tinderbox.mozilla.org/Firefox/1284748812.1284749884.15885.gz

This will download much faster, and you can grep for the text you're interested in directly in the downloaded log.