Thursday, 28 July 2011

Reducing the memory overhead of thread stacks

I recently landed bug 664341 into mozilla-central, which adds an API to specify the amount of virtual address space reserved for the thread stacks of nsIThreads. The new API looks like this:

  extern NS_COM_GLUE NS_METHOD
  NS_NewThread(nsIThread **result,
               nsIRunnable *initialEvent = nsnull,
               PRUint32 stackSize = nsIThreadManager::DEFAULT_STACK_SIZE);

The default stack size is the default for whatever platform your running on, so behaviour is unchanged for existing uses of NS_NewThread.

This new API is important, as x86 Linux by default reserves 8MB of virtual address space per thread stack. Windows and OSX use 1MB and 64KB respectively. If you have a lot of threads, their stacks can hog the virtual address space, and malloc will fail; we had at least one media mochitest that could fail in this way.

If you have code that creates threads, you should consider using this API. It's an easy way to reduce perceived memory usage. 

I've also recently concluded a major refactoring of the media playback engine in Firefox. This reduces the number of threads required to play <audio> and <video> elements by roughly one third. We now only require two threads per playing media element (plus one extra thread for sound playback on Linux at least until bug 623444 lands and we can refactor to take advantage of that). Media elements which are paused now shut down their threads where possible, resulting in lower overall memory usage. If you have a page with 100 <audio> elements on it, you no longer have 300 threads lying around using up virtual address space!

Wednesday, 8 June 2011

Impressions of China 2011

I have just returned from traveling with my wife and her parents and sister for two weeks in China and Hong Kong.

It was an interesting experience. It's easy to see why so many people say that this century will be China's century.

My impressions of China are below. No doubt some people will disagree with them. Constructive comments welcome.
  • The Chinese plan long term. They're building infrastructure that they'll need in 20 years. The leadership doesn't need to worry about long term projects appearing to their electorate that they're not achieving results. They don't need to borrow money in order to pay for the overly generous election promises required to get them elected. This seems to me to be one of the primary strengths of the Chinese communist system, and one of the failings of democracy. I am not implying that either system is necessarily superior.

  • All housing is leasehold. When you buy a house, you buy a 30 year lease for a residential city title, and longer leases are available for rural and commercial building (IIRC). If the government wants the land to build a road or whatever, they take the land back, the road gets built, and you go somewhere else.
  • As a corollary of my previous point, the Chinese get things done. They don't go through rounds of resource consent spanning years when they want to build something. Some engineer draws a line on a map, and it happens.
  • Communism works in China. Gone are the bad old days of the revolution and the madness contained therein. The Chinese have embraced a form of capitalism and made it work with their system. Numerous Chinese ex-pats have told me that people who don't rock the boat can live pretty free and happy lives. The people who rock the boat may not...
  • The top echelons of Chinese Government are engineers, and it shows. They build physical things and encourage the manufacture of physical products. They don't wrangle over IP laws designed to plug the leaks in dying business models. They build stuff. Then they sell it, and everyone benefits.

  • Intellectual property is not respected. They plagiarise and copy blatantly. I suspect this is part of their recent rapid rise; they didn't have to invent or start from scratch the same way the west did when it developed, the Chinese just copied what the west had done. Once they've caught up across the board, it will be interesting to see how lax intellectual property law/enforcement affects their economy and how they do business in future
  • Everything is cheap. Food is cheap (and may be subject to government price controls now or in future to ensure it remains cheap). This means the base cost of living is low, so wages can also be low, and manufactured (exported) goods are cheap. The price of good quality food in China was easily one fifth of what you pay in New Zealand.
  • Perhaps as a corollary or my previous point, the quality of workmanship in China is in general very low, and they don't seem to put very much emphasis on stream-lining many of their processes (store check outs, even in department stores are slow; why make it hard for customers to give you their money?).
  • There are [fun] police everywhere (at least in the tourist traps and the popular areas I visited). Plenty of stern faced young men in uniform patrol the streets with often-used whistles to keep people off the grass, to keep bikes off designated areas, to keep people from sitting on walls, and in general to keep people in line. Though one of their main duties seems to be to giving directions to people.
  • They have electric bicycles where the battery charges while you pedal. They're everywhere, and very quiet - so they can sneak up on you. Seems a great and environmentally friendly way to get around flat cities.

  • The Chinese can be "a but rough around the edges". Spitting on the ground is common practice (and may be a consequence of the bad air, and the prevalence of smoking). If they were kiwi, I'd describe them as "unashamedly blokey".
  • The Chinese do things at scale. When we crossed from Hong Kong to Shenzhen by bus, we crossed a long bridge over Shenzhen bay. There were rafts supporting an oyster farm spanning Shenzhen bay there, which stretched as far as the eye could see in either direction (the air was hazy/polluted, which reduced visibility to about 3km or so, but still. Impressive.
  • The "Maorish Village" in the "Wonders of the World" theme park in Shenzhen was hilariously inaccurate. The "Maori" people were not Maori (probably of south-western Chinese descent), and they performed a ramshackle show which was a fusion of a dozen different Pacific cultures. They repeatedly shouted "Aloha" (which is Hawaiian, they should at least say "kia ora" for a Maori greeting), and danced around in Cook Island costumes, claiming it was Maori. As a Pākehā, I'm offended on behalf of my Maori countrymen.

  • You couldn't see the sun most of the time in the big cities due to the pollution. The air tastes vile.
  • Traffic in Shanghai borders on being civilized. Other parts of China, less so.
  • White people are treated well, but prone to being overcharged. If you're a struggling dancer, move to China! White people are in demand in this area.
  • Mandatory kit for all white people in China should be a t-shirt that says "No buy DVD. No buy T-Shirt. No buy Bag." Bonus points if it's written in Chinese.
  • There are plenty of white people in Shanghai. Elsewhere, less so.
  • Shanghai is cool. There's lots of sci-fi-esque buildings all over the place. Star fleet head quarters should be built there.

  • They never miss an opportunity to ruin a perfectly good event/tour/attraction by trying to sell you stuff. We went on a day-trip guided tour, and after lunch we were taken into a fish oil factory and subjected to a 30 minute power point presentation trying to scare us into buying their products. I could barely believe it was happening!
  • We took the MagLev train in Shanghai to the airport. It went 434km/h. It was seriously cool. We should totally get one of those.
  • My wife's family is involved with an English school in Chongqing, China. If you're interested in teaching English in China, let me know, they're hiring. They're looking to hire English-speaking white people, English doesn't need to be your native language. Yes, I know many people of other ethnicities with excellent English, but the locals feel it's more prestigious to learn English from white skinned people.

Thursday, 31 March 2011

HTML5 Video painting performance statistics in Firefox 5

I've landed video frame paint performance counters for HTML5 video onto mozilla-central. This should ship in Firefox 5, barring any disasters. This work was a combined effort by Chris Double and I. These are Mozilla specific fields which will only be available in Firefox.

The new statistics enable us to measure the performance of the video decoding and frame painting pipeline.

This adds the following fields to the HTMLVideoElement:
  • mozParsedFrames - A count of the number of video frames that have been demuxed/parsed from the media resource. If we were playing perfectly, we'd be able to paint this many frames.
  • mozDecodedFrames - A count of the number of deumxed/parsed video frames that have been decoded into Images. We skip decoding of parsed/demuxed frames if the decode is falling behind the playback position (this can happen if it takes a long time to decode a keyframe for example).
  • mozPresentedFrames - A count of the number of decoded frames that have been presented to the rendering pipeline for painting (set as the current Image on the video element's ImageContainer). We may not present decoded frames if the frame arrives for presentation late.
  • mozPaintedFrames - A count of the number of presented frames which were painted on screen. We may end up not painting presented frames if another frame is presented before the graphics pipeline has time to paint the previously presented frame, or if the video is off screen. 
  • mozFrameDelay - The time (as a floating point number in seconds) which the last painted video frame was rendered late by. This is the time duration between the decoder saying "paint frame X now", and the graphics pipeline physically getting frame X displayed on the screen. The value is accurate on desktop Firefox, but not on mobile. Improvements in the graphics pipeline, and the integration with the graphics pipeline, will show up as a decrease in this number.
Here's a demo of the video paint statistics in Firefox 5. You'll need a recent Firefox trunk nightly build for the demo to work.

Thursday, 17 February 2011

Firefox 4 video decoder architecture

To assist others coming up to speed on the architecture of the video decoder, I've put together a diagram of Firefox 4's video playback engine. We rewrote our video architecture for Firefox 4 in order to give us better control over the complete stack.

Click on the image for a larger diagram.

The key classes in our architecture are:
  • nsHTMLMediaElement - This manages the JavaScript/HTML accessible HTMLMediaElement interface, and implements the resource selection, load, and preload logic.
  • nsBuiltinDecoder - Manages a main thread accessible snapshot of the state of the underlying decoder. The decoders run on non-main threads, and we don't want to block the main thread to dig into the decoders when JS queries playback state, so we maintain a snapshot of the playback engine's state in this class. This inherits from nsMediaDecoder. You can also implement playback support for a new format by inheriting and implementing nsMediaDecoder. nsWaveDecoder is currently implemented this way, but we're in the process of reimplementing that as a sublcass of nsBuiltinDecoderReader.
  • nsBuiltinDecoderStateMachine - Manages the decode, state machine, audio-push threads, frame queueing, A/V sync, and buffering logic. This ensures that all the HTML5 events get dispatched at the appropriate time, and that behaviour is consistent and sane across different media types. Demuxing is handled abstractly by subclasses of nsBuiltinDecoderReader. This way all media types can share as much playback logic as possible, reducing our maintenance overhead.
  • nsOggReader/nsWebMReader - Demuxing and codec specific functionality is implemented by subclassing nsBuiltinDecoderReader. This reduces the amount of work required to implement and maintain support for new codecs. When a new codec is implemented as a nsBuiltinDecoderReader subclass, support for HTML events, buffering, and playback logic does not need to be reimplemented, since it already exists in nsBuiltinDecoderStateMachine. To add support for a new codec, it's easiest to implement support as a new nsBuiltinDecoderReader subclass.
  • nsAudioStream - Our cross platform audio API wrapper. It is based on libsydneyaudio, which operates on a push model rather than a (more commonly used) callback-based model, which has brought in a whole raft of headaches. Matthew Gregan is in the process of rewriting our audio layer to a more sane callback based model. We also provide a cross-process nsAudioStreamRemote, which proxies audio commands to an audio stream in another process. This is required on mobile.
  • ImageContainer - When it comes time to present a video frame, nsBuiltinDecoderStateMachine sets it as the "current image" of the video element's ImageContainer object. This then propagates through the Layers/2D scene rendering system, and it eventually gets rendered on the screen. The Layers compositing runs on the main thread, and ImageContainer provides a thread-safe wrapper. The images contained in the ImageContainer can be in OpenGL/D3D surfaces, so we can take advantage of hardware accelerated scaling, rendering, and YCbCr to RGB conversion.
  • nsVideoFrame - This resides in layout, and manages the dimensions/reflow of the video, as well as its poster image.
  • nsMediaStream - Our network code runs on the main thread, but the underlying libraries we use for media decoding (libvpx, libtheora, etc) assume synchronous reads. We can't afford to do blocking reads on the main thread, so we cache the media data downloaded into the nsMediaCache, and provide a thread-safe wrapper synchronous wrapper for reading in the nsMediaStream class. We use Necko for our networking, so we can take advantage of all the existing security and load-group functionality it implements.
The advantage of controlling the entire playback engine are many. We can easily control frame dropping, memory allocation, the threading model, what, when, and how we decode, and we can integrate more tightly with our network stack.

Wednesday, 12 January 2011

Google to remove H.264 support from Chrome

Google is removing H.264 support from Chrome in the coming months. This is good news, as it means we're closer to having a high quality patent unencumbered video format for use in HTML5 video that works across all browsers. Now we're just left with Safari, Apple's iDevices, and IE9 as natively supporting patent encumbered video formats. Microsoft has said that IE9 will play "VP8 video when the user has installed a VP8 codec on Windows", it will be interesting to see how that pans out.

Monday, 8 November 2010

How to stop a video or audio element downloading

Say you've started playback of an HTML5 audio or video element, and you decide you really want to cancel playback and downloading of the media resource. Stopping playback is easy, just call the pause() method. But the network connection won't be stopped until the media element gets garbage collected, and even if you release all references to the media element, it won't be destroyed until the browser decides to runs its garbage collector. How can you stop the download of the media resource in the meantime? Here's a quick hack to achieve this: just reset the element's src attribute to the empty string. This destroys the element's internal decoder, and stops the network download.

For example:

<audio id="audiocontrols>
  <source src="funky-music.ogg">
  <source src="funky-music.mp3">
</audio>
<!-- ... Some time later, we decide we should stop the audio element playing and downloading... -->
<script>
var audio = document.getElementById("audio");
audio.pause();
audio.src = ""; // Stops audio download.
audio.load(); // Initiate a new load, required in Firefox 3.x.
</script>


Be aware that this will destroy the media element's decoders, so the element won't be playable anymore, and it will be rendered as an "error cross" if it's in a document. Also in Firefox 3.x you need to call load() after changing the source, whereas in Firefox 4 the load is scheduled to run when you change the src attribute, and the extra load() call is not required.

Changes to HTML5 video/audio load() function in Firefox 4

I've updated the media load() implementation in Firefox 4 to match the current WHATWG media load algorithm specification. There are three main changes that web developers using media elements should be aware of.

Firstly, error reporting has slightly changed. When a media element fails to load from its <source> children, an "error" event is dispatched to every child element which failed to load. Previously in Firefox 3.x you'd receive only one "error" event dispatched to the media element once all of its child <source> elements had failed to load. Now you only receive "error" events in the child <source> elements, and not in the media element itself.

For example, suppose you have the following markup:

<video>
<source id="mp4_src"
        src="video.mp4"
        type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
</source>
<source id="3gp_src"
        src="video.3gp"
        type='video/3gpp; codecs="mp4v.20.8, samr"'>
</source>
<source id="ogg_src"
        src="video.ogv"
        type='video/ogg; codecs="theora, vorbis"'>
</source>
</video>

Firefox 4 does not support the playback of patent encumbered content, so you'll receive "error" events in the <source> elements with the MP4 and 3GP resources, before the Ogg resource is loaded. Note also that the <source> children are loaded in the order in which they appear in the markup, and if one <source> child successfully loads and is playable, the children after it won't be loaded.

To detect that all child <source> elements have failed to load, check the value of the networkState attribute of the media element; if its value is HTMLMediaElement.NETWORK_NO_SOURCE, you know all child <source> elements have failed to load.

If you add another child <source> element to a media element which is in networkState HTMLMediaElement.NETWORK_NO_SOURCE, it will attempt to load the resource specified by the newly added <source>.

Secondly, when the load begins has changed. When you set the src attribute of a media element, or add a <source> child element to a media element, a load will be scheduled to run asynchronously as soon as the current JavaScript context exits (basically the load starts the next time we return to the browser application's main event loop). So for example, suppose you have some timeout set such as:

<script>
setTimeout(
  function() {
    var v = document.createElement("video");
    v.src = "video.ogg";
    document.appendChild(v)
    // Do some other stuff...
  },
  1000);
</script>

When the function runs, the load for the video element won't begin until after the function returns, and control returns to the browser. This is important, because in Firefox 3.x, the load is started (and could even run to completion!) as soon as you set the media element's src attribute.

Lastly, the media element's events now no longer bubble. Previously they bubbled, and this was a bug in Firefox 3.x and in violation of the spec.