Firefox OS: tracking reflows and event loop lags

In Firefox OS, we want apps to run at 60FPS. To get there, it's important to avoid blocking the event loop for too long. This article explains what is the event loop, what are reflows and how to spot event loop lags and reflows

Developer HUD: new tools in Firefox OS 1.4

In this article, I'm talking about Firefox OS apps, but this is true for any web pages, on any browser.

If you already have a good understanding of what the event loop is and what reflows are, just jump to the end of this article to see how to find janks in your Firefox OS app.


What is the event loop?

Any layout engine (Gecko, Webkit, …) is driven by an event loop.

This is how it works:

Think slow motion. At anytime, in the main thread, gecko holds a list of events to be processed. Events can be: "user clicked", "a XMLHttpRequest is done", "page has closed", "a setTimeout reached its delay", "painting the screen is needed", "mouse moved", "the user scrolled", …

Gecko will consume the oldest event and execute code. If there is a JS callback to execute, like addEventListener("click", onClick), gecko will run the callback onClick **to completion**. When the JS code has fully completed being executed, gecko will consume the next event. Maybe it's time to update the page on the screen. Gecko will then draw.

Operations are executed sequentially. One after the other. JavaScript functions don't run in parallel: setTimeout(callback, 10); foobar(); If foobar takes 100ms, callback will be only executed after these 100ms. Because foobar blocks the event loop. The timeout event can't be treated.

If an operation takes more than 16 milliseconds (1000ms / 60), gecko won't be able to draw at 60FPS. A frame will be skipped. Ideally, no operations should take more than 16ms. It's not always possible (because the developer doesn't control all the things happening in the event loop, it's sometimes gecko's responsibility), but with a good understanding of the event loop, many slow operations can be avoided by the web developer.

JavaScript can be slow for many reasons. Because of a big loop, because of DOM operations, because of long synchronous calls like localStorage. Making sure these JS "run to completion" don't block the event loop for too long is important.

Event loops in Firefox OS

Each Firefox OS app runs in its own process. Each process has its own event loop. The event loop runs in the main thread. There are other threads running (for network requests, composition, media decoding, …), but the UI of the app is rendered in the main thread. So if the event loop is blocked, the UI of the app is frozen. This is barely noticeable on desktop browsers, but with Firefox OS (and on any mobile device in general) any slow operation in the event loop will make the app choppy.

Reflows

Computing the layout of a page

A reflow is when the layout engine needs to calculate the position and/or the size of an element on the page. Reflows happen in the main thread, they block the event loop. And again, can make gecko skip frames.

There are various reasons why a reflow will be necessary: page is resized, a CSS property or a JS function change the size or position of a node, a node is added/removed from the page (DOM operations).

When CSS and JS change the layout, a reflow is not immediately triggered. The layout is flagged as "dirty" (invalid). BUT, at some point, gecko has to reflow: right before drawing the page, or if a JS code requests the size/position of a node.

Let's look at this code:

1  div1.style.margin = "200px";
2  var height1 = div1.clientHeight;
3  div2.classList.add("foobar");
4  var height2 = div2.clientHeight;
5  doSomething(height1, height2);
    

Line 1, layout is marked as invalid. Line 2, to get clientHeight, gecko needs to compute the new size of div1. A reflow is triggered. Line 3, layout is marked as invalid. Line 4, a reflow is triggered again. It's possible to batch the invalidations to get only one reflow by moving Line 3 right below Line 1.

These reflows are called uninterruptible reflows (they are absolutely necessary, because JS code needs to get the actual geometry of a node). Some reflows don't have to happen right away. Interruptible reflows can be delayed (usually to let the event loop run to not block the main thread, and draw early).

It's important to differentiate painting operations and reflows. Think about a code like this:

<div class="button">hi</div>

CSS1:
.button:hover {
  border: 1px solid red;
  /* to compensate the new border: */
  margin: -1px;
}

CSS2:
.button {
  border: 1px solid transparent;
}
.button:hover {
  border-color: red;
}
  

Using CSS1 or CSS2 will end up with the same result. But CSS1 will require the layout engine to compute the geometry of the button (and conclude that nothing needs to be moved), CSS2 will only require painting a red rectangle.

Painting is always cheaper than reflowing + painting.

This is why it's recommended to use CSS transforms instead of regular CSS properties like top/left/margins to move elements. For example, transform: translate(20px,40px) will not require a reflow:

  1. position on screen and size of the transformed element won't affect other elements (rotation/scaling/translating are painting operations, there's not shift/collision/compensation/…)
  2. the DOM will not reflect the actual position of the element (clientHeight, clientWidth, offsetWidth, offsetHeight, getBoundingClientRect()), so no need to do any computation on the main thread.

Detecting event loop lags and reflows in Firefox OS

On slow CPUs, making sure the event loop doesn't get blocked for too long and reflows don't happen too often is critical.

Jan Keromnes, Vivien Nicolas and myself have been working on tools for Firefox OS to show when reflows happen in an app, and when the event loop gets blocked for too long.

To enable them, in recent versions of Firefox OS (1.4+):

Jank is "event loop lag". You can set the lag threshold. Ideally, you should not get any jank > 20ms. But from our experience, non-optimized apps often show lags > 100ms. Better to start with 100ms and narrow down as you hunt the lags.

If you want to enable these tools for certified apps, you'll need to turn off this preference (in prefs.js): devtools.debugger.forbid-certified-apps > false (this won't be necessary in a near future).

Now, some squares should show up at bottom right of you app. The purple one is a reflow counter. The blue one shows the time of the latest event loop lag (only if it reaches the threshold set in the developer menu).

To get more details, I encourage you to use the App Manager or simple use adb logcat | grep Widget (uninterruptible reflows include a JS stack). We are working on more tools to make hunting performance issues easier, but in the meantime the Firefox OS developer HUD will help you track lags and reflows.

PS:

I said that JS functions don't run in parallel. It's not true for Web Workers, but workers don't interact with the DOM, layout or anything that could have an impact on the main thread. In some special scenarios, gecko (and only gecko) spins the event loop in the middle of a sync JS call (sync XHR and window.alert()), because Firefox Desktop has only one event loop.

I mentioned that drawing happens in the main thread. It's true, but compositing happens in a different thread, and soon, Firefox OS will also support scrolling in a different thread (async pan and zoom), which will make scrolling possible even if the event loop is blocked.

"Interruptible reflows" don't just get delayed, but also don't have to fully update the layout. When gecko paints, it tries to update the layout, but if it takes too long it will just give up and paint whatever we have so far.