Qt vs. Web (Part 1): Three Cautionary Stories About Web

Let XXX be a manufacturer of printers, TVs, set-top boxes (STBs) or home appliances. XXX must decide whether to use Qt or Web for the HMI of their devices. I’ll start my series “Qt vs. Web” with three cautionary stories about using Web technologies. Facebook moved from Web to native. Netflix built its own rendering engine and JavaScript HMI library to continue with Web technologies. Although LG’s smart TVs run on WebOS, the built-in and brand-critical TV UI uses QML and Qt. Third-party apps use standard Web technologies.

Setting the Stage

XXX’s GUI application – built with Qt or Web – will have to run on a wide range of devices. So, the chosen HMI framework must scale from very powerful CPUs with GPUs down to pretty weak CPUs without GPUs. I’ll use the stories of Facebook and Netflix to show how badly Web scales and how expensive it was for Netflix to scale down Web a little bit at least. The LG story shows how to complement a core Qt GUI with third-party Web apps, that is, how to use the advantages of both HMI technologies.

At the low-end, our manufacturer XXX looks at devices powered by ARM9 CPUs (e.g., i.MX25). The Nintendo DSi or Lego Mindstorm EV3 are typical devices. In the mid-range, typical CPUs are ARM11s (e.g., Raspberry Pi1) or Cortex-A8s with GPUs (e.g., i.MX53, AM335x). ARM11s can be found in the iPhone 3G and the Nokia N8. Some premium Electrolux ovens, the iPhone 4, the Nest thermostat and some in-flight entertainment systems sport a Cortex-A8. All these CPUs have a single core.

At the high end, XXX look at a multi-core Cortex-A9, which can be found in the iPhone 5, in infotainment systems of middle-class cars, and in terminals of agricultural machines. At the time of this writing, I am not aware of any home appliance (not even the really expensive ones) sporting multi-core Cortex-A9. The reason is that home appliances have very low profit margins. The same is true for printers, STBs and TVs.

More powerful CPUs like Cortex-A15s or the 64-bit Cortex-A57/A55s are currently “reserved” for infotainment systems of premium cars and for high-end smartphones like the Samsung GS4 and newer. Premium TVs will move into this direction in the near future (say, the next 3 years). Home appliances, printers and STBs will take a lot longer.

ARM9s don’t have any GPUs. Some ARM11s like the Raspberry Pi 1 have a GPU, some don’t. Cortex-A8s, Cortex-A9s and up have more and more powerful GPUs – all supporting OpenGL. The absolutely smooth and fluid animations with 60fps are only possible because of OpenGL acceleration.

I use the term “Web” as a shorthand for Web technologies like HTML5, CSS, JavaScript, Angular, React, Meteor and jQuery.

Facebook – Making a Multi-Million Dollar Mistake with Web

At the Disrupt conference in September 2012, Mark Zuckerberg, the CEO of Facebook Inc., conceded a multi-million-dollar mistake.

“The biggest mistake that we made as a company is betting too much on HTML5 as opposed to native […] We burned two years.”

What had happened? Facebook has always used Web technologies for its desktop application and has been one of the most vocal proponents of Web technologies. They have always had some of the best web developers in the world. It was only natural that Facebook used Web for their application on smartphones. The promise of HTML5 was that it runs on almost any platform (cross-platform) with few changes. Zuckerberg had to admit that the user experience of the Facebook Web application was not good enough for mobile users.

“The mobile [user] experience is so good that good enough is not good enough. We need to have something that is at the highest quality level. The only way we are going to get there is by going native.”

The user experience on mobile devices was so bad that Facebook scrapped about two years of development efforts with HTML5 and switched to native development. Many million dollars went down the drain, because Facebook didn’t understand that smartphones are much less powerful than desktop computers.

Facebook couldn’t get a good-enough user experience out of an iPhone 4S. The iPhone 4S sports a dual-core Cortex-A9 slightly underclocked at 800 MHz with 512 MB RAM and a PowerVR GPU (see Apple A5). Compare that with the fact that there is no home appliance in the field in 2017 with a CPU similar to a Cortex-A9. The very first TVs and CPUs with such a CPU came out in 2012/2013. Smartphone CPUs are years ahead of CPUs in embedded devices.

Tobie Langel, Facebook’s representative on the W3C Advisory Committee, gave more details about the technical issues.

“The biggest issues we’ve been facing here are memory related […] Unfortunately, it’s difficult for us to understand exactly what’s causing these issues.”
“[Scrolling performance] is one of our most important issues. It’s typically a problem on the newsfeed and on Timeline which use infinite scrolling […]”
“Inconsistent framerates, UI thread lag (stuttering).”

Netflix – Going All in with JavaScript

Netflix made the same mistake as Facebook although Netflix’s HMI is much simpler than Facebook’s. The Netflix HMI is mainly a cover flow with six visible images, which the user can move to the right or left with keys on their remote control. However, Netflix didn’t correct this mistake by going native but by reimplementing their HMI with React and half a dozen other JavaScript frameworks and by writing their own rendering engine.

Around 2009, Netflix faced an enormous fragmentation problem. Its video streaming application had to run on TVs, set-top boxes (STBs), DVD/BD players, gaming consoles, smartphones, tablets, laptops and desktop PCs. The CPUs of these devices ranged from the low end (e.g., ARM9) to the high end (e.g., Intel Core i7). Powerful CPUs were very rare. Some devices had a GPU, many not. The screen resolutions and formats varied widely.

Netflix chose a hybrid approach for its video streaming application. The HMI was written in standard HTML5, CSS and JavaScript and rendered with the QtWebkit library. In contrast to browsers, QtWebkit allows the application to access hardware capabilities directly. Browsers run in a sandbox and allow only very limited access to hardware capabilities. The user experience was mediocre at best. The scrolling of cover images was sluggish.

Netflix got away with this mediocre user experience until around 2014, when Amazon, Apple and others got serious in the on-demand video market. Netflix’s user experience was not competitive any more. Netflix replaced QtWebkit by their own custom-tailored rendering engine, Gibbon, written in JavaScript. They also rewrote their entire HTML5 HMI with highly optimised verison of ReactJS, a JavaScript library for building user interfaces. Netflix eliminated HTML5 and CSS as much as possible, which reduced the size of the DOM and the number of global transformations of the DOM. Netflix’s HMI consists mostly of JavaScript.

The video Performance without Compromise shows how difficult it was to achieve response times to user inputs of 100ms and animation fluidity of 30fps. Netflix doesn’t say on which SoCs (systems on chip) it achieved this. We can, however, safely assume that Netflix achieved this only on high-end devices with at least a Cortex-A8 inside. And make no mistake: 100ms response time and 30fps are still far away from an iPhone-like user experience (50ms and 60fps). Electrolux showed in a premium oven that it is possible to achieve an iPhone-like user experience on a Cortex-A8 (see this video).

Only very few web developers in the world can optimise the performance of a web application on an embedded device in the way Netflix did. Typical web developers build web applications on desktop computers that are hundreds of times more powerful than the average TVs and STBs. Many of these developers struggle to get 100ms response times and 30 fps on these computers.

This is without doubt a tremendous achievement by Netflix but its approach does not scale down well to mid-range and low-end devices. Netflix’s approach also required a top-notch developer team and many person years of effort to scale down one level – from Facebook’s Cortex-A9 to Netflix’s Cortex-A8.

WebOS – A Happy Ending with Qt

Palm developed WebOS as the operating system for its Palm Pre. WebOS was built with Web technologies in mind. WebOS uses Webkit, the rendering engine of Apple’s Safari web browser, as the runtime for Web applications. The main argument for integrating integration Web and the operating system (Linux by the way) so tightly was the ready availability of 10 million potential application developers (according to Palm’s CEO Ed Colligan).

As we know from the Netflix story, only very few Web developer can create a good user experience on embedded devices. The average Web developer doesn’t even achieve this on desktop computers. It doesn’t come as a surprise then that “customers immediately recognized that the phone was too slow, [which] led to extremely high return rates” (see “In Flop of H.P. TouchPad, an Object Lesson for the Tech Sector“).

At the end of its first year (2009), the PalmPre had 1,000 applications compared to the iPhone’s 100,000 applications (see “RIP Palm: it’s over and here’s why“). Before it died, it had a couple of thousand application – still orders of magnitude less than iPhones and Android phones.

HP bought Palm and WebOS with the clear intention to use WebOS on “more devices, including PCs and printers”. Actually, HP wanted to use WebOS as the operating system for all its printers. This never happened, as HP gave up on WebOS quickly and sold it to LG.

LG had learned from Palm’s and HP’s failures. They use WebOS – now based on QtWebengine – on their smart TVs and other devices as stated in this success story. What this success doesn’t tell you is that LG uses QML for their core TV HMI and standard Web technologies for third-party applications – as I learned from well-informed sources.

The core HMI is critical to LG’s brand image. So, it must offer the best user experience possible. Third-party applications get away with an acceptable but not great user experience. It is more important that these applications can be easily installed and updated, that they run on many TVs from different manufacturers, and that there are lots of developers. This is where Web is slightly better than Qt. LG succeeded in complementing the advantages of Qt and Web on their smart TVs.

Conclusion

The Facebook story tells XXX that it would need at least a multi-core Cortex-A9 SoC to create a good user experience with standard Web technologies (HTML5, CSS, JavaScript). The Netflix story shows that a good user experience is possible on a single-core Cortex-A8 SoC, if XXX writes its own rendering engine with JavaScript and if it uses JavaScript almost exclusively for the HMI. XXX would have to spend many person years on a top-notch development team.

The two stories show that Web doesn’t scale down well. It stops at a Cortex-A8. And reaching a good but not great user experience would cost XXX a fortune. As XXX is unlikely to find a development team as talented as Netflix’s, it would have to use at least a multi-core Cortex-A9 for its devices or something even more powerful. A 64-bit Cortex-A53 (as used in the DragonBoard 410c) would reduce the risk of coming up with a not-good-enough user experience considerably.

The more powerful the SoC is the more expensive it is. An NXP i.MX6 with four Cortex-A9 cores costs more than 20 Euros per device at a volume of 1 million devices! But this will be the topic of the second part of the “Qt vs. Web” series, where we’ll look at the total cost of ownership of a solution using Qt Commercial or “free” Web technologies. Be prepared for a surprise.