Added By: Feedage Forager Feedage Grade B rated
Language: English
browser  cpu  fetch  mobile  network  page  pages  performance  preconnect  request  response  server  text  time  user 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics

a goal is a dream with a deadline

Updated: 2016-08-26T16:50:29+00:00


Stop Cross-Site Timing Attacks with SameSite cookies


Let's say we have a client that can initiate a network request for any URL on the web but the response is opaque and cannot be inspected. What could we learn about the client or the response? As it turns out, armed with a bit of patience and rudimentary statistics, "a lot". For example, the duration of the fetch is a combination of network time of the request reaching the server, server processing time, and network time of the response. Each and every one of these steps "leaks" information both about the client and the server. For example, if the total duration is very small (say, <10ms) then we can reasonably intuit that we might be talking to a local cache, which means that the client has previously fetched this resource. Alternatively, if the duration is slightly higher (say, <50ms) then we can reasonably guess that the client is on a low-latency network (e.g. fast 4G or WiFi). We can also append random data to the URL to make it unique and rule out the various HTTP caches along the way. From there, we can try making more requests to the server and observe how the fetch duration changes to infer change in server processing times and/or larger responses being sent to the client. If we're really crafty, we can also use the properties of the network transport like CWND induced roundtrips in TCP (see TCP Slow Start), and other quirks of local network configuration, as additional signals to infer properties (e.g. size) of the response—see TIME, HEIST attacks. If the response is compressed and also happens to reflect submitted data, then there is also the possibility of using a compression oracle attack (see BREACH) to extract data from the response. In theory, the client could try to stymie such attacks by making all operations take constant time, but realistically that's neither a practical nor an acceptable solution due to the user experience and performance implications of such strategy. Injecting random delays doesn't fare much better, as it carries similar implications. "Networking thermodynamics" Each and every step in the fetch process—from the client generating the request and putting on the wire, the network hops to the server, the server processing time, response properties, and the network hops back to the client—"leaks" information about the properties of the client, network, server, and the response. This is not a bug; it's a fact of life. Borrowing an explanation from our physicist friends: putting a system to work amounts to extracting energy from it, which we can then measure and interrogate to learn facts about said system. Eyes glazing over yet? The practical implication is that if the necessary server precautions are missing, the use of the above techniques can reveal private information about you and your relationship to that server - e.g. login status, group affiliation, and more. This requires a bit more explanation… The dangers of credentialed cross-origin "no-cors" requests The fact that we can use side-channel information, such as the duration of a fetch, to extract information about the response is not, by itself, all that useful. After all, if I give you a URL you can just use your own HTTP client to fetch it and inspect the bytes on the wire. However, what does make it dangerous is if you can co-opt my client (my browser) to make an authenticated request on my behalf and inspect the (opaque) response that contains my private content. Then, even if you can't access the response directly, you can observe any of the aforementioned properties of the fetch and extract private information about my client and the response. Let's make it concrete… I like to visit on which I have an account to pin my favorite images: The authentication mechanism is a login form with all the necessary precautions (CSRF tokens, etc). Once authenticated, the server sets an HTTP cookie scoped to with a private token that is used to authenticate me on future visits. Someone else entices me to visit to view more pictures of kittens... While I'm indulging in k[...]

Building Fast & Resilient Web Applications


You've applied all the best practices, set up audits and tests to detect performance regressions, released the new application to the world, and... lo and behold, the telemetry is showing that despite your best efforts, there are still many users—including those on "fast devices" and 4G networks—that are falling off the fast path: janky animations and scrolling, slow loading pages and API calls, and so on. Frustrating. There must be something wrong with the device, the network, or the browser—right? Maybe there is. There is an infinite supply of reasons for why the application can fall off the fast path: overloaded networks and servers, transient network routing issues, device throttling due to energy or heat constraints, competition for resources with other processes on the user's device, and the list goes on and on. It is impossible to anticipate all the edge cases that can knock our applications off the fast path, but one thing we know for certain: they will happen. The question is, how are you going to deal with it? Carving out the fast path is not enough. We need to make our applications resilient. Resilient applications provide guardrails that protect our users from the inevitable performance failures. They anticipate these problems ahead of time, have mechanisms in place to detect them, know how to adapt to them at runtime, and as a result, are able to deliver a reliable user experience despite these complications. I won't rehash every point in the video, but let's highlight the key themes: (9m3s) Seemingly small amounts of performance variability in critical components quickly add up to create less than ideal conditions. We must design our systems to detect and deal with such cases—e.g. set explicit SLA's on all requests and specify upfront how the violations will be handled. (16m28s) The "performance inequality" gap is growing. There are two market forces at play: there is a race for features and performance, and there is high demand for lower prices. These are not entirely at odds, the cheap devices are also getting faster, but the flagships are racing ahead at a much faster pace. (19m45s) "Fast" devices show spectacular peak performance in benchmarks, but real-world performance is more complicated: we often have to trade off raw performance against energy costs and thermal constraints, compete for shared resources with other applications, and so on. (23m35s) Mobile networks provide an infinite supply of performance entropy, regardless of the continent, country, and provider—e.g. the chances of a device connecting to a 4G network in some of the largest European countries are effectively a coin flip; just because you "have a signal" doesn't mean the connection will succeed; see "Resilient Networking". If we ignore the above and only optimize for the fast path, we shouldn't be surprised when the application goes off the rails, and our users complain about unreliable performance. On the other hand, if we accept the above as "normal" operational constraints of a complex system, we can engineer our applications to anticipate these challenges, detect them, and adapt to them at runtime (31m39s): Treat offline as the norm. All request must have a fallback. Use available API's to detect device & network capabilities. Adapt application logic to match the device & network capabilities. Observe real-world performance (runtime, network) at runtime, goto(4). [...]

Control Groups (cgroups) for the Web?


You've optimized every aspect of your page—it's fast, and you can prove it. However, for better or worse, you also need to include a resource that you do not control (e.g. owned by a different subteam or a third-party), and by doing so you lose most, if not all, guarantees about the runtime performance of your page - e.g. an included script resource can execute any code it wants, at any point in your carefully optimized rendering loop, and for any lengths of time; it can fetch and inject other resources; all of the scheduling and execution is on par with your carefully crafted code. We're missing primitives that enable control over how and where CPU, GPU, and network resources are allocated by the browser. To the browser, all scripts look the same. To the developer, some are more important than others. Today, the web platform lack the tools to bridge this gap, and that's at least one reason why delivering reliable performance is often an elusive goal for many. We can learn from those before us... Conceptually, the above problem is nothing new. For example, Linux control groups (cgroups) address the very same issues "higher up" in the stack: multiple processes compete for a finite number of available resources on the device, and cgroups provide a mechanism by which resource allocation (CPU, GPU, memory, network, etc) can be specified and enforced at a per-process level - e.g. this process is allowed to use at most 10% of the CPU, 128MB of RAM, is rate-limited to 500Kbps of peak bandwidth, and is only allowed to download 10Mb in total. The problem is that we, as site developers, have no way to communicate and specify similar policies for resources that run on our sites. Today, including a script or an iframe gives it the keys to the kingdom: these resources execute with the same priority and with unrestricted access to the CPU, GPU, memory, and the network. As a result, the best we can do is cross our fingers and hope for the best. Arguably, Content-Security-Policy offers a functional subset of the larger "cgroups for the web" problem: it allows the developer to control which origins the browser is allowed to access, and new embedded enforcement proposal extends this to subresources! However, this only controls the initial fetch, it does not address the resource footprint (CPU, GPU, memory, network, etc.) once it is executed by the browser. Would cgroups for the web help? As a thought experiment, it may be worth considering how a cgroups-like policy could look like in the browser, and what we would want to control. What follows is a handwavy sketch, based on the frequent performance failure cases found in the wild, and conversations with teams that have found themselves in these types of predicaments: ...

The "Average Page" is a myth


As anyone and everyone in the web performance community will tell you, the size of the average page is continuously getting bigger: more JavaScript, more image and video bytes, growing use of web fonts, and so on. In fact, as of December 2015, the HTTP Archive shows that the average desktop site weighs in at 2227KB, and mobile is up to 1253KB. Except, what is an "average page", exactly? Intuitively, it is a page that is representative of the web at large, in its payload size, distribution of bytes between different content types, etc. More technically, it is a measure of central tendency of the underlying distribution - e.g. for a normal distribution the average is the central peak, with 50% values greater and 50% values smaller than its value. Which, of course, begs the question: what is the shape and type of the distribution for transferred bytes and does it match this model? Let's plot the histogram and the CDF plots... The x-axis shows that we have outliers weighing in at 30MB+. The quantile values are 25th: 699KB, 50th (median): 1445KB, 75th: 2697KB. The CDF plot shows that 90%+ of the pages are under 5000KB. The x-axis shows that we have outliers weighing in at 10MB+. The quantile values are 25th: 403KB, 50th (median): 888KB, 75th: 1668KB. The CDF plot shows that 90%+ of the pages are under 3000KB. Let's start with the obvious: the transfer size is not normally distributed, and there is no meaningful "central value" and talking about the mean is meaningless, if not deceiving - see "Bill Gates walks into a bar...". We need a much richer and nuanced language and statistics to capture what's going on here, and an even richer set of tools and methods to analyze how these values change over time. The "average page" is a myth. I've been as guilty as anyone in (ab)using averages when talking about this data: they're easy to get and simple to communicate. Except, they're also meaningless in this context. My 2016 resolution is to kick this habit. Join me. Page weight as of December 2015 Coming up with a small set of descriptive statistics for a dataset is hard, and attempting to reduce a dataset as rich as HTTP Archive down to a single one is an act of folly. Instead, we need to visualize the data and start asking questions. For example, why are some pages so heavy? A cursory look shows that the heaviest ~3% by page weight, both for desktop (>7374KB) and mobile (>4048KB), are often due to large (and/or heavy) number of images. Emphasis on often, because a deeper look at the most popular content types shows outliers in each and every category. For example, plotting the CDFs for desktop pages yields: We have pages that fetch tens of megabytes of HTML, images, video, and fonts, as well as high single-digit megabytes of JavaScript and CSS. Each of these "obese" outliers is worth digging into, but we'll leave that for a separate investigation. Let's compare this data to the mobile dataset. Lots of outliers as well, but the tails for mobile pages are not nearly as long. This alone explains much of the dramatic "average page" difference (desktop: 2227KB, mobile: 1253KB) — averages are easily skewed by a few large numbers. Focusing on the average leads us to believe that mobile pages are significantly "lighter", whereas in reality all we can say so far is that the desktop distribution has a longer tail with much heavier pages. To get a better sense for the difference in distributions between the desktop and mobile pages, let's exclude the heaviest 3% that compress all of our graphs and zoom in on the [0, 97%] interval: Mobile pages do appear to consume fewer bytes. For example, a 1000KB budget would allow the client to fetch fully ~38% of desktop pages vs. 54% of mobile pages. However, while the savings for mobile pages are present for all content types, the absolute differences for most of them are not drastic. Most of the total byte difference is explained by fewer image bytes. Structurally, mobile pages are not dr[...]

Don't lose user and app state, use Page Visibility


Great applications do not lose user's progress and app state. They automatically save the necessary data without interrupting the user and transparently restore themselves as and when necessary - e.g. after coming back from a background state or an unexpected shutdown. Unfortunately, many web applications get this wrong because they fail to account for the mobile lifecycle: they're listening for the wrong events that may never fire, or ignore the problem entirely at the high cost of poor user experience. To be fair, the web platform also doesn't make this easy by exposing (too) many different events: visibilityState, pageshow, pagehide, beforeunload, unload. Which should we use, and when? You cannot rely on pagehide, beforeunload, and unload events to fire on mobile platforms. This is not a bug in your favorite browser; this is due to how all mobile operating systems work. An active application can transition into a "background state" via several routes: The user can click on a notification and switch to a different app. The user can invoke the task switcher and move to a different app. The user can hit the "home" button and go to homescreen. The OS can switch the app on users behalf - e.g. due to an incoming call. Once the application has transitioned to background state, it may be killed without any further ceremony - e.g. the OS may terminate the process to reclaim resources, the user can swipe away the app in the task manager. As a result, you should assume that "clean shutdowns" that fire the pagehide, beforeunload, and unload events are the exception, not the rule. To provide a reliable and consistent user experience, both on desktop and mobile, the application must use Page Visibility API and execute its session save and restore logic whenever visibilityChange state changes. This is the only event your application can count on. // query current page visibility state: prerender, visible, hidden var pageVisibility = document.visibilityState; // subscribe to visibility change events document.addEventListener('visibilitychange', function() { // fires when user switches tabs, apps, goes to homescreen, etc. if (document.visibilityState == 'hidden') { ... } // fires when app transitions from prerender, user returns to the app / tab. if (document.visibilityState == 'visible') { ... } }); If you're counting on unload to save state, record and report analytics data, and execute other relevant logic, then you're missing a large fraction of mobile sessions where unload will never fire. Similarly, if you're counting on beforeunload event to prompt the user about unsaved data, then you're ignoring that "clean shutdowns" are an exception, not the rule. Use Page Visibility API and forget that the other events even exist. Treat every transition to visible as a new session: restore previous state, reset your analytics counters, and so on. Then, when the application transitions to hidden end the session: save user and app state, beacon your analytics, and perform all other necessary work. If necessary, with a bit of extra work you can aggregate these visibility-based sessions into larger user flows that account for app and tab switching - e.g. report each session to the server and have it aggregate multiple sessions together. Practical implementation considerations In the long term, all you need is the Page Visibility API. As of today, you will have to augment it with one other event — pagehide, to be specific — to account for the "when the page is being unloaded" case. For the curious, here's a full matrix of which events fire in each browser today (based on my manual testing): visibilityChange works reliably for task-switching on mobile platforms. beforeunload is of limited value as it only fires on desktop navigations. unload does not fire on mobile and desktop Safari. The good news is that Page Visibility reliably covers task-switching scenari[...]

Eliminating Roundtrips with Preconnect


The "simple" act of initiating an HTTP request can incur many roundtrips before the actual request bytes are routed to the server: the browser may have to resolve the DNS name, perform the TCP handshake, and negotiate the TLS tunnel if a secure socket is required. All accounted for, that's anywhere from one to three — and more in unoptimized cases — roundtrips of latency to set up the socket before the actual request bytes are routed to the server. Modern browsers try their best to anticipate what connections the site will need before the actual request is made. By initiating early "preconnects", the browser can set up the necessary sockets ahead of time and eliminate the costly DNS, TCP, and TLS roundtrips from the critical path of the actual request. That said, as smart as modern browsers are, they cannot reliably predict all the preconnect targets for each and every website. The good news is that we can — finally — help the browser; we can tell the browser which sockets we will need ahead of initiating the actual requests via the new preconnect hint shipping in Firefox 39 and Chrome 46! Let's take a look at some hands-on examples of how and where you might want to use it. Preconnect for dynamic request URLs Your application may not know the full resource URL ahead of time due to conditional loading logic, UA adaptation, or other reasons. However, if the origin from which the resources are going to be fetched is known, then a preconnect hint is a perfect fit. Consider the following example with Google Fonts, both with and without the preconnect hint: In the first trace, the browser fetches the HTML and discovers that it needs a CSS resource residing on With that downloaded it builds the CSSOM, determines that the page will need two fonts, and initiates requests for each from — first though, it needs to perform the DNS, TCP, and TLS handshakes with that origin, and once the socket is ready both requests are multiplexed over the HTTP/2 connection. In the second trace, we add the preconnect hint in our markup indicating that the application will fetch resources from As a result, the browser begins the socket setup in parallel with the CSS request, completes it ahead of time, and allows the font requests to be sent immediately! In this particular scenario, preconnect removes three RTTs from the critical path and eliminates over half of second of latency. The font-face specification requires that fonts are loaded in "anonymous mode", which is why we must provide the crossorigin attribute on the preconnect hint: the browser maintains a separate pool of sockets for this mode. Initiating preconnect via Link HTTP header In addition to declaring the preconnect hints via HTML markup, we can also deliver them via an HTTP Link header. For example, to achieve the same preconnect benefits as above, the server could have delivered the preconnect hint without modifying the page markup - see below. The Link header mechanism allows each response to indicate to the browser which other origins it should connect to ahead of time. For example, included widgets and dependencies can help optimize performance by indicating which other origins they will need, and so on. Preconnect with JavaScript We don't have to declare all preconnect origins upfront. The application can invoke preconnects in response to user input, anticipated activity, or other user signals with the help of JavaScript. For example, consider the case where an application anticipates the likely navigation target and issues an early preconnect: function preconnectTo(url) { var hint = document.createElement("link");[...]

Browser Progress Bar is an Anti-pattern


The user initiates a navigation, and the browser gets busy: it'll likely have to resolve a dozen DNS names, establish an even larger number of connections, and then dispatch one or more requests over each. In turn, for each request, it often does not know the response size (chunked transfers), and even when it does, it is still unable to reliably predict the download time due to variable network weather, server processing times, and so on. Finally, fetching and processing one resource might trigger an entire subtree of new requests. Ok, so loading a page is complicated business, so what? Well, if there is no way to reliably predict how long the load might take, then why do so many browsers still use and show the progress bar? At best, the 0-100 indicator is a lie that misleads the user; worse, the success criteria is forcing developers to optimize for "onload time", which misses the progressive rendering experience that modern applications are aiming to deliver. Browser progress bars fail both the users and the developers; we can and should do better. Indeterminate indicators in post-onload era To be clear, progress indicators are vital to helping the user understand that an operation is in progress. The browser needs to show some form of a busy indicator, and the important questions are: what type of indicator, whether progress can be estimated, and what criteria are used to trigger its display. Some browsers have already replaced "progress bars" with "indeterminate indicators" that address the pretense of attempting to predict and estimate something that they can't. However, this treatment is inconsistent between different browser vendors, and even same browsers on different platforms — e.g. many mobile browsers use progress bars, whereas their desktop counterparts use indeterminate indicators. We need to fix this. Also, while we're on the subject, what are the conditions that trigger the browser's busy indicator anyway? Today the indicator is shown only while the page is loading: it is active until the onload event fires, which is supposed to indicate that the page has finished fetching all of the resources and is now "ready". However, in a world optimized for progressive rendering, this is an increasingly less than useful concept: the presence of an outstanding request does not mean the user can't or shouldn't interact with the page; many pages defer fetching and further processing until after onload; many pages trigger fetching and processing based on user input. Time to onload is bad performance metric and one that developers have been gaming for a while. Making that the success criteria for the busy indicator seems like a decision worth revisiting. For example, instead of relying on what is now an arbitrary initialization milestone, what if it represented the pages ability to accept and process user input? Does the page have visible content and is it ready to accept input (e.g. touch, scroll)? Hide the busy indicator. Is the UI thread busy (see jank) due to long-running JavaScript or other work? Show the busy indicator until this condition is resolved; the busy indicator may be shown at any point in the application lifecycle. The initial page load is simply a special case of painting the first frame (ideally in <1000ms), at which time the page is unable to process user input. Post first frame, if the UI thread is busy once again, then the browser can and should show the same indicator. Changing the busy indicator to signal interactivity would address our existing issues with penalizing progressive rendering, remove the need to continue gaming onload, and create direct incentives for developers to build and optimize for smooth and jank-free experiences. [...]

Fixing the 'Blank Text' Problem


In cases where textual content is loaded before downloadable fonts are available, user agents may render text as it would be rendered if downloadable font resources are not available or they may render text transparently with fallback fonts to avoid a flash of text using a fallback font - Font loading guidelines. The ambiguity and lack of developer override in above spec language is a big gap and a performance problem. First, the ambiguity leaves us with inconsistent behavior across different browsers, and second, the lack of developer override means that we are either rendering content that should be blocked, or unnecessarily blocking rendering where a fallback would have been acceptable. There isn't a single strategy that works best in all cases. Let's quantify the problem How often does the above algorithm get invoked? What's the delta between the time the browser was first ready to render text and the font became available? Speaking of which, how long does it typically take the font download to complete? Can we just initiate the font fetch earlier to solve the problem? As it happens, Chrome already tracks the necessary metrics to answer all of the above. Open a new tab and head to chrome://histograms to inspect the metrics (for the curious, check out histograms.xml in Chromium source) for your profile and navigation history. The specific metrics we are interested in are: WebFont.HadBlankText: count of times text rendering was blocked. WebFont.BlankTextShownTime: duration of blank text due to blocked rendering. WebFont.DownloadTime.*: time to fetch the font, segmented by filesize. PLT.NT_Request: time to first response byte (TTFB). Text rendering performance on Chrome for Android Inspecting your own histograms will, undoubtedly, reveal some interesting insights. However, is your profile data representative of the global population? Chrome aggregates anonymized usage statistics from opted-in users to help the engineering team improve Chrome's features and performance, and I've pulled the same global metrics for Chrome for Android. Let's take a look... 50th 75th 95th WebFont.DownloadTime.0.Under10KB ~400 ms ~750 ms ~2300 ms WebFont.DownloadTime.1.10KBTo50KB ~500 ms ~900 ms ~2600 ms WebFont.DownloadTime.2.50KBTo100KB ~600 ms ~1100 ms ~3800 ms WebFont.DownloadTime.3.100KBTo1MB ~800 ms ~1500 ms ~5000 ms WebFont.BlankTextShownTime ~350 ms ~750 ms ~2300 ms PLT.NT_Request ~150 ms ~380 ms ~1300 ms No blank text Had blank text WebFont.HadBlankText ~71% ~29% 29% of page loads on Chrome for Android displayed blank text: the user agent knew the text it needed to paint, but was blocked from doing so due to the unavailable font resource. In the median case the blank text time was ~350 ms, ~750 ms for the 75th percentile, and a scary ~2300 ms for the 95th. Looking at the font download times, it is also clear that even the smallest fonts (<10KB) can take multiple seconds to complete. Further, the time to fetch the font is significantly higher than the time to the first HTML response byte (see PLT.NT_Request) that may contain text that can be rendered. As a result, even if we were able to start the font fetch in parallel with the HTML request, there are still many cases where we would have to block text rendering. More realistically, the font fetch would be delayed until we know it is required, which means waiting for the HTML response, building the DOM, and resolving styles, all of which defer text rendering even further. Developers need control of the text re[...]

Resilient Networking: Planning for Failure


A 4G user will experience a much better median experience both in terms of bandwidth and latency than a 3G user, but the same 4G user will also fall back to the 3G network for some of the time due to coverage, capacity, or other reasons. Case in point, OpenSignal data shows that an average "4G user" in the US gets LTE service only ~67% of the time. In fact, in some cases the same "4G user" will even find themselves on 2G, or worse, with no service at all. All connections are slow some of the time. All connections fail some of the time. All users experience these behaviors on their devices regardless of their carrier, geography, or underlying technology — 4G, 3G, or 2G. You can use the OpenSignal Android app to track own stats for 4G/3G/2G time, plus many other metrics. Why does this matter? Networks are not reliable, latency is not zero, and bandwidth is not infinite. Most applications ignore these simple truths and design for the best-case scenario, which leads to broken experiences whenever the network deviates from its optimal case. We treat these cases as exceptions but in reality they are the norm. All 4G users are 3G users some of the time. All 3G users are 2G users some of the time. All 2G users are offline some of the time. Building a product for a market dominated by 2G vs. 3G vs. 4G users might require an entirely different architecture and set of features. However, a 3G user is also a 2G user some of the time; a 4G user is both a 3G and a 2G user some of the time; all users are offline some of the time. A successful application is one that is resilient to fluctuations in network availability and performance: it can take advantage of the peak performance, but it plans for and continues to work when conditions degrade. So what do we do? Failing to plan for variability in network performance is planning to fail. Instead, we need to accept this condition as a normal operational case and design our applications accordingly. A simple, but effective strategy is to adopt a "Chaos Monkey approach" within our development cycle: Define an acceptable SLA for each network request Interactive requests should respect perceptual time constants. Background requests can take longer but should not be unbounded. Make failure the norm, instead of an exception Force offline mode for some periods of time. Force some fraction of requests to exceed the defined SLA. Deal with SLA failures instead of ignoring them. Degraded network performance and offline are the norm not an exception. You can't bolt-on an offline mode, or add a "degraded network experience" after the fact, just as you can't add performance or security as an afterthought. To succeed, we need to design our applications with these constraints from the beginning. Tooling and API's Are you using a network proxy to emulate a slow network? That's a start, but it doesn't capture the real experience of your average user: a 4G user is fast most of the time and slow or offline some of the time. We need better tools that can emulate and force these behaviors when we develop our applications. Testing against localhost, where latency is zero and bandwidth is infinite, is a recipe for failure. We need API's and frameworks that can facilitate and guide us to make the right design choices to account for variability in network performance. For the web, ServiceWorker is going to be a critical piece: it enables offline, and it allows full control over the request lifecycle, such as controlling SLA's, background updates, and more. [...]

Capability Reporting with Service Worker


Some people, when confronted with a problem, think: “I'll use UA/device detection!” Now they have two problems... But, despite all of its pitfalls, UA/device detection is a fact of life, a growing business, and an enabling business requirement for many. The problem is that UA/device detection often frequently misclassifies capable clients (e.g. IE11 was forced to change their UA); leads to compatibility nightmares; can't account for continually changing user and runtime preferences. That said, when used correctly it can also be used for good. Browser vendors would love to drop the User-Agent string entirely, but that would break too many things. However, while it is fashionable to demonize UA/device detection, the root problem is not in the intent behind it, but in how it is currently deployed. Instead of "detecting" (i.e. guessing) the client capabilities through an opaque version string, we need to change the model to allow the user agent to "report" the necessary capabilities. Granted, this is not a new idea, but previous attempts seem to introduce as many issues as they solve: they seek to standardize the list of capabilities; they require agreement between multiple slow-moving parties (UA vendors, device manufacturers, etc); they are over-engineered - RDF, seriously? Instead, what we need is a platform primitive that is: Flexible: browser vendors cannot anticipate all the use cases, nor do they want or need to be in this business beyond providing implementation guidance and documenting the best-practices. Easy to deploy: developers must be in control over which capabilities are reported. No blocking on UA consensus or other third parties. Cheap to operate: compatible and deployable with existing infrastructure. No need for third-party databases, service contracts, or other dependencies in the serving path. Here is the good news: this mechanism exists, it's Service Worker. Let's take a closer look... Service worker is an event-driven Web Worker, which responds to events dispatched from documents and other sources… The service worker is a generic entry point for event-driven background processing in the Web Platform that is extensible by other specifications - see explainer, starter, and cookbook docs. A simple way to understand Service Worker is to think of it as a scriptable proxy that runs in your browser and is able to see, modify, and respond to, all requests initiated by the page it is installed on. As a result, the developer can use it to annotate outbound requests (via HTTP request headers, URL rewriting) with relevant capability advertisements: Developer defines what capabilities are reported and on which requests. Capability checks are executed on the client - no guessing on the server. Reported values are dynamic and able to reflect changes in user preference and runtime environment. This is not a proposal or a wishlist, this is possible today, and is a direct result of enabling powerful low-level primitives in the browser - hooray. As such, now it's only a question of establishing the best practices: what do we report, in what format, and how to we optimize interoperability? Let's consider a real-world example... E.g. optimizing video startup experience Our goal is to deliver the optimal — fast and visually pleasing — video startup experience to our users. Simply starting with the lowest bitrate is suboptimal: fast, but consistently poor visual quality for all users, even for those with a fast connection. Instead, we want to pick a starting bitrate that can deliver the best visual experience from the start, while minimizing playback delays and rebuffers. We don't need to be perfect, but we should account for the current network weather on the client. Once the video starts playing, the adaptive bitrate streaming will take over a[...]