The Real Costs of Using JavaScript on a Page (film, 36 minutes)

Hey folks. So building interactive experiences on the web often translates into sending a large amount of JavaScript to users. Today, we're going to talk about the cost of JavaScript and some ways to load our experiences fast and still give users a pretty good first experience. Now, I love JavaScript. The reasons for having JavaScript on your web pages can be good ones. You might be looking for smoother transitions, faster loading of dynamic content, and better interactivity. But much like cake, there is such a thing as having too much of it, and this has become a bit of an unfortunate norm in recent years. So JavaScript is good, but in moderation. Now, zero JavaScript has been a new buzz phrase around the JavaScript community recently. We've seen React talk about server components, components that only render on the server, so you don't ship their JS in your bundles. Things like Moment or Lodash can maybe benefit from that. Remix and Svelte have emphasized leaning into what web browsers can do by defaults, instead of relying on so much JavaScript. There's been some good conversation around this topic floating around recently. Now, to tackle the cost of JavaScript, how we build and render on the web has been evolving. Rendering patterns have come a long way from server-side rendering and client-side rendering, all the way back to server-side rendering, to highly nuanced patterns like islands architecture and partial hydration, which we'll talk a little bit more about later. It can in fact sometimes feel like patterns are everywhere all at once, much like ugly Christmas sweaters. Now, while this can get overwhelming, it's important to remember that every pattern was designed to address a pretty specific use case. Now, the level of interactivity you require, things like session depth, they might inform how you render and how much you rely on JavaScript. For example, your personal blog may not necessarily need a lot of it. A highly interactive e-commerce site may require a little more. So a pattern that's beneficial for one use case may not be the best fit for another. Now, as I mentioned, server-side rendering, it's one of the oldest methods of rendering web content. It generates the full HTML for page content to be rendered in response to a user request. That content may include data from a data store or an external API. The connect and fetch operations are usually handled on the server. As such, rendering code is not required really on the client, and the JavaScript corresponding to the experience doesn't need to be sent to the client either. This is how we used to do things back in the PHP days. But SSR these days is not all that simple. If you've taken your favorite client-side rendering JavaScript framework and you're rendering on the server, simply server-side rendering your single page app isn't going to suddenly fix everything. In fact, it might introduce some additional trade-offs or surprises. Frameworked SSR pages often look deceptively loaded and interactive, but often can't actually respond to input until the client-side JavaScript is executed and event handlers have been attached. You have to deal with problems like hydration and so on. This can take seconds on slower devices and can lead to an uncanny valley effect where users are tapping but not actually seeing something happen. Now, I mentioned that how we build is evolving to try loading our JavaScript at more opportune times. This has given birth to ideas like progressive hydration. In order to only ship the bare minimum JavaScript required to initially render the most important parts of the page and make it interactive, companies like Airbnb have split each page into multiple sections, and they defer the downloading of their JavaScript and the rendering and hydration of those components if they're not in the viewport. That can lead to things loading and rendering a little bit faster. Their React app is server-side rendered, then uses progressive hydration to make sure the most important components of the page are fully loaded and interactive while deferring those loading and rendering of the sections that aren't currently in the viewport. We've also seen a lot of talk about the Edge these days. With Edge SSR, we can stream parts of the document as soon as they're ready, hydrate our components granularly, and then take advantage of things like Edge caching. That can reduce the waiting time for users as they can see components as they stream in one by one. But we can't talk about all of these advanced ideas without first understanding the true cost of JavaScript. Now, we've all had to deal with optimizing JavaScript a little later on than we probably should have trying to clean up your large JavaScript bundle six months in looks a little bit like this. It's like sweeping water back into the ocean, really. On my team, our mantra is, if JavaScript doesn't bring users joy, thank it and throw it away. This was probably in a Marie Kondo special somewhere. Now, if you take nothing else away from this talk, smaller JavaScript bundles can improve download speeds, lower memory usage, and reduce CPU costs. Post-download, executing JavaScript is the dominant cost now. So try to keep it minimal. Things like parse and compile costs are no longer really stopping folks from shipping more JS. It's mainly the download and the execution costs to keep in mind. We'll talk about these phases a little bit more really soon. So before we talk about rendering patterns, we're going to talk about two big topics, optimizing for hardware and optimizing for the network. Network latency and bandwidth can impact how soon things get over the wire. On the hardware side of things, processing power bounds computationally intensive tasks. So let's dive into hardware, optimizing for the hardware. Now, the biggest bottleneck in web performance these days is your CPU. Half of web activity comes from mobile devices and slower desktop devices with a smaller CPU and more limited battery power. Your transmit size might be crucial for low-end networks, but JavaScript execution time is important for CPU-bound devices. JavaScript is CPU-bound. So let's talk about phones. Now, mobile devices are a spectrum. You have low-end, medium, and high-end. Across the spectrum, you can see very big differences in things like thermal throttling, differences in cache size, CPU, GPU, and there's just generally a large disparity between low-end and high-end devices. If you work in tech, there's a chance that you may have one of these medium or high-end devices, while your users may not necessarily have that. So it's important to keep empathy with your end audience, your end users. Let's look at some data. So what's on screen right now is a visualization of some Geekbench data. Geekbench is a cross-platform processor benchmark with a scoring system that separates single-core and multi-core performance, and it uses workloads that try to simulate real-world scenarios. Here's a plot of Geekbench single and multi-core tests for a range of high, mid, and low-end phones. Take a look at some of those expensive higher-end devices. You see lots of iPhones at the top end, and there's a huge performance difference comparing your CPU at that end of the spectrum, compared to some of those cheaper phones, a lot of Android phones, a lot of other kinds of devices using Snapdragon chipsets lower down. Now, these differences can lead to real big deltas in how your users experience your site, and it's not just about mobile. Desktop, laptops, they're affected by this same phenomenon. This trend holds true for those high-end Macs. If you're on a high-end MacBook Pro, anything from the last four years, that might have very, very different CPU characteristics to the average desktop, the average laptop that your users may have. This is especially true of things like the education sector as well, where cost is a big factor trying to buy machines that are lower cost as a thing. So again, you have to have a lot of empathy for your users because they may not have as strong CPUs or as much memory on their systems as your high-end device that you're developing on. Now, this performance inequality gap is likely to continue growing. We're likely to see flagship devices continue to race ahead, while cost continues to be critical to the next billions of users trying to get on the web. There's an expanding range of performance. The slower devices are not really getting that much faster. Focus is really on driving down the price, while those fast devices are getting faster and faster, and it's just important to make sure that you've got this in mind when you're developing for the web. Now, to understand this topic a little bit more, let's talk about how browsers work. Now, I work for a web browser called Chrome. When users access your site, you're probably sending down a lot of files to the browser, lots of JavaScript. Let me visualize that for you. Looks a little bit like this. Now, I love JavaScript, but my position on it is that byte for byte, JavaScript is the most expensive resource on your site. Let me explain why. Now, the biggest bottleneck in web performance, as I mentioned, is your CPU. You need to make sure that you're optimizing for your CPU, and you're not making big assumptions about what kind of CPUs your users have. It's useful to take a look at a life of a script and how JavaScript engines such as V8 work, just to understand where this all fits in. There's a process that begins with scripts coming in from the network or the cache. It's potentially an extension or an app. That script text gets processed by a parser, yields to an abstract syntax tree. We're getting computer science here. That tree then walks to generated byte code, that byte code is interpreted. Eventually, all of this code is run. But really, the cost you need to keep in mind, again, are your network, getting it to the user, and then execution. A lot of these steps in between have been optimized very well by JavaScript engines and browsers over time. Those are the ends of the spectrum that I'd love you to focus on. As I mentioned, CPU, GPU, and RAM, these are things that can impact performance that are related to your hardware. Now, hardware constraints computationally expensive tasks. These are often tasks requiring a lot of optimizations, operations. They can be expensive rendering, expensive JavaScript, even if you don't understand how CSS works, you can imagine rendering gradients is maybe more costly than just rendering a rectangle. Now, up here we have some data from the HTTP archive. It shows the JavaScript bytes on mobile, as well as the unused JavaScript bytes on mobile. As was the case last year from WebAlmanac, we saw that this year marked yet another increase in the amount of JavaScript that people were shipping to browsers. Overall, there's an eight percent increase from 2021 for mobile devices and while capabilities continue to improve, not everyone is really running the latest device. According to Lighthouse, the medium mobile page loads about 162 kilobytes of unused JavaScript at the 90th percentile, about 604 kilobytes. This is a slight uptick from the last year, in fact. We're still shipping a lot of JavaScript down to our users, probably too much. What impact do large bundles have on performance? Just keep in mind that even if you think your code, your first-party code, your application logic is quite small, when you factor in your framework, all of the libraries, your UI components, all of these different layers that you're pulling down from NPM and from the web, all of these add up and then you throw on third-party code on top of it. Maybe your trackers, your analytics, whatever else that your company may have, and it all adds up. I call this the JavaScript tax, useful to just be aware of. So how does this all play into the impact to users? I like to think of the load experience as a little film strip. You want the experience to be visible and ideally interactive as soon as possible. Things like first contentful paint, which is the best to improve on. There's no real simple answer here. There's no one single metric that captures the real user experience. In the last couple of years, we've been increasingly focused on the Core Web Vitals metrics. I tend to think of these phases in terms of, is it happening, is it useful, and is it usable? Whatever metrics you're looking at, whether they're the Core Web Vitals metrics, including the upcoming interaction to next paint, or your own custom metrics, what's key is that you're giving users early visual feedback, and then the ability to complete actions as quickly as possible. Now, for a page to be interactive, it has to be capable of responding quickly to user input. Whether a user is clicking a link, tapping on a custom component, if the page can respond quickly, then we consider the page to be interactive. Now, late loading JavaScript can cause server-side render pages to fail in some infuriating ways, and that uncanny valley effect is the reason we focus on when pages really become reliably interactive as much as we can. Now, in this example, when the above code is running, it's not just blocking other JavaScript code, but it's blocking all of the other tasks on the main thread, and that includes so-called native interactions you might not expect to be affected by user code. So it means the user can't really use it. Part of the things that impact this are long tasks. So long tasks monopolize the UI thread for extended periods of time, typically over 50 milliseconds, and they block other critical tasks from being executed even if the page is visually ready. So break these up. Break these up into smaller tasks by splitting up your code and prioritizing the order in which they get loaded. Not only can you get your pages interactive faster, but you can reduce long tasks and then hopefully have less input latency and fewer slow frames. So you need to break up those tasks that means taking a single long task and dividing it into much smaller tasks that take less time to run individually. So the two key problems I want you to keep in mind, download times critical for slow networks, JavaScript execution time is critical for devices with slow CPUs. On the server, you get what you pay for with your CPUs, your disks, your networks. As we build sites more heavily relying on JavaScript, we sometimes pay for what we send down in ways that we can't see. The shape of success is really whatever lets us send less code while delivering the maximum value to our users. There are other factors on phones and desktop and laptops that are worth keeping in mind. Low battery can slow down your CPU performance, overheating can slow down your CPU performance, and background processes can also impact this. So that's it for hardware. Let's briefly talk about optimizing for the network because network latency and bandwidth can impact how soon things get over the wire. Now, I like to say that I use the free Wi-Fi at Starbucks to complain how slow their Wi-Fi is. Just a reminder that a fast connection does not mean it's always reliably fast. Networks are complex. So how do networks impact performance? When we talk about networks, we define them in terms of bandwidth and latency. Bandwidth can be thought of as data throughput. If water were data, a straw would be low bandwidth versus a pipe or a fire hose. Latency is delayed due to data travel, and the amount of latency is a factor of transmission, medium, and distance. This is one reason why CDNs are considered another performance strategy because they locate data a little bit closer to users. If you had a short garden hose, you can imagine water coming out a little bit more quickly. If you have a long one, you're probably waiting, waiting, waiting, and eventually it comes out. So the long hose is high latency in this picture. In most markets, it's hard to blame bandwidth for performance issues these days. Increasing bandwidth can have diminishing returns. Gaming and video are probably an exception these days, but bandwidth is less of a direct problem. And then latency is actually highly correlated with performance. Given a performance goal of a second, we can have many back and forth trips to the server if we want to achieve our goal. Now, sometimes folks will wonder, you know, how much is 4G, 3G, 2G, 5G? How much do these play in to this topic? Here is a view of the Chrome UX report showing for a popular site, this is twitter.com, showing for a popular site on mobile just how much coverage we have for 4G versus 3G. 3G is roughly 10%, I would say, in this picture, with the majority being 4G connection types. You can use the Chrome UX report to try getting data like this for your site if it has enough traffic coming to it. It's really insightful. Now, resource loading is just generally hard. Hopefully, you'll have a better understanding of it after reading through this list and checking out this talk. But performance tends to be tightly coupled to latency. Connection costs are high. Congestion control is often unavoidable. Critical resources can often be hidden. As we rely on JavaScript increasingly these days, sometimes critical resources can require other scripts to be downloaded first before the browser can parse them and actually discover where those other resources are so we can fetch them. And script execution, as I mentioned earlier, continues to be expensive. Now, there are a lot of different facets to navigations. There is all the time we spend in the user's connection. There is the server time, and then there's the browser execution. Navigation timing is really great if you're trying to get insight into how this works for your site and your users. Now, there are a number of evergreen network best practices. You can reduce DNS lookup times, reduce your TCP connections, take a look at things like heavily caching resources, minimize the number of HTTP redirects. There's lots of great content on this topic over on web.dev. And there are a number of compression trade-offs worth also keeping in mind. Compression was one of the tips in the evergreen network best practices, but there's a little bit of a trade-off here. If you end up splitting up your JavaScript bundles pretty granularly, you might have good caching, but you'll have poor compression. Whereas if you go in the other direction and you have larger bundles, that might lead to good compression, but poor caching and poor granularity. So just keep in mind that there are trade-offs to this stuff. We've started to see increases in folks using Brotli for compression rather than GZIP. Generally speaking, I've seen lots of great case studies where it's improved latency, reduced the overall amount of JavaScript, and it's been effectively like a lot more efficient than GZIP, so consider using Brotli. And then there are a number of ways that you as a developer can influence how browsers think about fetch priorities and when to fetch and prioritize things. So preload helps the browser fetch late-discovered resources earlier. Priority hints in the fetch priority attribute allow you to hint at the relative priority of a resource. DNS prefetching allows the browser to perform DNS lookups on a page in the background while the user is browsing. And there are lots of other hints here, including ones that have recently landed in Chrome, such as 103 early hints worth keeping in mind. So let's talk about optimizing, loading, and rendering next. This is where we start to talk about patterns. Now, before we do that, I want to talk about JavaScript bundles. We all NPM install popular packages without realizing that these have exploding costs over time. This is what an NPM install looks like on my device most of the time. It also has an effect on your JavaScript bundles. So just keep in mind auditing what you're shipping to your users. Now, there's some great tools like Lighthouse TreeMap and Webpack Bundle Analyzer that can help you here to better understand what you're sending down to your users. I highly recommend checking these out. So let's talk about code splitting. It's a foundational pattern for optimizing how JavaScript loads. We'll briefly cover it before looking at ideas to build on top of it. Starting off with static import. So in JavaScript modules, the import keyword allows us to import code that's been exported by another module. By default, all the modules we're statically importing get added to an initial bundle. A module that's imported using the default ES2015 import syntax is gonna be statically imported. This is an example with a very simple chat app, contains a chat component, which we're statically importing, and then we're rendering a number of components. We've got user info, a chat list, and a chat input. Now, within that chat input module, we're statically importing an emoji picker component to be able to show the user emoji when they toggle the emoji button, but that may not actually be a critical thing to show. Now, since the components were statically imported, Webpack bundled the module since the initial bundle, and we can see the bundle that Webpack created here. Our chat app's source code gets bundled into a single file, main.bundle.js, and it's pretty big. It's 1.5 megs. A large bundle size can affect our loading times, and that's a lot of code, right? It's a lot of code. So what can we do about that? Well, this is where code splitting comes in. When building a modern web app, you usually use a bundler like Webpack or Rollup to take your application source and bundle it together. Instead of generating one giant bundle that contains unnecessary code, we can split the bundle into multiple smaller bundles, and that can be loaded at more opportune times. Now, if we want to create async chunks that we can load at a more opportune time, we can use dynamic import. It allows us to better split and lazy load our modules, enabling patterns like loading based on user interaction. Now, in our chat application, remember that we had a number of components. Only three of those components are used instantly on the initial page load, and then we have that emoji picker that isn't directly visible and may not even be rendered at all if, you know, maybe you don't like emoji. So this would mean that using static import, we might be unnecessarily adding that emoji picker module to our initial bundle, which could increase loading time. To solve that, we can dynamically import the emoji picker component, and instead of statically importing it, we'll only import it when we want to show the emoji picker. By dynamically importing the emoji picker component, we can reduce the initial bundle size from 1.5 megs to 1.3 megs. Although it's still a lot, it's a bit smaller, and, you know, the user might have to wait a sec until the emoji picker has been fully loaded when they're first trying it out, but we have improved the user experience by making sure that for most users, the application is rendered and is interactive when the user waits for the component to load. And you can also always do things like prefetching to reduce the time the user's waiting for that emoji picker. By bundle splitting our larger bundles into smaller bundles, main.bundle.js and emoji.picker.bundle.js, we reduce the initial loading time, and we're also able to reduce how much JavaScript needed to be executed, which helped us improve our time to interactive here. Now, to further improve the page's performance, you can use both route-based and component-based code splitting. Route-based code splitting ensures that only the content required for the current route will be fetched. Component-based code splitting can improve the loading experience even more. So components that aren't necessary for that initial render, such as components further down the page or that are only visible on user interaction can be deferred. Now let's talk about importing on visibility. This pattern is all about only loading things when they're visible or close to being visible. Invisibility is clearly something that Shaq is an expert on here, as you can see. You can't see me. Now, besides user interaction, we often have components in our web pages that aren't visible on the initial page. A good example of this is lazy loading images that aren't directly visible in the viewport, but might only get loaded once the user scrolls down. As we're not requesting all of our images instantly, we can reduce the initial loading time. This can also apply to components. So in order to know whether components are currently in our viewport, we can use the intersection observer API or libraries to quickly add import on visibility to our app. So whenever the emoji picker is rendered to the screen, after the user clicks on a gift button, we can detect that the emoji picker element should be visible, and only then will it begin importing the module while the user sees that a loading component is being rendered. This is just another way of approaching loading that experience up. Now, these patterns of loading on visibility or on idle or on interaction are becoming increasingly popular and can be found with first-class support in stacks like Astro and Eleventy. So let's talk about rendering at last. When you start architecting a new web app, one of the foundational decisions you make is how and where do I want to render content? Should it be rendered on the web server, the build server, on the edge, directly on the client? Should it be rendered all at once, partially or progressively? Gosh, that's a lot, that's a lot of questions. In the last few years, there've been a number of newer solutions to problems in the rendering space worth keeping an eye on. So let's talk about some of them. One of the first problems is hydration. It can feel a lot like what happens when you're washing a spoon and water just defies gravity. Hydration comes in a few different flavors these days, but let's start with a basic definition of it. Hydration or rehydration uses client-side JavaScript to convert static HTML pages into dynamic web pages by attaching event handlers to the HTML elements. Although server rendering can provide a fast FCP, it doesn't always provide a faster time to interactive. Necessary JavaScript to load and interact with this website may not have been loaded yet. And so we want to ideally avoid buttons looking interactive but not being. So one of the patterns that tries to work around this is islands architecture or partial hydration, as it's sometimes known. Partial hydration basically means that not all of the page needs hydration. This means we don't need to send all of our JavaScript for component code down to the browser. The term islands gets used a lot here and it basically involves breaking the app into a static page of only these islands of components that we need to send to the browser. Now the islands architecture encourages these small focused chunks of interactivity within server rendered pages. And the term islands architecture was popularized by Katie Seiler Miller and Jason Miller. The output of islands tends to be progressively enhanced HTML with more specificity around how that enhancement occurs. Next up we have progressive hydration. Now progressive hydration is the ability to hydrate the page as needed. Maybe when it comes into view or on interaction. Even if you end up hydrating the whole page, by breaking up hydration we can reduce the time spent during that initial load. So we've talked about hydration. We've talked about islands architecture, that partial hydration. We've talked about progressive hydration. Now progressively hydrating the app, we can delay the hydration of those less important parts of the page. That way we can reduce the amount of JavaScript we have to request in order to make the page interactive and only hydrate those nodes once the user needs it. Basically progressive hydration allows us to only hydrate components based on a certain condition. Resumable hydration is a more recent idea, I would say. It's been played around with in the last couple of years, but basically it's a technique to reduce the execution time during hydration by serializing the data needed at a local component level to skip the need to do any calculations at hydration time. A good way to think about the difference between hydration and resumability is by looking at push and pull systems. So in a push system with hydration, you generally eagerly download and execute code to eagerly register the event handlers just in case of a user interaction. Resumability is a pull-based system. So by default, you do nothing. You wait for a user to trigger an event, and then you lazily create the event handler to process the event. Quick and Quick City are solutions that are really driving some interesting innovation in the space. And if you care about looking at resumable hydration, I heavily recommend checking them out. Now, we've got a few more just to wrap us up. React Server Components, it would be an incomplete talk without mentioning React Server Components a little bit more. But the React team are working on zero bundle-sized React Server Components. These encourage components that only render on the server in React, and they're being advertised as zero bundle-sized components. What does that mean? Well, it means that you aren't shipping these components with your JavaScript bundle. The rendered templates are making it to the browser, eventually through a serialized format. But what you have here is component-specific data processing that you do after loading some data. This means no more having to client-side ship Lodash or Moment in the browser, leading to a much smaller bundle size. That's what's being shown here in the Lighthouse tree map, a much smaller bundle size, because we didn't have Moment show up in our client-side bundle. That's a huge win. Now, if you're interested in a more advanced demo, here's a demo of a Hacker News app using Next.js. One of these, the one on the left-hand side, is using client-side rendering. The one on the right is using React Server Components. So the client-side version requires the runtime to first be loaded, and then it fetches data from the client. React Server Components does everything on the server and streams the result. With a slow network, the left shows a normal client-side rendering app. It's a little slower. And with the right, we use React Server Components with streaming, which delivers a nicer overall experience. And then we have streaming server-side rendering. Now, streaming rendering allows you to start streaming components as soon as they're ready in chunks without entirely risking a slower FCP and time to interactive due to components that might take longer to generate on the server. React's built-in render-to-node stream makes it possible for us to send our application in smaller chunks. The React team are also working on selective hydration. So the React team uses partial hydration to mean different parts of their tree can hydrate independently. And selective hydration means that it will prioritize trees in response to a user interaction. So one way to think of it is that hydration is lazy and only happens in response to user interaction. And then as an added feature, React speculatively hydrates in the background whenever the main thread is idle, which they can do because rendering is non-blocking. So normally, React's able to generate a tree on the server using the React render-to-string method, which gets sent to the client after the whole tree's been generated. The rendered HTML is non-interactive until the JavaScript bundle has been fetched and loaded, after which React walks down the tree to hydrate and attach those event handlers. And that can lead to some performance issues, unfortunately. After trying out selective hydration, so instead of using render-to-string, we can now stream render HTML using the new pipe-to-node stream method. And that method, in combination with create root and suspense, makes it possible to start streaming HTML without having to wait for the larger components to be ready. That means that you can lazy load components when using SSR, which wasn't really fully possible before. So our comments component, which earlier slowed down our tree generation and TTI, is now wrapped in suspense. And so that selective hydration makes it possible to already hydrate components that were sent to the client even before the comments component has been sent. If you take nothing else from this section, hydrate, but not too much. Improving performance is a journey. There are lots of small changes that can lead to big gains. If you've invested in performance before, this might look familiar. I just fixed the site. Why did you ruin it? It's not uncommon to see teams fix performance only for it to regress soon after due to feature development. We've pretty much all been there. This is one of the reasons why I encourage folks to invest in performance budgets. Performance budgets keep everybody accountable and on the same page. They enable a culture of shared enthusiasm for improving the lived user experience. Teams with budgets can sometimes also find it easier to track and graph their progress. And I'd encourage you to check out budgets. We have some good tooling in places like Lighthouse for performance budgets now, and lots of good third party services also support these as a feature. So wrapping up, please stop taking fast networks, CPUs, and high RAM for granted. Fast devices can be slow, fast networks can be slow, and variability makes everything slow. And test on real phones and networks. We're talking about average mobile phones and devices here. Those things can be more constrained than the devices that are in our pockets or on our desktops. Have empathy for your users. And that's it. That's the end of our script. I hope that you found something in this talk useful. If you're interested in learning more, there's a free book that I and Lydia Halle have written about patterns over on patterns.dev. That's it for me. Thank you.

Menu

The Real Costs of Using JavaScript on a Page (film, 36 minutes)

Toggle timeline summary

Transcription