Untangling Package Management in JavaScript Applications

494

If a JavaScript developer was frozen in 2005 and miraculously thawed in our present world of 2017, the thing that would likely amaze them is the massive proliferation of JavaScript packages. The video below gives us a fascinating visual representation of the package explosion over time.

The JavaScript ecosystem today consists of packages for nearly every need, from large framework libraries, to small functional packages that perform niche tasks. These bundles of componentized code have been instrumental in the evolution of JavaScript as a powerful and popular programming language. With the growth in packages, developers have also seen increased need for performant, reliable package managers to install and manage the multitude of dependencies.

The first post in our series discussed the JavaScript revolution and outlined three of the core components of a modern front-end development stack: package management, application bundling, and language specification. In this post, we’ll talk about where we started with package managers, how they’ve evolved, and why Kenzan recommends Yarn as a best bet for scaled applications in the continuing evolution of package management.

How It All Began

As package managers started to take root, and people moved away from downloading packages and including all their files manually, npm (version 2) and Bower both emerged as frontrunners.

Back then, npm exclusively handled node packages. It was uncommon to house packages with front-end assets like HTML and CSS in the early versions of npm. Bower, on the other hand, was built specifically to handle client-side packages with HTML, CSS, and JS assets. It was an invaluable tool for early front-end projects. Bower had its own registry of front-end packages and delivered a flattened dependency tree, making the user decide which version of a package they wanted on conflicts. It set a precedent for the package management function and started the JavaScript world down the road of package proliferation.

As our applications grew and the number of libraries increased, this foundation began to crack. Version mismatches became harder to handle. Build processes with Bower required wiredep and quite a bit of configuration. And, most importantly, CommonJS and other module formats didn’t play nice. Module loaders like webpack had a difficult time handling the concatenated and bundled format of bower packages, and occasionally couldn’t modularize them at all. This became a prominent issue as a more modular JavaScript took root.

Amidst these problems, npm launched version 3. It offered a flattened dependency tree with module nesting on conflicts, CommonJS module support, and a single ecosystem for both front-end and node packages. This was overall a big success for npm, and it was largely adopted in the JavaScript community. However, this system soon began to show flaws in projects. First, npm version 3 and 4 were non-deterministic, meaning that modules were not always installed in the same order or nesting pattern. This caused notorious “Works on my machine” bugs for developers as node modules began to drift. A second issue was caching. The npm cache was unreliable, and corrupted items could be removed and not replaced. This meant that developers could not rely on previously cached items being available for future installs, and offline installs were out of the question.

We also saw new package managers like JSPM begin to appear. JSPM (short for JavaScript Package Manager) came out as a tool alongside SystemJS. The two packages attempted to handle JS module loading with a clean integration into package management. JSPM worked with the npm registry, GitHub, and private registries to install dependencies, and then update the SystemJS configuration to map to the new module so it could be imported across files. Although package management and module loading shared some common concerns, they didn’t seem so in sync that they required explicit pairing. Additionally, for two tools built to work together, the configuration was very challenging. As webpack gained popularity, JSPM began to lose support. The final nail in the coffin seemed to come when SystemJS was replaced by webpack in the Angular CLI.

And Then Came Yarn

Yarn was introduced in late 2016 by Facebook to address some of the common complaints of npm, as mentioned above. Yarn takes influence from npm, leveraging the huge npm registry and package.json file, to make the application scaling process more repeatable and pain free. Yarn uses a custom lockfile and install algorithm to ensure a deterministic install across all users with the same version of Yarn. This ensures that all developers have identical node modules, no matter how they choose to install them. No more package drift, and no more hidden bugs! It also means consistency between developers and CI environments.

Further, Yarn introduces a caching mechanism for node modules. With a warm cache, Kenzan has seen a huge decrease in install times, with speeds up to 4x faster. Faster installs translates to faster builds and faster development. The cache also allows for sandbox installs, or an install without Internet. This feature is increasingly important for enterprise applications because it prevents any malicious content from being injected during the install. The packages are cached and inspected, then all further installs coming from the cache are safe.

At Kenzan we have largely adopted Yarn over npm to take advantage of the build consistency, decreased install times, and sandbox installs for our clients. With the change we have seen fewer issues with our CI/CD processes and faster install times overall. For our clients, this means less money spent and fewer bugs sneaking into downstream environments where they don’t belong.

This is not to say that Yarn is without flaws. It has shown a couple of issues, including limited support for private npm packages, and some issues with pulling packages from GitHub. If these become important issues for one’s project, Yarn may not be the right choice just yet. However, the package sustains a high level of development support, and it appears that bug fixes are pushed through the PR process faster than with npm. We anticipate Yarn will be able to handle all the situations that npm can handle soon enough.

And Then npm Again!

This spring npm released version 5, which brings the package manager up on competitive ground with Yarn. This begs an interesting question for all the developers that have hopped on board with Yarn—do we change back or stay the course?

To make an educated choice, we felt we needed to run some tests and do our own research. Is npm v5 really as fast as they say? How does it compare to Yarn when it comes to the cache, integration, and scalability?

First, let’s start with speed. We conducted an install speed test using the create-react-app project. We compared npm v4, Yarn, npm v5, and then the last two options with a warm cache. For each item, we ran 10 install tests to normalize for varying network conditions. Here is what we found.

6ZqbjTpna1ddmzVgexCH8RP__M68He7nhjWKErPn

npm v5 was fast. Almost 4x faster than npm v4! It was even faster than Yarn on a cold install, but only by a smidgen. The real standout in terms of speed was Yarn with a warm cache, which was roughly 2x faster than the next quickest option. The conclusion of this test was that, yes, npm v5 exhibited strong performance benefits over previous versions, but was still slightly lacking on cache install speed when compared to Yarn.

Next we looked at caching mechanisms. A big pain point of previous npm versions was an inconsistent cache. On this front, they seem to have caught up. The cache is now self-healing, so when corrupted data is removed it is automatically reinstalled fresh, and installs are retried on failures. This means that cached items should be available when you need them, similar to Yarn.

On usability and integration npm has a small leg up. The new npm lockfile file contains a deterministic list of all packages, with root level packages raised to the top of the tree so a full install can be completed with just that file. For a deterministic install, Yarn requires both yarn.lock and package.json. Also npm integrates seamlessly with private registries and package publishing since they are created by a single force, the npm team.

Finally, we looked into scaling. At this moment, the two package managers have similar capabilities when it comes to scaling for full-size enterprise applications. However, Yarn was built and continues to progress as a package manager designed for large scale applications. It prioritizes issues of scale and security over advanced registry features (although those are still addressed). It is likely that Yarn will lead when it comes to improvements for large-scale applications, although this is only a hedge.

Where We’ve Landed

In the introduction to this blog series, we stated that all projects should only need one package manager. In our experience, we’ve found that consistency in tooling across machines helps to create reproducible builds and fewer bugs. But which one you choose should take into account the kinds of projects your organization takes on.

With the new release of version 5, npm may be a good choice. It provides similar speeds for smaller projects, and it is fully deterministic for consistency across machines. It has fewer bugs with private npm packages and GitHub repositories, and maintains an adequate level of development for bug fixes and new features. In our opinion, it is suited to smaller applications with less room for scale.

However, at Kenzan our focus is on enterprise applications with a maximum ability to scale. Within these constraints Yarn has proven to be a pragmatic choice. It provides us with speed for quick builds, determinism for package consistency, and an active development network for new and improved features. No matter how many developers and environments exist in an application, we can be sure it will scale with Yarn. For these reasons, we have chosen Yarn for our single package manager and invite you to give it a try as well.

Stay tuned for our next blog post, where we take our front-end stack one step further by exploring tools for bundling modules.

Kenzan is a software engineering and full service consulting firm that provides customized, end-to-end solutions that drive change through digital transformation. Combining leadership with technical expertise, Kenzan works with partners and clients to craft solutions that leverage cutting-edge technology, from ideation to development and delivery. Specializing in application and platform development, architecture consulting, and digital transformation, Kenzan empowers companies to put technology first.

webpack is a trademark of the JS Foundation.