~/imallett (Ian Mallett)

Shared Libraries are Bad for You
by Ian Mallett

In recent history, a creeping horror has emerged among those who program. The shared library.

It started with good intentions.

====FIX ONCE====

One argument for shared libraries was that it would make everything work better. Library authors can fix a bug once. Then, everyone can update their libraries to solve the problem.

This ignores the reality of the situation.

J. Random User does not go thinking: "Oh, I wonder if X random library has released a bugfix recently? I'll download the latest distros and patch my system.". As exceptions, a select few people, in desperation to fix some problem, do try to download new shared libraries, updating them incompetently. This leads to system instability at best, and total mayhem at worst. There are entire sites (e.g. dll-files.com, systweak.com, snapfiles.com, dllme.com, dlldump.com, nodevice.com, dlldll.com, etc.) dedicated at least mostly to helping to fix shared-library-related errors, but in the hands of an incompetent user (or if some sites are not entirely charitable), these sites can help one mung everything.

====DEPENDENCE====

Sometimes, applications depend on having a certain version of a certain shared library (libpng writes, for example: "a program that was compiled with an older version of libpng may suddenly find itself using a new version with larger or smaller structs"). This is called ABI-incompatibility. If the user tries to get the new version to fix application A, then application B might suddenly break. Since systems have only one shared object for each library (or should; that's why they're called "shared"), you can't use applications A and B at the same time! You have to patch and then unpatch your system constantly. In reality, this does happen—I have encountered a few such cases myself—whereupon, like most, probably, I eventually gave up in disgust. Installers often install shared libraries for you, making the problem even more opaque to the average user. Sometimes these installers helpfully try to update an older shared library that everything depends on, thereby accidentally causing massive, invisible, difficult-to-pinpoint, lossage.

Dependence on a misfeature is regrettable—but if updating a random library breaks something, then the user has no idea what happened—which is worse.

With static linkage, individual application maintainers, when updating their build dependencies for the next release, realize at compile-time that their application depended on a misfeature—and they fix their application, redistributing as necessary. This is the Right Thing, since code changes are localized, and the application developer actually realizes that there is a mistake and fixes things. With shared libraries, the user generally kluges around the issue as best as possible and the developer never finds out their application has a problem.

Not only this, but static linkage allows library developers to radically change their entire API without worrying too much about backwards compatibility. Suppose a library is completely rewritten. All existing applications that statically use the library still work—since they still use the old version internally. Later, an application's developer may update that application to use the library's new API. With dynamic linkage, the system would need different versions of the shared libraries to even have a chance of working—but this leads to the problems detailed below.

This last effect is desperately evident in many "mature" libraries, where a huge percentage of functionality is duplicated for the purpose of compatibility with old software. In the best case, this mixed codebase is disgusting. In the worst case, the old compatibility callbacks are broken and reverse compatibility doesn't happen anyway.

With dynamic linkage, getting rid of some function calls in the new version of a shared object might break some utility compiled in 1992. If that utility had been statically linked, then the library's change wouldn't have broken anything. Notice that with static linkage, the Right Thing happens: the old utility continues to work as well as it did before and one has the option of recompiling it to take advantage of new features. Notice, most importantly, that nothing broke unnecessarily.

====MULTIPLE COPIES====

One argument is that shared objects save space. While this is sometimes true and is generally a laudable aim, it is by no means of critical importance today. Disk space is (just about) free—and modern applications usually have one or two orders of magnitude between their data's size and their executable size anyway. A couple megabytes of extra space in a binary doesn't hurt anything compared to the gigabytes of data, documentation, and other resources that ship with larger modern applications.

Generally, this is fine even for embedded systems too, since most libraries for these platforms are very small no matter how they are compiled (anyone trying to run a bloated library like Unity on an Arduino needs to reexamine their life). Also note that general-purpose dynamic libraries aren't supported on most, if really any, embedded devices anyway.

But, shared objects generally don't end up saving space. While system-level shared objects are usually effectively shared (and certainly when writing an operating system, dynamic libraries can be useful and necessary), generic shared libraries (what most software developers actually write), are usually not.

One problem is that developers often install libraries for development and then months or years later forget that they installed any of the shared objects that came with it. When distributed, many users report missing linkage errors. The application loses credibility. Some tech-savvy users, accustomed to such problems, install the shared objects themselves (or have done so for other applications in the past) and happily report no problems. Often, the users most closely in touch with the application developers—those most likely to submit a bug report, and those that test the application in beta—are of this type. So, sometimes, the developers nor any of their aides are able to catch these issues before an application gets out the door.

To correct for this, application developers often realistically assume that the end user does not have fringe packages' shared objects already on their system—so they defensively bundle the shared objects with their executable, just in case. This fixes the problem—and actually a lot of other ones too, since the hosting OS usually gives local shared objects higher precedence than any installed ones. But, this completely defeats the supposed benefits of shared objects! If the application were statically linked, the new application would be the same size as the old one plus its shared objects (or even a little smaller)—and since these shared objects are "hardcoded", all other benefits are effectively moot too.

Maybe you think that doesn't happen. I poked around a Windows box's "Program Files"/"Program Files (x86)" directories. On Windows, ".dll" files are shared libraries. Keep in mind, these are the programs' installation directories; folders meant only for one application. No sharing. Here's from TortoiseHG:

Notice those Qt .dlls. Those handle the GUI of Tortoise. There's no reason (apart from possibly licensing) they can't be bundled with the main application. Notice the Python .dlls. There's a perfectly functional Python 2.7 distribution already installed separately on this machine. But the Python distribution is mirrored here because the installers weren't smart enough.

Here's from Blender:

Same deal with Python 3.2. Those "msvc" .dlls are Microsoft Visual C runtime libraries. They tell me that this version of Blender was made using the MSVC compiler. Rather than link in the C runtime library, which basically every C program needs, it's also copied here.

Here's Maya:

I find it hard to believe that every single one of those .dlls is used by more than one .exe. I mean, all the modeling and rendering ones (I don't actually know, but probably stuff like "Nurbs.dll", "NurbsUISlide.dll", "cgGL.dll", "HWRender.dll", "ModelUISlice.dll", "DynSlice.dll", "GeometryAlg.dll", etc.) are all specific to maya.exe. For a large application like Maya, this is fairly typical. What may be happening here is that it makes updates easier, maybe?

. . . but then we have:

Almost all of these .dlls are tiny, and look how many there are. I can only guess this pathology was intentional.

A general solution is to make an installer that sorts out the whole mess—but then you need an installer which, for a simple application, is overkill. And when the installer gets it wrong, it can brick a system by inadvertently changing a shared library the system depends on (though robust OSes have two versions—the system used one in a system-write-protected directory to try to compensate (see #dependence on features)). Fortunately, this latter problem is rare, but it does happen, especially when the OS doesn't expect such a change.

An additional benefit of static linkage (or even mostly static linkage), is that the resultant binary has drastically fewer files (ideally down to one). This makes it much more obvious where the actual application is (especially on systems with no file extension convention). As in the examples above, I often see an application distribution with a folder containing twenty shared objects, and the actual executable three quarters of the way down the page. With static linkage, you just have the binary itself—you don't even need a folder!

====NAMING====

Another benefit is that explicitly keeping track of different architectures' shared library versions no longer needs to happen. In general, for example, x86 and x64 shared objects are not mutually compatible (x64 certainly isn't on x86 machines, and x64 libraries usually require some adjusting to be used by applications built for x86 running in emulation on x64 machines). On Windows, many libraries don't name these shared libraries differently, leading to horrible cruft like runtime additions to your path and multiple distributions containing files all named the same (*cough* Qt 5.0 *cough*). But, even if they're named differently, then at best you still double the problem above.

The point is that unless you dump all the different architectures' shared objects into their own local folders, you cannot have multiple builds of a program running on the same system—horrific for developers. With static linkage, you put the libraries inside the binary—and you can have as many different architectures installed simultaneously as your system supports.

I have run into at least one issue where two different dynamic libraries depended on two other different dynamic libraries that happened to have the same name. The libraries couldn't be in the same folder since they have the same name, but at the same time, they had to be in the same folder, since otherwise they wouldn't be found (dynamic libraries need to be named with their original names, or else the applications built on them can't find them). With static linkage, I could have just renamed one and linked them both.

====SECURITY====

There's also a (number of) security problem(s) inherent with shared objects—known in both Windows and UNIX computing spheres under the collective term "DLL injection". The simplest/easiest kind is just replacing any shared object the application uses with an adversarial one. Every single application that is dynamically linked suffers from this vulnerability. Static linkage mostly fixes this issue.

Think about it. How much widespread damage might be done by downloading such a shared object from one of the sites listed above? It probably has already happened. With Dependency Walker (a program that shows shared object dependencies), it's clear that many programs use shared objects the developers have probably never even heard of. What would happen if any of the ~100,000 functions in the hundreds of different shared libraries that Adobe Photoshop uses were to be corrupted? Suppose Photoshop issued a new release, but neglected to include just one of the shared libraries it needed? Thousands of users would be searching the Internet and more-or-less randomly downloading shared libraries to try to fix the problem before Adobe even discovered the problem. Suppose (extremely generously) that only one in a thousand of these people downloaded a DLL with seriously malicious code. The PR fallout would still be immense, and the reputation Adobe has built would be critically damaged.

Other vulnerabilities include programs being able to directly attack other programs running in an OS, or override legitimate loaded libraries used by the system. Or corrupt governments spying on you. Static linkage fixes these problems entirely.

====PERFORMANCE====

To use shared objects is generally slower in terms of performance, since there's generally a level of indirection when loading external modules. I know that you can snap pointers and make this mostly go away, but it's still irksome to deal with in the first place.

A secondary (and more significant) cost most new developers are unaware of is that functions are almost always loaded only as necessary instead of being in-core or immediately ready to page in. So, you get a fairly substantial runtime hit the first time you call any function in a shared library—the OS needs to search your path, find a shared object with a matching symbol table from among thousands, and then load the function into an executable segment. There's caching, of course, but it obviously hurts much more than having the function there in the first place. Unless it's a widely-used system library, it won't be.

Well-designed libraries minimize their APIs (usually because that's good anyway), but in the pathological case most of the runtime overhead of a simple batch utility might be in loading its shared libraries. This might not be a problem until you want to run your batch utility in a loop (say in a shell script). The OS is constantly loading the application without its shared objects, then loading functions 1 through n, one at a time, then exiting and unloading everything again.

I mention this latter effect because it comes up more frequently than I think I'm comfortable with, even though good OSes take steps to try to counteract the issue. Generous (and by no means rigorous) estimates of the massive build times for the popular library Qt (which makes heavy use of such batch files) suggest that most of the time after disk IO is spent thrashing shared libraries' functions into and out of core.

====LEGALITY AND CONCLUSION====

There is a part of the LGPL (Version 3, section 4) that states, effectively, that free libraries' maintainers must be able to update their code in other applications whenever they feel like it. The only practical way to do this (especially for proprietary software) is through shared objects.

There is a massive problem with this. It forces all of the above problems onto every application that uses any library under this license. My heart lies with the free software movement, but, put frankly, the LGPL (and the corresponding section of the GPL) need to be fixed.

I will not be licensing any of my newer low-level projects under the LGPL or the GPL. Most will be free of course, but they will be released under different terms. This means that I am forced to remove all LGPL and GPL libraries from my sources. This pains me deeply.

However, the plain fact is that shared objects are too great a problem to be ignored any longer. They must be eliminated where unnecessary or at least completely reinvented. They are not tractable in their current incarnation. They are at best a threat to all softwares' integrity, a security liability, and a massive inconvenience. At worst, they break software, propagate code bloat, horrify developers, and restrict intellectual freedom. They need to die.


COMMENTS
Ian Mallett - Contact -
Donate
- 2018 - Creative Commons License