~/imallett (Ian Mallett)

TABs vs. Spaces

by Ian Mallett

Perhaps the classic holy war in computing is whether to use TABs in your source code. Most people don't really care anymore, and do whatever everyone else is doing (consistency in group projects is important). While the general industrial trend is for spaces, I want to come down strongly for TABs: you should use TABs to indent and spaces to separate and align.

If you have particular objections, I address them in the following FAQ. Hold on until then (or skip to there; you do you). For now, though, why should you do this?

The most obvious reason is that it's semantically correct. TABs were literally invented for indentation. Spaces are for spacing things. You wouldn't use a TAB between words in a sentence, because that's the wrong character. Similarly, you shouldn't use spaces to indent your code, because that's the wrong character.

One major advantage of this is that it allows you to adjust the indentation width in your editor for all the code, at-once, on-the-fly. Some people prefer a width of four characters. Some people, less-correctly, prefer two or eight. Some people like to watch the world burn, and use three, or other peculiar widths. It doesn't matter—they can configure their TAB display widths and everything will come through as they please. This is a great boon for collaboration!

There are a few other benefits you get for free:

  • It is much-harder to accidentally misindent code. With spaces, a one-space error can go unnoticed. Whereas, with TABs, given that most people choose a TAB width greater than one, a misindented line is off by the full TAB width, making its incorrect indentation immediately obvious.
  • Fewer keypresses—to input, delete, or skip over.
  • Typical source files become significantly smaller. This can be hugely significant on the web, where e.g. HTML code is downloaded to display the page (albeit minification is a thing).
  • Similarly, in purely-interpreted languages, fewer characters will run faster.
  • Similarly, in compiled languages, a smaller file size means it is faster to parse and therefore compile.

But the main thing I want you to take away, regardless of whether you ultimately agree or disagree with my suggestion to use TABs, is the indisputable fact: spaces are semantically incorrect for indentation. You are free to continue using them, with any justification you please, just so long as you remember this.

Now, I'll address some archetypal arguments from the other side. I intend to be exhaustive; if you have additional objections, please send them to me.

Why should I expend the effort to switch over?

So that you're not using the semantically wrong indentation character and can reap the additional benefits listed above to boot.

TABs have different widths on different platforms! My code won't look the same! [1]

Yes, and that is a good thing. People have different display preferences [9]. They may even require, in an accessibility sense, a different indentation width to make sense of your code.

It doesn't work! My code becomes misaligned.

Then you're doing it wrong. A typical example runs like:

....int function(int arg0,
.................int arg1);
with the result supposedly looking incorrect if TABs are used. But, remember, TABs to indent, spaces to align and space. There is a mixture of both here. The correct way would be:
→   int function(int arg0,
→   .............int arg1);

That doesn't work because TABs aren't just used at the beginning of lines! [1]

Again, use TABs to indent, spaces to align and space. If you erroneously use TABs to do aligned spacing, then sure, you'll get problems. This comes up as a strawman argument a lot, so let me repeat: do not use TABs to do alignment or spacing.

In any case, putting comments, the main use-case for mass-TAB/space-intraline-spam, on the same line leads to very long lines and messier code, and so you shouldn't be doing that anyway.

→   // Incorrect usage of TABs for spacing/alignment.  Unacceptable!
→   int function(int arg0,→ →   →   // This is a multiline comment.
→   .............int arg1);→→   →   // Messy and long lines.

→   // Correct usage of spaces for spacing/alignment.  Acceptable.
→   int function(int arg0,..........// This is a multiline comment.
→   .............int arg1);.........// Messy and long lines.

→   // Better commenting and layout.  My preferred solution.
→   // This is now a single-line comment.  Simple, short lines.
→   int function(
→   →   int arg0,
→   →   int arg1
→   );

TABs are semantically for tabulating, not indentation! [4]

This is incorrect. TABs are semantically for both, and in-fact the primary semantics now is indentation because tabulation via embedded text is largely obsolete as a typographical concept.

The very first commercial typewriters had a bewildering variety of keyboard layouts, and generally only one spacing key. The TAB key was patented in the year 1900 with the purpose of moving the carriage to the next TAB Stop (i.e. column), thus separating the task of spacing from indentation and tabulation, and resolving all three problems at once. Therefore, TAB was invented for the purpose of indentation and for tabulation, and this purpose has remained unchanged since the literal Victorian age.

By the time the TAB key was introduced for computers (sometime around 1980, by my research), the primitive software at the time didn't support anything for it but indentation, which is anyway a far-more-common task than separating table columns (even more-so today, since tabulation is typically handled by features of the environment, not its text). ASCII (adopted by ISO 646 which punts to ECMA 48 on the matter) states (§8.3.60 [12]) "HT indicates the beginning of a string of text which is to be positioned within a line", alongside language about the underlying datastream implying that this offsetting can be generalized for tabulation. So this is true for computers as-well, and is embedded in our most-fundamental standards.

There is a good summary of software interpretations at [2], configuration thereof at [8], and at least one alternative replacement [3] proposed, but the general concept of "move over to the next column somehow" seems clear, and this clearly makes its modern application be indentation (also to (some kinds of) alignment, though we reject this because of the problem discussed in the previous section).

In any case, it should be clear that spaces aren't semantically for either purpose.

Mixing TABs and spaces is dirty!

Why? Good code is both indented and spaced. If you think the combination is dirty, then either don't indent your code (supported in many languages) or don't separate your words (not supported by most languages)—but I think it's obvious the result will be dirtier. What's dirty is deliberately using the wrong character for the wrong task, and somehow thinking it a virtue.

Using TABs is difficult! Mixing TABs and spaces is difficult!

Please what? It's easier. You hit the indent key to indent. You hit the space key to space. No more reformatting files by your coworkers just so you can stand to look at them. No more counting spaces to check for misindentation. No more pounding the spacebar 32 times for each line (or binding the TAB key to lie to you to partially automate that process).

If one seriously cannot separate whitespace into trivial semantic categories (understanding the difference between spacing, indentation, line breaks, etc.), how can one possibly expect to deal with actually-difficult semantics, like the actual programming you're trying to do in the first place?

TABs make the source too wide to read [7]! My lines are wrapping!

So set their display width smaller. Problem solved. By the way, with spaces, you'd have been stuck with it.

It makes diffs harder because the first character is a "+" or "-" in a diff!

That would be a problem with that tool not making the correct semantic separation between an annotation for a line and the line itself (although if you did this, I think it would look better with TABs anyway), but in any case the complaint is unsubstantiated since the problem does not occur in said tool, and moreover there are better tools for computing and presenting differences anyway.

Emacs doesn't understand TABs!

If that were true (it isn't), then I'd get a better editor. I fully expect every editor to be competent in editing at-minimum UTF-8 text, and frankly ASCII, of which TAB is an integral part, is so fundamental to modern computing that not supporting it is unthinkable.

I do not know of any text editor people actually use that doesn't support TABs, although I know a scant few that support it incorrectly. Certainly all the major text editors support it just fine, from vi to Visual Studio. Maybe this complaint had substance 30 years ago, but it doesn't today.

Some tools I could imagine existing load space-indented files as TABs and save them back as spaces. So, let's all use spaces, but you can still be happy?

Even aside from the question of whether such tools exist, are integrated with major editors, and are reliable—all three of which are outstanding questions with answers generally trending toward "no"—from a technical perspective, converting TABs to spaces is far-easier than converting from spaces to TABs—it doesn't require any parsing or understanding of the source language! In other words, we might do this the other way round: everyone uses TABs, and by presenting TABs as spaces in editors, the space-indent users might be happy. (Though we shouldn't do that either, for all the other reasons in this article.)

With spaces, I can do half-sized indents, like for `public:` in a class definition! [5]

That is, obviously, an abomination.

Famous people and popular languages prefer spaces! [6]

And there are famous people and popular languages that don't.

Regardless, this is a mere argument from authority. Besides which, the space-indenting authorities disagree! For example, Linus Torvalds [10] vehemently wants 8-character indentation and Guido van Rossum [11] wants 4-character indentation. If they used TABs instead, they could collaborate while both using their preferred indentation widths.

More reading:

Note: Citation of sources given are not necessarily selected as primary sources, nor even as particularly significant. They are selected merely as being an instance of the topic at-hand I can point to.

[1] https://adamspiers.org/computing/why_no_tabs.html
[2] https://www.jwz.org/doc/tabs-vs-spaces.html
[3] http://nickgravgaard.com/elastic-tabstops/
[4] https://softwareengineering.stackexchange.com/questions/57/tabs-versus-spaces-what-is-the-proper-indentation-character-for-everything-in-e#comment324_72
[5] https://softwareengineering.stackexchange.com/questions/57/tabs-versus-spaces-what-is-the-proper-indentation-character-for-everything-in-e#comment402187_657
[6] https://softwareengineering.stackexchange.com/a/165/128588
[7] https://web.archive.org/web/20080820220453/http://www.movementarian.org/docs/whytabs/
[8] https://web.archive.org/web/20080617053639/http://rikkus.info:80/indentation.html
[9] https://web.archive.org/web/20080726022036/http://www.wiggy.net:80/rants/tabsvsspaces.xhtml
[10] https://adamspiers.org/computing/Linus-Kernel-CodingStyle
[11] https://www.python.org/dev/peps/pep-0008/#indentation
[12] https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf

Ian Mallett - Contact -
- 2021 - Creative Commons License