~/imallett (Ian Mallett)

TABs vs. Spaces

by Ian Mallett

Perhaps the classic holy war in computing is whether to use TABs in your source code. Most people don't really care anymore, and do whatever everyone else is doing (consistency in group projects is important). While the general industrial trend is for spaces, I want to come down strongly for TABs: use TABs to indent and spaces to separate and align.

Why should you do this? The most obvious reason is that it's semantically correct. TABs were literally invented for indentation. Spaces are for spacing things. You wouldn't use a TAB between words in a sentence, because that's the wrong character. Similarly, you shouldn't use spaces to indent your code, because that's the wrong character.

One major advantage of this is that it allows you to adjust the indentation width in your editor for all the code, at-once, on-the-fly. Some people prefer a width of four characters. Some people, less-correctly, prefer two or eight. Some people like to watch the world burn, and use three, or other peculiar widths. It doesn't matter—they can configure their TAB display widths and everything will come through as they please. This is a great boon for collaboration!

There are a few other benefits you get for free:

  • Typical source files become significantly smaller. This can be hugely significant on the web, where e.g. HTML code is downloaded to display the page. And, of course, purely-interpreted languages will run faster.
  • In compiled languages, a smaller file size means it is faster to parse and therefore compile.
  • It is much-harder to accidentally misindent code. With spaces, a one-space error can go unnoticed. Whereas, with TABs, given that most people choose a TAB width greater than one, a misindented line is off by the full TAB width, making its incorrect indentation immediately obvious.
  • Fewer keypresses—to input, delete, or skip over.

But the main thing I want you to take away, regardless of whether you ultimately agree or disagree with my suggestion to use TABs, is the indisputable fact: spaces are semantically incorrect for indentation. You are free to continue using them, with any justification you please, just so long as you remember this.

I'll address some archetypal arguments from the other side. I intend to be exhaustive; if you have more objections, send them to me.

Why should I expend the effort to switch over?

So that you're not using the semantically wrong indentation character and can reap the additional benefits listed above to boot.

TABs have different widths on different platforms! My code won't look the same! [1]

Yes, and that is a good thing. People have different display preferences. [9]

It doesn't work! My code becomes misaligned.

Then you're doing it wrong. A typical example runs like:

....int function(int arg0,
.................int arg1);
with the result supposedly looking incorrect if TABs are used. But, remember, TABs to indent, spaces to align and space. There is a mixture of both here. The correct way would be:
→   int function(int arg0,
→   .............int arg1);

That doesn't work because TABs aren't just used at the beginning of lines! [1]

Again, use TABs to indent, spaces to align and space. If you erroneously use TABs to do aligned spacing, then sure, you'll get problems. This comes up as a strawman argument a lot, so let me repeat: do not use TABs to do alignment or spacing. In any case, putting comments, the main use-case for mass-TAB/space-intraline-spam, on the same line leads to very long lines and messier code, so you shouldn't do that anyway.

→   // Incorrect usage of TABs for spacing/alignment.  Unacceptable!
→   int function(int arg0,→ →   →   // This is a multiline comment.
→   .............int arg1);→→   →   // Messy and long lines.

→   // Correct usage of spaces for spacing/alignment.  Acceptable.
→   int function(int arg0,..........// This is a multiline comment.
→   .............int arg1);.........// Messy and long lines.

→   // Better commenting and layout.  My preferred solution.
→   // This is now a single-line comment.  Simple, short lines.
→   int function(
→   →   int arg0,
→   →   int arg1
→   );

TABs are semantically for tabulating, not indentation! [4]

This is incorrect. TABs are semantically for both, although the common usage now is indentation.

Before the TAB key existed, standard practice was to use spaces for indenting text or to align tables (as there was literally no alternative). The TAB key was patented in the year 1900 with the purpose of moving the carriage to the next TAB Stop (i.e. column), thus resolving both problems. Therefore, TAB was invented for the purpose of indentation and for tabulation.

By the time the TAB key was introduced for computers (sometime around 1980, by my research), the primitive software at the time didn't support anything for it but indentation, which is anyway a far-more-common task than separating table columns. ASCII (adopted by ISO 646 which punts to ECMA 48 on the matter) states (§8.3.60 [12]) "HT indicates the beginning of a string of text which is to be positioned within a line", alongside language about the underlying datastream implying that this offsetting can be generalized for tabulation.

Hence, we see that indentation was either the whole or the major part of what TABs were intended for. There is a good summary of software interpretations at [2], configuration thereof at [8], and at least one alternative replacement [3] proposed, but the general concept of "move over to the next column somehow" seems clear, and this clearly makes its modern application be indentation (also to (some kinds of) alignment, but we reject this because of the problem discussed in the previous section).

Mixing TABs and spaces is dirty!

Good code is both indented and spaced. If you think the combination is dirty, then either don't indent your code (supported in many languages) or don't separate your words (not supported by most languages)—but I think it's obvious the result will be dirtier.

Using TABs is difficult! Mixing TABs and spaces is difficult!

It's really not—certainly it's far-easier than programming itself. People who cannot separate whitespace into trivial semantic categories (space, alignment, indentation, line breaks, etc.) cannot be expected to understand the far-more-complex semantics of actual programs, and should not be allowed anywhere near production software.

TABs make the source too wide to read [7]! My lines are wrapping!

So set their display width smaller. Problem solved. By the way, with spaces, you'd have been stuck with it.

It makes diffs harder because the first character is a "+" or "-" in a diff!

That would be a problem with that tool not making the correct semantic separation between an annotation for a line and the line itself (although if you did this, I think it would look better with TABs anyway), but in any case the complaint is unsubstantiated since the problem does not occur in said tool, and there are better tools for computing and presenting differences anyway.

Emacs doesn't understand TABs!

If that were true (it isn't), then I'd get a better editor. I fully expect every editor to be competent in editing at-minimum UTF-8 text, and frankly it's unimaginable not to handle ASCII.

Some tools I could imagine existing load space-indented files as TABs and save them back as spaces. So, let's all use spaces, but you can still be happy?

Even aside from the question of whether such tools exist, are integrated with major editors, and are reliable—all three of which are outstanding questions—from a technical perspective, converting TABs to spaces is far-easier than converting from spaces to TABs—it doesn't require any parsing or understanding of the source language! In other words, we might do this the other way round: everyone uses TABs, and by converting from/to TABs on load/save the space-indent users might be happy. (Though we shouldn't do this either, for all the other reasons in this article.)

With spaces, I can do half-sized indents, like for `public:` in a class definition! [5]

That is, obviously, an abomination.

Famous people and popular languages prefer spaces! [6]

And there are famous people and popular languages that don't.

Regardless, this is a mere argument from authority. Besides which, the space-indenting authorities disagree! For example, Linus Torvalds [10] vehemently wants 8-character indentation and Guido van Rossum [11] wants 4-character indentation. If they used TABs instead, they could collaborate while both using their preferred indentation widths.

More reading:

Note: Citation of sources given are not necessarily selected as primary sources, nor even as particularly significant. They are selected merely as being an instance of the topic at-hand I can point to.

[1] https://adamspiers.org/computing/why_no_tabs.html
[2] https://www.jwz.org/doc/tabs-vs-spaces.html
[3] http://nickgravgaard.com/elastic-tabstops/
[4] https://softwareengineering.stackexchange.com/questions/57/tabs-versus-spaces-what-is-the-proper-indentation-character-for-everything-in-e#comment324_72
[5] https://softwareengineering.stackexchange.com/questions/57/tabs-versus-spaces-what-is-the-proper-indentation-character-for-everything-in-e#comment402187_657
[6] https://softwareengineering.stackexchange.com/a/165/128588
[7] https://web.archive.org/web/20080820220453/http://www.movementarian.org/docs/whytabs/
[8] https://web.archive.org/web/20080617053639/http://rikkus.info:80/indentation.html
[9] https://web.archive.org/web/20080726022036/http://www.wiggy.net:80/rants/tabsvsspaces.xhtml
[10] https://adamspiers.org/computing/Linus-Kernel-CodingStyle
[11] https://www.python.org/dev/peps/pep-0008/#indentation
[12] https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf

Ian Mallett - Contact -
- 2018 - Creative Commons License