In Defense of 'Mantissa'

Diagram of bits in a 32-bit float: 1 bit sign, 8 bits biased exponent, and 23 bits of mantissa. — Figure 0
: Example of bits in a `float` (altered from image source).

What's the 'mantissa', you ask? In floating-point (reminder example in Figure 0), it's that field over there on the right, the one which you take (with the implicit leading \(1\)) and multiply by the base-raised-to-the-exponent to get the stored number. I.e., it's related to the constant in scientific notation, the coefficient 'out front'.

Nowadays, the term 'mantissa' has come under attack, with the ungainly 'significand' proposed as replacement. Most programmers (who probably know too little about floating-point) would see no problem treating them as synonyms^[1]—indeed, 'mantissa' seems like a fancier-yet-convenienter name for a 'significand', right?

'Mantissa' has even been called "mathematically incorrect" (gravest of sins!).

We need to untangle this.

'Mantissa' and Logarithms

'Mantissa', along with the (appallingly vaguely named) term 'characteristic', originated with logarithm tables. The 'mantissa' is the fractional part of the logarithm, while the 'characteristic' is the integer part; the sum is the logarithm overall^[2]. For example, \(\log_{10}(299\,792\,458)~\)\(\approx 8.476\,821\), and so the characteristic is \(8\) and the mantissa is \(\approx 0.476\,821\).

Graph showing that the mantissa bits of a floating point number are very similar to a true logarithm. — Figure 1
: Sticking a radix point between the (unbiased) exponent and the mantissa bits makes a surprisingly good approximation to \(\log_2(x)\).

The reason why separating the two is useful is that only the mantissas are difficult. You can see that \(299\,792\,458\) must have characteristic \(8\), just by looking at it and seeing that it is at least \(10^8\) but less than \(10^9\). However, to find the mantissa, you do need the log table. You would look up \(\log_{10}(2.997\,924\,58)~\)\(\approx 0.476\,821\) and add the characteristic \(8\) to get the final answer.

The term was reused for floating-point (at least since 1946^[3]^[4]), and it has been used widely for this purpose ever since. In fact, the reuse is more than just an analogy. A floating-point number basically is a log. Specifically, if you stick a radix point between the exponent and mantissa fields, and then unbias the exponent, the result almost literally is the base-\(2\) logarithm of the encoded number (see Figure 1):

\[ \log_2(x) \approx \frac{x_\text{int}}{2^{23}} - 127 \]

That is, the mantissa field in a floating-point number is the mantissa of the logarithm, at least up to a piecewise-linear approximation.

This was, in fact, no accident. William Kahan^[5] spearheaded the development of IEEE 754, the floating-point standard, and Kahan made a career out of floating-point math and bithacks. For example, he exploited it in his unpublished paper that gave us the now infamous fast inverse-square-root and (less well known) fast square-root methods, both of which are derived from treating the mantissa as a logarithm. E.g. in the latter case, fundamentally what his algorithm tries to do is:

\[ \sqrt{x} = \exp_2\left(\log_2\sqrt{x}\right) = \exp_2\left(\frac{1}{2}\log_2(x)\right) \]

And it is only by substituting in the above \(\log_2(x) \approx (x_\text{int}/2^{23}) - 127\) twice that we get \(\sqrt{x} \approx 2^{29} - 2^{22} + \left(x_\text{int}/2\right)\), the fast computation you actually execute. (For more discussion.)

Enter 'Significand'

'Significand' seems to have come along later^[4], but somehow it became more conventional. The IEEE 754 standard uses only 'significand', and big names (notably Kahan^[5] and Knuth) seem to champion it over 'mantissa'. And everyone else seems to have fallen in line.

Unfortunately, the name offers no advantages while bringing plenty of problems. 'Significand' suggests 'significance', which is flatly wrong because significance and order-of-magnitude are the exponent's job. It's supposed to be an allusion to sigfigs, but more immediate is that for a floating-point number 'significand' shares the first four letters with 'sign bit', thus making code abbreviations forever obnoxious. Aesthetically, 'significand' has more syllables, and frankly, it just doesn't sound as cool. It's just bad.

Which is 'Right'?

The objection to 'mantissa' among the would-be sticklers is that it's supposedly "less mathematically correct", because the terminology is an abuse of another mathematical concept. The trouble is that it isn't. It was the fractional part of the logarithm (in 1800s/1900s log tables), and today it still is the fractional part of the logarithm (in a floating-point number).

Even if it were a new usage, every mathematician knows that concepts borrow names and notation from other places. Even foundational grade-school math requires awareness of context^[6]. Wikipedia currently tracks at least 22 separate definitions of the term 'normal' in mathematics, and yet floating-point itself uses 'normal' numbers to contrast with 'subnormal' numbers! If adapting terminology really were the problem, then surely reusing 'normal' would have been far more objectionable^[7]!

Of course, 'mantissa' isn't perfect. It's only exactly the same at powers-of-2. And, though it may have a catchier name, that's not quite a reason to embrace it. Probably the biggest issue is that it suffers from ambiguity because it is sometimes unclear whether an author includes the implicit leading \(1\). (One shouldn't, by analogy with the 'characteristic' vs. 'mantissa' separation, but floating-point remains arcane to many programmers and academics alike, it's easy to forget an implicit number, and it's sometimes more useful to include the \(1\), so). There's also the detail that the original 'mantissa' was for base 10, though expanding it to base 2 is trivial.

On the other hand, while 'significand' has the backing of unambiguity and authority, it is not just intrinsically deficient as a name, but actively bad at it. It sounds like something it's not, and it has no advantages over 'mantissa' (not even ambiguity; different authors confuse whether the implicit \(1\) is included there, too; in this case, 'significand' is supposed to include the \(1\); the 'fraction' would be the part without, this much being sensible and matching nicely with existing computational terminology^[8].)

There are a few alternatives, too. The best I've heard is 'coefficient' (one major advantage over 'significand' is that it's clear that it includes the \(1\), because coefficients are by definition multiplicative). Unfortunately, in floating-point, 'coefficient' is pretty rare. I wouldn't depend on someone knowing it offhand, nor even being able to accurately puzzle it out on the fly. And then we do have 'fraction' from above, as an unambiguous^[9] and good choice at least for the non-integer part.

Conclusion

Far from being "mathematically incorrect", the term 'mantissa' in floating-point is a parallel usage to its original meaning for logarithms. If repurposing and making up terminology is to be avoided, then not only should we not use 'significand', we are obliged to use 'mantissa'!

It's true that the coefficient on a scientific-notation-form number and the fraction of a logarithm are different but, for base-2, adding IEEE 754's implicit leading 1 on the latter gets you shockingly close to the former—to the extent that algorithms depend on them being equivalent. We can call the thing on the front the 'coefficient' or the 'significand' and its fractional part the 'fraction' or 'mantissa', but the actual bits we're storing are the mantissa bits, not the significand bits. And if we have to talk about the thing in front, 'coefficient' is at least clearer and means something.

Even though 'mantissa' has a better claim to history and the more accurate meaning, nevertheless, history has turned against it; it is now considered nonstandard. As far as I can tell, this is due to blindly following authorities who, rather inexplicably, say that 'significand' is better.

What to do?

Well, in my quest for righteousness and standardization, I'll accede to using 'significand' and 'fraction'. I guess. Ugh. However, I'll use 'coefficient' and 'fraction' with my friends. And maybe, when I say 'fraction', I'll think 'mantissa', fondly, in my heart.

Notes

[1]	It is even a (non-technical) dictionary definition!
[2]	'Mantissa' came from Latin 'mantisa' (with one 's'), meaning 'a worthless addition'. Although the mantissa is hardly worthless, it is definitely less significant than the characteristic, and it is indeed added.
[3]	[Burks et al. 1946] "Preliminary discussion of the logical design of an electronic computing instrument" (see PDF §5.3, pg. 97 (in document, pg. 6)), a readable and prescient paper. Unlike today, the analogy to logarithm tables would have been contemporaneously apparent—doubly so since the discussion begins with base-10 logs.
[4]	See Wikipedia for more on the history. They list 1946 for mantissa and 1967 for significand.
[5]	Famous in his own right, "The Father of Floating Point" won the 1989 Turing Award for spearheading the development of IEEE 754. I attempted to contact Kahan on the topic of 'mantissa' and, while I have not (yet) received an answer, he is apparently against 'mantissa'. The best explicit reference I can find is here, where he lists 'mantissa' as "wrong" (albeit, this is in the context of including the leading \(1\)). Louder speaks the fact that his brainchild, the IEEE 754 standard, doesn't contain the word 'mantissa' in it literally anywhere. I flatly do not understand this. Kahan would have been drawing on his own history and the contemporaneous literature, where 'mantissa' was prevalent. IEEE 754 is deliberately designed to treat the mantissa field just like the mantissa from the days of log-tables. He also invented several key algorithms that depend on them being equivalent. Still, I do not know the exact history, and even if he did indeed cause IEEE to drop 'mantissa', this should not disparage the excellent and much more important substance of the standard itself.
[6]	Example: \(4\frac{x}{y} = 4x/y\), but \(4\frac{1}{2} = 4+\frac{1}{2}\).
[7]	Especially since, for 'mantissa', people basically don't use log tables anymore, and would be unlikely to be conflate the terms today (for all that the meaning hasn't changed, so any conflation would be a good thing).
[8]	E.g. `fract(⋯)` in OpenGL.
[9]	At least, you would think. Knuth of all people includes \(1\) in the fraction (see The Art of Computer Programming, Vol 2., §4.2.1.A, pg. 214.), in flat defiance of the IEEE 754 standard's usage.

Return to the list of blog posts.