Which Axis Is Up?

Introduction

In graphics, we regularly deal with 2D and 3D coordinate systems, and it isn't immediately clear which axis is up, even to professional application developers.

Chart of different axis and handedness conventions. — Figure 0
: Coordinate axes and handedness in various graphics environments. (Image credit: Freya Holmér.)

In a 2D (spatial) coordinate system, we have two axes, \(\vec{x}\) and \(\vec{y}\). If you're sane, every such graph you've ever drawn has had the \(\vec{x}\)-axis be horizontal—which leaves the \(\vec{y}\)-axis to be vertical. Moreover, the positive \(\vec{y}\)-axis is not merely vertical, but up, because gravity points downward and is a negative, attractive force.

What, then of 3D, where we have \(\vec{x}\), \(\vec{y}\), and \(\vec{z}\) axes?

Arguments for the \(\vec{y}\)-axis being up

A strong argument can be made for \(\vec{y}\) continuing to be "up". For one thing, it's consistent. When you add another dimension, why should \(\vec{y}\) suddenly change from being up to being one of the sideways directions? Surely, it's nonsensical to redefine \(\vec{y}\) to be something literally orthogonal to its previous meaning when you simply add a new coordinate?

Maya started life at Alias|Wavefront^[1] as an animation tool. 2D graphics were common at the time, so it made sense for \(\vec{y}\) to be "up". If the third dimension was involved at all, it was as a depth. This makes sense^[2], and in fact this sort of thinking doubtless explains the terminology of image depth being referred to as the "\(\vec{z}\)-buffer", and OpenGL's preference for \(\vec{y}\) being up in general.

Arguments for the \(\vec{z}\)-axis being up

A strong argument can be made for \(\vec{z}\) being up, too—but the argument is more subtle than anyone seems to ever make! See, it's not about the \(\vec{z}\)-axis being a better "up" vector per-se. It's about "up" being the last coordinate in your coordinate system. The last element in 2D is \(\vec{y}\), and the last element in 3D is \(\vec{z}\). In 4D, it would be \(\vec{w}\), and so on.

The "up" direction is usually the most interesting one for a given problem—and in fact when we move from 2D to 3D it's often because we're adding a new, interesting feature. Putting it last means the other, more-similar, dimensions get grouped together. It also means that the "interesting" dimension gets pulled to the end, where we can work on it easily.

For example, consider level design for a 3D game. You sketch out the level in 2D, and then when you move to 3D, you extrude the level upward^[1]. Keeping the "up" direction the last element means that this is just appending a coordinate, not inserting a new one in the middle.

This is more-elegant mathematically, too. The 2D planar position of the character is simply a prefix of the full 3D location:

\[ \vecinline{ \hl{x}, \hl{y}, z } \]

In programming, this has the practical consequence that you can just . . . write those two coordinates, without any shuffling around. Consider how useful it is to, for example, take some character's 3D position (perhaps a vec3f), cast it into a vec2f, and then automatically have the position on the map? Or, to teleport them to the other side of the world by writing the position's .xy field? This isn't just aesthetics, either; the implementation will be faster because there's less shuffling around, too!

As another example, consider a texture. The 2D space of the texture is like a painting, and so there's a horizontal and a vertical direction. \(\vec{y}\) is up. But now suppose the texture is a heightmap, and we're doing parallax occlusion mapping (POM) on it. Suddenly, the interesting aspect becomes the rays tracing "down into" the surface from "above". The axis coming out of the texture, the \(\vec{z}\) axis, becomes "up". This is such a natural change in perspective we don't even realize how significant it is.

Consider for example the famous "TBN" matrix. In 2D, you have just the tangent and bitangent, a matrix sometimes seen for texture transformations:

\[ \begin{bmatrix} t_x & b_x\\ t_y & b_y \end{bmatrix} \]

When you generalize it to 3D, with \(\vec{z}\) being up, you just add a row and column:

\[ \begin{bmatrix} t_x & b_x & \hl{n_x}\\ t_y & b_y & \hl{n_y}\\ \hl{t_z} & \hl{b_z} & \hl{n_z} \end{bmatrix} \]

Compare that to the \(\vec{y}\)-axis being up case, where you have to insert the normal vector awkwardly into the middle?

\[ \begin{bmatrix} t_x & \hl{n_x} & b_x\\ t_y & \hl{n_y} & b_y\\ \hl{t_z} & \hl{n_z} & \hl{b_z} \end{bmatrix} \]

To get the offset to the sampled texture coordinates, you can't just take the first two components. You have to take the first and . . . third.

So Which to Choose?

At one point, I vehemently argued for the \(\vec{y}\)-axis being up. It makes sense with our 2D intuitions. However, I now think that that \(\vec{z}\)-axis being up in 3D is generally better. However, again, the motivation is key: the intuition wasn't wrong! It's just that it's not about \(\vec{z}\) being "up" per-se; it's about the last coordinate being "up". This allows you to maintain that same laudable consistency we sought of the former convention, but gain the mathematically and computationally superior attributes of the latter convention.

The convention is arbitrary either way, but for me, for the future, \(\vec{z}\) is up in 3D^[3].

References

[1]	Note: Wavefront is better-known for the ".obj" file format, which incidentally also has \(\vec{y}\) up.
[2]	Credit to "TheJamsh" for noticing this.
[3]	It's worth noting that the only apparent exception to this, the image plane (where \(\vec{y}\) is up), isn't really an exception. The image plane is . . . a plane. Yes, it has a "depth" attribute, but it's because view space is fundamentally referred to the final image, not the world. It should not be thought of as 3D, but as 2D with extra attributes.

Return to the list of blog posts.