~/imallett (Ian Mallett)

The goto Statement is Good
by Ian Mallett

====INTRODUCTION====

The following argument attempts to demonstrate that the "goto" statement in languages like C and C++ is actually a Good Thing. Programmers' compulsive dislike of it is objectionable since the goto statement has legitimate application to good program design.

Some time ago, before the rise of modern programming practices, then-modern languages were just adopting a thing called a "function". Curiously enough, no one seemed to like them—preferring instead to use hundreds of goto statements to jump around one or two of them. To counter this, Edsger Dijkstra, a prominent Computer Scientist, wrote a famous essay entitled Go To Considered Harmful (though interestingly he did not name it), wherein he advocated against the goto statement as a propagator of poor programming practice.

Dijkstra was referring to the indiscriminate use of goto as a substitute for function calls and actual structured programming (which had a tenuous hold at the time), in higher-level languages. That problem is no longer relevant today.

Note that in machine language, there's no such thing as a loop, an if, and so on—it is all jump (goto) instructions; it doesn't even make sense to get rid of them entirely, as some more naïve programmers suggest. Consequently, the following discourse primarily concerns "high"-level (higher than assembly) languages.

====APPLICATIONS OF GOTO====

The goto statement still has genuine application, and sometimes the best program design uses a goto or two.

One application is the problem solved by named break. In languages without named break (i.e., most), breaking out of multiple levels of nested loops requires at least one extra variable and a conditional. These are expensive, hurting space, instruction locality, and instruction counts. Probably more importantly, they are confusing, especially if you have to do it in more than one place. However, a simple goto statement solves the problem cleanly.

The following is a typical example. Without goto:

bool continuing = true;
for (int y=0;y<4096&&continuing;++y) {
	for (int x=0;x<4096;++x) {
		if (arr[y][x] == BLAH) {
			continuing = false;
			break;
		}
		process(x,y);
	}
}
With goto:
for (int y=0;y<4096;++y) {
	for (int x=0;x<4096;++x) {
		if (arr[y][x] == BLAH) goto END;
		process(x,y);
	}
}
END:
In the first example, "continuing" is checked each time through the loop. We also have to store it, which is a register. The end result is messy too: look how two different mechanisms (the break and the variable) are used to break out of the nested loop! We could get rid of the boolean variable by setting "y" to its upper bound, but this is even less clear. In the second example, it's immediately obvious what's happening. Notice how, either way, the goto saves instructions. Try generalizing the example to three nested loops, or four. The advantage of goto quickly becomes even more clear.

A second, and somewhat related, application is a pattern I see in loops frequently. At each loop step, there is an expensive calculation needed to continue to the next iteration. If the loop only runs a few times, this extra calculation getting thrown away on the last iteration can be significant. For example, without goto:

int i = 0;
ComplicatedObject obj(/*...*/);
do {
	f(i,obj);
	obj.prepareForNext(); //expensive
} while (++i<n); //n is fairly small
With goto:
int i = 0;
ComplicatedObject obj(/*...*/);
LOOP:
	f(i,obj);
	if (++i<n) { //n is fairly small
		obj.prepareForNext(); //expensive
		goto LOOP;
	}
In the second version, we completely avoided that expensive ".prepareForNext()" call that wasn't necessary the last time through the loop! This sort of thing occurs frequently in text processing. I use goto-esque "loops" like the above for my .obj file loader in my graphics library. Little things like this add up: my loader runs 50% faster than any other loader I've used, and I haven't done much in the way of optimization—just by allowing these control flows, the same algorithm is made more efficient!

A third application of goto is code organization within a function. This most often comes up in error handling. For example, suppose you have some error conditions, and various points where they could occur in a function, but you still need to do cleanup in case of an error. If we didn't have goto, we'd need to duplicate this cleanup code at each point. With goto, the error code can not only be specified once, but moved to the bottom of the function where it clearly is distinct from the algorithm proper. A few (yes recent) C best practices guides in fact actually recommend this, citing specific examples from, among others, the Linux kernel itself). The following is inspired from such an example. Without goto:

int func(void) {
	char* buffer = malloc(64*sizeof(char));
	if (buffer == NULL) return -1;
	
	buffer[0] = 'a';
	if (!process1(buffer)) {
		printf("Final simulation buffer: \"%s\"\n",buffer);
		free((void*)(buffer));
		return -1;
	}
	
	buffer[0] = 'b';
	if (!process2(buffer)) {
		printf("Final simulation buffer: \"%s\"\n",buffer);
		free((void*)(buffer));
		return -1;
	}
	
	buffer[0] = 'c';
	if (!process3(buffer)) {
		printf("Final simulation buffer: \"%s\"\n",buffer);
		free((void*)(buffer));
		return -1;
	}
	
	printf("Final simulation buffer: \"%s\"\n",buffer);
	int result = final_processing(buffer);
	free((void*)(buffer));
		
	return result;
}
With goto:
int func(void) {
	char* buffer = malloc(64*sizeof(char));
	if (buffer == NULL) return -1;
	
	int result = -1;
	
	buffer[0] = 'a';
	if (!process1(buffer)) goto CLEANUP;
	
	buffer[0] = 'b';
	if (!process2(buffer)) goto CLEANUP;
	
	buffer[0] = 'c';
	if (!process3(buffer)) goto CLEANUP;
	
	result = final_processing(buffer);
	
	CLEANUP:
		printf("Final simulation buffer: \"%s\"\n",buffer);
		free((void*)(buffer));
		
	return result;
}
Notice how, with goto, the program is shorter and more understandable. Since it compiles to fewer instructions, it is also more instruction-cache-friendly. There are other ways to implement the first example; perhaps the cleanest deserves rebuke. The duplicated code could be moved to a separate function—this is a Bad Thing because the duplicated code is semantically part of the original function.

====PERFORMANCE AND OPTIMIZATION====

The goto statement can also make programs faster. My programs make careful, limited use of goto statements in tight logic, making them faster than any other equivalent—even when they are compiled without optimization.

Many compilers use an old, though effective, optimization technique known as graph-coloring, wherein program flow is modeled as a directed graph. Since the goto statement allows arbitrary jumps, the graph algorithm becomes intractable in general. However, in the specific, a careful use of goto does not interfere. In any case, the goto statement is one major tool for programmer optimization. Even if the compiler can't optimize something, that may be fine since the programmer, using goto, already did. You need to be careful, of course, since compilers are very good at doing optimization for you already.

The above examples should suggest that the goto statement allows optimizations that are otherwise completely inexpressible using standard control flow. Furthermore, it should be obvious that these optimizations can require some thought to implement: compilers can't do it for you. In my experience, careful use of the goto statement allows you to make optimizations C/C++ compilers in particular can't possibly do, since they cannot understand semantics. When goto is mixed with pointer arithmetic, one can produce truly marvelous works of engineering—doing things no compiler today could even dream of doing. Mix this with the under-appreciated "restrict" keyword (in C99 and on), and I have written programs literally 10 times faster than their goto-less counterparts.

====CONCLUSION====

Dijkstra was right in some sense. The goto statement is not a substitute for function calls. Nor should it be.

But, the plain fact is that the goto statement is just another language feature. The fact that some programmers abuse a language feature doesn't make it bad—it makes the programmers abusing it stupid. As demonstrated above, the goto statement has genuine application to good program design, and any blanket suggestion that they should be removed from high-level languages is prescriptivist and fundamentally short-sighted.

Edsger Dijkstra was attempting to force structured programming into a world that didn't want it, and his suggestion to remove goto from all higher-level languages may have been meant literally. But Dijkstra could not have anticipated the incredible advances in languages and technology that would literally completely change how code is written. In any case, now that it is more than 40 years later, I find it incredible that people still parrot it like an absolute truth instead of noting it for the ancient and mostly irrelevant historical tidbit that it now is.

The greats of Computer Science understood that dogmatic adherence to any practice is a Bad Thing. Certainly, just because any one said something does not mean that all programmers everywhere should suddenly treat it like some kind of divine law.


COMMENTS
Ian Mallett - Contact -
Donate
- 2018 - Creative Commons License