Diving into Gcc: OpenBSD and m88k
by Miod Vallat10/02/2003
This article describes how the m88k-specific backend of the GNU C compiler, gcc, was fixed, from the discovery and analysis of the problems to the real fixing work. Since it started with almost zero gcc internals knowledge, it should be understandable by anyone able to read C code, and proves that diving into gcc is not as hard as one could imagine.
Most of the code snippets displayed below come from gcc 2.95 sources,
in the m88k specific code (found inside
gcc/config/m88k/m88k.*). For more details about the gcc
internals, the reader is welcome to refer to the Resources.
Some Background History
The Motorola 88000 architecture is not well-known today. Think of it as a bridge between the famous Motorola 68000 family and the well-known PowerPC family. Although now a dead architecture, many fine m88k-based systems were produced from 1988 to 1992, such as the Data General Aviion workstations and Motorola's own embedded systems.
Due to the elegance of its design and the availability of second-hand machines, the m88k systems became and remain quite popular among hobbyists. No wonder that several free operating systems are ported, or are being ported to, these machines. The most advanced effort was the OpenBSD/mvme88k port to the Motorola VME boards.
Nivas Madhur started the OpenBSD/mvme88k port in 1995. Back then, OpenBSD would still ship with gcc 2.7.2.1 and local patches (these were the days!), and the problems Nivas had to face were dire kernel bugs, so no real effort was done on the toolchain. Optimization was disabled in order to prevent compiler bugs, if any, from interfering.
Nivas Madhur eventually stopped working on the port. Dale Rahn integrated it into the OpenBSD main sources, but Dale did not have the resources to maintain it. Steve Murphree, Jr eventually took over. At some point, OpenBSD started to use gcc 2.8, which did not fix the optimizer problems.
Steve was very close to producing an OpenBSD/mvme88k release.
Unfortunately, some weeks before the code freeze, the in-tree gcc was updated
to egcs 1.1 and compiler problems started to plague the port: even at
-O0 non-optimization level, the compiler would not always output
correct code. As the kernel was compiled with -O as an exception
to the -O0 rule, people started running unreliable kernels.
At some point, the userret() code path factorization in the
OpenBSD kernel for all architectures required kernels to be built with
optimization, relying on userret() being an inline function.
Unfortunately, gcc, by design, will never inline functions (even if explicitly
requested) at -O0. The non-return point had been crossed: gcc had
to be fixed.
Starting Debugging
Finding and fixing compiler problems is never easy. It's like climbing a mountain: you need a good rope, and good assets. Moreover, a debugger is mostly useless: you're not trying to find why code behaves incorrectly, but rather what causes gcc to produce incorrect code.
In my case, I made sure to keep a known working gcc 2.8 binary in a safe place, which could be used to bootstrap gcc 2.95 first, then as a working reference. I also made sure I had a good backup.
Stack Me Harder
After compiling gcc 2.95 with gcc 2.8, my first test was to compile and run a kernel and hope for the best. After a long compilation, my hopes were smashed very quickly: the kernel failed very early in an assertion:
panic: kernel diagnostic assertion "obj == NULL || anon == NULL" failed:
file "/usr/src/sys/uvm/uvm_page.c", line 899
dropping me into the debugger. Yet the function parameters from the traceback were apparently correct!
From the assertion message and the debugger traceback, I could easily
reconstruct the code flow. Since the assertion failure would happen in an
uvm_pagealloc_start invocation, I built the following simple
program to reproduce a similar flow:
$ cat assert.c
#include <stdio.h>
#include <sys/types.h>
#define KASSERT(e) ((e) ? (void) 0 : __assert( __FUNCTION__, #e))
void
__assert(const char *funcname, const char *error)
{
printf("Assertion failed in %s: %s\n", funcname, error);
/* exit(0); */
}
void *
uvm_pagealloc_strat(void *obj, u_int64_t off, void *anon, int flags, int strat,
int free_list)
{
KASSERT(anon == NULL);
return obj;
}
void *
uvm_pagealloc(void *obj, u_int64_t off, void *anon, int flags)
{
KASSERT(anon == NULL);
return obj;
}
main()
{
char *pg;
char *kobj = "kobj";
pg = uvm_pagealloc(kobj, 0, NULL, 0);
pg = uvm_pagealloc(kobj, 0, NULL, 0);
pg = uvm_pagealloc_strat(kobj, 0, NULL, 0, 0, 0);
pg = uvm_pagealloc(kobj, 0, NULL, 0);
}
$
When compiled with gcc 2.95, this program reproduced the problem:
$ gcc295 -O0 -o assert assert.c
$ ./assert
Assertion failed in uvm_pagealloc: anon == NULL
$
gcc 2.8 worked fine:
$ gcc28 -O0 -o assert assert.c
$ ./assert
$
More interestingly, the assertion would only fail for the first call but not afterwards. This would hint toward either incorrect stack or incorrect register usage, eventually corrected by the side effects of multiple function calls.
After some tinkering, I finally ended with this interesting sample program:
$ cat eighty2.c
#include <stdio.h>
#include <sys/types.h>
void
odd64(int oddmaker, u_int64_t stamper, int value)
{
printf("odd64: value=%d\n", value);
}
void
odd32(int oddmaker, u_int32_t stamper, int value)
{
printf("odd32: value=%d\n", value);
}
void
even64(int oddmaker, int evenmaker, u_int64_t stamper, int value)
{
printf("even64: value=%d\n", value);
}
void
even32(int oddmaker, int evenmaker, u_int32_t stamper, int value)
{
printf("even32: value=%d\n", value);
}
main()
{
odd64(0, 0, 1);
odd32(0, 0, 1);
even64(0, 0, 0, 1);
even32(0, 0, 0, 1);
}
$
When run, this program would produce:
$ gcc295 -O0 -o eighty2 eighty2.c
$ ./eighty2
odd64: value=3
odd32: value=1
even64: value=1
even32: value=1
$
instead of:
$ gcc28 -O0 -o eighty2 eighty2.c
$ ./eighty2
odd64: value=1
odd32: value=1
even64: value=1
even32: value=1
$
The problem was apparently tied to the use of a 64 bit argument, but only in some cases. Why?
Let's examine the m88k calling convention for these routines. The
canonical m88k calling convention mandates that the arguments are passed
in registers r2 to r9, with extra arguments
passed on the stack. If an argument can not fit in one register (such as
a double, or an int64_t), it will be put in two
consecutive registers starting at an even number, so that double word load
and store instructions can be used. In our case, the calling convention
would be:
Calling convention for even64()
r2 - oddmaker
r3 - evenmaker
r4, r5 - stamper
r6 - value
Calling convention for odd64()
r2 - oddmaker
r3 (wasted)
r4, r5 - stamper
r6 - value
However, looking at the code generated by gcc 2.95, odd64()
would be invoked with
r2 - oddmaker
r3 (unused)
r4, r5 - stamper
stack - value
Why would the last parameter be passed on the stack with the new compiler?
Let's dive in to the gcc sources. A large piece of code in
gcc/calls.c is responsible for function invocations, choosing
where to pass arguments, whether on the stack or in a specific
register. To do this, it relies upon a set of macros provided by the
processor-dependent backend of gcc: the FUNCTION_ARG macro
set.
In this particular case, the macros misbehave as soon as they encounter
a 64-bit parameter, for which it is necessary to skip and waste an
odd-numbered register. As such, the problem probably lies in
FUNCTION_ARG or FUNCTION_ARG_ADVANCE. The first
macro will decide where to put the argument while the second one updates a
position counter for the next FUNCTION_ARG update to know at
which register number or stack location to start.
A simple grep through the gcc sources shows that gcc 2.8 invokes
FUNCTION_ARG_ADVANCE in five places:
$ grep FUNCTION_ARGS_ADVANCE *.c
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, TYPE_MODE (type), type,
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, Pmode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
function.c: FUNCTION_ARG_ADVANCE (args_so_far, promoted_mode,
$
as does gcc 2.95:
$ grep FUNCTION_ARGS_ADVANCE *.c
calls.c: FUNCTION_ARG_ADVANCE (*args_so_far, TYPE_MODE (type), type,
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, Pmode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
function.c: FUNCTION_ARG_ADVANCE (args_so_far, promoted_mode,
$
but note how the first invocation uses a pointer dereference now? Of
course, the FUNCTION_ARG_ADVANCE macro is not entirely
expansion-safe:
$ cd config/m88k
$ head -1069 m88k.h | tail -18
/* A C statement (sans semicolon) to update the summarizer variable
CUM to advance past an argument in the argument list. The values
MODE, TYPE and NAMED describe that argument. Once this is done,
the variable CUM is suitable for analyzing the *following* argument
with `FUNCTION_ARG', etc. (TYPE is null for libcalls where that
information may not be available.) */
#define FUNCTION_ARG_ADVANCE(CUM, MODE, TYPE, NAMED) \
do { \
enum machine_mode __mode = (TYPE) ? TYPE_MODE (TYPE) : (MODE); \
if ((CUM & 1) \
&& (__mode == DImode || __mode == DFmode \
|| ((TYPE) && TYPE_ALIGN (TYPE) > BITS_PER_WORD))) \
CUM++; \
CUM += (((__mode != BLKmode) \
? GET_MODE_SIZE (MODE) : int_size_in_bytes (TYPE)) \
+ 3) / 4; \
} while (0)
$
Notice how CUM, the first parameter, is used unprotected?
Guess what happens in the CUM++; statement when
CUM happens to be *args_so_far? Our exact
problem! Compiling a call to odd64() would trigger the
CUM++; statement from the macro invocation using the
pointer. This is just one more bug caused by unnoticed preprocessor-unsafe
code, especially since it had been working correctly in previous gcc
versions.
As a result of the bug, args_so_far would end up incremented,
pointing to a semi-random memory location holding a huge value. As a result,
subsequent FUNCTION_ARG invocations would consider that all the
r2-r9 registers are in use, placing the remaining arguments on the
stack.
That's not all. There is another bug left in there. Look more closely at the macro expansion:
enum machine_mode __mode = (TYPE) ? TYPE_MODE (TYPE) : (MODE); \
...
CUM += (((__mode != BLKmode) \
? GET_MODE_SIZE (MODE) : int_size_in_bytes (TYPE)) \
It is obvious that the GET_MODE_SIZE macro should be applied
to the revisited __mode, not the initial MODE. Yet
people keep telling me gcc 2.95 works for them under SysV/m88k. Happy fellows.
I eventually rewrote this macro as a function in my tree, as it had more
bugs and needed to follow the FUNCTION_ARG logic more closely,
which would make it too big to be worth kept as a macro:
/* Update the summarizer variable CUM to advance past an argument in
the argument list. The values MODE, TYPE and NAMED describe that
argument. Once this is done, the variable CUM is suitable for
analyzing the *following* argument with `FUNCTION_ARG', etc. (TYPE
is null for libcalls where that information may not be available.) */
void
m88k_function_arg_advance (args_so_far, mode, type, named)
CUMULATIVE_ARGS *args_so_far;
enum machine_mode mode;
tree type;
int named;
{
int bytes;
if ((type != 0) &&
(TREE_CODE(type) == RECORD_TYPE || TREE_CODE(type) == UNION_TYPE))
mode = BLKmode;
bytes = (mode != BLKmode) ? GET_MODE_SIZE (mode) : int_size_in_bytes(type);
if ((*args_so_far & 1) && (mode == DImode || mode == DFmode
|| ((type != 0) && TYPE_ALIGN (type) > BITS_PER_WORD)))
(*args_so_far)++;
(*args_so_far) += (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
}
Pages: 1, 2 |
