Wednesday, August 26, 2009

Enumeration constant in Macro definition

We use symbolic constants to get rid off magic numbers - hard coded values in programs. There three ways to define symbolic constants. They are
  1. Macros
  2. enumeration constants
  3. const objects
Among the above three, the standard coding practice suggests to use the enumeration constant for programming.

Refer symbolic constants for the details on the above three methods.

The above is just a pre-requisite for further reading.

For porting GCC (GNU Compiler Collection - a compiler generation framework) to a new architecture, we require to feed in the information of the new architecture into the GCC. This information on new architecture is feed into GCC in the form of machine description. The machine descriptions include macros and LISP like languages for defining the new architecture.

There are macros that would define the information on new architecture registers. Each such macro is defined with the corresponding register number.

For example,

#define FIRST_PSEUDO_REGISTER 32

The macro FIRST_PSEUDO_REGISTER defines the register number that can be given to the first pseudo (scratch) register. The register numbers for the pseudo registers should start from the number next to the last actual register (as in architecture) number. Consider there are 32 actual registers in the architecture. Then the actual registers are with numbers from 0 to 31. So, from the number (32) next to the last actual register (31), shall be assigned to the pseudo registers.

Hence the macro FIRST_PSEUDO_REGISTER is defined with the total number of registers.

Then there is another macro,

#define STACK_POINTER_REGNUM 30

The macro STACK_POINTER_REGNUM defines the register number that would be used as stack pointer.

Similarly there are other macros. Here we are to use such numbers of registers for defining the macros. What we thought of is, instead of magic numbers of registers why can't we have a enumeration that defines the constants for the registers in an architecture. If that is so, we can just use the enumeration constant names in place of the numbers.

And we did as follows:

#define FIRST_PSEUDO_REGISTER eTOTAL_REGISTERS_KNOWN_TO_COMPILER

#define STACK_POINTER_REGNUM eR7


And do you think this is safe? Could you see there could be some problems in doing so?

Yes, it is true we didn't see anything while doing so. But we ended-up in failure while building the compiler.

One thing we didn't think about the macros. That is, preprocessor plays some part with the numbers defined by the macros. The macros are used in the preprocessor statements. In such case, the numbers should be visible at the time of preprocessing. If we use the enumeration constant names as shown above, the preprocessor cannot see that as a number.

For example consider the below code with preprocessor statements:

#if FIRST_PSEUDO_REGISTER
// do something
#endif

for defining the macro 'FIRST_PSEUDO_REGISTER' with enum constant 'eTOTAL_REGISTERS_KNOWN_TO_COMPILER', the preprocessor would check as follows:

#if eTOTAL_REGISTERS_KNOWN_TO_COMPILER
// do something
#endif

#IF ... #ENDIF preprocessor directive conditionally includes source code at compile-time. The preprocessor expects for a numeric expression whereas here it gets a string 'eTOTAL_REGISTERS_KNOWN_TO_COMPILER'. And hence the source code inside #IF ... #ENDIF is not included for compilation. This is the cause for the problem during build.

No comments: