On Mon, Apr 16, 2007 at 11:14:27AM +0200, Philippe Lhoste wrote:
Really? I am not a compiler expert, far from it, but as a C
programmer I
thought preprocessing phase takes place before anything else.
The C standard (in section 5.1.1.2) breaks translation into 8 distinct
phases. The early phases are probably a lot more fine-grained than
you're used to thinking about. Paraphrased:
1) trigraphs and multibyte sourcecode
2) splicing of lines ending with backslashes
3) comment removal, and sourcecode is converted to a sequence
of preprocessing tokens
4) preprocessing directives are executed, included files are read and
run through stages 1-4.
5) character escape sequences
6) adjacent string literals are concatenated
7) preprocessing tokens are coverted to tokens, which are compiled as
a translation unit
8) linking
Of course most compilers actually implement multiple phases at once,
but the important thing is that the compiler must produce results
consistent with the phases being done separately and in the above
order.
The problem is that when you follow the grammar, phase 3 ends up
needing a little bit of knowledge from phase 4. It's pretty subtle,
and not the sort of thing you notice until you actually try to
implement them separately.