Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger
From: David Jones <drj@...>
Date: Wed, 18 Apr 2007 13:14:30 +0100


On 18 Apr 2007, at 09:49, Dave Dodge wrote:

On Wed, Apr 18, 2007 at 09:54:39AM +0100, David Jones wrote:

Phase 3 does not need a little bit of knowledge from Phase 4.


Footnote 6 (admittedly non-normative, but read on) seems to explicitly
state that it does.

The problem you referred to earlier was that of differentiating the
string "foo\nar" from the included file "foo\nar".  In stage 3 there
is no distinction, it's all just pp-tokens.  You can create a problem
for yourself if you decide that your frontmost lexer can distinguish
strings from included files, but really the C standard says that
strings don't become strings as we know them until stage 5.


Right, one problem is if you're trying to categorize the pp-tokens
before passing them to phase 4.  The sequence

  "foo"

is ambiguously either a header-name or string-literal when you don't
have the phase 4 context available.  As you say, one approach is to
just consider it a generic pp-token and figure it out later.

But there's a more difficult case I forgot about:

  <x>

which can be pp-tokenized two very different ways:

  punctuator identifier punctuator
  header-name

Choosing the correct pp-token sequence here does require phase 4
context.

No, it just requires context. I don't have to have done any macroreplacements or conditional preprocessing, or file inclusion (inother words I don't have to do any of the things that are actually inphase 4). I simply have to know whether I could be in a control-line#include reduction or not. This means that you can't naivelyseparate your preprocessor lexer from the parser. But in fact theamount of context you have to maintain to decide whether the nextpreprocessor-token is a header-name is next to trivial.

I'm not sure what you mean when you say "phase 4 context". The Cstandard says that a program is decomposed into preprocessingtokens. 6.4p4 observes an ambiguity in the grammar and specifies howit is resolved.

When phase 3 says that "The source file is decomposed intopreprocessing tokens" it doesn't say how this is done, and we can see(from 6.4p4) that we can't do this using a traditional context freelexer. So don't do that then.

It seems that you're trying to take shortcuts in analysing a Cprogram strictly according to the grammar specified in the standard,and then complaining that it's tricky. I agree.


Cheers,
 drj

Follow-Ups:
- Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger, David Jones

References:
- [Announce] Alpha release of a Lua debugger, Rici Lake
- Re: [Announce] Alpha release of a Lua debugger, Thomas Lauer
- Re: [Announce] Alpha release of a Lua debugger, Rici Lake
- Re: [Announce] Alpha release of a Lua debugger, Thomas Lauer
- Re: [Announce] Alpha release of a Lua debugger, Rici Lake
- Re: [Announce] Alpha release of a Lua debugger, David Given
- Re: [Announce] Alpha release of a Lua debugger, Dave Dodge
- [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger, Philippe Lhoste
- Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger, Dave Dodge
- Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger, David Jones
- Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger, Dave Dodge

Prev by Date: Re: Object binding system vs LUA_REGISTRYINDEX
Next by Date: Suggestion: upvalues to C hook functions
Previous by thread: Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger
Next by thread: Re: [OT] C parsing, was: Re: [Announce] Alpha release of a Lua debugger
Index(es):
- Date
- Thread