[Neko] Compiler used for neko
lee.mccoll at lyons-group.co.uk
Tue Mar 7 15:17:34 CET 2006
Skaller, you seem to know a lot about parsers and compilers... Where do
you suggest I look for more info on the subject? I've got myself a copy
of 'The Dragon Book', but I need more than this. Any language platform
Thanks in advance,
From: neko-bounces at lists.motion-twin.com
[mailto:neko-bounces at lists.motion-twin.com] On Behalf Of skaller
Sent: 07 March 2006 14:08
To: Neko intermediate language mailing list
Subject: Re: [Neko] Compiler used for neko
On Tue, 2006-03-07 at 12:47 +0100, Nicolas Cannasse wrote:
> > Not clear how that works! For LL(1) this requires the
> > head token of the stream pattern to be unique?
> For NekoML stream parsers, it works the following :
> - if it's a pattern, we check the matching and increment the readed
> tokens counter
> - if we fail the token pattern matching, reset the counter to 0 and go
> to next rule
> - if it's a sub-rule (in fact a function call), then we junk the
> tokens before entering it
> - if no rule match an expression is raised
> - if we catch an exception in a rule set, it is an error only if some
> tokens were junked, else it just mean the rule didn't match
Ah. OK. So basically you have arbitrary lookahead with
a pattern, up to a recursion point (subroutine call).
If the match makes it that far, it becomes a cut point,
and there's no going back: if the subrule match fails,
the outermost enclosing match also fails.
But you have at least junked some tokens at the point,
so the Syntax Error is not at the start of the stream.
However this form of stream parsing is quite weak.
[It can't even parse regular expressions]
There is a very nice way to enhance it you may wish
to think about: GLL. Generalised LL parsing.
This works the same as GLR parsers. Whenever there are
choices you follow all of them, and you junk each
token in turn. There is no backtracking at all,
you just follow all the alternatives simultaneously
and drop ones the fail as you go. This design
is perfect for destructive streams. AFAIK it can
parse all context free languages with suitable
For very long or continuous stream parsing there
is no other way! Backtracking over millions of
tokens (eg with large XML document) isn't an option.
On the other hand left factoring isn't an option
if you want user extensible grammars.
Executable GLL is hard to do in a compiled language,
because you need your recursive descents to be control
inverted. But this is easy with an interpreter which
John Skaller <skaller at users dot sourceforge dot net>
Async PL, Realtime software consultants
Checkout Felix: http://felix.sourceforge.net
Neko : One VM to run them all
More information about the Neko