Thursday, September 23, 2010

Parsers and grammars

A friend expressed an interest in an improved version of yacc. I think this fits the bill:
    http://en.wikipedia.org/wiki/Lemon_Parser
    http://hwaci.com/sw/lemon/lemon.html

It's always re-entrant and thread-safe, with some other advantages and all the best lessons of
yacc (e.g. error-handling).

Anyway, I'm nearly sold on PEG (Parsing Expression Grammars).
    http://en.wikipedia.org/wiki/Parsing_expression_grammar
    http://piumarta.com/software/peg/peg.1.html

Unlike LR parsers, a PEG parser does not require a separate tokenizer.  It's very elegant, much like BNF.  I want to do some runtime comparisons, but I expect it to be comparable.  I will try to attach a pegleg grammar that I wrote to solve this (with the added constraint of left-associativity):
    https://www.spoj.pl/problems/ONP/
Just install peg/leg, run 'leg' on the attached file (if I ever manage to post it to this blog), and compile.

Here is an interesting table:
    http://en.wikipedia.org/wiki/Comparison_of_parser_generators

To learn a ton about PEG:
    http://pdos.csail.mit.edu/~baford/packrat/

In a nutshell, a Context-Free Grammar parser like
yacc is meant for natural language, based on Chomsky's work, not for an unambiguous computer language.  PEG simplifies the parser by eliminating ambiguity in the grammar.

No comments:

Post a Comment