|
Overview
The Gardens Point Parser Generator (gppg) is a Yacc/Bison like parser generator.
Both the parser generator and the parser runtime components are implemented entirely
in C#. They make extensive use of generic collection classes and so require
Version 2.0 of the .NET Framework. The parser generator takes a Bison/Yacc
style grammar specification with semantic actions coded in C# and produces an LALR(1)
parser. It is designed to be simple, efficient and as similar as possible to Yacc/Bison
in functionality. It does not include a scanner/lexical analyser generator.
The parser's interfaces however, are designed to be as simple as possible with minimal
object-orientation.
Parser Interfaces
The parser requires a scanner to be provided that implements the following interface
public abstract class IScanner<SemanticValueType>
{
// scan for the next token from the input stream
public abstract int yylex();
// for reporting errors during parsing
public abstract void yyerror(string format, params object[] args);
// for passing semantic values to the parser
public SemanticValueType yylval;
}
where SemanticValueType is a "union" type produced by the parser generator.
What does the Parser Generator Produce?
The parser generator produces a C# source file containing:
- a type declaring symbolic tokens:
public enum Tokens {error=127, EOF=128, ... };
- a type for the "union" type specified in the grammar:
public struct SemanticValueType { ... };
- and a class that implements the parser:
public class Parser: ShiftReduceParser<SemanticValueType>
{
...
}
The base class implements the parse method:
public bool Parse();
which returns true iff the parse is successful.
Using the Generated Parser
The main program/compiler should create an instance of class Parser, initialize
its scanner property to an object that implements IScanner and then call
the parser's parse method:
static public void Main(string[] args)
{
Parser parser = new Parser();
parser.scanner = new MyScanner();
bool SyntaxOK = parser.Parse();
...
}
Implementing yylex
The yylex method of the scanner returns an integer which should either
be the "ASCII" value of a character literal used in the grammar, or the integer
value of one of the symbolic tokens declared in the Tokens type (the
symbolic constants in the Tokens type are assigned integer values in
such a way that they don't clash with the character literals used in the grammar).
The parser generator does not currently support string literals in the grammar
specification. Tokens.EOF should be returned on end of file.
See the samples directory for a simple example.
Invoking the Parser Generator
The parser generator can be invoked on the command line and takes the name of the
grammar specification file as input. The source code output is directed to standard
output, so users should typically redirect it to an appropriate output file. Eg:
gppg.exe ruby.y > parser.cs
Compiling the Generated Parser
The generated parser file can then be compiled either on the command line or as
part of a Visual Studio 2005 project. A reference to ShiftReduceParser.dll should
be included to provide access to parser's other runtime components. Eg:
csc /reference:ShiftReduceParser.dll parser.cs scanner.cs main.cs
Debugging
Runtime tracing of the shift/reduce actions performed by the parser can be turned on
by setting the property Trace of the parser to true. The -report
option of the parser generator causes it to output details of the LALR item states
that were generated. Type gppg /help for other command line options.
Bug Reports
Please send bug reports, comments, suggestions or requests for change to
w.kelly@qut.edu.au.
Implementing the Semantic Value "Union" Type
Yacc and Bison allow a union type to be declared (in the %union directive)
that is used for associating different types of semantic values with the symbols in
the derivation tree. C# unfortunately, doesn't directly support C style union data
structures. The parser generator takes whatever C# code exists within the %union
directive and copies it directly into the body of a struct called
SemanticValueType. So, for example:
%union
{
public int i;
public string name;
}
would be translated into:
public struct SemanticValueType
{
public int i;
public string name;
}
This relatively naive translation will often mimic the intended union semantics, albeit
with with a little wasted storage.
Programmers can, however, if they wish implement a set of properties that more accurately
reflects the original union semantics:
%union
{
private object val;
public int i
{
get { return (int) val; }
set { val = value; }
}
public string name
{
get { return (string) name; }
set { val = value; }
}
}
Acknowledgements
The parser generator was produced as part of Microsoft Research sponsored project to
build a Ruby compiler for the .NET CLR. The Ruby compiler is still in development and will
be released under a similar licensing agreement.
I would also like to thank Malcolm Crowe for inspiring this project by making his CSTools
parser generator publicly available.
Source code contributions and bug fixes contributed by the following people:
- Stefan Rusek (directives to control namespaces, visibility and class names)
Licensing
Copyright 2005 Queensland University of Technology (QUT). All rights reserved.
Redistribution and use in source and binary forms, with or without modification are permitted
provided that the following conditions are met:
- Redistribution of source code must retain the above copyright notice, this list of
conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list
of conditions and the following disclaimer in the documentation and/or other materials
with the distribution.
This software is provided by the GPPG project "as is" and any express or implied warranties,
including, but not limited to the implied warranties of merchantability and fitness for a
particular purpose are hereby disclaimed. In no event shall the GPPG project or QUT be liable
for any direct, indirect, incidental, special, exemplary, or consequential damages (including,
but not limited to procurement of substitute goods or services; loss of use, data, or profits;
or business interruption) however caused and on any theory of liability, whether in contract,
strict liability, or tort (including negligence or otherwise) arising in any way out of the
use of this software, even if advised of the possibility of such damage.
|