welcome to the
Generic Interpreter

A Java-based interpreter of context-free languages with user-defined semantics.

Generic Interpreter 1.3
Protected API

Home: http://www.csupomona.edu/~carich/gi/
Version: Generic Interpreter 1.3
Requires: JDK 1.5 or higher
Author: © 1999-2009 Craig A. Rich <carich@csupomona.edu>
Download: gi-1.3.tar.gz  (source, classes, documentation, examples)
gi-1.3.zip  (source, classes, documentation, examples)
gi-1.3.jar  (classes only)
Archived Versions
Example Instances
Public API  (language user's view)
Protected API  (language designer's view)
Private API  (internal view)

The Generic Interpreter is used to produce compact standalone or embedded interpreters and compilers written in Java. It is unique in producing "just-in-time" interpreters with respect to source stream, language specification, and parsing method. Nothing is preconstructed or generated as in most compiler construction tools. A Language is specified on the fly, and source streams written in the language can be interpreted at any time, even as the language evolves. Language specifications are analyzed only when source streams are read and only to the extent the source stream and parsing method require. The Generic Interpreter adapts to changes in language and parsing method, and it is reasonable to treat the language specification and parsing method as variables (to what end, I'm not yet sure ;). The Generic Interpreter has a small footprint; it has a class archive under 50K, uses under 500K of heap memory, and generates no internal garbage. If you find an interesting use for the Generic Interpreter, I'd like to see it.

A language is specified by a Grammar (mapping nonterminals to phrases containing nonterminals, terminals, and Grammar.Semantics) extending a Lexicon (mapping terminals to expressions denoting regular sets of words). A source character stream is interpreted by grabbing words into terminals according to the Lexicon, and parsing terminals into phrases according to the Grammar. Specifically, Lexicon.grab grabs the longest word matching an expression in the Lexicon, yielding the terminal that matches the word. Grammar.interpret parses successive terminals, yielding a Grammar.ParseTree and evaluating embedded Grammar.Semantics as productions are applied. A Lexicon.Exception is thrown when an I/O, lexical or syntax error first occurs.

Several parsing methods are supported--LL(1), LR(0), SLR(1) and LR(1). A parser extends a Grammar, can be constructed on demand around any Grammar in constant time, and adapts to changes in the underlying Grammar and Lexicon.

No effort has been made in this version to determine if the Grammar submits to the chosen parsing method without conflicts. During LL parsing, a conflict is resolved by choosing the applicable production most recently put in the Grammar. During LR parsing, a shift/reduce conflict is resolved by shifting, and a reduce/reduce conflict is resolved by choosing the applicable production most recently put in the Grammar. The class of context-free grammars that submit to LR(1) parsing without conflict is relatively large, and includes grammars used to build most programming languages.

© 1999-2009 Craig A. Rich <carich@csupomona.edu>