Generic Interpreter 1.2
Private API

gi
Class Lexicon

java.lang.Object
  extended bygi.Lexicon
Direct Known Subclasses:
Grammar

public class Lexicon
extends Object

This class implements a Lexicon.

Version:
1.2
Author:
© 1999-2004 Craig A. Rich <carich@csupomona.edu>

Nested Class Summary
protected static class Lexicon.Alphabet
          This class implements an Alphabet of character symbols.
protected static class Lexicon.Concatenation
          This class implements an Expression expressing the concatenation of two regular languages.
protected  class Lexicon.Exception
          This class implements an Exception.
protected static class Lexicon.Expression
          This class implements an Expression expressing a regular language.
protected static class Lexicon.Match
          This class implements an Alphabet containing some characters.
protected static class Lexicon.NonMatch
          This class implements an Alphabet containing all except some characters.
protected static class Lexicon.PosixClass
          This class implements an Alphabet containing the characters in a POSIX character class.
protected static class Lexicon.Range
          This class implements an Alphabet containing the characters in a range.
protected static class Lexicon.Repetition
          This class implements an Expression expressing the repetition of a regular language.
(package private) static class Lexicon.Set
          This class implements a Set.
protected static class Lexicon.Singleton
          This class implements an Expression expressing a singleton language.
(package private) static class Lexicon.Stack
          This class implements a Stack.
protected static class Lexicon.UnicodeCategory
          This class implements an Alphabet containing the characters in a Unicode general category.
protected static class Lexicon.Union
          This class implements an Expression expressing the union of two regular languages.
 
Field Summary
protected static Object $
          The terminal matched by the character at the end of a source stream.
private static Lexicon.Expression $_EXPRESSION
          The Alphabet containing the character at the end of a source stream.
protected  int debug
          The debug switches, initially zero.
private static Lexicon.Stack delta
          The transition relation of the lexical NFA.
private  Map E
          The mapping representing this Lexicon.
private  Map F
          The final states of the lexical NFA.
private  Lexicon.Set I
          The initial states of the lexical NFA.
protected static int LEXICAL
          debug switch constant enabling printing terminals and associated words grabbed during lexical analysis.
private static int QSize
          The number of lexical NFA states constructed.
private  Lexicon.Set[] R
          The states through which the lexical NFA transitions.
protected static int TERMINALS
          debug switch constant enabling printing the set of terminals before lexical analysis.
protected static int VERBOSE
          debug switch constant enabling all debugging.
private  StringBuffer w
          The StringBuffer containing the word most recently grabbed.
 
Constructor Summary
protected Lexicon()
          Constructs an empty Lexicon.
(package private) Lexicon(Lexicon lexicon)
          Constructs a Lexicon that is a shallow copy of lexicon.
 
Method Summary
private  Integer accept(Lexicon.Set S)
          Computes the current final state, if any, in the lexical NFA.
private static Lexicon.Set closure(Lexicon.Set S)
          Computes a reflexive transitive closure under empty transition using the lexical NFA.
protected static Lexicon.Expression expression(String ere)
          Creates an Expression by interpreting a POSIX extended regular expression (ERE), as used in egrep.
protected  Object grab(LineNumberReader source)
          Grabs a terminal from a source character stream using this Lexicon.
private  Lexicon.Set initial()
          Returns the initial states of the lexical NFA.
 Object interpret()
          Interprets the standard input stream using this Lexicon.
 Object interpret(File source)
          Interprets a source file using this Lexicon.
 Object interpret(InputStream source)
          Interprets a source byte stream using this Lexicon.
(package private)  Object interpret(LineNumberReader source)
          Repeatedly invokes grab(source) until the end of the source stream reached.
 Object interpret(PipedWriter source)
          Interprets a source pipe using this Lexicon.
 Object interpret(Reader source)
          Interprets a source character stream using this Lexicon.
 Object interpret(String source)
          Interprets a source string using this Lexicon.
 void interpret(String[] arguments)
          Lexical analysis by command-line arguments using this Lexicon.
private static void put(Integer s, Lexicon.Alphabet A, Integer r)
          Puts a transition into the lexical NFA.
protected  void put(Object a, Lexicon.Expression e)
          Puts a terminal and associated Expression into this Lexicon.
private static Integer s()
          Creates a new state in the lexical NFA.
(package private)  boolean terminal(Object a)
          Indicates whether a symbol is a terminal in this Lexicon.
private static Lexicon.Set transition(Lexicon.Set S, char a, Lexicon.Set R)
          Computes a transition using the lexical NFA.
protected  String word()
          Returns the word most recently grabbed using this Lexicon.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

$

protected static final Object $

The terminal matched by the character at the end of a source stream.

Since:
1.1, renames END_OF_SOURCE in version 1.0.

$_EXPRESSION

private static final Lexicon.Expression $_EXPRESSION

The Alphabet containing the character at the end of a source stream.


debug

protected int debug

The debug switches, initially zero. The following bits enable debugging to standard output:

0x01 = TERMINALS
Print the set of terminals before lexical analysis
0x02 = LEXICAL
Print terminals and associated words grabbed during lexical analysis
0x04 = FIRST_FOLLOW
Print first and follow sets precomputed during syntax analysis
0x08 = SYNTAX
Print parsing decisions made during syntax analysis
0x10 = CONFLICT
Print parsing conflicts encountered during syntax analysis
0x20 = PARSE_TREE
Print each ParseTree produced by syntax analysis

Since:
1.1

delta

private static final Lexicon.Stack delta

The transition relation of the lexical NFA.


E

private final Map E

The mapping representing this Lexicon. A terminal is mapped to the initial state of the NFA constructed from the associated Expression.


F

private final Map F

The final states of the lexical NFA. A final state is mapped to the terminal it accepts in this Lexicon. When empty, there is a need to compute current final states. It is computed only on demand created by initial().


I

private final Lexicon.Set I

The initial states of the lexical NFA. When empty, there is a need to compute the current initial states. It is computed only on demand created by initial().


LEXICAL

protected static final int LEXICAL

debug switch constant enabling printing terminals and associated words grabbed during lexical analysis.

Since:
1.1
See Also:
Constant Field Values

QSize

private static int QSize

The number of lexical NFA states constructed.


R

private final Lexicon.Set[] R

The states through which the lexical NFA transitions.


TERMINALS

protected static final int TERMINALS

debug switch constant enabling printing the set of terminals before lexical analysis.

Since:
1.1
See Also:
Constant Field Values

VERBOSE

protected static final int VERBOSE

debug switch constant enabling all debugging.

Since:
1.1
See Also:
Constant Field Values

w

private final StringBuffer w

The StringBuffer containing the word most recently grabbed.

Constructor Detail

Lexicon

protected Lexicon()

Constructs an empty Lexicon.


Lexicon

Lexicon(Lexicon lexicon)

Constructs a Lexicon that is a shallow copy of lexicon. The fields of the new Lexicon refer to the same objects as those in lexicon.

Parameters:
lexicon - the Lexicon to copy.
Method Detail

accept

private Integer accept(Lexicon.Set S)

Computes the current final state, if any, in the lexical NFA.

Parameters:
S - the current states.
Returns:
the maximum final state in S. Returns null if S contains no final states.

closure

private static Lexicon.Set closure(Lexicon.Set S)

Computes a reflexive transitive closure under empty transition using the lexical NFA. The closure is computed in place by a breadth-first search expanding S.

Parameters:
S - the states whose reflexive transitive closure is computed under empty transition.
Returns:
the reflexive transitive closure of S under empty transition.

expression

protected static Lexicon.Expression expression(String ere)
                                        throws Lexicon.Exception

Creates an Expression by interpreting a POSIX extended regular expression (ERE), as used in egrep. The syntax and semantics for EREs is formally specified by the ERE Grammar. Provides a convenient method for constructing an Expression, at the cost of an LR(1) parse. Implementations seeking maximum speed should avoid this method and use explicit Expression subclass constructors; for example,

new Union(new NonMatch("0"), new Singleton("foo"))
instead of
expression("[^0]|foo")

Parameters:
ere - the POSIX extended regular expression (ERE) to interpret.
Returns:
the Expression constructed by interpreting string.
Throws:
Lexicon.Exception - if an ERE syntax error occurs.

grab

protected Object grab(LineNumberReader source)
               throws Lexicon.Exception

Grabs a terminal from a source character stream using this Lexicon. The variable returned by word() is set to the longest nonempty prefix of the remaining source characters matching an Expression in this Lexicon. If no nonempty prefix matches an Expression, a Lexicon.Exception is thrown. If the longest matching prefix matches more than one Expression, the terminal associated with the Expression most recently constructed is returned. Blocks until a character is available, an I/O error occurs, or the end of the source stream is reached.

Parameters:
source - the source character stream.
Returns:
the terminal grabbed from source.
Throws:
Lexicon.Exception - if an I/O or lexical error occurs.

initial

private Lexicon.Set initial()

Returns the initial states of the lexical NFA.

Returns:
I, computing it and F if there is a need to compute the current initial states and final states.

interpret

public Object interpret()
                 throws Lexicon.Exception

Interprets the standard input stream using this Lexicon.

Returns:
the ParseTree constructed by interpreting the standard input stream.
Throws:
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

public Object interpret(File source)
                 throws FileNotFoundException,
                        Lexicon.Exception

Interprets a source file using this Lexicon.

Parameters:
source - the source file.
Returns:
the ParseTree constructed by interpreting source.
Throws:
FileNotFoundException - if the source file cannot be found.
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

public Object interpret(InputStream source)
                 throws Lexicon.Exception

Interprets a source byte stream using this Lexicon.

Parameters:
source - the source byte stream.
Returns:
the ParseTree constructed by interpreting source.
Throws:
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

Object interpret(LineNumberReader source)
           throws Lexicon.Exception

Repeatedly invokes grab(source) until the end of the source stream reached. Blocks until a character is available, or an I/O error occurs. This method is overridden by Grammar and its parser subclasses, so it is only invoked when this Lexicon has not been extended into a Grammar or parser.

Parameters:
source - the source character stream.
Returns:
the ParseTree constructed by interpreting source. A Lexicon always returns null.
Throws:
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

public Object interpret(PipedWriter source)
                 throws IOException,
                        Lexicon.Exception

Interprets a source pipe using this Lexicon.

Parameters:
source - the source pipe.
Returns:
the ParseTree constructed by interpreting source.
Throws:
IOException - if the source pipe cannot be connected.
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

public Object interpret(Reader source)
                 throws Lexicon.Exception

Interprets a source character stream using this Lexicon.

Parameters:
source - the source character stream.
Returns:
the ParseTree constructed by interpreting source.
Throws:
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

public Object interpret(String source)
                 throws Lexicon.Exception

Interprets a source string using this Lexicon.

Parameters:
source - the source string.
Returns:
the ParseTree constructed by interpreting source.
Throws:
Lexicon.Exception - if an I/O, lexical, syntax or semantic error occurs.

interpret

public void interpret(String[] arguments)

Lexical analysis by command-line arguments using this Lexicon. The first I/O or lexical error that occurs during lexical analysis is printed to the standard error stream.

Parameters:
arguments - the command-line arguments controlling the interpreter.
The following arguments may appear zero or more times, are processed in order, and have the following effects:
-t, --terminals
Print the set of terminals in this Lexicon before subsequent lexical analyses.
-l, --lexical
Print terminals in this Lexicon grabbed during subsequent lexical analyses.
-v, --verbose
Print maximum debugging. Equivalent to -tl.
-
Lexically analyze the standard input stream using this Lexicon.
filename
Lexically analyze source file filename using this Lexicon.
If no filename arguments are given, the standard input stream is lexically analyzed.

put

private static void put(Integer s,
                        Lexicon.Alphabet A,
                        Integer r)

Puts a transition into the lexical NFA.

Parameters:
s - the state from which the transition is made.
A - the Alphabet on which the transition is made.
r - the state to which the transition is made.

put

protected void put(Object a,
                   Lexicon.Expression e)

Puts a terminal and associated Expression into this Lexicon. The Expression supersedes any previously associated with the terminal.

Parameters:
a - the terminal to add to this Lexicon.
e - the Expression associated with terminal a. When grabbing, the language expressed by e matches a.

s

private static Integer s()

Creates a new state in the lexical NFA.

Returns:
a new state in the lexical NFA.

terminal

boolean terminal(Object a)

Indicates whether a symbol is a terminal in this Lexicon.

Parameters:
a - the symbol whose status is requested.
Returns:
true if symbol is a terminal in this Lexicon; false otherwise.

transition

private static Lexicon.Set transition(Lexicon.Set S,
                                      char a,
                                      Lexicon.Set R)

Computes a transition using the lexical NFA.

Parameters:
S - the states from which the transition is made.
a - the character on which the transition is made.
R - the states to which the transition is made.
Returns:
the states to which the transition is made.

word

protected String word()

Returns the word most recently grabbed using this Lexicon.

Returns:
the word most recently grabbed by grab(source).

 

© 1999-2004 Craig A. Rich <carich@csupomona.edu>