CFG Plugin2.0 Output Syntax

From FSL
Jump to: navigation, search

The MOP CFG plugin uses the following output syntax, which is defined in BNF:

// BNF below is extended with {p} for zero or more and [p] for zero or one repetitions of p
 
<CFG Output Syntax> ::= "[" <Action Table> "::" <Goto Table>
[ "::" <Conflict List> ] "]"
<Action Table> ::= {"["<Nat>, <Event Name> "=>" <Action> "]"}
<Goto Table> ::= {"["<Nat>, <NonTerminal Name> "=>" <Int>"]"}
<Conflict List> ::= <Conflict> {<Conflict>} "::" <Collection>
<Nat> ::= <!-- Natural Number -->
<Event Name> ::= <Id>
<Action> ::= "shift(" <Int> ")" | "reduce(" <NonTerminal Name> "," <Nat> ")"
<NonTerminal Name> ::= <Id>
<Event or NonTerminal Name> ::= <Event Name> | <NonTerminal Name>
<Conflict> ::= "< shift-reduce |" <Nat> ">" | "< reduce-reduce |" <Nat> ">"
<Collection> ::= { "[[["<Nat>"]][" <Collection Item Set> "]] ;" }
<Int> ::= <!-- Integer -->
<Collection Item Set> ::= { "{" <NonTerminal Name> "->"
<Collection Item RHS> "," <Event Name>"}," }
<Collection Item RHS> ::= { <Event or NonTerminal Name> | "*" }

<CFG Output Syntax>

<CFG Output Syntax> defines the generic output of the various cfg plugins, which must be converted to the proper language by each MOP instance that uses it. It consists of either a pair of an <Action Table> and a <Goto Table>, or a triple that also adds a <Conflict List>, if any conflicts are present. The pair or triple has its elements separated by ::. A conflict occurs during table generation when a given input either denotes both a shift and a reduction (shift-reduce conflict) or reduction to two different <NonTerminal Name>s (reduce-reduce conflict); see <Conflict> and <Collection> for more information on conflicts.

<Action Table> and <Goto Table>

The <Nat> as the first argument to an action or goto table represents the current state of the push down automata. For action tables the second argument, the <Internal Terminal> is the input to the automata. The last argument, appearing after =>, is the action to perform, either an error (shift(error)), a legitimate shift, or a reduce. In the example below, the action table item [0,a => shift(error)] says in state 0 with input a, there is an error condition. For goto tables the first <Int> argument is, just like in action tables, the state. The <NonTerminal Name> argument is the <NonTerminal Name> specified by a reduce action. The last argument, after the =>, is the state to go to (hence the name of the table).

<Conflict List>

A <Conflict List> is a non-empty list of <Conflict>s followed by a <Collection>, that allows one to see where the <Conflict>s are.

<Nat>

This is a natural number.

<Event Name>

An identifier to represent an event.

<Action>

An <Action> represents either a shift or a reduce.

<Event Name>

An identifier to represent a nonterminal.

<NonTerminal or Event Name>

<NonTerminal or Event Name> can match either a <NonTerminal Name> or an <Event Name>.

<Conflict>

<Conflict>s can be shift-reduce or reduce-reduce conflicts. A shift-reduce conflict occurs when, for a given state, the table generation algorithm cannot deterministically decide whether to shift or reduce for a given input. A reduce-reduce conflict occurs much the same, save that the algorithm cannot chose which non-terminal to reduce to. The <Nat> argument of a conflict describes which state the conflict is in.

<Collection>,<Collection>, and <Collection Item Set>

The <Nat> in the <Collection Item RHS> describes which state is represented. A state can be found in the canonical collection by looking for the proper state number surrounded by [[]]. In a canonical collection item, the current cursor position is marked by *. The cursor tells us at which point the CFG monitor is looking for more input. Multiple items in state 7 of the example below create a shift-reduce conflict. To see this, consider the items {nt(1) -> t(1) t(1) *,t(1)} and {nt(1) -> * t(1) t(1),t(1)}. On input t(1) item {nt(1) -> t(1) t(1) *,t(1)} directs the table generation algorithm to produce a reduce action with t(1) as the look ahead, while {nt(1) -> * t(1) t(1),t(1)} directs the algorithm to produce a shift action.

<Int>

This is an integer number.


The Special $ Terminal

The CFG Monitoring Algorithm page explains why we need the special terminal $ (it matches the hypothetical end of trace). A phony production S -> nt(N) $ is added to the grammar, where nt(N) is the internal version of the original start symbol of the grammar. This makes the table generation algorithm more general, as $ is treated like any other terminal, rather than needing special cases. If S is used in the original grammar it will have already been renamed to some nt(N) before this production is added. This can be seen in collection output from the example below: in state 0 the collection item {S -> * nt(1) $,$} is present.

What if my grammar isn't LR(1)?

If the grammar is not LR(1) there will either be one or more of a shift-reduce or reduce-reduce conflict. shift-reduce is decided in favor of shift, reduce-reduce is decided arbitrarily. Either way a table is generated which may, or may not, match what you want it to (depending on if the language needs more than LR(1) or if the you just wrote a poor grammar for an LR(1) language). Keep in mind that the syntaces for most programming languages, notable exceptions being C and C++, are expressible as LR(1) grammars. For C and C++ the syntaces are not even context free, so even a GLR parser would fail to parse them without tricks. Changing the CFG Plugin to a GLR parser would be fairly straight forward (not much change to the table generation algorithm), but, in general, the runtime overhead incurred does not seem worth it (GLR parsers require backtracking to handle conflicts). More information on LR parsing is available on Wikipedia.

Example

For the grammar:

event a 
event b
lr:   S -> b S b | b A b | epsilon, 
      A -> A a | a

The output, which denotes that there is a shift-reduce conflict in state 7, is:

 
[
[0,$ => shift(error)]
 [0,a => shift(error)]
 [0,b => shift(5)]
 [1,$ => shift(error)]
 [1,a => reduce(A, 1)]
 [1,b => reduce(A, 1)]
 [2,$ => shift(error)]
 [2,a => reduce(A, 2)]
 [2,b => reduce(A, 2)]
 [3,$ => shift(error)]
 [3,a => shift(2)]
 [3,b => shift(9)]
 [4,$ => shift(error)]
 [4,a => shift(2)]
 [4,b => shift(10)]
 [5,$ => shift(error)]
 [5,a => shifb]
 [5,b => shift(6)]
 [6,$ => reduce(S, 2)]
 [6,a => shifb]
 [6,b => shift(7)]
 [7,$ => shift(error)]
 [7,a => shifb]
 [7,b => shift(7)]
 [8,$ => accept]
 [8,a => shift(error)]
 [8,b => shift(error)]
 [9,$ => reduce(S, 3)]
 [9,a => shift(error)]
 [9,b => shift(error)]
 [10,$ => shift(error)]
 [10,a => shift(error)]
 [10,b => reduce(S, 3)]
 [11,$ => shift(error)]
 [11,a => shift(error)]
 [11,b => shift(13)]
 [12,$ => shift(error)]
 [12,a => shift(error)]
 [12,b => shift(14)]
 [13,$ => reduce(S, 3)]
 [13,a => shift(error)]
 [13,b => shift(error)]
 [14,$ => shift(error)]
 [14,a => shift(error)]
 [14,b => reduce(S, 3)]
 ::
[0,A => error]
 [0,S => 8]
 [1,A => error]
 [1,S => error]
 [2,A => error]
 [2,S => error]
 [3,A => error]
 [3,S => error]
 [4,A => error]
 [4,S => error]
 [5,A => 3]
 [5,S => 11]
 [6,A => 4]
 [6,S => 12]
 [7,A => 4]
 [7,S => 12]
 [8,A => error]
 [8,S => error]
 [9,A => error]
 [9,S => error]
 [10,A => error]
 [10,S => error]
 [11,A => error]
 [11,S => error]
 [12,A => error]
 [12,S => error]
 [13,A => error]
 [13,S => error]
 [14,A => error]
 [14,S => error]

:: < shift-reduce | 7 >
::

[[[0]][{S -> * S $,$},{S -> * b b,$},{S -> * b A b,
    $},{S -> * b S b,$}]] ; 
[[[1]][{A -> a *,a},{A -> a *,b}]] ; 
[[[2]][{A -> A a *,a},{A -> A a *,b}]] ; 
[[[3]][{A -> A * a,a},{A -> A * a,b},{S -> b
    A * b,$}]] ; 
[[[4]][{A -> A * a,a},{A -> A * a,b},{S -> b
    A * b,b}]] ; 
[[[5]][{A -> * a,a},{A -> * a,b},{A -> * A a,t(
    0)},{A -> * A a,b},{S -> * b b,b},{S -> b
    * b,$},{S -> * b A b,b},{S -> * b S b,t(
    1)},{S -> b * A b,$},{S -> b * S b,$}]] ; 
[[[6]][{A -> * a,a},{A -> * a,b},{A -> * A a,t(
    0)},{A -> * A a,b},{S -> * b b,b},{S -> b
    * b,b},{S -> b b *,$},{S -> * b A b,b},{
    S -> * b S b,b},{S -> b * A b,b},{S
    -> b * S b,b}]] ; 
[[[7]][{A -> * a,a},{A -> * a,b},{A -> * A a,t(
    0)},{A -> * A a,b},{S -> * b b,b},{S -> b
    * b,b},{S -> b b *,b},{S -> * b A b,b},
    {S -> * b S b,b},{S -> b * A b,b},{S
    -> b * S b,b}]] ; 
[[[8]][{S -> S * $,$}]] ; 
[[[9]][{S -> b A b *,$}]] ; 
[[[10]][{S -> b A b *,b}]] ; 
[[[11]][{S -> b S * b,$}]] ; 
[[[12]][{S -> b S * b,b}]] ; 
[[[13]][{S -> b S b *,$}]] ; 
[[[14]][{S -> b S b *,b}]]
]

Personal tools
Namespaces

Variants
Actions
Navigation