Parsing
There are two broad categories of parsing:
plus various schemes combining elements of each.
Top Down Parsing
The top-down approach attempts to construct the parse tree (i.e., do the parse or find a derivation) using the start symbol of the grammar as a beginning point.
This implies backtracking.
Problems
Bottom Up
The tree is constructed from the leaves on ``up to the root''.
The epitome model is that of shift-reduce parsing.
Algorithm:
repeat
shift a token onto the "stack"
if stack contains a recognizable "structure" (string of tokens)
reduce these to a LHS
endif
until stack contains only the start symbol
Example Grammar 1
S -> NP VP NP -> D N VP -> V D -> a D -> the D -> this N -> dog N -> cat V -> barks V -> meows
Sample parse of ``a dog barks''.
_ - a dog barks a - dog barks D - dog barks D dog - barks D N - barks NP - barks NP barks - NP V - NP VP - S -
Problems arise when the decision whether to shift or reduce can not be made - leading to either a
S -> NP VP NP -> D N VP -> V VP -> V PP D -> a D -> the D -> this N -> dog N -> cat N -> alley V -> barks V -> meows PP -> P NP P -> in
Sample parse of ``a dog barks in the alley'':
_ - a dog barks in the alley
a - dog barks in the alley
D - dog barks in the alley
D dog - barks in the alley
D N - barks in the alley
NP - barks in the alley
NP barks - in the alley
NP V - in the alley
<-- s/r here
reducing:
NP VP - in the alley
S - in the alley <-- some input left
shifting:
NP V in - the alley
NP V P - the alley
NP V P the - alley
NP V P D - alley
NP V P D alley -
NP V P D N -
NP V P NP -
NP V PP -
NP VP -
S -
Note that in natural language, recognizing (accepting) often is done (correctly or incorrectly) when some input remains to be processed.
Generating reduce/reduce conflicts:
Let's add the rule:
VP -> P D N
(perhaps trying to simplistically capture the sentence:
the dog in the alley.
Which answers the question who barks?
and uses a form of ellipsis: the dog in the alley barks.
S -> NP VP NP -> D N VP -> V VP -> V PP VP -> P D N D -> a D -> the D -> this N -> dog N -> cat N -> alley V -> barks V -> meows PP -> P NP P -> in
Parsing:
_ - the dog in the alley the - dog in the alley D - dog in the alley D dog - in the alley D N - in the alley NP - in the alley NP in - the alley NP P - the alley NP P the - alley NP P D - alley NP P D alley - NP P D N -
Which could reduce:
NP P NP - NP PP -
but is now stuck.
Or it could be reduced via
NP VP - S -
which is accepted.