Unification-Based Grammars

A UBG is a grammar that:

- encodes information in features and their values (aka attributes and attribute values)
- gives values only through unification (subtly argumentative - what about lookup in dictionary?)

Feature: an attribute/value pair added to a syntactic element, and, hence, to a node in a parse tree

Notable point:

A PS rule is used to legitimize a part of the parse tree

      NP --> D  N   yields tree     NP
                                   / \
                                  D   N

Clarification:

Given the ``phrase'' x + 5, construct an ``expression tree''

                    exp
                   / | \
                  /  |  \
                 /   |   \
                x    +    5

``Justification'':

       exp --> [x], ['+'], [5].

Or

       exp --> var, op, const.
       var --> [Z], { atom(Z) }.
       const --> [N], { number(N) }.
       op --> ['+'].

Or

       exp --> rval, op, rval.
       rval --> var ; const.
       var --> [Z], { atom(Z) }.
       const --> [N], { number(N) }.
       op --> [O], { member(O,['+','-','*','/']) }.


More extensive example: z = x + sin(y) * 5;

                     assgn
                     / |  \
                    /  |   \
                   /   |    \
                 lval  =    rval
                  |          |
                 var         exp__
                  |         / |   \
                  x       exp aop term___
                          /    |   / |   \
                        term   +  /  |    \
                                term mop   factor
                                /     |       |
                              factor  *      rval
                               /              |
                              /              const
                            funcall            |
                            / |  |  \          5
                           /  |  |   \
                         sin '(' |   ')'
                                 |
                               explist
                                 |
                                exp
                                 |
                                term
                                 |
                                factor
                                 |
                                rval
                                 |
                                 var
                                 |
                                 y

       assgnmnt --> lval, ['='], exp, [';'].
       exp --> exp, aop, term.
       exp --> term.
       term --> term, mop, factor.
       term --> factor.
       factor --> ['('], exp, [')'].
       factor --> funcall.
       factor --> rval.
       funcall --> [F], ['('], explist, [')'], { atom(F) }.
       explist --> exp ; exp, [','], explist.
       explist --> [].

       rval --> var ; const.
       var --> [Z], { atom(Z) }.
       const --> [N], { number(N) }.


Note on eliminating left-recursion:

       A --> A u1.
       A --> A u2.
          :
       A --> A um.
       A --> v1.
          :
       A --> vn.

       can be rewritten:

       A --> v1.
          :
       A --> vn.
       A --> v1 B.
          :
       A --> vn B.

       B --> u1.
       B --> u2.
          :
       B --> um.
       B --> u1 B.
       B --> u2 B.
          :
       B --> um B.

This is significant because it:

  1. allows top-down parse
  2. shows there is more than one grammar for any language
  3. shows that not all grammars follow the
    "syntax should denote semantics" (form follows function)
    rule


Adding features just allows the "unification" to be likewise legitimized:

For example:

      NP -->   D       N
    [num:X] [num:X] [num:X]

places these features in the tree.

Note how we could do this in Prolog:

   np(Num) --> det(Num), noun(Num).

The formalism's notation, e.g., [num:X], has the advantage of "giving a name to each feature".

This notation, can be encoded in Prolog as is by defining a new operator :. However, the same results can be achieved by writing grammar rules as, e.g.,

   np(number(Num)) --> det(number(Num)), noun(number(Num)).

   det(number(singular)) --> [the].
   det(number(singular)) --> [a].
   det(number(plural)) --> [the].
   noun(number(singular)) --> [goat].
   noun(number(plural)) --> [goats].

How would you test (call) this in Prolog?