TRC - expert system building tool (part 1 of 8)
    sources-request at panda.UUCP 
    sources-request at panda.UUCP
       
    Sat Feb  8 08:49:27 AEST 1986
    
    
  
Mod.sources:  Volume 3, Issue 109
Submitted by: ihnp4!dicomed!ndsuvax!nckary (Daniel D. Kary)
This is NOT a shell archive.  Simply delete everything up to and including
the cut mark and save the result as overview.doc.
Dan Kary
ihnp4!dicomed!ndsuvax!nckary
-------------- cut here ---------------
                        CHAPTER  1
                    COMPILER OVERVIEW
     TRC is a useful tool for building expert systems, but
it  is  not  intended  as the only tool that the knowledge
engineer will use.  TRC augments  the  editing,  compiling
and  debugging  tools  already present on the system being
used for development.   Figure  1  illustrates  the  steps
involved in developing an expert system with TRC.
     The knowledge engineer first creates the  TRC  source
code  file  using  whatever  text  file  editing tools are
available.  This text file is then passed to the TRC  com-
piler  which produces a set of C language files.  In addi-
tion  to  the  C  language  files  produced  by  TRC,  the
knowledge  engineer  must  provide  auxiliary  C  language
file(s).  At a minimum the auxiliary file(s) must  contain
a  main procedure which will initialize STM and invoke the
inference engine produced by TRC.  The size of the auxili-
ary  code is limited only by the C compiler itself and the
address space of target  machine.   The  inference  engine
might be a small part of a larger system.
     The input and output facilities provided by  TRC  are
limited.  Any interaction with the user or the file system
on the  target  machine  will  usually  be  coded  in  the
                                                         1
                                                         2
auxiliary  code files.  C language code can be embedded in
either the situation or the action part of a rule.  It may
be convenient to place any procedures that are called from
within a rule in an auxiliary code file.
          _________________
          |               |
          |      TRC      |
          | Specification |
          |_______________|
                 |
            _____V______
            |          |
            |   TRC    |
            | Compiler |
            |__________|
                 |              _____________
            _____V______        |           |
            |          |        | Auxiliary |
            |    C     |        |     C     |
            |  Files   |        |   Files   |
            |__________|        |___________|
                 |__________________|
                          |
                     _____V_____
                     |         |
                     |   CC    |
                     |_________|
                          |
                    ______V_______
                    |            |
                    | Executable |
                    |    Code    |
                    |____________|
              Figure 1: Development Sequence
     The C code produced by TRC and the auxiliary  C  code
provided  by the knowledge engineer are then passed to the
C compiler.  CC is the traditional  name  of  the  command
that   invokes  the  sequence  of  processes  required  to
                                                         3
translate the C language file(s)  to  an  executable  code
file.   This sequence of processes will often include mul-
tiple compiler passes, an assembler process and  a  linker
process. In the context of figure 1, CC refers to whatever
sequence is required to translate the C language files  to
executable code.
     Building an expert system is much like  building  any
other software system.  The system is specified, designed,
coded, tested and finally released.  Each  of  the  steps,
particularly   coding  and  testing,  will  frequently  go
through several iterations.  TRC provides debugging  tools
which  will  profile and trace the execution of the infer-
ence engine.  The TRC debugging tools along with  whatever
debugging  tools  are  provided with the C language system
can be used in the coding and testing  phase  to  simplify
development.
9
9
                        CHAPTER  2
                    DESIGN OBJECTIVES
     An expert system for configuring a packet switch pro-
vided  the  initial stimulus for this project.  The expert
system was implemented using PROS,  a  LISP  based  expert
system  building  tool.  PROS provided a convenient way to
represent the  knowledge  needed  to  configure  a  packet
switch.  The representation was easy to read and expressed
_w_h_a_t the machine was to do more than  _h_o_w  it  was  to  be
done.   There  was an aesthetic quality to the representa-
tion that a seasoned programmer can appreciate.  Execution
turned  out  to  be  a major disappointment.  A relatively
simple problem required over two hours of CPU  time  on  a
VAX  11/780.   A  human expert could easily solve the same
problem in fifteen minutes.
     Artificial  Intelligence  programs  are  not   always
expected  to  solve problems faster than humans.  For some
problems, being able to solve the problem on a computer at
all  is a major accomplishment.  Being able to configure a
packet switch with a computer program did not seem like  a
major  accomplishment  and it seemed that a program should
be able to solve the problem  much  faster  than  a  human
expert.  It also seemed that a program could be written in
                                                         4
                                                         5
a procedural language that would do the same thing in  the
same way as the expert system, only much faster.
     The result of  the  initial  attempt  to  produce  an
expert  system  using a procedural language was a compiler
called 't' (translator).  The 't'  compiler  was  suitable
for the packet switch problem and similar simple problems.
The packet switch problem which required two hours of  cpu
time  in  the  LISP based implementation, executed in less
than one second when compiled  with  't'.   The  execution
time  of  the code produced by 't' was more reasonable for
the complexity of the problem.  This  seemed  to  indicate
that  it  might  be  possible to speed up the execution of
more complex problems.
     The first step in designing  TRC  was  to  study  the
operation  of  PROS  and  PASVII.   The most objectionable
aspect of both these tools is the amount of time  required
for execution.  The expectation was that understanding the
operation of these tools would suggest ways in which  fas-
ter  executing expert systems could be produced.  PROS and
PASVII operate in similar manners and are not unlike other
LISP based implementation tools.
     In PROS and PASVII, both STM and LTM are  represented
by  LISP  lists.   The STM list contains all the data that
the rules will operate on and the LTM  list  contains  all
                                                         6
the  rules.   Each  system has a general purpose inference
engine that searches through the LTM list for a rule whose
situation  part  is  satisfied.   To test if the situation
part is satisfied, the STM list is searched  for  whatever
pattern was specified in the situation part.  Both systems
use the trivial conflict resolution  strategy  of  testing
the  rules sequentially and selecting the first rule whose
situation part is satisfied.
     A LISP list can be used to  represent  any  structure
that  can  be imagined.  A single list is certainly suffi-
cient to represent STM.  Searching the list for a  pattern
match  involves  decomposing the list.  This operation can
be time consuming when the list  is  large  and/or  deeply
nested.   The  list must be decomposed each time a rule is
tested.  In both PROS and PASVII the list is  also  decom-
posed  in  the action part, if something has to be removed
from STM.  Reducing the time required to searching STM  is
an obvious way to speed up execution.
     Since the LTM is also represented by a  single  list,
it  too is continuously decomposed in the execution of the
program.  Consider an expert system composed of a  hundred
rules.   If  each rule is equally likely to be selected by
the resolution strategy, then on the average  fifty  rules
will  have  to  be  tested  before  the rule that is to be
applied is found.  This means that fifty rules have to  be
                                                         7
extracted  from  the  LTM  list and the STM list has to be
decomposed fifty times before one rule can be applied.  It
is  not  uncommon  for this type of system to spend 90% of
it's execution time testing rules and  only  10%  of  it's
time applying actions[12].  Eliminating the need to decom-
pose the LTM and reducing the time spent testing rules are
other obvious ways to improve execution speed.
     Both PROS and PASVII  are  acceptable  languages  for
expressing  rules.   The  TRC language is quite similar to
both PROS and PASVII.  Differences  between  TRC  and  the
LISP  based  languages  are  due  primarily to differences
between C and  LISP.   TRC  also  contains  some  language
features  not  found  in  either  PROS or PASVII.  The TRC
language is described in detail in Appendix C.
     Finally,  why  translate  to  the  C  language?   The
machine  of interest (VAX 11/780) runs an operating system
(4.2BSD) that is written largely in C.  The 4.2BSD C  com-
piler  is  intended  as a production compiler.  Other com-
pilers supplied with the system  (e.g.  PASCAL)  are  pri-
marily  instructional tools[18].  The C language is widely
used and compilers are available for small computers.  The
availability  of  C compilers for small computers makes it
feasible to develop expert systems with TRC that are  tar-
geted to small computers.
9
9
                        CHAPTER  3
                   TRANSLATION STRATEGY
     The first design objective is to reduce the amount of
time spent searching STM.  The way STM is represented will
affect the way a search is conducted.  Since speed is  the
objective,  a  representation that can be searched quickly
is desirable.  The representation must also be  sufficient
for  the  problem.        LISP based implementations use a
LISP list to represent STM.  The LISP list  representation
for  STM  has been sufficient for many complex problems[8,
13, 14, 15, 16].
     There is little possibility  that  any  program  will
exhaust  human  imagination  by using a LISP list in every
way it can possibly be used.  This implies that  the  full
capability  of a LISP list may not be required.  The ques-
tion, then, is how much or how little is enough.   A  LISP
list  can contain atoms or other LISP lists.  Atoms can be
strings or numbers, and numbers can be integers or  reals.
A  variable in a procedural language can be a string or an
integer or a real, so atoms are  simple  to  represent  in
procedural   languages.    The  question  now  is  how  to
represent or replace a list?
_D_a_t_a _R_e_p_r_e_s_e_n_t_a_t_i_o_n
                                                         8
                                                         9
     It was decided  that  STM  would  be  represented  by
linked  lists  of  structures  that could contain whatever
variables (atoms) that were needed.  This is the subset of
a LISP list that is easy to build and manipulate in a pro-
cedural language.  The structures that are used  to  build
the linked list will be referred to as objects.  The vari-
ables that the object structures contain will be  referred
to  as elements.  Element values distinguish the otherwise
identical objects from one another.
     The number and type of elements that are required  to
distinguish  an object will vary between applications.  An
expert system will often refer to  objects  that  bear  no
similarity to one another.  One object may be described by
two strings, while  another  object  may  require  a  real
number  and  an  integer  to  distinguish  it  from  other
objects.  It would be wasteful to require every object  to
contain  the  same number and type of elements.  Therefore
the description of STM is extended  to  a  set  of  lists,
rather  than  a single list.  Each list is a collection of
related objects.
     One side effect of representing STM as a set of lists
is  that  STM  is  in effect partially sorted.  In the TRC
language every reference to an object or element  includes
an  implicit  reference  to  the list where the object may
exist.  Stated another way, it is not possible to refer to
                                                        10
an  object or an element without implicitly mentioning the
list where the object or element  might  be  found.   This
means  that when STM is to be searched for an object there
is only one list where it can  possibly  exist,  therefore
only one of the set of lists has to be searched.
_R_u_l_e _R_e_p_r_e_s_e_n_t_a_t_i_o_n
     A situation-action rule  is  essentially  an  if-then
statement;  "if  this  situation  exists,  then  take this
action".  LTM is translated to a  single  procedure  which
contains an if statement for each rule.  The if statements
(rules) are tested successively  until  one  evaluates  to
true.   The action part of that statement is then executed
(the rule fires).  Control then branches back to the first
if  statement  (rule) and testing continues at that point.
This sequence of events continues until  no  if  statement
(rule)  evaluates to true (fires), at which point the pro-
cedure terminates.
     Up to this point the overall action of an expert sys-
tem  has been described as "searching LTM for a rule whose
situation part is true".  On each iteration only one  rule
is  applied.   If  next  rule to be applied could be found
without searching LTM, the search time would be reduced to
zero.   Reducing  search time is a primary goal of the TRC
rule representation.   From  this  point  on  the  overall
                                                        11
action  of an expert system will be to "reject at the ear-
liest possible moment rules that cannot be applied until a
rule that can be applied is found".  There are some conse-
quences of this new attitude worth noting.
     One side effect of the representation for STM is that
it  is  possible  to have some knowledge of what is in STM
without searching STM.  Suppose there is an expert  system
where  two  types of objects can exist in STM, there would
then be two lists; call them list A  and  list  B.   Since
each list can contain only objects and not other lists, it
is possible to keep a count of how  many  objects  are  in
each  list.   The  count  of the number of objects in each
list can be used to reject a rule without  searching  STM.
Suppose the situation part of a rule specified two objects
from list A and one from list B.  If either list is  empty
or  if  list  A  contains only one object, the rule can be
rejected without searching  either  list.   TRC  places  a
precondition  on  every  rule  that  causes the rule to be
rejected if STM does not contain enough of  each  type  of
object to make the situation part possible.
_S_e_a_r_c_h_i_n_g
     The default strategy for searching STM is called  the
LINEAR  strategy.   A  search is conducted for each object
listed in the situation part, in the order the objects are
                                                        12
listed.   If  any  single  search fails, the rule is aban-
doned.  This is consistent with the "reject at the  earli-
est  possible  moment"  attitude  adopted for LTM.  Unfor-
tunately this simple strategy may not  be  a  sufficiently
powerful  pattern  matching  tool  for  some expert system
implementation requirements.
     An alternate search strategy available in TRC, called
the  RECURSIVE strategy, is designed to perform a combina-
toric search on STM.  When the RECURSIVE strategy is used,
the  failure  of  an  individual search causes the rule to
fail only when there is no previous  search  that  can  be
redone.  Speed of execution can be dramatically reduced by
using the RECURSIVE strategy. Time loss is largely  depen-
dent on the size of the lists being searched.
     Allowing the search strategy to be  selected  by  the
knowledge  engineer  on  a  per-rule basis is a compromise
designed to give the best of both  worlds;  fast  searches
when  combinatoric possibilities do not exist and powerful
pattern matching when they do.  The search  strategy  used
by PROS and PASVII is similar to the RECURSIVE strategy.
     Both search strategies are detailed  in  Appendix  B.
Of particular interest will be section 6.3.3 of Appendix B
where a small example system illustrates  the  differences
between the two strategies.
                                                        13
_A_c_t_i_o_n_s
     The action part consists primarily  of  additions  to
stm  and deletions from stm.  On the surface it seems that
adding and deleting objects should be trivial.  There are,
however, performance issues to be considered.
     Deletions usually refer to objects that were found in
the  situation  part.   This is a matter of practical con-
cern, since only those objects found in the situation part
are  guaranteed  to  exist  in STM.  There are two general
strategies for deleting the objects, either remember where
in  STM the objects were found or search STM in the action
part for the objects to  delete.   Both  PROS  and  PASVII
search STM in both the situation and the action part.  The
objects that are found in the situation part are moved  to
the  front of STM to minimize the time spent searching STM
in the action part.
     There are some objectionable aspects to the  strategy
of  searching  STM in both the situation and action parts.
Every rule that fires can reorder STM.  It can  be  argued
that  reordering STM is a good idea, but it may not always
be what the knowledge engineer wants.  If STM is reordered
in  the  situation  part it is possible that the search in
the action part  will  find  different  objects  than  the
search  in  the  situation part found.  The possibility of
                                                        14
finding something different in the action  part  than  was
found  in the situation part is at least counter intuitive
and possibly incorrect.  Finally, searching STM twice  for
the exact same object(s) is objectionable in and of itself
because it consumes time redoing work.
     PASVII has an interesting way of  adding  objects  to
STM.   The list that represents STM is initialized to con-
tain some number of objects which may be atoms  or  lists.
An  atom  or  a  list  can  replace an atom or a list that
exists in STM.  If an atom or a list is  inserted  at  the
head  of  the  list, the last object (atom or list) in the
list falls off.  This action is called a metaphor for  the
action  of  short  term memory in humans.  As knowledge is
gathered old unused knowledge fades to the back of  memory
and eventually is forgotten.  Quite frankly, this metaphor
sounds more  like  a  clever  explanation  for  a  program
'feature' than a useful tool.
     The actions of adding and deleting objects in TRC are
not  quite  as clever as the previous example.  Objects to
be added to STM are simply inserted at  the  head  of  the
list,  nothing ever falls off the end of the list.  STM is
searched only in the situation part.  Objects that are  to
be  deleted in the action part must have been found in the
situation part.  This rule is enforced  by  the  compiler.
When  an  object  is  found  in  the situation part, it is
                                                        15
identified with an indexed pointer.  The object can now be
referred to or deleted without searching STM.
9
9
                        CHAPTER  4
                       OPTIMIZATION
     Most of the improvements in execution speed  provided
by  TRC are side effects of the translation strategy.  STM
is partially sorted by virtue of being  represented  as  a
set  of  lists  rather  than  as a single list.  For every
object that can exist, there is only  one  list  that  can
contain that object.  The TRC lists themselves are simpler
than LISP lists.  A single linear pass through a TRC  list
will reveal every object.  A LISP list can be more complex
to search because it can be arbitrarily nested.  Precondi-
tions placed on every rule eliminate testing rules when it
is known that the rule's situation part can  not  possibly
be met.  Selectable search strategies allow quick searches
of STM when combinatoric possibilities do not exist.
     The optimizer does not produce code that  is  optimum
in any sense.  What it does is to perform a single, useful
code modification that can have a positive impact on  exe-
cution time.
     The goal of the optimizer is to reduce the amount  of
time  spent searching. Each time a rule fires a great deal
of  implicit  knowledge  about  the  content  of  STM   is
obtained.   It  is  known  that  no  rule  previous to the
                                                        16
                                                        17
current rule is true and no rule previous to  the  current
rule  can  be true after the execution of the current rule
unless the current rule modifies STM in such a way  as  to
make  some previous rule true.  These simple facts are the
entire basis of the optimizer.  Three tests must  be  per-
formed  to determine if it is possible for a rule to fire.
These tests will be called the NOT test, the ADD test  and
the MARK test.
     The tests are named after  the  TRC  statements  that
figure  prominently  in  the  test.   The  details of each
statement are presented in Appendix B.  For the purpose of
this  discussion  it  should  suffice to know that the NOT
statement is an explicit test for an empty list,  the  ADD
statement  is  a  request  to add something to STM and the
MARK statement is a request to remove something from STM.
     The first case to be considered is the case of a rule
which  contains  a  NOT  statement  in the situation part.
When a rule that fires contains an ADD statement  it  will
not be possible for any previous rule with a NOT statement
referring to that list to be the next rule to fire.  If  a
rule  that  fires  contains  a  MARK  statement and no ADD
statement referring to that same list, it is possible that
the list will become empty making it possible for the rule
with the NOT statement that previously  failed  to  become
true.   If it is determined that it is possible for a rule
                                                        18
to fire after the NOT test, that rule becomes  the  candi-
date rule and no further testing is done.
     The ADD test finds recursive rules that can not  fire
on the next pass.  Consider the case of a rule with no NOT
statements that recursively searches STM for a  situation.
If  this  rule fails, it will continue to fail until some-
thing is added to STM to  make  it  true.   If  all  rules
searched  STM  recursively  it  would be known when a rule
fires that of the rules that  precede  the  current  rule,
only those rules that search for something added to STM by
the current rule can possibly fire in the next pass.
     If the current rule adds an object  to  STM,  control
could  continue with the first rule that searches for that
object rather than the first rule  in  LTM.   If  no  rule
prior  to the current rule searches for those things added
to STM by the current rule or if  the  current  rule  adds
nothing  to  STM  then no prior rule can execute.  Control
could continue with the current rule rather  than  at  the
beginning of LTM.
     The MARK test applies to rules that perform a  linear
search  on STM.  The previous conclusion about items being
added to STM is still true; a rule that adds something  to
STM  can  cause a linear search rule to become true.  With
linear search it is also possible that a rule will  become
                                                        19
true  if  something is removed from STM.  If a linear rule
searches for several similar items which are  present  but
not  correctly  ordered  it  is  possible  for this linear
search to fail where a recursive  search  would  not  have
failed.  If there were excess items,  removing one or more
may cause a different linear assignment which could make a
linear rule true.
     The TRC optimizer selects a  continuation  point  for
each  rule  based on what the rule adds to or deletes from
STM rather than testing from the beginning  of  LTM  after
any  rule  fires. The continuation point is the first rule
that could fire based on the NOT and  ADD  tests  for  all
rules, and the MARK test for linear rules.  The TRC optim-
izer is somewhat naive in that  it  considers  only  items
added or deleted with the ADD and MARK statements.
9
9
                        CHAPTER  5
                     FURTHER RESEARCH
     A hierarchical arrangement  for  expert  systems  has
been  suggested[19].  The divide and conquer strategy is a
technique used by  experts  in  almost  every  field.   By
decomposing  a  set of rules into several subsets arranged
in a hierarchy, only the rules that apply to  the  current
part  of  the problem need to be considered.  Reducing the
number of rules that have to be tested at  any  one  point
will  generally reduce the average amount of time it takes
to select a rule.
     Since hand optimization in TRC  allows  an  arbitrary
control  structure to be imposed on the rule based system,
it is not impossible to build a hierarchical expert system
with  TRC.  However, it might not be convenient to build a
hierarchical system with the current TRC compiler.
     The input language to TRC  should  be  redesigned  to
include the convenient expression of hierarchical systems.
Many programming languages support  multiple  module  pro-
grams.   Each  module in a multiple module program usually
contains a group of related procedures.  It might be  rea-
sonable  to  place  each system of rules in a hierarchy of
rule based systems in a separate module.
                                                        20
                                                        21
     In languages that support multiple  module  programs,
some  means  of  binding the modules together is provided.
In the C language the '#include' facility  permits  common
definitions  to  be loaded by each module.  In Ada[20] the
package specification is used to make type,  variable  and
procedure  declarations  visible to other modules.  Either
of these facilities could serve as a model for designing a
hierarchical language.
     Experts are frequently  asked  to  explain  how  they
arrived  at  a  conclusion.  It is reasonable to expect an
expert program to do the same  thing.   TRC  provides  lip
service  to  this  requirement  of expert systems with the
TRACE facility.  The  ordered  list  of  rules  that  were
applied  can  explain how or why a given answer was found,
but inferring an explanation from a trace may not be  sim-
ple.   A  more convenient facility for generating explana-
tions should be designed.
     With the current TRC grammar it is possible to search
for  the  absence  of object/element combinations by using
two rules.  TRC should be extended to  include  a  way  to
specify  a search for the absence of object/element combi-
nations in a single rule.  This could be  accomplished  by
extending the NOT statement and will have an impact on the
optimizer and search code generation.
9
9
                                                        22
     Some expert systems allow  certainty  factors  to  be
associated  with rules.  A confidence factor is the proba-
bility that it is appropriate to  apply  this  rule  given
that  the  situation part is true.  Confidence factors can
also be used to suggest the probability  that  the  answer
generated  by  the expert system is correct.  A convenient
facility for representing  confidence  factors  should  be
included in TRC.
     TRC uses the trivial conflict resolution strategy  of
applying the first rule whose situation part is satisfied.
Alternate conflict resolution strategies  should  be  con-
sidered.   If confidence factors are implemented, one con-
flict resolution strategy may be to  test  all  rules,  if
more  than  one rule is satisfied then select one based on
confidence factors.
     The C language is not the only language that could be
generated  by  a  compiler  like  TRC.  In a separate pro-
ject[21] TRC was modified to generate  TURBO  PASCAL.   It
has  been  suggested[22]  that  TRC  could generate INGRES
code.  STM can be viewed as a database, the situation part
of a rule can be viewed as a database query and the action
part of a rule can be viewed as  a  database  transaction.
For  problems  that  deal with a large amount of data, the
file handling capabilities of a DBMS could be put to  good
use.  Relational calculus is a powerful tool for searching
                                                        23
a data base that could be put to good use  on  some  prob-
lems.
     The current optimizer is very weak.   By  looking  at
the  elements  that are being searched for in STM in addi-
tion to the objects, additional implicit knowledge of  the
state  of  STM is gained.  It may be possible to skip over
some rules based on this knowledge, thus  reducing  search
time.   Consider this sketch where object A has an integer
element B:
     R1:     (A.B == 7)
             =>
             MARK A
             ;
     R2:     A
             =>
             MARK A
             ;
     R3:     =>
             ADD A(B => 6)
             ;
     When rule R3 is optimized by the  current  optimizer,
it will decide that it is possible for R1 to fire after R3
has fired because R3 adds an  object  of  type  A  and  R3
searches for an object of type A.  Clearly R1 can not fire
after R3 fires because the object of type A  added  by  R3
has  element  B equal to 6 while R1 searches for element B
equal to 7.  The current optimizer finds  the  first  rule
that  can possibly fire.  This does not mean the rule will
fire.  There can be any number of rules  between  the  the
last  rule  that fired and the first one that can possibly
                                                        24
fire next.  Preconditions could be placed on  these  rules
to  eliminate  testing  intermediate  rules  where  possi-
ble[23].  Consider this example:
     R1:     B C
             =>
             MARK B
             ;
     R2:     A B
             =>
             MARK A
             ;
     R3:     A C
             =>
             MARK A
             ;
     R4:     A
             =>
             MARK A
             ;
     R5:     =>
             ADD C
             ;
     The optimizer will correctly deduce that it is possi-
ble  for  R1  to fire after R5 fires.  It is also possible
that R1 will not fire.  If R5 fires and R1 does not  fire,
it is not possible for R2, R3 or R4 to fire either.  Since
R5 fired it is known that no  previous  rule  could  fire.
Since  R4  could not fire, it is not possible for R2 or R3
to fire.  When R5 fires, preconditions could be placed  on
R2,  R3 and R4 that would prevent even testing those rules
since it is known that they cannot fire.
9
9
                        CHAPTER  6
                       CONCLUSIONS
     A compiler has been described and built.   This  com-
piler  translates  a  rule  based language to a procedural
language and is a useful tool for building expert systems.
The  translation  to a procedural language may be advanta-
geous for reasons of speed,  portability  or  convenience.
Translation  to a procedural language is particularly con-
venient when integration of procedural code and an  expert
system is desirable.
     Some observations about building expert systems  have
been  made.  These observations are not necessarily unique
to the compiler  that  is  described,  i.e.  they  may  be
applied to other expert system tools.
     If the data objects that the rules will refer to  are
defined, it is possible to represent STM as a set of lists
rather than as a single list.  For many search  algorithms
reducing  the  size of the list to be searched will reduce
search time.  Defining data objects also  makes  automatic
generation  of  preconditions  that can eliminate the need
for searching a possibility.
9
9                                                        25
                                                        26
     Many expert system tools are  conceptually  interpre-
tive.   A  single general purpose inference engine is used
for whatever problem is being addressed.   The  notion  of
generating a custom inference engine for each set of input
rules is novel.
     The optimizer is probably the most  significant  out-
come,  and it too is made possible by defining the objects
to be manipulated.  Optimization  of  interpretive  expert
system  tools has centered on developing efficient search-
ing and matching strategies.  The  notion  of  a  separate
optimizer  that  changes  the  operation  of the inference
engine without affecting the  order  in  which  rules  are
fired  is  novel and can be applied to other expert system
building tools.
9
9
                       BIBLIOGRAPHY
1.  Aho  and  Ullman,  _P_r_i_n_c_i_p_l_e_s  _o_f   _C_o_m_p_i_l_e_r   _D_e_s_i_g_n.
Addison-Wesley, 1977.
2. Pyster,  _C_o_m_p_i_l_e_r  _D_e_s_i_g_n  _a_n_d  _C_o_n_s_t_r_u_c_t_i_o_n,  Prindle,
Weber and Schmidt, 1980.
3. Johnson, _Y_a_c_c: _Y_e_t _A_n_o_t_h_e_r _C_o_m_p_i_l_e_r _C_o_m_p_i_l_e_r.  Computer
Science Technical Report No. 32, Bell Laboratories, Murray
Hill, NJ   1975.
4. Juell, _A_n _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _P_R_O_S  _P_r_o_d_u_c_t_i_o_n  _S_y_s_t_e_m.
Computer  Science  Department,  North Dakota State Univer-
sity, 1983.
5. Mittal, _P_A_S-_I_I _U_s_e_r _M_a_n_u_a_l.  Department of Computer and
Information Science, Ohio State University, 1977.
6. Forgy, _O_P_S_5 _U_s_e_r'_s _M_a_n_u_a_l.   Technical  Report  CMU-CS-
81-135, Carnegie-Mellon University, Pittsburgh, 1981.
7. Kernighan and  Ritchie,  _T_h_e  _C  _P_r_o_g_r_a_m_m_i_n_g  _L_a_n_g_u_a_g_e.
Prentice-Hall, NJ, 1978.
8. Hayes-Roth, Waterman and Lenat,  _B_u_i_l_d_i_n_g  _E_x_p_e_r_t  _S_y_s_-
_t_e_m_s.  Addison-Wesley, 1983.
9.  Winston,  _A_r_t_i_f_i_c_i_a_l  _I_n_t_e_l_l_i_g_e_n_c_e.    Addison-Wesley,
                                                        27
                                                        28
1984.
10. Ritchie and Thompson, _T_h_e  _U_N_I_X  _T_i_m_e-_S_h_a_r_i_n_g  _S_y_s_t_e_m.
The Bell System Technical Journal, Vol. 57, No. 6, Part 2,
1978.
11. Winston and Horn, _L_i_s_p.  Addison-Wesley, 1984.
12. Gupta, _P_a_r_a_l_l_e_l_i_s_m _i_n _P_r_o_d_u_c_t_i_o_n _S_y_s_t_e_m_s: _T_h_e  _S_o_u_r_c_e_s
_a_n_d  _t_h_e  _E_x_p_e_c_t_e_d  _S_p_e_e_d-_u_p.  Department of Computer Sci-
ence, Carnegie-Mellon University, 1984.
13. Lindsay, Buchanan, Feigenbaum and Lederberg,  _A_p_p_l_i_c_a_-
_t_i_o_n_s  _o_f  _A_I  _f_o_r _O_r_g_a_n_i_c _C_h_e_m_i_s_t_r_y: _T_h_e _D_E_N_D_R_A_L _P_r_o_j_e_c_t.
McGraw-Hill, 1981.
14.  Shortliffe,  _C_o_m_p_u_t_e_r-_B_a_s_e_d  _M_e_d_i_c_a_l   _C_o_n_s_u_l_t_a_t_i_o_n_s:
_M_Y_C_I_N.  American Elsevier, New York, 1976.
15. Davis, Buchanan and Shortliffe, _P_r_o_d_u_c_t_i_o_n _R_u_l_e_s _a_s  _a
_R_e_p_r_e_s_e_n_t_a_t_i_o_n _f_o_r _a _K_n_o_w_l_e_d_g_e-_B_a_s_e_d _C_o_n_s_u_l_t_a_t_i_o_n _P_r_o_g_r_a_m.
Artificial Intelligence, Vol. 8, No. 1, 1977.
16. Erman, et. al,  _T_h_e  _H_e_a_r_s_a_y-_I_I  _S_p_e_e_c_h  _U_n_d_e_r_s_t_a_n_d_i_n_g
_S_y_s_t_e_m:  _I_n_t_e_g_r_a_t_i_n_g  _K_n_o_w_l_e_d_g_e  _t_o  _R_e_s_o_l_v_e  _U_n_c_e_r_t_a_i_n_t_y.
Computing Surveys, June 1980.
17. Davis, _E_x_p_e_r_t _S_y_s_t_e_m_s: _W_h_e_r_e _A_r_e _W_e? _A_n_d _W_h_e_r_e  _D_o  _W_e
_G_o  _F_r_o_m  _H_e_r_e?.   Massachusetts  Institute of Technology,
A.I. Memo 665, 1982.
                                                        29
18. Joy, et. al, _B_e_r_k_e_l_e_y _P_a_s_c_a_l _U_s_e_r'_s _M_a_n_u_a_l.   Computer
Science  Division,  University  of  California,  Berkeley,
1983.
19. Mizoguchi and Kakusho, _H_i_e_r_a_r_c_h_i_c_a_l _P_r_o_d_u_c_t_i_o_n _S_y_s_t_e_m,
IJCAI-79, p586, 1979.
20. _A_m_e_r_i_c_a_n _N_a_t_i_o_n_a_l _S_t_a_n_d_a_r_d _R_e_f_e_r_e_n_c_e  _M_a_n_u_a_l  _f_o_r  _t_h_e
_A_d_a  _P_r_o_g_r_a_m_m_i_n_g  _L_a_n_g_u_a_g_e.   American  National Standards
Institute, Inc., 1983.
21. Nygard, personal communication, 1985.
22. Shapiro, personal communication, 1985.
23. Rebel, personal communication, 1985.
9
9
    
    
More information about the Mod.sources
mailing list