This section describes the feature of using data (i.e., a set of cases) to estimate the conditional probability distributions and densities when only the graphical structure is given. This is known as batch learning. Note that if the initial conditional probability distributions and densities are also given, it is still possible to use batch learning.
Batch learning requires that all
data are available when the learning process starts, whereas sequential learning
allows the knowledge to be updated incrementally, but sequential learning
requires that initial assessments of the conditional probabilities are given, whereas batch learning
only requires the graphical structure. The batch learning method
currently implemented in Hugin is called EM-algorithm or EM-learning
and was developed by Lauritzen (1995);
see also the work of Cowel
& Dawid (1992).
A case is an assignment of values to
some or all the nodes of a domain. If values have been assigned to all nodes,
the case is said to be complete; otherwise, it is said to be incomplete.
The data used by the learning algorithm is comprised of a set of cases. The
cases are numbered sequentially from 0 to N-1, where N
is the total number of cases.
Before learning can take place, in addition to the data
set, the set of nodes for which conditional probability distributions and
densities should be estimated must be specified. This set is specified as the set of nodes having experience
tables (described in section Learning - Adaptation). The input to the learning algorithm
is the cases data; this set of cases must be non-empty. Note
that if the domain is a LIMID model, then the domain's decision
nodes must be instantiated in all cases.
The learning algorithm needs to do inference, so the domain must be compiled before the learning algorithm is called. If successful, the procedure will update the conditional probability (or density) and the experience table for each node of the domain that has an experience table. Moreover, the inference engine will be reset and use the new conditional probability (or density).
If the contents of the experience tables are all zeros, then the algorithm will compute the best ("maximum likelihood") conditional probability tables and densities, assuming that any table is valid (i.e., there are no restrictions on the form of the tables). If the contents are not all zeros, then the experience table and the conditional probability tables or density is used to form counts ("prior counts") that will be added to those derived from the data set. This is known as "penalized EM". If the experience count for a parent configuration is negative, then parameter estimation is disabled for that parent configuration.
The starting point or the EM algorithm is the conditional
probability tables and densities specified prior to calling the algorithm, assuming that the
domain has been compiled with these tables. If no table has been specified for a
discrete node, uniform
distributions are assumed whereas initial values for the mean and variances are
computed from the data for a continuous node. Sometimes it is desirable to enforce zeros in the
conditional probability tables for the configurations that should be impossible
(i.e., having zero probability).
The EM algorithm performs a number of iterations. For each iteration the logarithm of the probability of the case data given the current joint probability distribution is computed. This quantity is known as the log-likelihood, and the EM-algorithm attempts to maximize this quantity. There are two ways in which the number of iterations can be controlled: (1) A maximum number of iterations can be specified (zero means no maximum). (2) If a maximum number of iterations has been specified and not exceeded, the EM algorithm terminates when the relative difference between the log-likelihood for two successive iterations is sufficiently small. That is, when the relative difference between the log-likelihood for two successive iterations becomes less than a tolerance, which must be a positive number. By default, the value of the tolerance is 10-4.
To activate EM learning, either press the EM Learning Button or select the EM Learning Wizard item of the Wizards Menu.
See also notes on EM learning in OOBNs.