Documentation
Case Studies
Extensions
Support
Team


Infer.NET user guide :
Learners :
Bayes Point Machine classifiers :
The Learner API
Settings & events
The settings of a Bayes Point Machine classifiers are
obtained through its
Settings property. There are separate settings for
training and prediction.
Training settings

IterationCount sets the number of training algorithm
iterations to be run. The number of training iterations
specifies how often the messagepassing algorithm (Expectation
Propagation in the case of the Bayes Point Machine
classifiers) iterates over the (entire) training data. The
higher the number of training iterations, the more accurate the
predictions; however, training will be slower.
For
most data sets, the default setting of 30 training iterations is
sufficient for the algorithm to make accurate predictions.
Sometimes accurate predictions can be made using fewer
iterations. On data sets with highly correlated features, one
may benefit from more training iterations. Typically, the number
of iterations varies between 5 and 100.

BatchCount sets the number of batches that the training
data is split into. It is set to 1 by default, but should be
increased if the system runs out of memory during training.
Batching allows to train the Bayes Point Machine classifier even
if the training data does not fit into memory! If a
standard data format mapping is used to read data, then the
batching will be automatically performed by the Bayes Point
Machine classifiers; but if you use a
native data format mapping, you are expected to have split
the data into batches prior to training.
The
computational cost of batching increases markedly with the
number of batches because the Bayes Point Machine classifiers
needs to switch the batches in and out of memory as well as
reconstruct messages for each batch. Batching should hence be
avoided, if possible.

ComputeModelEvidence determines whether or not the Bayes
Point Machine classifier can compute model evidence (marginal
likelihood), a measure of how likely the data is given the
model. Model evidence is hence used in model comparison. As the
computation of evidence decreases performance,
ComputeModelEvidence is set to
false by default. In contrast to the other settings, it
cannot be changed once a Bayes Point Machine classifier has been
trained.
It is important to realize that the Bayes Point Machine
classifiers do not require the specification of some parameters,
such as prior distributions over weights. The Bayes Point
Machines are practically
hyperparameter free. As mentioned in the tutorial
introduction, this is achieved via heavytailed prior
distributions over weights (see
The probabilistic model) and has a number of immediate
benefits:
 It avoids a misspecification of the prior distributions
(which typically results in suboptimal performance).
 It entirely frees the Bayes Point Machine classifiers from
the need for timeconsuming hyperparameter tuning.
 It also avoids the need for feature values to be normalized
in a preprocessing step, as the Bayes Point Machine classifiers
automatically adapt to the different feature scales during
training.
Prediction settings

SetPredictionLossFunction sets the loss function based on
which a precise classification (predictive point estimate) is
computed from the predicted distributions over class labels (see
Prediction). There are four options:

ZeroOne  the 0/1 loss function is equivalent to
choosing the mode of the predicted distribution as a point
estimate. Use this loss function to minimize mean
classification error.
ZeroOne is the default loss function.

Squared  the squared or quadratic loss function is
equivalent to choosing the mean of the predicted
distribution as a point estimate. Use this loss function to
minimize mean squared error on class labels which have an inherent
ordering. Note, however, that such an ordering violates the
assumptions underlying the Bayes Point Machine classifiers
(more here). Moreover, the
Squared option is available
only if labels are supplied via a native data format
mapping. Otherwise you need to specify a custom squared loss
function.

Absolute  the absolute loss function is equivalent
to choosing the median of the predicted distribution as a
point estimate. Use this loss function to minimize mean
absolute error on class labels which have an inherent
ordering. Note, however, that such an ordering violates the
assumptions underlying the Bayes Point Machine classifiers
(more here). Moreover, the
Absolute option is available
only if labels are supplied via a native data format
mapping. Otherwise you need to specify a custom absolute loss
function.

Custom  a custom loss function can be specified if
none of the above loss functions satisfy your specific needs
(in medical diagnosis, for instance, classification loss is
often judged to be asymmetric). Using a custom loss function
may slow down point prediction (i.e. the
Predict method).
The following prediction setting only applies to the
multiclass Bayes Point Machine classifier:

IterationCount sets the number of prediction algorithm
iterations to be run. In multiclass classification, the
messagepassing algorithm (Expectation
Propagation) needs to iterate a small number of times over
the prediction data. Typically, this number varies between 2 and
10. As in the case of the training algorithm, the higher the
number of prediction iterations, the more accurate the
predictions; however, prediction will be slower.
Events
The Bayes Point Machine classifiers can provide (limited)
diagnostic information about the progress of the training
algorithm via the
IterationChanged event. Listening to this event gives you
access to the posterior distributions over weights after the
training algorithm has completed an iteration over the training
set. Especially if there are many features (and thus weights) or
classes, attaching a handler to this event comes at a cost as
the weights need to be copied. However, in rare cases the Bayes
Point Machine classifiers may exhibit relatively poor
performance. In such a case, it is often informative to look at
the change in posterior weight distributions over time and see
whether there is a problem with convergence of the training
algorithm.
Handlers attached to the
IterationChanged event will
not be serialized when saving the Bayes Point Machine
classifiers.




