Documentation
Case Studies
Extensions
Support
Team


Distributed inference
The techniques described in the
Online learning document can also be used to perform inference over a
cluster of machines. Each machine runs the same model, but with different
data and different incoming messages, then they exchange messages. The
steps are as follows:
 Determine how you want to partition the model, and identify the set of
variables shared among multiple partitions.
 For each shared variable, create a variable to hold incoming messages, and
attach it via ConstrainEqualRandom. We will call this the inbox.
Each machine will have a different value for the inbox. Initially, the inboxes
are all uniform.
 Each machine runs inference, obtaining the MarginalDividedByPrior for each
shared variable, and divides it by the inbox. We will call this the upward
message. If machines are indexed by m, then we will get an array of
messages indexed by m.
 On the next iteration, the inbox for a variable on machine m is equal to the
product of messages from all other machines.
 At convergence, the posterior for the variable is its marginal computed by
any of the machines (they should all be equal).
An example of this procedure can be found in the Infer.NET distribution under
Learners. The paths to the batched training methods for the classifier and
recommender are:
 Classifier\BayesPointMachineClassifierInternal\Classifiers\NativeDataFormatBayesPointMachineClassifier.cs
> TrainOnMultipleBatches()
 Recommender\MatchboxRecommenderInternal\NativeDataFormatMatchboxRecommender.cs
> Train()
To divide out the inbox in step 3, we can use a clever modelling trick
instead of doing the division manually. The trick is to make a copy of the
shared variable and infer the MarginalDividedByPrior of the copy. Since
the inbox is part of the prior of the copy, it will be divided out for free.
Here is an example:
Variable<double> weight = Variable.GaussianFromMeanAndVariance(0, 1);
Variable<Gaussian> weightInbox = Variable.Observed(new Gaussian(3,4));
Variable.ConstrainEqualRandom(weight, weightInbox);
Variable<double> weightCopy = Variable.Copy(weight);
weightCopy.AddAttribute(QueryTypes.MarginalDividedByPrior);
InferenceEngine engine = new InferenceEngine();
var message = engine.Infer<Gaussian>(weightCopy, QueryTypes.MarginalDividedByPrior); 
This method is recommended since it not only avoids the cost of doing the division but also potential numerical inaccuracies due to roundoff errors. (The cost of copying a variable in Infer.NET is negligible since the compiler will optimize it away in the generated code.)

false,false,1



