Model Generation

The Models

As previously discussed PYMC uses models to represent the graph structure on which it can do sampling upon.

These models are contained in a python source file. When one wants to use a model they are imported and passed into PYMC.

The Problem

The problem with this approach revolves around when one wants to change the model slightly or update the learnt probabilities. Manually editing it is of course inefficient. Further difficulty is introduced when we want to change the graph structure in multiple subsystems.

The Solution

The solution chosen was to automatically generate the python source files that make up the models. Whilst this is not the immediately obvious solution any other potential workaround look to be possibly even more painful.

The generator imports the graph from a file which is made up of the simple graph language that has been previously discussed. In addition to this the generator takes an input of observed variables.

The graph is then passed. For each node a node in the model is created, similarly a probability function is created in accordance to the relationships defined in the graph definition and from the stored CPT data that was gathered during the training stage.

If a node is observed (identified by the passed list of observed variables) this will be marked in the model as well.

By dynamically allowing the observed nodes to be set in code it allows for the system to generate a model to answer questions like P(X|A,C).

When the model has been created all that needs to be done is to reload it and then to sample from it.

Simple to explain, annoying to get working.

Leave a Reply

Your email address will not be published. Required fields are marked *