Bayesian inference in F# - Part IIa - A simple example - modeling Maia


Other parts:

Let’s start with a sim­ple ex­am­ple: in­fer­ring the un­der­ly­ing at­ti­tude of a small baby by ob­serv­ing her ac­tions. Let’s call this par­tic­u­lar small baby Maia. People al­ways asks her fa­ther if she is a good’ baby or not. Her fa­ther started to won­der how he can pos­si­bly know that. Being good’ is not very clear, so he chooses to an­swer the re­lated ques­tion if her at­ti­tude is gen­er­ally happy, un­happy or sim­ply quiet (a kind of mid­dle ground).

/// Underlying unobservable, but assumed stationary, state of the process (baby). Theta.
type Attitude =
    | Happy
    | UnHappy
    | Quiet

Her poor fa­ther does­n’t have much to go with. He can just ob­serve what she does. He de­cides, for the sake of sim­pli­fy­ing things, to cat­e­go­rize her state at each par­tic­u­lar mo­ment as smil­ing, cry­ing or look­ing silly (a kind of mid­dle ground).

/// Observable data. y.
type Action =
    | Smile
    | Cry
    | LookSilly


The fa­ther now has to de­cide what does it mean for Maia to be of an happy at­ti­tude. Lacking an uni­ver­sal de­f­i­n­i­tion of hap­pi­ness in terms of these ac­tions, he makes one up. Maia would be con­sid­ered happy if she smiles 60% of the times, she cries 20% of the times and looks silly the re­main­ing 20% of the times. He might as well have ex­per­i­mented with clearly happy/​un­happy ba­bies to come up with those num­bers.

/// Data to model the underlying process (baby)
let happyActions = [ Smile, 0.6; Cry, 0.2; LookSilly, 0.2]
let unHappyActions = [Smile, 0.2; Cry, 0.6; LookSilly, 0.2]
let quietActions = [Smile, 0.4; Cry, 0.3; LookSilly, 0.3]

What does it mean ex­actly? Well, this fa­ther would call his wife at ran­dom times dur­ing the day and ask her if Maia is smil­ing, cry­ing or look­ing silly. He would then keep track of the num­bers and then some­how de­cide what her at­ti­tude is. The gen­eral idea is sim­ple, the some­how part is not.

/// Generates a new uniformly distributed number between 0 and 1
let random =
    let rnd = new System.Random()

We can now model Maia. We want our model to re­turn a par­tic­u­lar ac­tion de­pend­ing on which at­ti­tude we as­sume Maia is in mostly. For ex­am­ple, if we as­sume she is an happy baby, we want our model to re­turn Smile about 60% of the times. In essence, we want to model what hap­pens when the (poor) fa­ther calls his (even poorer) wife. What would his wife tell him (assuming a par­tic­u­lar at­ti­tude)? The gen­eral idea is ex­pressed by the fol­low­ing:

/// Process (baby) modeling. How she acts if she is fundamentally happy, unhappy or quiet
let MaiaSampleDistribution attitude =
    match attitude with
    | Happy     -> pickOne happyActions
    | UnHappy   -> pickOne unHappyActions
    | Quiet     -> pickOne quietActions

The pickOne’ func­tion sim­ply picks an ac­tion de­pend­ing on the prob­a­bil­ity of it be­ing picked. The name sam­ple dis­tri­b­u­tion is sta­tis­tic-lingo to mean what you ob­serve’ and in­deed you just can ob­serve Maia’s ac­tions, not her un­der­ly­ing at­ti­tude.

The im­ple­men­ta­tion of pick­One gets tech­ni­cal. You don’t need to un­der­stand it to un­der­stand the rest of this post. This is the beauty of en­cap­su­la­tion. You can start read­ing from af­ter the next code snip­pet if you want to.

pickOne’ works by con­struct­ing the in­verse cu­mu­la­tive dis­tri­b­u­tion func­tion for the prob­a­bil­ity dis­tri­b­u­tion de­scribed by the Happy/UnHappy/Quiet/Actions lists. There is an en­try on wikipedia that de­scribes how this works and I don’t wish to say more here ex­cept pre­sent­ing the code.

/// Find the first value more or equal to a key in a seq<'a * 'b>.
/// The seq is assumed to be sorted
let findByKey key aSeq = aSeq |> Seq.find (fun (k, _) -> k >= key) |> snd /// Simulate an inverse CDF given values and probabilities let buildInvCdf valueProbs = let cdfValues = valueProbs |> Seq.scan (fun cd (_, p) -> cd + p) 0. |> Seq.skip 1 let cdf = valueProbs |> fst |> cdfValues |> Seq.cache fun x -> cdf |> findByKey x /// Picks an 'a in a seq<'a * float> using float as the probability to pick a particular 'a let pickOne probs = let rnd = random () let picker = buildInvCdf probs picker rnd

Another way to de­scribe Maia is more math­e­mat­i­cally con­ve­nient and will be used in the rest of the post. This sec­ond model an­swers the ques­tion: what is the prob­a­bil­ity of ob­serv­ing an ac­tion as­sum­ing a par­tic­u­lar at­ti­tude? The dis­tri­b­u­tion of both ac­tions and at­ti­tudes (observable vari­able and pa­ra­me­ter) is called joint prob­a­bil­ity.

/// Another, mathematically more convenient, way to model the process (baby)
let MaiaJointProb attitude action =
    match attitude with
    | Happy     -> happyActions |> List.assoc action
    | UnHappy   -> unHappyActions |> List.assoc action
    | Quiet     -> quietActions |> List.assoc action

List.assoc re­turns the value as­so­ci­ated with a key in a list con­tain­ing (key, value) pairs. Notice that in gen­eral, if you are ob­serv­ing a process, you don’t know what its joint dis­tri­b­u­tion is. But you can ap­prox­i­mate it by run­ning the MaiaSampleDistribution func­tion on known ba­bies many times and keep­ing track of the re­sult. So, in the­ory, if you have a way to ex­per­i­ment with many ba­bies with known at­ti­tudes, you can cre­ate such a joint dis­tri­b­u­tion.

We now have mod­eled our prob­lem, this is the cre­ative part. From now on, it is just ex­e­cu­tion. We’ll get to that.




Luca Bolognese's WebLog


Other parts: Part I — Background Part II — A sim­ple ex­am­ple — mod­el­ing Maia The pre­vi­ous post ended on