# Bayesian inference in F# – Part IIb – Finding Maia underlying attitude

-

Other parts:

The pre­vi­ous post ended on this note.

```let MaiaJointProb attitude action =
match attitude with
| Happy     -> happyActions |> List.assoc action
| UnHappy   -> unHappyActions |> List.assoc action
| Quiet     -> quietActions |> List.assoc action```

This is just a two by two ma­trix. It sim­ply rep­re­sents which prob­a­bil­ity is as­so­ci­ated to an (attitude, ac­tion) tu­ple. It is use­ful to think about it in these terms, be­cause it makes eas­ier to grasp the fol­low­ing func­tion:

```/// Conditional probability of a mental state, given a particular observed action
let MaiaLikelihood action = fun attitude -> MaiaJointProb attitude action```

This is sim­ply a row in the ma­trix. It an­swers the ques­tion: given that I ob­serve a par­tic­u­lar ac­tion, what is the prob­a­bil­ity that Maia has a cer­tain at­ti­tude?. This is called likelihood func­tion” in sta­tis­tics. Its gen­eral form is: given that a I ob­serve an out­come, what is the prob­a­bil­ity that it is gen­er­ated by a process with a par­tic­u­lar pa­ra­me­ter?

A re­lated ques­tion is then: what if I ob­serve a se­quence of in­de­pen­dent ac­tions? What is the prob­a­bil­ity that the baby has a cer­tain at­ti­tude then? This is an­swered by the fol­low­ing:

```/// Multiple applications of the previous conditional probabilities for a series of actions (multiplied)
let MaiaLikelihoods actions =
let composeLikelihoods previousLikelihood action  = fun attitude -> previousLikelihood attitude * MaiaLikelihood action attitude
actions |> Seq.fold composeLikelihoods (fun attitude -> 1.)```

It is a triv­ial ex­ten­sion of the pre­vi­ous func­tion (really), once you know that to com­bine like­li­hoods you mul­ti­ply them.

We now need to de­scribe what our prior is. A prior is our pre­con­ceived no­tion about a par­tic­u­lar pa­ra­me­ter (in this case the baby’s at­ti­tude). You might be tempted to ex­press that no­tion with a sin­gle value, but that would be in­ac­cu­rate. You need to in­di­cate how con­ﬁ­dent you are about it. In sta­tis­tics you do that by choos­ing a dis­tri­b­u­tion for your be­lief. This is one of the beau­ties of Bayesian sta­tis­tics, every­thing is a prob­a­bil­ity dis­tri­b­u­tion. In this case, we re­ally don’t have any pre­vi­ous be­lief, so we pick the uni­form dis­tri­b­u­tion.

`let MaiaUniformPrior attitude = 1. / 3.`

Think of this as: you haven’t read any baby-at­ti­tude-spe­ciﬁc study or re­ceived any ex­ter­nal in­for­ma­tion about the likely at­ti­tude of Maia, so you can­not pre­fer one at­ti­tude over an­other.

We are al­most done. Now we have to ap­ply the Bayesian the­o­rem and get the un-nor­mal­ized pos­te­rior dis­tri­b­u­tion. Forget about the un-nor­mal­ized word. What is a pos­te­rior dis­tri­b­u­tion? This is your out­put, your re­turn value. It says: given my prior be­lief on the value of a pa­ra­me­ter and given the out­comes that I ob­served, this is what I now be­lieve the pa­ra­me­ter to be. In this case it goes like: I had no opin­ion on Maia’s at­ti­tude to start with, but af­ter I ob­served her be­hav­ior for a while, I now think she is Happy with prob­a­bil­ity X, UnHappy with prob­a­bil­ity Y and Quiet with prob­a­bil­ity Z.

```/// Calculates the unNormalized posterior given prior and likelihood
let unNormalizedPosterior (prior:'a -> float) likelihood =
fun theta -> prior theta * likelihood theta```

We then need to nor­mal­ize this thing (it does­n’t sum to one). The way to do it is to di­vide each prob­a­bil­ity by the sum of the prob­a­bil­i­ties for all the pos­si­ble out­comes.

```/// All possible values for the unobservable parameter (mental state)
let support = [Happy; UnHappy; Quiet]
/// Normalize the posterior (it integrates to 1.)
let posterior prior likelihood =
let post = unNormalizedPosterior prior likelihood
let sum = support |> List.sum_by (fun attitude -> post attitude)
fun attitude -> post attitude / sum```

We are done. Now we can now start mod­el­ing sce­nar­ios. Let’s say that you ob­serve [Smile;Smile;Cry;Smile;LookSilly]. What could the un­der­ly­ing at­ti­tude of Maia be?

`let maiaIsANormalBaby = posterior MaiaUniformPrior (MaiaLikelihoods [Smile;Smile;Cry;Smile;LookSilly])`

We can then ex­e­cute our lit­tle model:

```maiaIsANormalBaby Happy
maiaIsANormalBaby UnHappy
maiaIsANormalBaby Quiet```

And we get (0.5625, 0.0625, 0.375). So Maia is likely to be happy and un­likely to be un­happy. Let’s now model one ex­treme case:

```/// Extreme cases
let maiaIsLikelyHappyDist = posterior MaiaUniformPrior (MaiaLikelihoods [Smile;Smile;Smile;Smile;Smile;Smile;Smile])
maiaIsLikelyHappyDist Happy
maiaIsLikelyHappyDist UnHappymaiaIsLikelyHappyDist Quiet```

And we get (0.944, 0.000431, 0.05). Now Maia is al­most cer­tainly Happy. Notice that I can con­ﬁ­dently make this af­ﬁr­ma­tion be­cause my end re­sult is ex­actly what I was look­ing for when I started my quest. Using clas­si­cal sta­tis­tics, that would­n’t be the case.

A re­lated ques­tion I might want to ask is: given the pos­te­rior dis­tri­b­u­tion for at­ti­tude that I just found, what is the prob­a­bil­ity of ob­serv­ing a par­tic­u­lar ac­tion? In other words, given the model that I built, what does it pre­dict?

```let posteriorPredictive jointProb posterior =
let composeProbs previousProbs attitude = fun action -> previousProbs action + jointProb attitude action * posterior attitude
support |> Seq.fold composeProbs (fun action -> 0.)
let nextLikelyUnknownActionDist = posteriorPredictive MaiaJointProb maiaIsLikelyHappyDist```

I don’t have the strength right now to ex­plain the math­e­mat­i­cal un­der­pin­ning of this. In words, this says: con­sid­er­ing that Maia can have one of the pos­si­ble three Attitudes with the prob­a­bil­ity cal­cu­lated above, what is the prob­a­bil­ity that I ob­serve a par­tic­u­lar ac­tion? Notice that the sig­na­ture for it is: (Action –> ﬂoat), which is the com­piler way to say it.

Now we can run the thing.

```nextLikelyUnknownActionDist Smile
nextLikelyUnknownActionDist Cry
nextLikelyUnknownActionDist LookSilly```

And we get (0.588, 0.2056, 0.2055). Why is that? We’ll talk about it in the next post.

• FSHARP
• STATISTICS
1 Comment