Notice that we can’t make any additional eliminations from { \Pr(P | B_0, B_1) } because { P \not\perp B_0 \mid B_1 }. While P and B_0 are statistically independent a priori, conditioning on B_1 induces statistical dependence between P and B_0 (recall the collider bias and the other causal motifs).
Similarly, while S is causally independent of P (causal influce flows in the direction of arrows), information flows both directions so S and P are statistically dependent {( S \not\perp P )}. Knowing something about P gives you information about S and vice versa (recall the definition of statistical dependence).
The equality we’re trying to derive (i.e. Equation 2) is
Wow, that was easier. Why didn’t we just do that to start? Because we didn’t know what order of variables to use.
Exercises
Based on the posterior given in Equation 1, what is the prior, likelihood and marginal evidence?
In Equation 3, we used the conditional independencies of the causal graph to expresses { \Pr(S | B_0, B_1, D, P) } as { \Pr(S | P) }. Explain why the causal graph permits the elimination of B_0, B_1, and D, and not P. For instance, we condition on B_1, which is a collider of S and B_1. Why can we still eliminate B_1?
We have shown that there are multiple (potentially very many) factorizations of a joint distribution that are probabilistically equivalent. E.g. in the worked example,
\begin{align*}
\Pr(B_0, B_1, &D, P, S, A)
\\=&\\
\Pr(A | B_1, D) \times \Pr(B_1 | P, B_0) &\times \Pr(P | S) \times\Pr(S, B_0, D)
\\=&\\
\Pr(A | B_1, D) \times \Pr(B_1 | B_0) \times& \Pr(P | B_0, B_1) \times \Pr(S | P) \times \Pr(B_0) \times \Pr(D)
\end{align*}
Why did these authors chose the former factorization rather than the latter, or any of the myriad other factorizations that are mathematically equivalent? Why bother factorizing at all? Why not just use the joint probability { \Pr(B_0, B_1, D, P, S, A) }?
Render env
%reset -fimport sysimport platformimport importlib.metadataprint("Python:", sys.version)print("Platform:", platform.system(), platform.release())print("Processor:", platform.processor())print("Machine:", platform.machine())print("\nPackages:")for name, version insorted( ((dist.metadata["Name"], dist.version) for dist in importlib.metadata.distributions()), key=lambda x: x[0].lower() # Sort case-insensitively):print(f"{name}=={version}")
Baker, Chris L., Jara-Ettinger, Julian, Saxe, Rebecca, & Tenenbaum, Joshua B. (2017). Rational quantitative attribution of beliefs, desires and percepts in human mentalizing. Nature Human Behaviour, 1(4), 598. https://doi.org/10.1038/s41562-017-0064