Question regarding Homework assignment 2.2, subtask 2

system · 12. Mai 2020 um 17:19

Disclaimer: Dieser Thread wurde aus dem alten Forum importiert. Daher werden eventuell nicht alle Formatierungen richtig angezeigt. Der ursprüngliche Thread beginnt im zweiten Post dieses Threads.

Username123 · 12. Mai 2020 um 17:19

Question regarding Homework assignment 2.2, subtask 2
Hello,

regarding the above mentioned homework assignment and since I have lost a few points in this subtask, I have the following question:

The solution of subtask 2 says “We normalize to F= Age_25, making …” . Does this have anything to do with the Normalization we do in Bayesian Networks? The steps the solution does seem intuitive to me: We only look at Age_25 women now, so instead of talking about P(Down|F=Age25), we now simply say P(Down) (correct me if this is wrong). Now I’m not sure whether or not I’m missing something, because somehow I can’t see how this normalization is the same normalization we use in Bayesian Networks. They’re different concepts, right?

Thanks in advance and have a nice day!

rappatoni · 13. Mai 2020 um 15:32

You are on the right track: Whenever P(X) is a probability distribution, so is P(X|Y). Hence you can e.g. write P_Y(X) to denote that probability distribution. In the example, if P(Down) is a probability distribution, so is P_(F=Age-25)(Down).

This is indeed equivalent to the normalization technique (which among other things, is used in Bayesian networks). For the vector P_(F=Age-25)(Down)= alpha*[x,y] to be a probability distribution, its values have to sum to one. Note that (by definition) x=down /\ F=Age_25, y= not down /\ F=Age_25 and as this is only part of the joint probability distribution P(Down, F), the probabilities of these two events will not sum to 1. So to make a probability distribution we have to choose a suitable normalization constant alpha. This is done by setting alpha=1/(x+y)=1/P(F=Age_25) (the last equality is by marginalization).

In the homework most of the probabilities you were given were already conditioned on F=Age_25, so normalizing did not require any computation. However, there was one exception which strictly speaking one would have to normalize: the false positive/false negative rates P(pos|not down), P(not pos|down). It might be that these depend on the age of the tested person. However, most of you assumed that false positive/false negative rate and age are independent, hence P(pos|not down)=P(pos|not down, F=Age_25) allowing you to use the given values.

Username123 · 13. Mai 2020 um 17:27

Thank you very much, that explanation has made me understand it!