How to calculate the odds of something being true vs. it being false, given the evidence (Bayesian inference)

Mauro

Active Member
The name 'Bayesian inference' is intimidating, but in the end the method is very simple and only requires multiplications and divisions. If you want a more formal (and probably more accurate) explanation you can start from wikipedia: Bayesian inference. I tried to keep things as simple as I could (without losing too much rigour, I hope) and, being me an engineer, I focused more on how to do it in practice rather than on math or philosophy.​

THE VERY BASICS & DEFINITIONS

We need at least two competing hypothesis, we could call them H and K, but luckily in the simplest case one is the negation of the other (ie.: it was a cat, or it was not a cat). Given this is easier I'm going to use the simplest case in what follows, so instead of calling the hypothesis H and K (and then L, M, N..) I'll stick to H and notH from now on.

P(H) means 'the probability that H is true' while P(notH) means 'the probability that notH is true', or equivalently 'the probability that H is false'.

Notations: 70% probability is the same as 70/100 = 0.7, I'll use both notations.​


THE BAYESIAN ALGORITHM: HOW TO MAKE THE CALCULATIONS

Let's start from the very beginning: we have the two hypothesis but nothing else at all, no informations, no nothing. Which numbers should we assign to P(H) and P(notH)? We have no reasons to prefer H over notH, or viceversa, so the answer is P(H) = 50% (0.5) and P(notH) = 50%. We call these numbers prior probabilities, for reasons which will be clear later. Knowing the prior probabilities we can now calculate the prior odds of H being true vs. H not being true (or viceversa, if you prefer): it is P(H)/P(notH) = 0.5/0.5 = one to one, a fair bet.


We then get some evidence (let's call it E) which may help us to shed light on the mystery of H vs. notH. What we need now is a way to account for the new evidence in the calculation of our odds. To do that we need two more numbers:​
  • P(E, given H) = the probability that, if H is true, we will get evidence E​
  • P(E, given notH) = the probability that, if H is not true, we will get evidence E nonetheless​
Finding these two numbers is the tricky part! For now, let's assume we have found those two numbers and let's see how we should use them to revise our prior probability. This is the formula, it looks nasty, but hold on and it will become simple:

P(H, given E)/P(notH, given E) = P(E, given H)/P(E, given notH) * P(H)/P(notH)

The formula is a consequence of a fundamental theorem of probability theory called "Bayes' theorem" (from which the name 'Bayesian inference'). For the demonstration, see the appendix at the end, but I want to first explain what all those Ps, Hs and Es actually are.​
  • P(H, given E)/P(notH, given E) : this is the number we want to find! In plain language it is the odds (the ratio) of the probability of H being true given evidence E, versus the probability of notH being true given the same evidence.​
  • P(E, given H) and P(E, given notH) are the two tricky-to-find numbers that we assumed to have somehow determined before​
  • P(H)/P(notH) are the prior odds we started from​
So we can just plug the numbers we know into the formula and calculate the new odds we want to find!


One simple example to fix ideas. I have two hypothesis: there was a cat in my backyard (H) and there was no cat in my backyard (notH). For some reason I haven't looked into my backyard since ages and I have no idea if there were cats, nor I have any idea on how frequently cats come to my backyard nor anything else: the only thing I can soundly say at this point is P(there was a cat) = P(there were no cats) = 50%. Then my neighbour shows me a photograph of a proud tabby tomcat sitting in my backyard: this is E, the evidence. Now the tricky part: how much is probable for that picture to exist if there really was a cat in my backyard? And how much if there were no cats? Well in this case I could reasonably say P(cat picture, given there was a cat) = 99%, P(cat picture, given there were no cats) = 1%. But this is not a given: for instance my neighbour could be a notorious serial prankster and then I'd better say P(cat picture, given there was a cat) = 60%, P(cat picture, given there were no cats) = 40% [I told you this was the tricky part!]. But anyway, let's stick with the original 99% and 1% and let's apply the formula:

The odds that there really was a cat in my backyard, given the picture, against 'there really were no cats', is = (0.99/0.01) * (0.5/0.5) = 99 times to 1. I bet it was a cat!

What if my neighbour is a prankster? The odds become (0.6/0.4) * (0.5/0.5) = only 1.5 times to 1. Should I bet now? Hmmmm...

What if my neighbour shows me a picture of my backyard with a blurry smudge which could be any kind of small animal (or even no animal at all?). Say your two numbers, P(blurry picture, if there was a cat) and P(blurry picture, if there were no cats), put them in the formula and calculate.



And now here comes the magic of Bayesian inference: the new odds we just calculated become our new prior odds and we can start from them and factor in another piece of evidence we have found, using the same formula as before, and more and more pieces of evidence after that until we have exhausted all the evidences we have and we are left with our final odds (often called the posterior or consequent odds, even if, should we find yet one more piece of evidence, they'll become the new priors from which to start a new round of calculations).

There is another, psychological good thing about using Bayes: one is compelled to think both to his preferred hypothesis (H) and to the hypothesis he doesn't like (notH), this because he needs to calculate both P(E, given H) and P(E, given notH). And this helps a lot in keeping things in perspective and not to be lead astray by what one would like the final answer to be. Just try reasoning this way (even if you don't do any calculations, but having to follow an algorithm helps a lot), then you'll tell me.​


THE TRICKY PART

We still need to address the 'tricky part': how do we find the two numbers we need, the probability of getting our evidence E, first in case H is true, then in case H is not true? In the example with the cat reasonable numbers were easy to find and noone will yell at you from a forum if you cannot justify why you think P(cat picture, given there was a cat) is 99%. But for a real life discussion one cannot simply shoot probabilities at random or, worse, juggle the numbers so they fit his theory. This problem is difficult and there is no general formula for it, there are, however some sound methods to approach it (and many unsound ones).​

1) FORMAL ARGUMENTS

From many point of views this method is ideal, the problem is that it's rarely applicable in practice. If you can find a correct formal logic argument which allows you to directly calculate the probabilities (starting from some undisputable premises) the problem is solved. In practice, this is much easier said than done.​


2) THE REFERENCE CLASS

A much more broadly applicable method is the 'reference class'. Easy example: a friend of mine has taken a picture of a random person on the street, what is the probability it was a female? One possible reference class is easy to find: 'persons'. Do we have any data on how many persons are male and how many are females? Sure we do, we have lots of data, and so we can assign a probability to the picture representing a female, which will be about 50%. First rule for a reference class to be a sound one: we need to have (reliable) data about the members of the class. If we don't, our reference class is useless.

Now imagine that my friend took his picture in the streets at Mount Athos, a place in Greece which women are forbidden to enter (it's an absurd thing, I know, but it's true). This completely invalidates the previous assumption: the 50% probability we determined before is completely wrong! Second rule for a reference class to be a sound one: it must be as specific as possible, or in other words it must include all the available, relevant informations we have. If we leave out the information about the picture being taken at Mount Athos we make a big mistake. Our reference class cannot be 'persons' anymore, it must become 'persons at Mount Athos' to be meaningful.​


3) LAPLACE'S RULE OF SUCCESSION

This is even more broadly applicable than reference classes. What is the probability that the cat in my backyard had a bright pink coat and a blue tail? There are no known confirmed examples of pink cats with blue tails, but it would be a mistake to answer 'zero percent': this would mean to deny the not-hypothesis by definiton, or, if you prefer, to beg the question that no cat is or ever will be pink with a blue tail, a bad logical mistake, akin to say 'no aliens have ever been confirmed to exist, thus no aliens exist'. Laplace's rule of succession comes to the rescue: we can start examining cats (or aliens) one by one and see if we find a pink blue-tailed one. After, say, 10000 cats and (say, who knows?) zero weirdos found we apply Laplace's formula:

Probability = (N+1)/(S+2)

What does that mean? N is the number of pink blue-tailed cats we found, S is the number of cats we sampled. From the evidence given by our sampling the probability of the next cat to be a pink blue-tailed one is (0+1)/(10000+2) =~ 1/10000.

This formula can be used generally of course (it's in effect a recipe for building a reference class from scratch), but when N=0 it's the only possible way to rationally assign a meaningful probability.


4) GUESSWORK

When everything else fails, guesswork is the last resort. It's also the most dangerous way of proceeding. A way to lessen the risk to be yelled at is to keep wide margins: ie. not from 70% to 75%, but maybe from 55% to 85%, and try to lean a lot against your favoured hypothesis. In the end, it's more a matter of achieving consensus than a matter of precision. The good news are that, surprisingly, even very broad estimates of the probabilities often yield strong odds in favor (or against) one of the hypothesis.​


5) THERE IS NO WAY TO KNOW

If something is totally unknown and unknowable, even by applying the broadest and least reliable method, guesswork, it's an unknown, and it cannot be used as evidence for, nor against, H (or notH). The probabilities of something unknown is 50%/50%, just as the very first prior probabilities from which all the calculations started: the effect is to cancel out in the formula and to change nothing at all, as it should be. And here lies another beauty of Bayesian inference: you stop worrying about what you can't know and concentrate instead on what you do know, and on what you may come to know (and hopefully on how to get it).​



TIPS & PITFALLS

Remember probabilities are numbers which must be between zero and one: any other number cannot be a probability. This is important to know because it allows to spot (big) mistakes in the premises, ie. if I find the probability of something is -2 or 1.5 I surely made some error somewhere. On the contrary, odds can be any positive number. And, try not to confuse probabilities with odds, they are two very different beasts even if closely related (I need to always repeat that to myself too!). Probabilities give more informations, but odds are more tractable, they're both needed.

If you have two mutually exclusive hypothesis, H and notH, then P(notH) = 1- P(H): if it's a cat with 70% probability, then it's not a cat with 30% probability. But take care of the mutually exclusive part, ie. if it's a cat with 70% probability you cannot conclude it's a dog with 30% probability, because it could be any other kind of small animal (or even not an animal at all). This mistake is not always as easy to spot as in this example, beware. And, given we are on this, please notice that the two tricky numbers, P(E, given H) and P(E, given notH) do not need to add up to one, they are two totally different things. Ie. the probability of the sun rising tomorrow is ~100% if today is an odd day, but it is ~100% also if today is an even day: they add up to ~2 (and they'll cancel out when calculating the new odds on such a silly evidence).

Probability zero and probability one. Wonderful quote from my University textbook: probability zero does not mean impossible and probability one does not mean certain. This is very true! One should never find nor suggest a 0% or 100% probability (that would be begging the question), even if it's perfectly legal to find and propose numbers such as 0.00000000000001% or 99.9999999999999%. In these cases writing just '0%' and '100%' is customary, but keep always in mind that 'impossibility' and 'certainty' can never be granted. But to avoid confusion, add a tilde instead: ~0%, ~100%.

Beware of correlations. Instead of using the full Bayesian calculation algorithm it's often faster to start from some 'given' probability and calculate the first prior by just multiplying them together. For instance, if the probability of a cat in my backyard is 50% and the probability of a cat being black is 10% then the probability of a black cat in my backyard is 0.5 * 0.1 = 5%. This is a correct method to do things except if there is a correlation between the two probabilities we are multiplying together. In the case of cats and backyards I could have installed an automatic door with a camera which only allows black cats in: the two variables ('cat in my backyard', and 'cat is black') are now correlated and multiplying them would be a (very bad) mistake. Yet again, this is easy to see with cats and computer controlled doors, but not easy at all in many many practical cases.

Chasing butterflies. When I talked about reference classes I stressed that, to be meaningful, "it must be as specific as possible, or in other words it must include all the available, relevant informations we have". There the focus was on the 'all the available', but the 'relevant' part is very important too, and unfortunately forgetting it is a sure method to demonstrate whatever one prefers the answer to be, which is a really bad thing. Example: I start from the 'persons at Mount Athos' reference class, but wait... there's a butterfly in the picture, shouldn't I multiply everything by the probability of having a butterfly in the picture? That exact butterfly maybe? At that exact position? Then I should conclude that my friend, actually, never took any picture, it's too improbable even to exist! The absurdity of doing this is pretty clear, but yet again, this may not be as easy to spot in real life cases: chasing butterflies is indeed a common trick used to 'demonstrate' the most absurd rubbishes.​


APPENDIX: DEMONSTRATION OF THE BASIC FORMULA TO CALCULATE ODDS

We start from Bayes' theorem, which can be written this way:
1636748861641.png

I rewrite it, using the same notation I used before:

1636759001264.png
Mathematically it's an astonishing theorem, really, but don't focus on it, we just use it as a stepping stone. We now write two copies of Bayes' theorem, one for H and one for notH:

1636749882895.png
And now we divide 1) by 2), getting rid of P(E):

1636750132201.png
which is the formula for calculating odds given before, q.o.d.​


Edit: added the 'Guesswork' chapter

Edit: corrected a mistake in the labelling of cats and pictures probabilities (thanks to @Mendel who first noticed this)

Edit: added the 'There is no way to know' chapter

Edit: added the 'Chasing butterflies' paragraph in 'Tips & Pitfalls'

Edit: removed an unnecessary and wrong sentence from the Appendix (thanks to @Mendel again and to @jplaza)
 
Last edited:

Mendel

Senior Member.
Well in this case I could reasonably say P(cat, given the picture) = 99%, P(no cats, given the picture) = 1%
This is not P(E, given H), it's P(H, given E); you've confused them.

Your example goes wrong from there because you plug it into the formula as if it was the other.
 

Mendel

Senior Member.
how can we ever expect to be able to say what P(to have evidence E) is? There's no way to answer such a question
Yes, there is.

P(E)=P(E|H)×P(H)+P(E|notH)×P(notH)

Note that all of the entities on the right are also inputs to your formula, so they're (un)known either way.
 

Mauro

Active Member
This is not P(E, given H), it's P(H, given E); you've confused them.

Your example goes wrong from there because you plug it into the formula as if it was the other.
Answering here to both your previous posts. In the first case the problem is in the P(cat, given the picture) description: that's wrong as you say, thank you for noticing that! The following calculations in the example (and the final results) are instead right. I'll edit the post soon [done now].

You're probably reason also on the second part, so I probably should add something about that in my post. But here on the spot I cannot delve through your formula and understand how it fits in the context, could you maybe help me and give me a reference to a demonstration (or explanation) of your formula, so I avoid goig googling around for infos you probably already have? Thank you. [answered in post #9]
 
Last edited:

Mauro

Active Member
Bayes' is simpler expressed in odds ratios. 3b1b did a pithy vid on the subject a while back:
I have no doubts Bayesan inference has already been explained by many people, and in better ways than mine :) This is just my take, by the way I wrote this not only for posting on Metabunk but a lot also for myself, to check if I really had understood what the whole damn thing means and how it works. Thank you for the link to the vid.
 
Last edited:

jplaza

Member
Thank you @Mauro for the post.

I am interested in this topic, as it has popped up in a mailing list. Given an UFO case, what are the chances that the proposed solution is the true one? or respect to other possible solutions? Bayesian inference came up.

It seems to me that this a reasonable tool to use when you still have uncertainties, and want to evaluate whether you are on the right track, or whether a possible solution has a so low odds that it is better to continue with the others (always that you can calculate probabilites rather than guesswork! That's where our disagreement was in the Gimbal thread :) )

But when you find a "smoking gun" kind of evidence, that is, an evidence that definitely demonstrates that H either true or not, how is it included or reflected in these calculations?
 

Mauro

Active Member
Thank you @Mauro for the post.

I am interested in this topic, as it has popped up in a mailing list. Given an UFO case, what are the chances that the proposed solution is the true one? or respect to other possible solutions? Bayesian inference came up.

It seems to me that this a reasonable tool to use when you still have uncertainties, and want to evaluate whether you are on the right track, or whether a possible solution has a so low odds that it is better to continue with the others (always that you can calculate probabilites rather than guesswork! That's where our disagreement was in the Gimbal thread :) )

But when you find a "smoking gun" kind of evidence, that is, an evidence that definitely demonstrates that H either true or not, how is it included or reflected in these calculations?
Thank you @jplaza, I'm glad you liked it

A 'smoking gun' evidence has a very high P(E, given H) and a very low P(E, given notH), say 99.999% and 0.001% (*). This will rise the final odds, in this example, by a factor of 0.99999/0.00001 = one hundred thousand times. Even the most desperate hypothesis can be rescued by a strong enough new evidence (but 'certain' and 'impossible' are never granted!)

(*) btw, they don't need to add up to 1, they are two totally different probabilities (see how they are written, one is not the negation of the other). [I'd better add this to the original post] [done, thank you for making that come to my mind]
 
Last edited:

Mauro

Active Member
Yes, there is.

P(E)=P(E|H)×P(H)+P(E|notH)×P(notH)

Note that all of the entities on the right are also inputs to your formula, so they're (un)known either way.
The formula is exact, but the problem in using it is that we only (un?)know the odds, P(H)/P(notH), we don't know P(H) and P(notH) separately as the formula requires (this except in the trivial case of the algorithm initialization, P(H) = P(notH) = 0.5)
 
Last edited:

jplaza

Member
The formula is exact, but the problem in using it is that we only (un?)know the odds, P(H)/P(notH), we don't know P(H) and P(notH) separately as the formula requires (this except in the trivial case of the algorithm initialization, P(H) = P(notH) = 0.5)
You know the odds to be any real number k
P(H)/P(notH)=k,

And at least in this specific case, you also know that :
P(H)=1-P(notH).

You can solve and get
P(H)=k/(k+1)
P(notH)=1/(k+1)

( for k=1, P(H)=0.5=P(notH) )
( for k=2, P(H)=2/3, P(notH)=1/3 )
(...)

So at least for the P(H) vs P(notH) case, P(E) can be computed
 

Mauro

Active Member
You know the odds to be any real number k
P(H)/P(notH)=k,

And at least in this specific case, you also know that :
P(H)=1-P(notH).

You can solve and get
P(H)=k/(k+1)
P(notH)=1/(k+1)

( for k=1, P(H)=0.5=P(notH) )
( for k=2, P(H)=2/3, P(notH)=1/3 )
(...)

So at least for the P(H) vs P(notH) case, P(E) can be computed
Ahh you may very well be right, give me some time to check it please (and to dine xD).

Eventually, I'll remove all the discussion on the troublesome P(E) from the post, it's unnecessary anyway for the demonstration. Thanks as always )
 

Mauro

Active Member
You know the odds to be any real number k
P(H)/P(notH)=k,

And at least in this specific case, you also know that :
P(H)=1-P(notH).

You can solve and get
P(H)=k/(k+1)
P(notH)=1/(k+1)

( for k=1, P(H)=0.5=P(notH) )
( for k=2, P(H)=2/3, P(notH)=1/3 )
(...)

So at least for the P(H) vs P(notH) case, P(E) can be computed
You're indeed right (and silly me not to have thought that sooner). I have removed the offending unnecessary elucubration from post #1, thanks again.
 

Edward Current

Active Member
The probabilities of something unknown is 50%/50%
Is this part of Bayesian inference, or did you make it up? I don't see it mentioned in the Wikipedia article, or in the equivalent article in the Stanford Encyclopedia of Philosophy.

The probability that there's a ball of pure gold orbiting somewhere in the Solar System is 50/50?
The probability that invisible aliens live among us on Earth is 50/50?
The probability that a ball of pure gold orbits the Sun, with invisible aliens living among us, is 25%?
 

Mauro

Active Member
Is this part of Bayesian inference, or did you make it up? I don't see it mentioned in the Wikipedia article, or in the equivalent article in the Stanford Encyclopedia of Philosophy.

The probability that there's a ball of pure gold orbiting somewhere in the Solar System is 50/50?
The probability that invisible aliens live among us on Earth is 50/50?
The probability that a ball of pure gold orbits the Sun, with invisible aliens living among us, is 25%?
I didn't make that up at all, but give me at least until tomorrow for an authoritative reference though, please.

The examples you make miss the point: you know a lot of things about the Solar System and what orbits in it, so you can calculate (or at least make an estimation of) the probability of a pure gold ball orbiting the Sun (which surely is extremely low). In the 'There is no way to know' chapter I'm talking about situations where just nothing can be said (mainly because we don't have any data at all: if we are at this point, not even guesswork could help us): ie. (sorry for the silliness of the example, it's late here), did Caesar like cheese or not? How are we supposed to ever know that, barring the miracolus find of some previously unknown writing mentioning Caesar's taste for cheese (*)? Whatever the case, no inferences can be made starting from Caesar's loving cheese or not, because we know nothing about it. Thus P(Caesar loved cheese) and P(Caesar did not love cheese) should both be set at 50%, because in this way the odds of Caesar's loving cheese are even and will not influence any result: nothing in, nothing out (**).

But I agree, this fact is not usually mentioned (I first read about it at Richard Carrier's website, among other things), which is a pity because the whole algorithm starts just with this consideration, and not mentioning this makes one missing the 'basement' from which to start. This is one of the problems which made me scratch my head when I started toying with Bayesian inference, btw. It's just a technicality if you wish, but it's important if one wants to use Bayes in practice (and stop worrying about the unknowable).



(*) Maybe some author did in fact mention Caesar's tastes for cheese, I really don't know! Make that Nero, then, or Trajan, whoever. I told you the example was silly.

(**) of course one could try different ways, ie. try to determine which percentage of Romans loved cheese, but once you know that you now know something, which is different from knowing nothing. The 50%/50% 'rule' does not apply anymore. Better knowledge replaces inferior knowledge, and no knowledge can be lower than 'unknown'.
 
Last edited:

Mauro

Active Member
Is this part of Bayesian inference, or did you make it up? I don't see it mentioned in the Wikipedia article, or in the equivalent article in the Stanford Encyclopedia of Philosophy.

The probability that there's a ball of pure gold orbiting somewhere in the Solar System is 50/50?
The probability that invisible aliens live among us on Earth is 50/50?
The probability that a ball of pure gold orbits the Sun, with invisible aliens living among us, is 25%?
[Following the previous post] You made me think again about the '50% rule' and, in effect, you spotted a (small) problem (yet again, confusion between probabilities and odds.. I haven't told enough times to myself to take care, it seems..). What I wrote before about Caesar's love for cheese holds, but it was applied to P(H) and P(notH) while the probabilities involved in this case are P(E, given H) and P(E, given notH). The latter ones actually do not need to be set at 50%: they just need to be set equal, so they'll factor out in the odds calculations formula. They need to be set at 50% only at the start of the algorithm (because at the start of the algorithm those probabilities reduce simply to P(H) and P(notH), and in this case P(H) = 1 - P(notH), while this is not generally true for P(E, given H) and P(E, given notH)).

Conclusion: I think the basic idea underlying the "There is no way to know" chapter should have been clarified now: no knowledge in, no knowledge out. I'm going to modify my unfortunate 50% sentence to try to say it in the right way. Thanks for having made me think again!

... sigh.. the edit window has closed :(
 
Last edited:

Mendel

Senior Member.
In the example with the cat reasonable numbers were easy to find and noone will yell at you from a forum if you cannot justify why you think P(cat picture, given there was a cat) is 99%. B
I will.
Cats prowl at night, but the neighbor can't see your yard at night, so P(cat picture, given there was a cat) is at most 66%.
If your neighbor looks in your yard only 3 times a day, the probability is even lower.

You are guessing numbers with no data to come to a conclusion you feel is right, and the best thst can achieve is uncover an inconsistency in what you feel are the facts; but without actual data, your result remains an unsupported guess.

The Bayesian analysis doesn't create facts from nothing.
 

Mauro

Active Member
I will.
Cats prowl at night, but the neighbor can't see your yard at night, so P(cat picture, given there was a cat) is at most 66%.
If your neighbor looks in your yard only 3 times a day, the probability is even lower.
I'd really like to see you trying an argument like that in a real-life debate on some real-life heated forum. By the way, do you know more about my backyard and my neighbour than I do? You also forgot the most obvious reason why a picture of a cat can not show a real cat: https://thiscatdoesnotexist.com/

You are guessing numbers with no data to come to a conclusion you feel is right, and the best thst can achieve is uncover an inconsistency in what you feel are the facts; but without actual data, your result remains an unsupported guess.

The Bayesian analysis doesn't create facts from nothing.

I explicity said:
But for a real life discussion one cannot simply shoot probabilities at random or, worse, juggle the numbers so they fit his theory. This problem is difficult

Then I devoted some hours of my life to explain some sound methods (and even some unreliable ones, such as guesswork, explicitly marked as such) which one can hope to use to overcome this problem (explicitly marked as difficult).

I did not say 'Bayesian inference is a magical tool which allows you to create knowledge from nothing'. It's pretty obvious that if you put garbage in you get garbage out (and I know of published 'Bayesian inferences' which do exactly that, with comical results). For 'nothing in, nothing out' see also my previous two posts. Don't come tell me that without data there can be no results please, I've been saying that since yesterday.

What seems you're implying with sentences like those in your post is that Bayesian inference is just rubbish and a waste of time, is that what you're claiming? Or that it's specifically me who misuses Bayesian inference?
 
Last edited:

Mendel

Senior Member.
Is this part of Bayesian inference, or did you make it up? I don't see it mentioned in the Wikipedia article, or in the equivalent article in the Stanford Encyclopedia of Philosophy.

The probability that there's a ball of pure gold orbiting somewhere in the Solar System is 50/50?
The probability that invisible aliens live among us on Earth is 50/50?
The probability that a ball of pure gold orbits the Sun, with invisible aliens living among us, is 25%?
"Probability" has two meanings. If you consider a bad with red and white balls in it, you can use a probility to describe the ratio of red balls in the bag. You can determine that ratio by sampling the bag. "Frequentists" would do that. (They also use Bayes's Theorem, which is where it gets confusing.)

Bayesians use a probability to describe their knowledge of the bag. At first, we know nothing about the bag, so we have no reason to prefer a color, and our lnowledge (not the bag!) is described as P(red)=0.5 . That's what we'd bet against someone else who knows as much as we do. Pulling a ball out of the bag changes our knowledge piece by piece, and Bayesian analysis can model that.

You can use Bayesian modeling to describe what you know about the Solar System and aliens to arrive at probabilities other than 50%. In fact, if you think the probability of gold balls orbiting is less than 50%, a Bayesian approach will force you to consider why you think that, since you'll need to add some evidence to move off the 50% probability.

Bonus question: is it more likely that aliens live among us, or that invisible aliens live among us?
 

Mauro

Active Member
"Probability" has two meanings. If you consider a bad with red and white balls in it, you can use a probility to describe the ratio of red balls in the bag. You can determine that ratio by sampling the bag. "Frequentists" would do that. (They also use Bayes's Theorem, which is where it gets confusing.)

Bayesians use a probability to describe their knowledge of the bag. At first, we know nothing about the bag, so we have no reason to prefer a color, and our lnowledge (not the bag!) is described as P(red)=0.5 . That's what we'd bet against someone else who knows as much as we do. Pulling a ball out of the bag changes our knowledge piece by piece, and Bayesian analysis can model that.
You could also read about sampling methods in my "Laplace's rule of succession" chapter, I'm pretty proud of having used pink blue-tailed cats rather than dull red balls.

You can use Bayesian modeling to describe what you know about the Solar System and aliens to arrive at probabilities other than 50%. In fact, if you think the probability of gold balls orbiting is less than 50%, a Bayesian approach will force you to consider why you think that, since you'll need to add some evidence to move off the 50% probability.
Thus, Bayesian inference is useful at the end of the day? It's just me who misuses it (see my post #17)?

Bonus question: is it more likely that aliens live among us, or that invisible aliens live among us?
Nice question, but not in the mood for exponential notations today, sorry.
 

Mendel

Senior Member.
I did not say 'Bayesian inference is a magical tool which allows you to create knowledge from nothing'. It's pretty obvious that if you put garbage in you get garbage out (and I know of published 'Bayesian inferences' which do exactly that, with comical results). For 'nothing in, nothing out' see also my previous two posts. Don't come tell me that without data there can be no results please, I've been saying that since yesterday.
You've done it without data for the cat picture example and for Gimbal. I'd love to see you demonstrate a data-driven Bayesian analysis.

What seems you're implying with sentences like those in your post is that Bayesian inference is just rubbish and a waste of time, is that what you're claiming? Or that it's specifically me who misuses Bayesian inference?
I'm sure you're not alone in this.

One advantage of people (mis)using Bayesian inference is that they reveal why they're thinking what they're thinking.

When used with data, it's a great method.
 

Mendel

Senior Member.
There are no known confirmed examples of pink cats with blue tails, but it would be a mistake to answer 'zero percent': this would mean to deny the not-hypothesis by definiton, or, if you prefer, to beg the question that no cat is or ever will be pink with a blue tail, a bad logical mistake,
To affirm that there can be a pink cat with a blue tail is a bad logical mistake.

Article:
Since we have the prior knowledge that we are looking at an experiment for which both success and failure are possible, our estimate is as if we had observed one success and one failure for sure before we even started the experiments.

If you do not have that prior knowledge, you can't justify using the rule of succession.
(A frequentist would compute a confidence interval.)
 

Mauro

Active Member
You've done it without data for the cat picture example and for Gimbal. I'd love to see you demonstrate a data-driven Bayesian analysis.
Ahhh that's what you say. I don't see it that way, really. Just go look the Gimbal thread and check how I opened every premise and numbers up to discussion and how, when presented with new evidence, I put in with no qualms in the calculations odds of ten to one against the hypothesis which, from the perliminary evidences, I had calculated to be the best. Why are you accusing me? And what my use (or misuse) of Bayesian analysis has to do with Bayesian analysis itself, which should be the topic of this thread?
 

Mendel

Senior Member.
I opened every premise and numbers up to discussion
The negation of "didn't use data" is not "opened for discussion". My point stands.
Oh c'mon @Mendel, please.
Look, if you can claim something's a "bad logical mistake" with no argument, so can I.
And since your claim was first, it's your turn to support it.

I suggest slowing down the discussion and taking time to ponder.
 

Mauro

Active Member
The negation of "didn't use data" is not "opened for discussion". My point stand
But I did use data. The time/date of the Atlas launch is a datum, so the time/date of the Gimbal video (unfortunately not so precisely known). Same goes for the fact that a Flir can see the exhaust of an Atlas, at least in certain (many) conditions, and suffer from glare by looking at it (at least in certain conditions), and this sums up all of my initial assumptions. And all those data are open for discussion, I can't see any problem here.

Look, if you can claim something's a "bad logical mistake" with no argument, so can I.
And since your claim was first, it's your turn to support it.
But I did put forth an argument, and not once, but twice! One cannot logically claim impossibility (nor certainty) because that would amount to begging the question (even for pink blue-tailed cats!). Go read again what I wrote in post #1, I even made a (hopefully clear) more 'mundane' example.

I suggest slowing down the discussion and taking time to ponder.
I wholeheartedly agree!
 
Last edited:

Mendel

Senior Member.
But I did put forth an argument, and not once, but twice!
You did argue that pink cats exist?
Could you quote those arguments?
One cannot logically claim impossibility (nor certainty) because that would amount to begging the question (not even for pink blue-tailed cats!).
If you can't be certain that a pink blue-tailed cat exists, you can't justifiably use the rule of succession. The rule only applies when an experiment can have 2 outcomes. If it can only have one outcome, you can't use it.
 

Mauro

Active Member
You did argue that pink cats exist?
Could you quote those arguments?

Sure I can! Just I didn't argue that pink cats exist of course, I actually argued that one cannot claim it's impossible a pink cat exists (nor a pink blue-tailed one, for that matter).

Post #1, Chapter 'Laplace's rule of succession'
What is the probability that the cat in my backyard had a bright pink coat and a blue tail? There are no known confirmed examples of pink cats with blue tails, but it would be a mistake to answer 'zero percent': this would mean to deny the not-hypothesis by definiton, or, if you prefer, to beg the question that no cat is or ever will be pink with a blue tail, a bad logical mistake, akin to say 'no aliens have ever been confirmed to exist, thus no aliens exist'.

Post #1, Chapter 'Tips & Pitfalls', paragraph 'Probability zero and probability one'
One should never find nor suggest a 0% or 100% probability (that would be begging the question),

[Mendel]If you can't be certain that a pink blue-tailed cat exists, you can't justifiably use the rule of succession.
Read the aliens example quoted above, isn't it absurd to say no aliens can exist because we aren't certain that aliens exist? Same goes for weird cats: one can never discount the possibility one exist (unless we have examined every and each cat on the Earth of course, but if we could do that we won't need talking about Laplace's rule of succession in the first place, we'd just go for the "1) Formal arguments" solution). What if a mutation in some genes or some genetic engineering or just selective breeding produces a breed of pink cats with a blue tail? Improbable, yes, impossible? No.

[Mendel] The rule only applies when an experiment can have 2 outcomes. If it can only have one outcome, you can't use it.
And indeed the experiment has two outcomes: it's a pink blue-tailed cat (wether they actually exist or not), or it's a cat coloured some other way, a not-(pink blue-tailed cat).
 
Last edited:

FatPhil

Active Member
Mendel said: "Bonus question: is it more likely that aliens live among us, or that invisible aliens live among us? "

Nice question, but not in the mood for exponential notations today, sorry.

@Mendel's question is insightful, as it highlights a common mistake (one of too many assumptions) people make. It's worth the effort to give it a rethink.

Consider how your answer to the original would be different from your answer to the following: Is it more likely that aliens live among us or that visible aliens live among us? More explicitly - which is more likely, A exists, or A exists and has additional property X?
 

Mauro

Active Member
@Mendel's question is insightful, as it highlights a common mistake (one of too many assumptions) people make. It's worth the effort to give it a rethink.

Consider how your answer to the original would be different from your answer to the following: Is it more likely that aliens live among us or that visible aliens live among us? More explicitly - which is more likely, A exists, or A exists and has additional property X?
I agree it's interesting! I said something on those lines (not exactly the same, but relatable) in the 'Tips&Pitfalls, Chasing butterflies' paragraph.
 

deirdre

Senior Member.
which is more likely, A exists, or A exists and has additional property X?
aren't you assuming an additional property?

say aliens are like angels (ie invisible), then aliens exist and invisible aliens exist are the same number.

(ps. people dye cats.. they shouldn't because it's abuse, but they do)
 

FatPhil

Active Member
aren't you assuming an additional property?

say aliens are like angels (ie invisible), then aliens exist and invisible aliens exist are the same number.

The clause starting the sentence beginning "say ..." is an assumption of an additional property. That is an assumption I am not making. That doesn't mean I'm making an assumption.
 

deirdre

Senior Member.
i'm pointing out Mendels example may be akin to "what are the chances a cat is in your backyard vs what are the chance a cat with feet is in your backyeard."
 

Rory

Senior Member.
But angels aren't invisible to those that can see them. ;)

The answer is "it's more likely that aliens live among us than invisible aliens live among us - unless we can prove that there are no visible aliens."

I can't pretend to be as educated as you guys are in theories of probability - but I do think it would go a long way if you could back up your points with real world examples.

For instance, as some of you may know I designed and ran an esports player rating system that was based on probability (similar to Elo). What it said was "a player x points higher than another is predicted to win y percentage of the matches between them" - and when we looked back and searched all matches between players rated x points difference we found the win% of the higher rated player was (pretty much) y.

Point being, it was a theory based on knowledge and the theory could be validated by analysis.

So in this discussion I would think either the theory can be validated by real life examples or it can't - and if it can't it's a theory that doesn't really work.

And as far as Mauro's Gimbal/Atlas figures go, they do seem like guesswork - not based on data, not based on knowledge - and so, in that case, don't really serve much useful purpose, other than as a bit of fun.

I'd even go so far as to wonder even if they were based on data and knowledge what purpose they could serve, or what they could be used for? For example, if the probability is 1% that it's aliens and 99% that it's not, what then? Do you stop looking at it? Do you look at it more? Or won't people just do what they're going to do anyway?

But I understand: numbers are fun, speculating is fun, and a man's got to have his hobbies and his pastimes to fill the days and years. :)
 

Mauro

Active Member
And as far as Mauro's Gimbal/Atlas figures go, they do seem like guesswork - not based on data, not based on knowledge - and so, in that case, don't really serve much useful purpose, other than as a bit of fun.
I don't agree my estimates are not based on data (as I didn't agree with Mendel in post #25). I used and tried to make the most out of the very few data we actually have. And surely that involved guesswork, had I had more data I could have tried formal arguments o reference classes or Laplace, unfortunately I had to use what I actually had, and make (educated) guesses. Am I claiming that my data are the infallible and unerrant solution? Not at all, I just claim they're better than nothing and I hope they may be improved, or at least discussed critically instead than disparaged tout court, but oh well.

I'd even go so far as to wonder even if they were based on data and knowledge what purpose they could serve, or what they could be used for? For example, if the probability is 1% that it's aliens and 99% that it's not, what then? Do you stop looking at it? Do you look at it more? Or won't people just do what they're going to do anyway?
Isn't knowing that something is 99 times more probable than something else useful? Isn't it at least better than not being able to say nothing at all? Then, what you do when you have that information is a totally unrelated topic.

But I understand: numbers are fun, speculating is fun, and a man's got to have his hobbies and his pastimes to fill the days and years. :)
Very much agreed! All work and no play make Jack a dull boy :)
 

Rory

Senior Member.
I don't agree my estimates are not based on data (as I didn't agree with Mendel in post #25). I used and tried to make the most out of the very few data we actually have.

That's true that some of it did include data - but maybe it's like how anything multiplied by zero is always zero, anything mulplied by guesswork is always guesswork. Like given the data here -

The time/date of the Atlas launch is a datum, so the time/date of the Gimbal video (unfortunately not so precisely known).

- what is the probability that they both overlap?

Isn't knowing that something is 99 times more probable than something else useful?

I would say (thinking about UAPs in particular) "why?"

Very much agreed! All work and no play make Jack a dull boy :)

I was very gladdened to read these sentences. They made me smile also. Nice to know you took these challenges in the spirit intended. :)
 

Mauro

Active Member
That's true that some of it did include data - but maybe it's like how anything multiplied by zero is always zero, anything mulplied by guesswork is always guesswork. Like given the data here -

The time/date of the Atlas launch is a datum, so the time/date of the Gimbal video (unfortunately not so precisely known).

- what is the probability that they both overlap?
I'm glad to answer this because you touched a very interesting point. The answer is that.. it does not really matter, except for poor Atlas which, after this omission, finds itself worse off. Why? Because you must apply the same reasoning to any other possible cause beyond the Atlas, ie., if Gimbal in effect was a transdimensional craft, what's the probability that the Gimbal video and the transdimensional craft flight time overlapped? Then every probability (of every possible cause) should be multiplied by the number you're asking for (which will in general be different for each cause), and that will be an advantage for the Atlas, because we know for sure there was an Atlas launch around that time, while we do not know for sure (at all, in fact) of any other possible candidate (I may be wrong on this! correct me in case), and what exists with ~100% probability tends to beat at this game what only might exist, and with a low x% probability (were x large, then we'd have another candidate, it couldn't have been overlooked).

As a sidenote, reasoning on 'probabilities of overlap' is often a bad idea (not always of course, but that's why I avoided them in the first place, and in any case they have the tendency to become a mess). Why? There's a big risk of going 'chasing butterflies':
, but wait... there's a butterfly in the picture, shouldn't I multiply everything by the probability of having a butterfly in the picture? That exact butterfly maybe? At that exact position? Then I could even conclude that my friend, actually, never took any picture, it's too improbable even to exist!


I would say (thinking about UAPs in particular) "why?"
I don't know, I find it useful, as any other piece of knowledge, or at least interestinjg. Isn't it also useful to have a quantitative analisys, with numbers which can be compared, instead of a qualitative one? Or know how much each of your assumptions actually influences the final results, to have numbers showing clearly on what exactly your beliefs really rest? Or to have a method to extract all what is possible from meager data?

I was very gladdened to read these sentences. They made me smile also. Nice to know you took these challenges in the spirit intended. :)
I was glad too to read your post. I'm also very glad to have the Metabunk community review what I wrote, I couldn't hope for anything better, and you're challenging (and often right :)).
 
Last edited:

Rory

Senior Member.
Isn't it also useful to have a quantitative analysis, with numbers which can be compared, instead of a qualitative one?

I mean, yes - perhaps - probably not. That would rely on real numbers, real data, and real knowledge to use as input. And even then, as someone who works in analysing quantitative and qualitative data in my job as well as in my hobby, in cases like these I would say "no". ;)
 
Last edited:

Mendel

Senior Member.
But I did use data. The time/date of the Atlas launch is a datum, so the time/date of the Gimbal video (unfortunately not so precisely known). Same goes for the fact that a Flir can see the exhaust of an Atlas, at least in certain (many) conditions, and suffer from glare by looking at it (at least in certain conditions), and this sums up all of my initial assumptions. And all those data are open for discussion, I can't see any problem here.
None of these are statistical data, and the way you derive probabilities from them involves guesswork in every case.

Rory's example shows two ways to derive a probability: via the ELO ratings, and via frequency counting. Neither involved any guesswork.

One should never find nor suggest a 0% or 100% probability (that would be begging the question)
Take a bag, put a bunch of red balls in, draw a ball at random and determine the probability that it's white.
• According to your approach, we have two possible outcomes, so the rule of succession applies.
• According to my approach, you need to show there's a white ball in there before you apply it.

(What makes your approach absurd is if you generalise the rule for more possible outcomes, then the more colors you consider, the lower the probability for red becomes.)

Your belief seems to be that anything you can think of can exist. That is both an old Greek philosophical notion and a modern technocratic credo.

I am deeply suspicious of any model of knowledge that has no way to express uncertainty. Your presentation of Bayesian inference here and your use of it in the Gimbal case don't. (A full Bayesian analysis can do this.)

You argue that we can't decide if a pink cat with blue tail exists, and therefore the analysis should proceed as if it did exist. But that is a shortcoming of your method. You simply want to proceed; your method requires that this cat exists; therefore you assume it, for no reason other than that you can't continue if it doesn't.
I say, if you can't continue, choose a different method.

The pitfall is a confusion about the use of a probability to express knowledge, and the use of a probability to express reality. Laplace requires reality.

If you don't use Laplace, go sample some cats. You'll be able to put the point estimate for pink blue-tailed cats at 0%, with a 95% confidence interval expressing your uncertainty that includes some probabilities close to 0 and shrinks with sample size.

It's way better than arguing, "because I can think of a pink blue-tailed cat, its probability can't be zero" (because that is begging the question).
 

Mauro

Active Member
Edit
None of these are statistical data, and the way you derive probabilities from them involves guesswork in every case.
Of course it does, were I to have better data I could rely on better methods than 'Guesswork', ideally using 'Formal arguments'. But with the data at hand the last resort is guesswork, which, by the way, is regularly used by everybody, even if it's easier to go unnoticed if one does not explicitly state some numbers as I do instead. It's better to (educately, hopefully) guesswork than conclude 'I just can't say nothing', I think. Low-reliability knowledge? For sure! Can we do anything better? I doubt it.


Rory's example shows two ways to derive a probability: via the ELO ratings, and via frequency counting. Neither involved any guesswork.
I wouldn't use guesswork too if I had comparable data about Gimbal. See above.


Take a bag, put a bunch of red balls in, draw a ball at random and determine the probability that it's white.
• According to your approach, we have two possible outcomes, so the rule of succession applies.
• According to my approach, you need to show there's a white ball in there before you apply it.

(What makes your approach absurd is if you generalise the rule for more possible outcomes, then the more colors you consider, the lower the probability for red becomes.)

Your belief seems to be that anything you can think of can exist. That is both an old Greek philosophical notion and a modern technocratic credo.
And we are again at it. Thus, following the same reasoning, we will never find an alien, because we never found one? C'mon. And, did I say anywhere one can generalize Laplace's rule to the multivariate (more than two possible outcomes) case maybe? I know it's not possible (or at least, that it becomes a mathematical mess), so I didn't, I don't even know if I have enough math skills to tackle it, go figure. Wasn't I clear enough in the introduction?
We need at least two competing hypothesis, we could call them H and K, but luckily in the simplest case one is the negation of the other (ie.: it was a cat, or it was not a cat). Given this is easier I'm going to use the simplest case in what follows
But you do have a point: I would had better write something about the multivariate case in the 'Pitfalls' section. It's a pity the edit window is long closed now :(.



I am deeply suspicious of any model of knowledge that has no way to express uncertainty.
Me too! That's why I like Bayes, uncertainty to the very bottom 0% (and to the very top 100% too).

Your presentation of Bayesian inference here and your use of it in the Gimbal case don't. (A full Bayesian analysis can do this.)
I can't really understand where you got, from my presentation, the idea that I said uncertainty does not exist or cannot be expressed or considered or whatsoever, but for what regards the Gimbal case specifically, what makes you feel that the numbers I suggested, ie.: "a probability between 80% and 30%" (Gimbal vs. Atlas thread, post #46) do not express uncertainty? Yes I could have more reliably written 'between 100% and 0%', but that last statement does not express uncertainty, it expresses total ignorance.


You argue that we can't decide if a pink cat with blue tail exists, and therefore the analysis should proceed as if it did exist. But that is a shortcoming of your method. You simply want to proceed; your method requires that this cat exists; therefore you assume it, for no reason other than that you can't continue if it doesn't.
I say, if you can't continue, choose a different method.

The pitfall is a confusion about the use of a probability to express knowledge, and the use of a probability to express reality. Laplace requires reality.

If you don't use Laplace, go sample some cats. You'll be able to put the point estimate for pink blue-tailed cats at 0%, with a 95% confidence interval expressing your uncertainty that includes some probabilities close to 0 and shrinks with sample size.

It's way better than arguing, "because I can think of a pink blue-tailed cat, its probability can't be zero" (because that is begging the question).
And we're yet again (and again) at the same mistake: pink blue tailed cats cannot exist, because we have never seen one (in a nuthsell). See above (and above, and above, and....).

And, just to put all this pink blue-tailed cats thing in perspective, and maybe help a newcoming reader with a resume: what Mendel is arguing (illogicaly, as I see it, but you can judge it by yourself) is the applicability of one of the tools I proposed as helpful for Bayesian inference (Laplace's rule of succession) in a specific case. His argument rests on the total impossibility for pink blue-tailed cats (or aliens, gods, many many things actually) to exist, given we never reliably saw one.

And now I wonder, had I chosen a glowing green cat instead, would I have avoided all this? Possibly yes.. since I later discovered glowing green cats (genetically engineered to express a green fluorescent protein) do indeed exist. Learnt lesson: choose your cat color carefully!
 
Last edited:

Edward Current

Active Member
Yes I could have more reliably written 'between 100% and 0%', but that last statement does not express uncertainty, it expresses total ignorance.
But I thought total ignorance was expressed by 50%. "Between 100% and 0%" seems more correct — if there is no information, the probability is undefinable.

Sabine Hossenfelder has criticized other theoretical physicists for trying to calculate probabilities that we live in a multiverse, or that the anthropic principle explains fine-tuned parameters. Neither is possible, she says, because we can't inspect an ensemble of universes to learn how many belong to multiverses, or how many have parameters like ours. Otherwise I suppose it would be "50% probability that we live in a multiverse" or "50% probability that another random universe would have the same physical parameters" which doesn't seem useful.

Informationally, I don't see how one can go from ø to a specific number. Did you find a source?
 

Related Articles

Top