/ Home About VFF Tests Readings Study Results Discussion Statistics Statistics 2 Other Forum Links Updates Contact |
/
I have had two tests on medical dowsing. Here are the statistics of the results, first from the IIG Test, then the TAM Test, and then both combined. This page is messy but all the statistics are here.
- Start from the top of the image. We start with one million people who are going to be guessing on the IIG test. I chose one million people so that I will always have whole numbers at the end rather than fractions.
- Every time you go down one row, everybody all the million people are doing their personal guess. The first time you go down a row they are guessing for the person at trial 1. There are six persons in trial 1, one of which is missing a kidney. So if you guess randomly, which all of these million people are doing, 5 out of 6 guessers will be wrong and 1 out of 6 guessers are right. Just by random chance. So you divide the million by six. Five such sixths split up as a group to the left and the remaining sixth who guessed it right split up into another group to the right. That means that after guessing at the person in trial 1 there are 833,333 people who got it wrong and the lucky 166,667 people who got it right. It doesn't matter which of the five wrong subjects the 833,333 each picked, because wrong is wrong.
- On to the next segment of the test. Once the people have all chosen a subject in trial 1, they are now going to be "looking into the person" to try to guess on which side of the subject that they chose a kidney is missing. Is it the left kidney or the right kidney that is missing? Well, those 833,333 who already got the wrong person they are not going to get any points whether they choose left or right, so, all 833,333 get the same score here: wrong. Meanwhile the lucky 166,667 now have a 1 in 2 chance of getting the right side, so they split into two groups of 83,333 each; one group who got the right person AND the right side, the other group who got the right person but the wrong side.
- Now is trial 2 and everybody picks one subject out of six subjects again. The odds are once again 1 in 6 to get the right person. Since the first trial (person and side chosen) produced three separate groups out of the one million original people in which each of the three groups people have the same group score. Now each of these three groups will be split into two: five sixths who get it wrong, and one sixth who get it right. We end up with a total of six groups now. In each group all the people in that group have been guessing exactly the same and in the same sequence.
- Notice that the way I drew it is that whenever a group makes a decision (either a 1 in 6 decision or a 1 in 2 decision), the group that splits to the LEFT got it WRONG, and the group that splits to the RIGHT got it RIGHT.
- So, all the people keep guessing through the three trials where they pick a person and then the side. We end up with 27 groups at the end! Each group has by random chance been making the very same guesses within their group.
- But it does not matter in what sequence you got a score! If you got one subject right and that's all you got, it doesn't matter whether you got that right in the first trial, or in the second trial, or in the third trial. And just so happens there is of course every possible combination of groups produced. So we add together groups that got the same score, no matter in what sequence they got the score.
- Only one group of people were able to get everything wrong, and they are more than half of the 1,000,000, at 578,704. More than half of the people guessing, or more exactly, nearly 58%, will get nothing right at all! Next we have the lucky guessers who managed to get one subject right. Doesn't matter in which trial they had a lucky hit. There are three ways to get one subject right: either you got it in trial 1, or trial 2, or trial 3. So we group together these people, adding 57,870 who got trial 1 right, 57,870 who got trial 2 right, 57,870 who got trial 3 right, for a total of 173,610 out of the million who got one person right but not the side, and that is about 17% of the people.
- Adding together all the different total scores: none right, one subject right, one subject and one side, two subjects no sides, two subjects one side, two subjects two sides, three subjects no sides, three subjects one side, three subjects two sides, three subjects three sides (which is all correct). These are all the different types of total scores that a guesser can end up with, and you can end up with that score in a number of ways because of the sequence that leads you there.
- Then simply divide the total number of people who got a particular score, by one million, and multiply by a hundred, and you get the percentage of guessers who would get that particular score by random guessing alone. For instance 578,704 divided by 1,000,000 times 100 is 58%.
- These calculations were done entirely without bias. I do not see any other or better way of calculating it. I don't see how anybody could dispute these figures. And once again, I find it interesting that when a skeptic attempts to calculate the stats they come up with numbers such as 25%, and when a woo or supporter calculates the stats they come up with 3.8%. I produced what I believe to be the true numbers. I applied no bias I just wanted to know what the actual numbers were, and here they are.
- Note that on the picture, I have highlighted in blue the sequence that I personally arrived at at the IIG test, but note that it itself does not give the probability of achieving my score. You first have to add all the different ways of getting two people one side right! This table shows all the ten different total IIG test scores that a random guesser can arrive at. The first column says how many people out of one million guessers will get the score of that row. For that row, "P" says how many people they guessed right, "S" how many sides of the people they got were right. The best score is to get everything right with 3P and 3S, the worst score is to get everything wrong with 0P and 0S. The fourth column shows the percentage of people who would guess the total test score of that row.
Possible total IIG test scores from random guessing So we see that most people will get everything wrong, at 58% of all the people. Note that it is easier to get 3P and 2S than it is to get 3P and 0S due to the magic of statistics!
So here we have it! My IIG test results were not 25%, or 3.8%, or anything else. I acchieved a score of 2P and 1S and the probability of guessing to that score is 2.89%. So out of 100 people guessing randomly, 3 would be expected to have the same results as I did, or 29 out of 1,000 people, or 289 out of 10,000 people, or 28,900 out of 1,000,000 people.
Test size matters when we want to find the real accuracy of a method What if I could always get two people and one side in such a test? What if I were consistently as good as 2.89%? Meaning, what if I could do it again? Yeah right, it will only be 1 in 35 again. No big deal. Or is it? To do the same test with the same probabilities twice, means that you get to combine the two probabilities, because it extends the total test to six trials, three from each. In statistics, if you multiply the probability of a score with itself, you get the probability of achieving the same score twice in two consequtive tests. My total IIG score of 1 in 34.6 achieved twice on two consequtive tests is 1 in (34.6*34.6) which is 1 in 1197 or 0.08%. Achieved in three tests is 1 in 41,422. Achieved in four tests is 1 in 1,433,192.
Am I just skewing the data? Well let's say that a person got one person correct and no sides. The odds of achieving that was 17.36%, which is 1 in 5.8 or 1 in 6. They might want to try a test again to add the results many times over. Well if they manage to get the same results again, done twice makes 1 in 12. Done three times makes 1 in 191. Done four times makes 1 in 1,101. Not impressive. Does not even pass the IIG test. How about someone who got two people and two sides correctly in the first test, doing better than I did with odds at 1.74% which is 1 in 57? If they repeat their score in two tests, it is 1 in 3303. Three times is 1 in 189,824. Four times is 1 in 10,909454. Pretty impressive, and better than mine.
And any person who was guessing and got a particular score would be more than welcome to try a test again, in which they are most likely to do poorly, since more than half will get a total test score of zero. So what if I was consistently as good as 2.89%? It would be something! And it is not cheating, or skewing with the data. It is adding more data sets to the total, and in the world of science, more data sets is ALWAYS better. There is not a single scientist in the world who will say that running more trials is cheating or is making a random fluke seem better than it was. A skeptic might say that, and skeptics do say that. Skeptics think that a paranormal claimant must stop testing once they reach a score that was not 1 in 1,000,000 the first time around. But in science, and in particular in the science of chemistry which is one of my two areas in science that I study, we often work with small samples and small effects but illuminate those and detect them by running more careful testing. And more careful testing always requires more data sets. That is the problem with working with skeptics, they discourage the scientific approach to paranormal claims.
I have valid reason backed up by statistics to run more tests. If I had another three-trial test, it would immediately show whether my 2.89% is a consistent achievement or if it was only the case of random statistical outcome. The more trials that are done, the clearer the true answer becomes. If I have no ability, with further tests the total score will go even further down, not up. Skeptics often discourage claimants from further testing, thinking that the claimant will somehow by doing so manage to twist the results in their favor. That is simply not possible when additional trials are added together. If there was any luck in the first test, it is not likely to happen again. The results of my IIG test seem almost better than they should be, at least they are in the upper region, and only by having another test can we find out if that is a consistent performance, or random chance. And let's not forget that I knew beforehand which answers were correct and which were not based on my confidence in the medical images.
Here is a table prepared similarly for the TAM test with the six possible total test scores listed. Again, the first column says how many people out of one million guessers are expected to arrive at the score of that row. 5C and 0W means five correct and zero wrong. And a percentage of guessers that arrive at that total test score.
Possible total TAM test scores from random guessing The TAM test involved five persons. There are three options for what answer one chooses for each person. Either the person is missing their left kidney, is missing their right kidney, or has both kidneys. It is a 1 in 3 chance to guess the one correct option of the three, and a 2 in 3 chance to guess the wrong option of the three.
Of course for this test we had been told that only one person is missing a kidney so the odds for this particular test are not quite that but I do not know how to account for that in statistics. Meanwhile, the way that I did the test I personally treated each person as a test of its own, that is how I ended up with not seeing a kidney in two of the people, rather than guessing at one of my two and saying that the other has both kidneys. I went with what I saw.
I got nine out of ten kidneys correct, or four out of five people correct. The odds of guessing all five people correctly is 1/3*1/3*1/3*1/3*1/3=1/243=0.4%. The odds of guessing four out of five people correctly is 1/3*1/3*1/3*1/3*2/3=2/243=1/121.5=4.1%. It is 4.1% because there are five different ways of arriving at such a result depending on the sequence. I got four of five people correct so I got 4.1%.
Note that it is 13.2% chance to get all wrong, because there is only one way of getting all wrong. Meanwhile there are five different ways of getting one right, so to get one right is 32.9% chance. Interesting the way statistics goes, it is more likely to guess one or two correct than to get them all wrong! So if somebody thinks that getting one right out of five is *better* than to get none right? Nope. Not what statistics tells us. Isn't statistics scary?
What would we expect out of a random guesser in this test? It is far more likely to get one or two correct out of five (at 32.9% each), or, we could combine those and say that it is a whopping 65.8% chance that a person gets *one or two* correct out of five. Of course the least likely is for a guesser to get all five correct. But to get four out of five correct at 4.1% chance is almost in the upper range, and most guessers would not fall into this luckier region of the statistics, yet, 4 out of 100 people guessing would. Or, 41,152 people out of a million people guessing would. Does that mean that 41,152 people out of every one million people out there are psychic? No. And that is why getting 4 out of 5 right and at 4.1% can not indicate to being psychic.
Nor is five out of five correct at 0.4% psychic either, because then 4,115 out of every million people guessing would be psychic because just by random guessing they would also get that result, or out of the 6,775,235,741 people in the world, 27,100,942 would be psychic, but they aren't. They're just lucky guessers, and maybe I was too?
Test size and consistent results matters If I had another TAM-type test with five people with the same statistics again, I would be all over again most likely to get *one or two* correct out of five at the combined percentage of 65.8%, and all the same numbers for the likelihood of which category luck would put me in would apply again. If we combine the percentages for getting zero, one, two or three correct out of five (ie. any score lower than what I had at the first TAM test) then the combined percent is 95.5%. It is 95.5% likely that at a TAM-type test with five people I would get a score lower than the four I had or the five which I didn't have.
If it's really hard to guess oneself to the upper 4.5% of four or five correct once, it is much harder to guess to it a second time again! 4.5% equals 1 in 22. To achieve 4.5% twice, is 1/22*1/22=1/494=0.2% chance to get 4 or 5 out of 5 correct two tests in a row. Which is still 13,719,852 people in the world.
Will a test ever be good enough? James Randi of the JREF offers a psychic challenge in which if a psychic claimant passes odds of 1 in 1,000,000 they would be awarded a USD $1,000,000 prize and "pass" as psychic. I personally do not think it's good enough. If I held a paranormal challenge whose objective is to determine whether someone is psychic, I might make the odds higher than that. Possibly even as hard as 1 in 6,775,235,741, why not. All it takes is seven or eight consequtive tests and 35 or 40 test subjects to produce odds that beat any other guesser in the world. How's that?
Does a paranormal claim have to be 100% correct and at all times? Let's say a psychic gets correct at odds of 50% and claims that they are psychic anyway. Let's make them have to achieve total test odds of 1 in 1,000,000 by consequtive tests. How many tests total would they have to do with consistent 50% odds to achieve 1 in 1,000,000 odds? 50%=1/2, so we have to find the exponent of 2 that adds up to 1,000,000, ie. the equation goes like this: 2x=1,000,000, find x where x is the number of consequtive tests. If they have ten tests with the same result in each, they only arrive at odds of 1 in 1024. If they have eighteen consequtive tests with the meager 50% result always in each test, they have achieved total test odds of 1 in 262,144. Nineteen tests brings them to 1 in 524,288. Twenty tests is what they would have to have for 1 in 1,048,576 odds, beating Randi's 1 in 1,000,000 odds.
Let's say the claim is that they can predict which side a tossed coin falls, but make this a coin that *is* perfectly 50/50 in terms of which side it falls on. So they get it right the first time and say they are psychic. If they manage to get it right twenty times in a row they have actually beat 1,000,000 odds. But does that make them psychic? I say no because there are 6,775,235,741 people in the world and 6,775 of them would get the same result by pure luck alone. I say a psychic has to beat not one in a million odds, but at least do better than anybody else would in the world who is guessing.
So do we have to worry that a psychic is going to get lucky by having more tests and end up with a passing total score? Skeptics are really scared of that and they won't let psychics have more tests once they fail to meet 100% on the first test they have. Let's look at that for a moment.
If a psychic beats odds that are 4.1% even though they did not achieve 100% correct on a test by doing so (this is my 4 of 5 score for the TAM test), how easy is it to accumulate a higher score in consequtive tests? If they have one test, they are 4.5% likely to make a score as good or higher, or 95.5% likely to get a score lower than that. If they add a second test into the equation, they are 99.8% likely to *not* get the 4.5% score TWICE in a row. For three consequtive tests they are 99.99% likely to *not* get the 4.5% score THREE times in a row. See? It gets harder and harder, not easier. Yet, even an ability that were not 100% always accurate but a little bit less than that would have a great chance of showing itself, if it exists, if it is given more tests total. And a claim that doesn't work would simply fall flat.
More tests is good. More tests can only illuminate the true result. More tests added to the total score actually increases the truthfulness of the result whether indicating to an ability of some extent, or no ability. More tests improves on the resolution of the true answer, it makes the picture clearer. More tests makes it harder, not easier, to seem good if random chance is all there is.
My total test score is such that if people were guessing randomly, 0 of 100 would achieve my total score, 1 of 1,000, 11 of 10,000, 118 of 100,000, or 1,185 of 1,000,000. Which might seem pretty neat, except that it still makes for 8,027,977 people in the world that I have to beat.
If I were to achieve this test total of 0.1% TWICE, it would be 1 in 712,257, which is 0.0001%. Which would be 0 out of 100 random guessing people. 0 out of 1,000. 0 out of 10,000. 0 out of 100,000. And 1 out of 1,000,000. So, if I were to do another IIG test-format and another TAM test-format all with the same total score not lower, I would have achieved Randi's odds of 1 in 1,000,000, unless the JREF is of those skeptics who think that 1 in 1,000,000 is only such if achieved through 100% accuracy throughout and that a different path would somehow not be the same even though it is and is still 1 in 1,000,000 hard to achieve by guessing. But, even after a double additional set of tests and all with the same score, I would still have to beat 9,512 people in the world. A psychic's work is never done. More tests than the double would be needed at a minimum. And, with consistent results, which they so far have seemed to be. So I continue and go on, don't you think?
Ok here it is. I have to combine all test sequences together. To get everything right in IIG and TAM test is 1/6*1/2*1/6*1/2*1/6*1/2*1/3*1/3*1/3*1/3*1/3=1/419,904=0.00024% and I am sure of that. My score was 5/6*1/2*1/6*1/2*1/6*1/2*1/3*1/3*2/3*1/3*1/3=10/419,904=1/41,990=0.0024% but that does not yet account for the multiple number of ways that the same score can be achieved so that needs to be added. 0.95% to have everything wrong. I need to think about this more. I hate statistics.
The odds of 0.0024% to achieve my score overall needs to be multiplied by the number of different ways that a person might arrive at that total score, so the number won't be as good as that in the end. The TAM side with four 1/3 and one 2/3 contributes with five different ways in which the TAM-side of the score could be achieved. This "five" is added to every different way that the IIG-test side can be achieved.
With the 5/6 1/2 1/6 1/2 1/6 1/2 IIG score it can be arranged in three different ways, so the total is fifteen. The total test score would then be 0.036%, oops, much better than the 0.1% I computed earlier, and here I was thinking I would get a lower score overall by diving into the statistics again. Never mind. There'd still be 2,420,280 people in the world I'd have to beat. But, hey, if this my second calculation is right, it means that I've achieved total odds of ... do we dare to say it? I don't dare to say it. It says that I've done as good as 0 in 100. 0 in 1,000. 0 in 10,000. 2 in 100,000. 24 in 1,000,000. I'm almost there! I can still beat Randi's million and the whole world population of guessers! I just need more tests! And don't forget, having more tests makes it harder, not easier, to do good. I think? Statistics is so hard. I know these numbers must be wrong. Numbers are scary.
Perhaps the best way to look at the statistics
Possible total IIG test scores from random guessing Somebody who gets everything wrong should not be given a "score" of 57.87%, they should get no score at all. Even though the way I have indicated it, the smaller the percentage the better one has done. So here is a better way to think about it:
Percentage of probability beaten by a score IIG The first column still shows how many people out of a million who guess randomly would arrive at the score on the row, and 0P 0S for instance, means that these guessers got no person right no side right. The percentage this time says "how many people out of the million did they beat". So the people who got all wrong beat nobody, they did better than no other. Whereas the people who guessed one person but no side, did better than all the 578,704 of 1,000,000, so they did better than 57.87% of all the people.
Even though the earlier calculations show what the odds are of arriving at a particular result, it didn't really say how good you were compared to all the others. And with this new form of calculation, it says that my score of 2P 1S meant that I did better than 94.33% of all the guessers, which puts me at the upper 5.67% of the guessers who did as good or better than me. This number is more indicative of my results, than the 2.89% whose format would still allow for somebody who got all wrong to get points.
As for TAM, here is the previous table again that shows what the probability is of arriving at a particular result, yet it doesn't say how good you did because somebody who got all wrong, even though it's 13.2% likely to guess all wrong, should not get any points at all.
Possible total TAM test scores from random guessing Here is the table again:
Percentage of probability beaten by a score TAM So this time, the percentage tells you how many people did you beat with a particular score. My score of 4C 1W beat 95.47% of all the other guessers, putting me in the upper 4.57%. This scoring system doesn't say how likely you are to get a particular score, but it says how well you did by getting a particular score.
Score overall There is a total of 60 possible combinations. Here is the one I got:
So according to this, 1185 out of 1,000,000 people would be expected to get the same total results as me, and that it is 0.1% likely for a guesser to get my score. That is the same result of 0.1% that I got earlier. Statistical analysis seems to indicate that my results are better than they should be. They were not only in the upper ~5% the first time, they were again the second time, and then overall for a 0.1%. A third test would give poor results if all I am doing is guessing. But you can't be lucky three times, can you? I couldn't be in the upper 5% a third time just by guessing, the odds of that would be... 0.01% to be in 5% three times in a row.
How many people did I beat overall? To answer that question, I computed all 60 combinations and will next list them in order and add together the groups that I did better than.
Here are the 60 different possible outcomes, each given based on IIG score with the six possible outcomes due to the TAM score following that particular IIG score.
Those who got 3P 3S on the IIG test: Those who got 3P 2S on the IIG test: Those who got 3P 1S on the IIG test: Those who got 3P 0S on the IIG test: Those who got 2P 2S on the IIG test: Those who got 2P 1S on the IIG test: Those who got 2P 0S on the IIG test: Those who got 1P 1S on the IIG test: Those who got 1P 0S on the IIG test: Those who got 0P 0S on the IIG test: All duplicates, ie. there being more than one way often to arrive at a particular result depending on the sequence of answers, is already accounted for and all these ten tables add up to 1,000,000 people and 100%, everybody is there.
Ok. So if we give everybody who got a person right on the IIG test 6 points, a side right on the IIG test 2 points, and correct answer on TAM test 3 points, we can order these based on who got the most points. (I know this is a lot of work but once I've done this I will know for sure how I did, and nobody else wants to tell me, I've asked several people to calculate the statistics for me.)
39 points 37 points 36 points 35 points 34 points 33 points 32 points 31 points 30 points 29 points 28 points 27 points 26 points 25 points 24 points 23 points 22 points 21 points 20 points 19 points 18 points 17 points 16 points 15 points 14 points 12 points 11 points 9 points 8 points 6 points 3 points 0 points Alright. I got 2P 1S 4C 1W so I got 26 points total. Adding together the total number of people below 26 points says I did better than 992,230 of the other people, or I did better than 99.223% which puts me in the upper 0.777% overall. Still, 1744 people out of 1,000,000 people did as good as me (point-wise) which is 0.1744% who did as good as I did. And, 6004 did better than me which is 0.6004%.
I like having this final answer, that I did better than 99.223% of people would, and that 0.1744% of people would do as well as I did, and that 0.6004% would do better than I did. It was almost worth all the calculations.
So how do we interpret these results? 99% of everybody guessing would be expected to do worse than that. My score and higher accounts for 7748 people which is 0.7748%, putting me in the upper 0.7748% of everybody. And the probability of achieving the particular score I had is 0.118% likely, or 0.056% if you arrive at the same points from a different mechanism.
Is the claim falsified? It can't be, not with individual tests scores and an overall score like this. Is the claim proven and verified? Not to my standards and not to typical standards set. If I have more tests, it could either give me even higher total score, and remember, to have more tests is not "cheating", more tests enhance accuracy and resolution. If I do have an ability that is beyond random chance, then more tests would narrow down to a smaller percentage value. Yet if I do not have an ability, then the percentage either stays here (constant at a value that is not good enough to pass as a paranormal ability, either because it is not an ability, or because it is an ability but one that does not perform good enough to really matter much), or random chance starts putting me into those more likely categories with poorer performance to reduce the total score significantly.
The answer is not clear, but what is clear is that more tests are needed. Note that if I were to do two more tests of the same size and with the same results as previously (0.118%) the total would then be 0.013%. If repeated three times the total would be 0.0016%. If four times 0.000194%. Five times 0.0000229%. I'm currently at 1 in 1000. Two tests puts me at 1 in 10,000. Three tests makes it 1 in 100,000. Four tests makes it 1 in 1,000,000. The odds whether achieved quickly with fewer tests or slowly with a greater number of tests are still the same odds. It is just as difficult to make 1 in 1,000,000 by doing four tests, as it is to make 1 in 1,000,000 by doing one test.
Note that if we omit the parts of the IIG test that I knew beforehand (not ad hoc) would be wrong because the claim had not performed then, my total odds would be 0.0171%. Calculated as 1/6*1/2*1/6*1/3*1/3*1/3*1/3=1/5832=0.0171%, which is 0 of 100, 0 of 1,000, 2 of 10,000 or 1 in 5,000. Or 17 in 100,000, or 171 in 1,000,000.
My overall test score if evaluating the claim without requirement that it kicks in every time it is asked to, but evaluating its accuracy when it does form perceptions that are compelling (no ad hoc bias): 0.013% odds of achieving my results by random guessing. Upper percentage and how many I did better than is not available (and involves a tedious calculation). Note that these numbers might be calculated by the wrong method. If anyone spots an error in these calculations please do let me know, but I don't want to hear anything about "you failed so it's zero" or "25%" because that simply isn't so.
Useful about statistics Laws of Chance Tables
/
|