Wikipedia:Reference desk/Archives/Mathematics/2010 June 21
Mathematics desk | ||
---|---|---|
< June 20 | << May | June | Jul >> | June 22 > |
Welcome to the Wikipedia Mathematics Reference Desk Archives |
---|
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
June 21
[edit]Books on Randomness
[edit]Could anyone recommend some good books on randomness? More specifically, how randomness affects everyday lives and is often interpreted by people as having patterns. It needs to be at a level that a non mathematician can understand. 'A Random Walk Down Wall Street' is a good example of what kind of book I am looking for though it does not need to be financially based. Also, I would be interested if there are any books that really delve into how randomness and perceived patterns can be utilized/exploited, for example Casinos using the house edge against gamblers thinking they have patterns and systems that beat the odds.
I searched the archive to see if there were any previous questions along this line but could not find any, although searching for randomness is a tough search. 63.87.170.174 (talk) 16:26, 21 June 2010 (UTC)
- You might want to try The Black Swan by Nassim Nicholas Taleb. -mattbuck (Talk) 16:36, 21 June 2010 (UTC)
- Based on the reviews at Amazon, it looks like The Black Swan is significantly worse than Taleb's earlier book, Fooled by Randomness, which I think is also more aligned to what the OP is looking for. I haven't read either yet. -- Meni Rosenfeld (talk) 17:35, 21 June 2010 (UTC)
- The Drunkard's Walk: How Randomness Rules Our Lives by Leonard Mlodinow is a good non-mathematician's introduction, covering a lot of the areas you mention such as casinos. --OpenToppedBus - Talk to the driver 13:59, 22 June 2010 (UTC)
- Based on the reviews at Amazon, it looks like The Black Swan is significantly worse than Taleb's earlier book, Fooled by Randomness, which I think is also more aligned to what the OP is looking for. I haven't read either yet. -- Meni Rosenfeld (talk) 17:35, 21 June 2010 (UTC)
Grand Piano lid prop angles
[edit]Most modern grand pianos' lid props appear to form a 90º angle where they meet the underside of the piano's lid. It seems logical to me that the lid prop is less likely to slip at that angle because there would be a direct load transfer of the weight of the piano's lid to the support stick. That is, grand piano manufacturers intentionally use a 90º angle for safety reasons. Could someone show me the mathematics, perhaps using vector analysis, to prove my hypothesis?
The reader may want to visit http://en.wikipedia.org/wiki/Grand_Piano to see a couple of pianos that do not appear to use the 90º angle. Note the Louis Bas grand piano of 1781 and Walter and Sohn piano of 1805.Don don (talk) 16:52, 21 June 2010 (UTC)
- Thinking about it I'm surprised any of the props meet the lid at a right angle. That means the prop would have to be supported on all sides to stop it from slipping after being knocked. If it is at an angle it could just fit into an angle, sounds easier to me. I don't think one need worry much about the strength of the prop and the force will always go straight down it since the ends aren't held firmly in place. Dmcq (talk) 19:57, 22 June 2010 (UTC)
Question continues at Wikipedia:Reference desk/Science#Piano Lid Prop angle.94.72.242.84 (talk) 01:57, 6 July 2010 (UTC)
Which Statistics Test?
[edit]I'm trying to determine the effect taking a particular class has on student retention rates to the next grade (sophomore year). I have the number of freshman students who started and the number who went on to sophomore year (about 66% of 1600). I also have the number of those same freshman who took this class, and how many of this subset went on (about 90% of 300). Obviously the class made a difference, but what test do I use to prove it (with significance)? It's been over five years since my last statistics class. I'm not really dealing with samples, these are the official numbers. Do I still use the z-score, right tailed hypothesis test... even though the z-score is 9.19 before fpc, 10.18 after? Do I use fpc even though it's not really a sample? I've got to run similar tests for ten other years.160.10.98.34 (talk) 20:47, 21 June 2010 (UTC)
- Pearson's chi-square test seems appropriate here. But it looks like you have made an observational study, so keep in mind that correlation does not imply causation. -- Meni Rosenfeld (talk) 07:49, 22 June 2010 (UTC)
- McNemar's test. HTH, Robinh (talk) 07:51, 22 June 2010 (UTC)
- I can't see any pairing here, so I can't see why McNemar's test would be appropriate rather than Pearson's chi-square test. As for finite population correction (fpc), you use that if you're trying to estimate uncertainty in your estimate of a population parameter when your sample is a substantial fraction of the size of the population. Here, you sample is the entire population, so the fpc should be 0 as you know the proportion in the population exactly, i.e. its standard error is 0. However, i think you can still interpret statistical tests of a null hypothesis, such as Pearson's chi-square test, without assuming variability is due to sampling from a larger population. Another matter to consider as you're repeating this for ten other years is multiple comparisons. And Meni is right to remind you about not reading causation into this - you say "obviously, the class made a difference", but that's not obvious at all from what you've told us. Were the students randomly allocated to take this class? If they weren't, you can't assume those who took the class are comparable to those who didn't, e.g. maybe the more motivated students were more likely to choose this class. See self-selection. Qwfp (talk) 08:26, 22 June 2010 (UTC)
- You meant "but that's not obvious at all". -- Meni Rosenfeld (talk) 08:34, 22 June 2010 (UTC)
- Fixed, thanks! Qwfp (talk) 08:41, 22 June 2010 (UTC)
- You meant "but that's not obvious at all". -- Meni Rosenfeld (talk) 08:34, 22 June 2010 (UTC)
- Thanks to all. Can I still do a Chi Square test even though the 1600 number includes the 300 students who took the class? Most of the examples I've come across deal with two independent, non-overlapping groups. Just so you know, I appreciate it's correlation, not causation. We're planning on testing whether the major indicators of academic success are different for these groups too. —Preceding unsigned comment added by 160.10.98.106 (talk) 13:06, 22 June 2010 (UTC)
- The first step before you conduct the test is to construct a 2×2 contingency table of non-overlapping groups, i.e. "took class", "didn't take class" vs. "went on to next year", "didn't go on to next year". That's just a couple of straightforward subtractions. Qwfp (talk) 13:45, 22 June 2010 (UTC)
- The probability P1, that a student having taken the class went on, has a beta distribution with α=1+0.90·300=271, β=1+0.10·300=31, α+β=302, mean = μ = = 0.897351 . standard deviation = σ = = 0.0174356 . So P1 ≈ 0.897351±0.0174356 . The probability that a student not having taken the class went on, is P2≈ 0.646837±0.0131106 . The difference P1 − P2 = 0.250514±0.0218149. Zero is μ−11.4836σ. This difference is highly significantly different from zero. Bo Jacoby (talk) 14:47, 23 June 2010 (UTC).
- Huh? How do beta distributions come into this? This is just a straightforward comparison of two binomial probabilities, i.e. Pearson's chi-squared test for a 2×2 contingency table. There's no need to give the probabilities themselves a distribution. Granted in reality the probability of going on to the next grade may vary between individuals within each group (class takers and non-takers) depending on all sorts of other factors, but there's not the information here to say anything about that; all you know are the overall probabilities for each group. Qwfp (talk) 23:55, 23 June 2010 (UTC)
- If you know a probability, P, and a sample size, n, then the number of successes in the sample, i, has a binomial distribution. But our situation is that we know n and i, while P is unknown. The distribution of the continuous parameter P is not binomial, but beta, with parameters α=i+1 and β=n−i+1. Bo Jacoby (talk) 09:15, 24 June 2010 (UTC).
- Oh, i see now, sorry; you're taking a Bayesian approach, while Meni and I (and, i think, the orginal poster) were being frequentist. Either is fine and going to come to essentially the same conclusion here (though don't usually associate Bayesian inference with phrases like "highly significantly different"). I'm still more concerned about (over)interpretation. 21:38, 24 June 2010 (UTC)
- The difference between the frequentist and the bayesian approach is obvious when the sample is very small. If the sample size n = 0, (and then of course i = 0 too), then the beta distribution gives P ≈ 0.5±0.3, which makes sense, while the frequentist approach gives P = 0/0, which does not make sense. Bo Jacoby (talk) 14:41, 25 June 2010 (UTC).
- True, but with no data the Bayesian posterior distribution is the same as the prior distribution, so it depends on which prior you choose. You're implicitly assuming a uniform prior on the probability, but there are other possible choices even if we stay with 'uninformative' priors. See prior probability#Uninformative priors. Qwfp (talk) 21:15, 25 June 2010 (UTC)
- The continuous uniform distribution f(P)dP = dP for 0≤P≤1 is the limiting case for large values of N of the uniform discrete distribution for I=0,...,N. (Here N is the size of the population, and I is the number of successes in the population). The discrete uniform distribution is the correct choice of uninformed prior distribution in the finite case, and so the continuous uniform distribution is the correct choice in the limiting case. This is the beta distribution for α=β=1 : P ≈ 0.5±0.3 . Bo Jacoby (talk) 06:36, 26 June 2010 (UTC).
- True, but with no data the Bayesian posterior distribution is the same as the prior distribution, so it depends on which prior you choose. You're implicitly assuming a uniform prior on the probability, but there are other possible choices even if we stay with 'uninformative' priors. See prior probability#Uninformative priors. Qwfp (talk) 21:15, 25 June 2010 (UTC)
- The difference between the frequentist and the bayesian approach is obvious when the sample is very small. If the sample size n = 0, (and then of course i = 0 too), then the beta distribution gives P ≈ 0.5±0.3, which makes sense, while the frequentist approach gives P = 0/0, which does not make sense. Bo Jacoby (talk) 14:41, 25 June 2010 (UTC).
- Oh, i see now, sorry; you're taking a Bayesian approach, while Meni and I (and, i think, the orginal poster) were being frequentist. Either is fine and going to come to essentially the same conclusion here (though don't usually associate Bayesian inference with phrases like "highly significantly different"). I'm still more concerned about (over)interpretation. 21:38, 24 June 2010 (UTC)
- If you know a probability, P, and a sample size, n, then the number of successes in the sample, i, has a binomial distribution. But our situation is that we know n and i, while P is unknown. The distribution of the continuous parameter P is not binomial, but beta, with parameters α=i+1 and β=n−i+1. Bo Jacoby (talk) 09:15, 24 June 2010 (UTC).
- Huh? How do beta distributions come into this? This is just a straightforward comparison of two binomial probabilities, i.e. Pearson's chi-squared test for a 2×2 contingency table. There's no need to give the probabilities themselves a distribution. Granted in reality the probability of going on to the next grade may vary between individuals within each group (class takers and non-takers) depending on all sorts of other factors, but there's not the information here to say anything about that; all you know are the overall probabilities for each group. Qwfp (talk) 23:55, 23 June 2010 (UTC)
- The probability P1, that a student having taken the class went on, has a beta distribution with α=1+0.90·300=271, β=1+0.10·300=31, α+β=302, mean = μ = = 0.897351 . standard deviation = σ = = 0.0174356 . So P1 ≈ 0.897351±0.0174356 . The probability that a student not having taken the class went on, is P2≈ 0.646837±0.0131106 . The difference P1 − P2 = 0.250514±0.0218149. Zero is μ−11.4836σ. This difference is highly significantly different from zero. Bo Jacoby (talk) 14:47, 23 June 2010 (UTC).
- The first step before you conduct the test is to construct a 2×2 contingency table of non-overlapping groups, i.e. "took class", "didn't take class" vs. "went on to next year", "didn't go on to next year". That's just a couple of straightforward subtractions. Qwfp (talk) 13:45, 22 June 2010 (UTC)
- I can't see any pairing here, so I can't see why McNemar's test would be appropriate rather than Pearson's chi-square test. As for finite population correction (fpc), you use that if you're trying to estimate uncertainty in your estimate of a population parameter when your sample is a substantial fraction of the size of the population. Here, you sample is the entire population, so the fpc should be 0 as you know the proportion in the population exactly, i.e. its standard error is 0. However, i think you can still interpret statistical tests of a null hypothesis, such as Pearson's chi-square test, without assuming variability is due to sampling from a larger population. Another matter to consider as you're repeating this for ten other years is multiple comparisons. And Meni is right to remind you about not reading causation into this - you say "obviously, the class made a difference", but that's not obvious at all from what you've told us. Were the students randomly allocated to take this class? If they weren't, you can't assume those who took the class are comparable to those who didn't, e.g. maybe the more motivated students were more likely to choose this class. See self-selection. Qwfp (talk) 08:26, 22 June 2010 (UTC)
- McNemar's test. HTH, Robinh (talk) 07:51, 22 June 2010 (UTC)