Mathematics desk
< February 26	<< Jan \| February \| Mar >>	February 28 >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

February 27

stats question/puzzles.

these are not homework, though probably that's exactly what they look like. The give away is if I was a stats student I should have a clue how to approach these kind of problems. I've been coming up with these stats questions but don't have the math to solve them. Most start with: say you have a 100 sided die. Here's a couple: On average how many unique numbers will NOT have appeared after 100 rolls? On average, how many rolls would it take to see each number at least once. And lastly, on average, after 100 rolls, what number roll will produce the last unique number, not a repeat. Can someone come up with a mathematical solution or is that pretty much only solvable if you modelled it with a random number generator? I reckon the first two are probably solvable. Not sure about number three. Vespine (talk) 09:06, 27 February 2009 (UTC)[reply]

The second problem is the Coupon collector's problem. McKay (talk) 09:41, 27 February 2009 (UTC)[reply]

The first problem can be solved exactly by the following iterative algorithm: After one throw, the probability is 1 that exactly one number has appeared. After two throws, the probability is 99/100 that exactly two numbers have appeared, and 1/100 that exactly one number has appeared. For each succeeding throw, the probability of having thrown only one unique value will be the value of the preceding iteration multiplied by 1/100. The probability of K unique values (where 100 >= K > 1), will be p(K) from previous iteration multiplied by K/100 (i.e. the probability of again throwing one of the K values), plus p(K-1) from the previous iteration multiplied by (100-(K-1))/100, i.e. the probability of having thrown one less than K in the previous iteration, and now throwing a number that is not among the K-1 that have previously appeared.

For 1,2,3 and 4 throws, we get these probabilities:

n (number of throws)	p(u=1,n=row)	p(u=2,n=row)	p(u=3,n=row)	p(u=4,n=row)
1	1	0	0	0
2	1/100	99/100	0	0
3	p(u=1,n=2)*1/100	p(u=2,n=2)2/100+p(u=1,n=2)99/100	p(u=2,n=2)*98/100	0
4	p(u=1,n=3)*1/100	p(u=2,n=3)2/100+p(u=1,n=3)99/100	p(u=3,n=3)3/100+p(u=2,n=3)98/100	p(u=3,n=3)*97/100

Expand the table until row 100 is reached, and use the probabilities to calculate the expected value. The C++ code in the collapsed box below does this, and finds that the expected value is 36.6032. I'm reasonably sure this is correct, as I've also run simulations which show that the value is close to 36.6. I'm sure the algorithm could be expressed a lot more succinctly in mathematical notation, but I don't have the skills necessary.

C++ code to compute E(number of unique values that have NOT appeared after 100 throws of 100-sided die).

double question1()
{
	std::vector<double> p; // Probabilities in previous iteration
	std::vector<double> q; // Probabilities in current iteration
	p.resize(100);
	q.resize(100);

	int i;
	for (i = 0; i < 100; ++i)
	{
		p[i] = 0.0;
	}

	// Situation after two throws
	p[0] = 1.0/100.0;
	p[1] = 99.0/100.0;

	for (i = 2; i < 100; ++i)
	{
		int j;
		for (j = 0; j < 100; ++j)
		{
			q[j] = 0.0; // Reset
		}

		q[0] = p[0]/100.0;
		for (j = 1; j <= i; ++j)
		{
			q[j] = p[j]*(j+1)/100.0 + p[j-1]*(100.0-j)/100;

		}

		// Verify that total probability is 1.
		double tot_prob = 0.0;
		for (j = 0; j < 100; ++j)
		{
			tot_prob += q[j];
		}
		std::cout << "Iteration: " << i+1 << ", Total probability: " << tot_prob << '\n';

		p.swap(q); // Prepare for next iteration
	}

	// Calculate expectation (of number of numbers that have appeared)
	double u = 0.0;
	for (i = 0; i < 100; ++i)
	{
		u += (i+1.0)*p[i];
	}
	return 100.0 - u;
}

--NorwegianBlue^talk 14:49, 1 March 2009 (UTC)[reply]

Exponential/Log of a continuous linear operator

I'm trying to work out this problem (which is not homework): Suppose $A:\mathbb {X} \to \mathbb {Y}$ is a bounded linear operator, where $\mathbb {Y}$ is a Banach space, with $\|I-A\|<1$ (and $I$ the identity operator). The exponential of $A$ is defined as $e^{A}=\exp(A)=\sum _{n=1}^{\infty }A^{n}/n!$ and the logarithm is defined as $\log(A)=-\sum _{k=1}^{\infty }{\frac {1}{k}}(I-A)^{k}$ . As both of these series are absolutely convergent, it is clear that they do indeed converge. My problem is to show that $\exp(\log A)=A$ .

My first idea is to show that the exponential function (as an operator from $L(\mathbb {X} ,\mathbb {Y} )$ into itself) is continuous (a relatively simple task), so that the problem reduces to showing $\exp {\Bigl (}-\sum _{k=1}^{n}{\frac {1}{k}}(I-A)^{k}{\Bigr )}=\exp {\Bigl (}-\sum _{k=1}^{n}{\frac {1}{k}}\sum _{j=0}^{k}{\binom {k}{j}}(-A)^{j}{\Bigr )}\to A$ .

I thought perhaps changing the order of summation could be helpful, though doing this didn't really reveal any form that appeared useful to me. Does anyone have any useful suggestions? Nm420 (talk) 19:04, 27 February 2009 (UTC)[reply]

I haven't followed up this idea at all, but it might be useful to look at some proofs that those power series define functions from R to R. It's possible that the ideas easily generalize. Algebraist 19:12, 27 February 2009 (UTC)[reply]

Exact; let's write A=I-x so it is exp(log(I-x))=I-x ; this is true in the sense of real functions (|x|<1), therefore in the sense of formal power series, therefore in any Banach algebra (again with ||x||<1). No need of computation... (and I guess you mean

\mathbb {X} =\mathbb {Y}

) --pma (talk) 19:50, 27 February 2009 (UTC)[reply]

In many fields of mathematics, objects are generalized/based from/on Euclidean space. So if you prove it for Euclidean space, it should be easy to generalize (note that not all objects are based on Euclidean space). --PS T 12:41, 28 February 2009 (UTC)[reply]

pma's solution is very slick, but if you actually want to do a calculation, then I suggest setting

B=I-A

and then showing that

\exp(\log(I-B))=I-B

. Both sides of this are well-defined formal power series in

B

, and you should be able to show that the coefficient of

B^{n}

is the same on both sides by reversing the order of summation, as you expected, though I haven't worked it out, so I'm not sure how complicated it turns out to be. (Note also that your definition of

\exp(A)

is missing

I+

from its beginning.)Ctourneur (talk) 14:12, 28 February 2009 (UTC)[reply]

I'd say it's not really slick (thanks :) anyway), it is just the standard way people do that. A proper setting for the OP is a sort of 'baby holomorphic functional calculus', that is, with

f(A)

being defined by means of a complex power series

\scriptstyle f(z):=\sum _{k=0}^{\infty }a_{k}z^{k}

with radius of convergence

\scriptstyle r>\|A\|

. If

\scriptstyle g(z):=\sum _{k=0}^{\infty }b_{k}z^{k}

is another power series, with radius of convergence

\scriptstyle R>\sum _{k=0}^{\infty }|a_{k}|\|A\|^{k}\geq \|f(A)\|

then

\scriptstyle g\circ f

, expanded formally, has in fact radius of convergence at least r and

\scriptstyle (g\circ f)(A)=g((f(A))

. To prove this, no particular computation is needed, but to do it well you need the notion of summability and absolute summability for families of vectors in a Banach space. The point is that when dealing with composition and multiplication of power series, the concept of series indexed on natural numbers is too narrow, a true PITA. Objects of the right generality to deal with are

\scriptstyle \sum _{\lambda \in \Lambda }x_{\lambda }

, infinite sums indexed on general sets

\Lambda

(not necessarily ordered); where the sum is conveniently defined. Then you have nice properties about products, and about associativity, that allow to say that

\scriptstyle (g\circ f)(A)=g((f(A))

because by associativity each side equals the sum of the family

\{b_{k}a_{j_{1}}..a_{j_{k}}A^{j_{1}}..A^{j_{k}}\}_{(k,j_{1},..j_{k})}

(and as you see, in a natural way one is lead to consider sets of indices quite different from

\scriptstyle \mathbb {N}

). Unfortunately I can't find here a reference for infinite sums; the French Wikipédia has this nice article [1]. Otherwise, if you know about theory of integration, you may observe that the notion of absolute summability is a very particular case of Bochner integral (in that a sum is an integral wrto the counting measure); you can recognize that the various properties of infinite sums are very easy cases of analogous theorems in integration theory.--pma (talk) 22:38, 28 February 2009 (UTC)[reply]

Calendar question.

What day of the week were Nov 11, 1918 July 4, 1776 June 10, 1215 ? 65.167.146.130 (talk) 20:25, 27 February 2009 (UTC)[reply]

Where in the world are you asking for? Different countries adopted things like the Gregorian calendar or Julian calendar at different times, to say nothing of the ambiguities offered with just a year number BC or AD etc... 78.151.212.201 (talk) 22:03, 27 February 2009 (UTC)[reply]

Check out this page and choose a "day of the week" calculator. Using the top of the list link I got Monday, Thursday, and Wednesday -hydnjo talk 02:07, 28 February 2009 (UTC)[reply]

These are all famous historical dates, so it's easy to guess the calendar. Magna Carta explicitly says that June 10, 1215 is Julian. I assume July 4, 1776 is Gregorian since the British Empire switched in 1752, and Nov 11, 1918 obviously is. I'm not sure I'd trust random day-of-the-week calculators to handle mixed-calendar dates and dates that old. The top Google hit for me claims to handle only Gregorian dates. All three of your week days are correct, but Julian and Gregorian days of the week happen to coincide for 13th-century dates. This calculator looks trustworthy. -- BenRG (talk) 21:45, 28 February 2009 (UTC)[reply]

They may all be famous historical dates *for some people*, but that doesn't excuse the lack of ambiguity. For instance, a casual glance may suggest William Shakespeare and Miguel de Cervantes died on the same day, famous playwrights both, on 23 April 1616. Yet they were actually some 10 days or so apart. Not a rant per se, just a caution to be precise and concise. 78.151.212.201 (talk) 10:40, 1 March 2009 (UTC)[reply]

In the above, I think for "lack of ambiguity" you should read "ambiguity". -- SGBailey (talk) 12:20, 1 March 2009 (UTC)[reply]

You are, of course, right. Two sentences in my mind, plus too early on a Sunday. Thanks :) 78.151.212.201 (talk) 16:37, 1 March 2009 (UTC)[reply]

This isn't an answer to the question, just a moan (or rant), so I've written it in small! Writing dates as month day comma year is the daftest convention ever invented. Firstly the usual complaint about the elements being out of order being neither big-endian nor little endian but sort of "middle-endian-then-dancing-about". Secondly the comma is crazy. ~~You~~ I look at 65's post and I see a list of 4 dates. I see (Nov 11), (1918 July 4), (1776 June 10) and (1215). I can't see anyway to get the daft 50% of the world to change to anything sensible, so I just sound off about it occasionally. End of rant -- SGBailey (talk) 09:22, 1 March 2009 (UTC)[reply]

Is it really 50%? I thought it was pretty much just the US. (We, of course, have an section of an article on it here - far less than 50%.) --Tango (talk) 15:55, 1 March 2009 (UTC)[reply]

In my experience, that comma thing is probably explained by him/her putting the dates onto separate lines, and then the formatting was off so that the lines were combined. I will demonstrate. Hi How Are You I put each of those words onto a separate line, but they were combined. 12.216.168.198 (talk) 15:12, 1 March 2009 (UTC)[reply]

Well that is obviously true, but the problem is in the fact that the date is in a screwed up order and folk stuff a comma in the middle of the item without anything forming a boundary for the date 'phrase'. -- SGB not logged in. —Preceding unsigned comment added by 82.26.223.56 (talk) 16:58, 1 March 2009 (UTC)[reply]