Jul 14, 2017

# Student Puzzle Corner 18 (and solution to Puzzle 17)

Contributing Editor Anirban DasGupta sets this problem. Student members of the IMS are invited to submit solutions (to bulletin@imstat.org with subject “Student Puzzle Corner”). The deadline is September 4, 2017.

After a relaxed rendezvous with effulgent nothingness, we should now seriously get back to the problem corner. It is the turn of a problem in statistics.
We will pose a problem on deconvolution, sometimes brandished as noisy data. The basic model is that you get to observe a random variable $Y$ which has the distribution of the convolution of $X$ and $Z$, it being usually assumed that $Z$ has a completely known distribution, while the distribution of $X$ has unknown parameters, perhaps infinite dimensional, associated with it. We would want to infer about the distribution of $X$, knowing only $Y$; often, it is assumed that iid replicates of $Y$ are available. There is massive literature on deconvolution, particularly Gaussian deconvolution. Generally, the results are asymptotic in some sense. The problem we describe today was originally posed by C.R. Rao in 1952.

Here is the exact problem of this issue.

Suppose $X \sim \mbox{Bin}(n_1, 1/2)$ and $Z \sim \mbox{Bin}(n_2, p)$, $0 < p <1$ being an unknown parameter; $X$ and $Z$ are assumed to be independent. Due to (spatial) aggregation, we can only observe $Y = X+Z$.

(a) Is there always an MLE of $p$?

(b) In suitable asymptotic paradigms, are there consistent estimators of $p$ based on $Y$ alone?

(c) How does one construct a confidence interval for $p$, again, based on $Y$ alone?

(d) What can be said about minimax estimation of $p$ on the basis of $Y$, using squared error loss?

## Solution to Puzzle 17

The problem asked was about the first hitting time $\tau$ of the set of prime numbers by the process $\{S_n\}, S_n$ being the sum of the first $n$ rolls in an infinite sequence of rolls of an honest die. Evidently, $P(\tau < \infty ) > 0$. The set of primes $\mathcal{P}$ is an infinite set, and in fact, with probability one, (the process) $S_n$ will hit $\mathcal{P}$ infinitely many times (see, e.g. Feller, Vol. 2, pp 360, 381). We know from classic number theory that for fixed $n$, large, $P(S_n \in \mathcal{P}) \approx \frac{1}{\log n}$, and since the series $\sum_{n = 2}^\infty \frac{1}{\log n}$ diverges, the expected number of hits is easily $+\infty$. The tailsum formula tells us

$E(\tau ) = 1 + \sum_{n = 1}^\infty P(\tau > n) = 1 + \sum_{n = 1}^\infty P(\cap _{k = 1}^n S_k \not\in \mathcal{P})$,

and a geometric approximation by terminating the infinite series at one million terms gives $E(\tau ) \approx 7.6$. The initial terms $P(\tau > k)$ can be calculated exactly. By summing $P(\tau > k)$ from $k = 0$ to $9$, we get $E(\tau ) > 2.34 > 7/3$.

It is less clear if $\tau$ has a finite variance, but it probably does. To my knowledge, the answer is not explicitly known.

References:
Brown, L. (private communication)
Davis, B. (private communication)
Feller, W. (1971)
Pitman, J. (private communication; provided an approximate value of 7.62 for $E(\tau )$)
Rao, C.R. (1952)
Szekely, G. (private communication)

## Welcome!

Welcome to the IMS Bulletin website! We are developing the way we communicate news and information more effectively with members. The print Bulletin is still with us (free with IMS membership), and still available as a PDF to download, but in addition, we are placing some of the news, columns and articles on this blog site, which will allow you the opportunity to interact more. We are always keen to hear from IMS members, and encourage you to write articles and reports that other IMS members would find interesting. Contact the IMS Bulletin at bulletin@imstat.org

## What is “Open Forum”?

In the Open Forum, any IMS member can propose a topic for discussion. Email your subject and an opening paragraph (to bulletin@imstat.org) and we'll post it to start off the discussion. Other readers can join in the debate by commenting on the post. Search other Open Forum posts by using the Open Forum category link below. Start a discussion today!