# Anirban’s Angle: A New Core?

Anirban DasGupta writes:

PhD students in statistics departments across the world are asked to take a course on the core theory of inference, the so-called “qualifier theory course”. I took mine at the ISI in Calcutta in 1977. It was masterfully taught by K.K. Roy, and covered what was essentially globally regarded as the core of inference, exponential families, sufficiency, ancillarity, completeness and UMVU, MLEs, Fisher information and Cramér-Rao, asymptotics of MLE, consistency, delta and Slutsky’s theorem, NP lemma, MP and UMP tests, MLR families, LRT, UMA confidence sets, duality of confidence and testing, basic game theory, Bayes rules, admissibility, minimaxity, Wald’s SPRT, and rank tests. Bickel and Doksum had just come out and hadn’t reached India; we used Ferguson (1967), and we all agreed that it was a fulfilling course.

But that was then, and this is now. Through purely personal interactions, I have had anecdotal evidence of some sentiment that a part of what was long regarded as the core of inference is not considered too relevant now. The events and the discoveries of the last thirty years make it necessary to re-evaluate what is the core theory of statistics that a fresh PhD ought to be expected to know, and understand. To get a sense of my colleagues’ pulse, I checked out the syllabi of the core course at Berkeley, Stanford, Chicago, Washington, UPenn, Carnegie Mellon, and Duke. I also contacted 11 experts in the US, Europe, Australia, and India, and solicited their definition of what should be in the qualifier theory course. The responses quite surprised me.

I was surprised by the fantastic diversity of the opinions on what should be in that first theory course. The intersection of the definitions was empty, barring sufficiency, exponential families, and MLEs. But there was an unmistakable desire to put less emphasis on parametrics, unbiasedness, on UMP tests, on certain parts of decision theory, traditional sequential analysis, and the old nonparametrics. On the other hand, the responses included many “new age” topics: the bootstrap, AIC and BIC, VC theory, empirical processes, permutation tests, EM, MCMC, function estimation, sparsity, causal inference, extreme values, and some more.

The responses are revealing. They told me that while there is a sharp hunger for change in that traditional core course, it is no longer possible to have an approximate global consensus on what that first course should teach. The core course will probably become rather local, and we wouldn’t be able to assume that a fresh PhD from a statistics program has seen and been tested on a set of common topics in inference.

What would I teach personally in that first course? After reading all the responses that I received, I think I personally agree with these two statements from John Marden and Philip Stark: the first math-stat course should be about how to think about models and inference and the mathematical (as opposed to computational) framework to attack the problems; and, you need to have some idea of what’s possible, and where to look for approaches, ideas, inspirations, theorems. I think my own dottrina à nouveau could be something like this, the topics and the number of 50-minute lectures on each (43 in total).

Problems and basic principles of inference, selecting and evaluating a procedure, loss and risk, bias, variance, parametric vs. nonparametric modelling, optimality vs. robustness, Hogg’s adaptive estimate (nontechnical; 2); modelling, location-scale, exponential families, mixtures, heavy tails, non/semi-parametric models, dependence (4); data summary, likelihood function, sufficiency, factorization and Rao-Blackwell, definition of UMVU (3); score function, Fisher information, information matrix, Cramér-Rao (3); MLEs, general and in exponential family, some nonregular, some multiparameter, nonexistence, difficulty of computing, refer to EM in Bickel-Doksum (3); simulate the Cauchy MLE and median, statement of Portmanteau, asymptotic normality and Cramér-Rao conditions, observed information and sandwich, two parameter Gamma, plug-in, delta and Slutsky, applications (5); priors, posteriors, conjugate priors, posterior means, posterior and Bayes risk, comparison with MLE, Bayes vs. minimaxity, from Bickel-Doksum (5); testing, error probabilities and power, NP lemma, applications, statement of UMP one sided tests (3); LRT, three examples of Bickel-Doksum, chi square limit (3); confidence sets, duality with testing, t interval, asymptotic confidence intervals, definition of posterior credible interval (3); the empirical CDF; SLLN, Glivenko-Cantelli and DKW; purpose, use, and scope of bootstrap; bootstrap bias and variance estimation; consistency of bootstrap, permutation tests (4); a personal selection from James-Stein and Donoho-Johnstone estimates, sparsity, SURE, Gaussian sample maximum, kernels, RKHS and function estimation, model choice, AIC, BIC, VC and martingale inequalities, Bayes factors, Dirichlet process, Bernstein-von Mises (5).

Perhaps someone else will address what should be taught in Stat 100. Opinions will probably differ on that, too. But that’s an issue for another day.

## 4 Comments

## Leave a comment

## Welcome!

## What is “Open Forum”?

## Categories

- Anirban's Angle
- Dimitris Politis
- From the Editor
- Hadley Wickham
- Hand writing
- IMS awards
- IMS news
- Journal news
- Lectures and Addresses
- Letters
- Meetings
- Member news
- Nominations
- Obituary
- Open Forum
- Opinion
- Other news
- Rick's Ramblings
- Robert Adler
- Statistics2013
- Stéphane Boucheron
- Student Puzzle Corner
- Terence's Stuff
- Vlada's Point
- Welcome
- XL Files

David BanksAugust 11, 2011 at 5:41 pmI’m delighted that Professor DasGupta has raised this issue.

My sense is that our profession needs two kinds of doctoral programs. One kind would emphasize mathematical statistics, and the other would emphasize applied statistics. The appropriate course content for these would have relatively little overlap.

The world needs a great many more high-level applied statisticians. Such people are versatile in computation, knowledgeable in specific areas of application, broad in their toolkit, and capable collaborators/consultants. These are not skills that academic programs have emphasized. Most successful applied statisticians have had to learn these on their own, after graduation.

AliAugust 19, 2012 at 10:38 pmAs an econometrician (which is bcalsaily the bastard offspring of an economist and a statistician) I definitely agree that statistics is very important, and help build your sceptical toolkit.Also, that graph is one of the great untold stories (or under-told stories) of progress. The world as a whole has grown hugely richer and healthier over the last 200 years. The gap between poor and rich is large, but that’s because some countries have pulled away from what was the norm for the whole of humanity before the 19th Century.

Vaidyanathan RamaswamiAugust 17, 2011 at 5:13 pmThe crucial element is not what specific topic is taught, but rather that the course(s) should bring out the rich choices available. Not imparting any methodological bias based on one’s own involvement in a certain set of areas is equally important. In that context, note that with the easy availability of canned software, there is a need for a professional statistician to distinguish oneself from a casual engineer or scientist using statistics. That distinction comes from being one who is not just good at pushing a few buttons but really great at grasping the real problems and tailoring the right solutions to them and inventing them as needed. It is doubtful that without a reasonably good grounding in the mathematics of it all, the latter can be done effectively.

LuanoAugust 19, 2012 at 9:57 pmAwesome! It would have made yet another salniet point had he done another version using a straight scale rather than logarithmic for the wealth (x) axis. Everyone would have been scrunched way over on the left for a while and then shot over super fast. Also, this is just the last 200 years, when these things actually changed. If he had done all of recorded history with a very fast time scale, there would probably very little movement until the last few seconds.