No vibrant society would want as its president someone who feels entitled to the honor, or treats it as an honorary title. I am therefore honored by your trust, and shall do my best to earn the honor. No society is entitled to vibrancy either, regardless of its past laurels, without ongoing and on-target efforts by its membership and leadership. For those inclined to consider this assertion merely a new president’s scare tactic, my Chicago colleague Stephen Stigler’s historical account, “Does the American Statistical Association Have a Future?” (*Amstat News*, August 2004, pp 2–3), would be a timely read.

“But why should I care about IMS’s vibrancy?” some of you may ask. “Duh!” perhaps is the most expected answer from any IMS president, but let’s elaborate on that “Duh.” Don’t worry, I am not here to preach about why you shouldn’t ask what IMS can do for you (you should), but rather ask what you can do for IMS (you can). Nor am I here artfully to back out of any “read-my-lips” campaign promises. I didn’t make, or rather need, any. IMS decided a long while ago, wisely, that a sure way to eliminate empty campaign promises was to eliminate the campaign. “Conditioning is the soul of statistics,” as my Harvard colleague Joe Blitzstein emphasizes in all his teaching. The vote of confidence I received was conditioned on *N*=1, where *N* is the population size of candidates determined by the nomination committee (thanks a bunch!). My previous “*N*=2” experiences (at ASA and ISBA) reminded me that this conditional confidence should give me little confidence that the majority of IMS members actually endorse the three priorities in my candidate statement, or even know what they are. Indeed, when I asked about what to write for the President Column, I got a clear message that I should not make the obnoxious assumption that the majority of members actually know who I am. This gives me a perfect excuse to invoke *n*=1, interweaving a few personal stories with the thinking behind the three priorities. As a (reputable) statistician, I am well aware of the danger of relying on *n*=1 (or any *n*). My hope is that these stories can help personalize messages that otherwise may be perceived to be mostly rhetorical.

I was trained, during 1978–1982, as a pure mathematician at Fudan University in Shanghai. Like thousands of youths then in China, I was attracted to mathematics not because I understood any of its applications, but because of an inspiring article on Jingrun Chen, a mathematician who, back in the early 1970s, had made the most significant contribution towards solving the Goldbach conjecture. By comparison, the statistical profession has relied almost exclusively on the applicability and impact of statistics to attract students, researchers and the public’s attention. This has been working increasingly well as we enter more deeply into the digital age. For instance, the Statistical Sciences department at the University of Toronto now attracts more than 3500 undergraduate students from all walks of (college) life to its statistical programs, apparently because of the greater job prospects; see the report here. Even if only a small percentage of them end up being a card-carrying statistician or probabilist, the large denominator greatly drives up the numerator.

There is still more we can and should do, however. For instance, the pure-math community constantly attracts some of the world’s most powerful minds, as well as public fascination, by emphasizing (wisely) not the job prospects of mathematicians, but the sheer intellectual challenges created by the community, from Fermat’s last theorem to the twin prime conjecture. As I argued in XL-Files, job prospects can fluctuate greatly even for seemingly infallible fields (e.g., CS). Yet to solve world’s most difficult problems, as they accumulate, requires an increasing supply of creative minds and deep thinkers. A nano-sample of such problems might include: formulating and guiding data utility and data privacy trade-off, building a general theory for cost-effective and information-preserving data preprocessing, establishing a rigorous evaluative framework for individualized treatments, analytical modeling of digital blockchain networks, coherent statistical learning with many and diverse low-resolution input, and deep understanding of deep learning. It would be a missed opportunity if we do not also emphasize their intellectual demandingness, including being harder to formulate than mathematical conjectures, when conveying their importance and impact. The joy of intellectual pursuit is catnip for brilliant minds, and IMS, being the world’s leading scholarly society for mathematical statistics and probability, is a natural home for such minds. Therefore, we should also compete for them to the fullest extent.

Young minds (of whatever biological age) are not only more curious, they are also more likely to make good use of the joy implicit in “*aha*” moments as fuel for the drive to overcome increasingly higher-

level obstacles. In my own case, although I did well enough in middle school to forgo high school, I ran into a major learning block upon encountering the concept of open sets at college. I just couldn’t comprehend the notion “open” without connecting it to a geometric object. It took a whole week for me to realize that it is fruitful to let the meaning of a property outgrow its original root. I still recall how joyful I was when I finally “got it!” My ability to engage in abstract thinking took off after that “aha” moment. I started to seek difficult abstract concepts instead of avoiding them, ultimately leading me to study abstract algebra for my senior thesis. IMS should cultivate interest from college students from around the world, and from all data science-oriented fields, who live for such moments of intellectual exhilaration. Relative to the pool of graduate students, undergraduates represent an even larger supply of extraordinary minds, with increasingly diverse backgrounds. Indeed, once when I was writing a recommendation letter for a PhD student, I had to replace “the best student” by “the best PhD student,” upon realizing that the most original thinker I ever worked with was an immigrant undergraduate student, whose senior thesis is far deeper than my PhD thesis (and most of my publications).

When I started to take courses at Harvard in 1986, I encountered my second learning block. I was well advised, by a classmate from Fudan who had come to the US a year earlier, to take some hard courses and some easy ones, in order to achieve a “grade-learning balance.” Unfortunately, he neglected to tell me which courses were easy and which ones were hard. Probability Theory was obviously hard, and Regression Analysis surely was on the easier side: how hard could it be to draw a reasonable line across a bunch of points? The reality opened my eyes and my mind. My homework answers on Probability Theory were used by the instructor to replace the TA’s answers, thanks to years of training in getting the right answers in the most elegant and succinct ways (getting right answers would only earn partial credits then). For the regression homework, I knew how to start, but didn’t know when to stop. The residual plots always had the shapes—especially when I stared at them long enough—that my textbook said I should avoid. I took log of *Y*, and of *X*, and of both of them, and sometimes log of log *Y*. Shapes changed, but never disappeared. The book told me to also try the square root transformation. So I did. I also cleverly avoided taking a square root of the square root—there was something called the Box-Cox transformation. More shapes emerged, and some were more amusing than others. But the young course instructor was not amused, when I handed in about 100 pages of computer output. I was called to his office, and I was asked to explain what I had done. I was frustrated. If I had known what I was doing, would I have handed in all 100 pages?? Thankfully, my English was so poor then that soon the instructor had to let me go, saving both of us from going from regression to aggression.

Recently, the same instructor—now a senior professor at a world class university—and I reminisced about those good old days with a few colleagues. He said that he didn’t recall that encounter at all. Instead, he told everyone how once I cracked a mathematical problem for the course on the spot, a problem that he had struggled with for days. I, of course, was flattered, but didn’t have the slightest memory of it. The conversation reminded me of how young minds are more likely to be impressed, not depressed or suppressed, by challenges. It is also a telling example of the effectiveness of having people with different training and experiences work together to solve challenging problems. Back then the ratio of my understanding and skills in probability to that in statistics was essentially infinite, and hence I was more useful for mathematical and theoretical problems. By now, the ratio is essentially zero—I didn’t intentionally decrease the numerator, but I have tried hard to grow the denominator. I now have a far better understanding of the residual plots (I hope!), and much deeper statistical insights with which to conduct principled corner-cuttings and formulate foundational research problems, such as those in *“A Trio of Inference Problems that Could Win You a Nobel Prize in Statistics (If You Help Fund It)”* (http://www.stat.harvard.edu/Faculty_Content/meng/COPSS_50.pdf). But to tackle these problems, I am in far more need of probabilistic and other analytical skills than the young instructor was in need of my mathematical skills more than three decades ago.

Generally speaking, foundational problems in data science require penetrating statistical and computational thinking, complex probabilistic modeling, and intricate mathematical derivations. As my predecessor, Professor Alison Etheridge, rightly and repeatedly stressed in her presidential address during the 2018 IMS annual meeting, there is an increasingly urgent need for statisticians and probabilists to work together as a team, within and beyond IMS, to tackle the most challenging problems in data science. She also stressed the importance of being as inclusive and supportive as possible, for example to female probabilists and statisticians. We need to work together, continuously and creatively, to ensure that IMS excels as a scholarly society for mathematical statistics and probability, establishing and sustaining a socially welcoming and intellectually supportive hub and professional network for young talents in statistics, probability, and more broadly data science. Our key aim is to facilitate collaboration and career development, especially for those who are in most need.

With these priorities in mind, I was very encouraged when NSF (the US National Science Foundation) contacted me during the JSM at Vancouver, inviting IMS to play more a leading role in conducting foundational research for data science, and in helping NSF to shape its priorities and strategies for various data science initiatives, such as the HDR (Harnessing the Data Revolution) investment (https://www.nsf.gov/news/news_summ.jsp?cntn_id=244678&org=CI). As a small but immediate step, I have requested the leadership of IMS Data Science Group to accelerate their efforts to organize a cross-disciplinary conference on Foundations of Data Science, and to increase their emphasis on attracting and accommodating young talents from probability, statistics, computer science, and beyond. I am also forming an IMS Task Force, consisting of probabilists and statisticians, with the charge to recommend strategies and activities aimed at building younger, broader, and deeper pipelines, in terms of demographical and intellectual spectrums. IMS is not immune to the phenomenon of declining membership; see, for example, the trends in the figure below, especially the non-student membership. Yet our vibrant future lies not only in increasing our numbers, but in enriching the scholarly community of probabilists, statisticians, and, more broadly, data scientists. We aim to strengthen our existing foundations, as well as to break the ground within which new ones can be built.

To summarize, foundations cannot be simulated; they necessitate designing, digging, and excavating, and hence, powerful minds and muscles. If IMS is to play a more leading role in building the foundations of data science, probabilists and statisticians need to work more closely and more frequently. This aim should compel us to cultivate collaboration and communication early on in the careers of young talents, and to be as inclusive and supportive as possible in our pipeline-building effort. IMS will also serve data science, and itself, well by attracting many more extraordinary talents from all walks of life, who derive joy and develop perseverance from intellectual challenges themselves. The more broadly and deeply we can attract and engage such talents, the more effectively IMS can lead in building the foundations of data science. Our vitality depends upon exercising our capacities to carry out the tasks right in front of us, while taking a long view of the intellectual challenges that are as yet far beyond us.

Thanks for indulging my long “Duh,” going from vibrancy to vitality. Increasingly, I find myself hoping that I am aging like Château Musar (I cannot afford Château Lafite, at least not those from Château Lafite). But there is no escape from the reality that in youth (which is both an inward and outward quality) lies vitality. I surmise that few of you would have the patience to hear yet another anecdote from my youth to illustrate this point. But in case you feel your time has been wasted reading my anecdotes, I’m happy to refund your time by reading your stories (president@ims.org), especially if they can inspire us to build, collectively, a more vibrant and vital IMS.

Until the next time (with stories from the older me), please stay young and vibrant!

]]>Congratulations to all of them!

]]>Donald B. Rubin, professor of statistics at Yau Mathematics Center, Tsinghua University, and professor emeritus at Harvard University, has been appointed Murray Schusterman Senior Research Fellow in the Department of Statistical Science in Temple University’s Fox School of Business. Don is a fellow/member/honorary member of the Woodrow Wilson Society, Guggenheim Memorial Foundation, Alexander von Humboldt Foundation, American Statistical Association, Institute of Mathematical Statistics, International Statistical Institute, American Association for the Advancement of Science, American Academy of Arts and Sciences, European Association of Methodology, the British Academy, and the US National Academy of Sciences. He has authored or co-authored about 450 publications (including 10 books), has four joint patents, and is one of the most highly cited authors in the world, with nearly 250,000 citations

Edoardo M. Airoldi will join the Department in fall 2018 as the Millard E. Gladfelter Professor of Statistics and Data Science. He will also serve as director of the Fox School’s Data Science Center. Edo comes from Harvard University, where he served since 2009 as a full-time faculty member in the Department of Statistics. He founded the Harvard Laboratory for Applied Statistics and Data Science and served as director there until 2017. Additionally, he has held visiting positions at MIT, Yale University, and Microsoft Research, and served as a research associate at Princeton University. A distinguished researcher, Edo has authored more than 140 publications with over 12,000 citations. His work focuses on statistical theory and methods for designing and analyzing experiments on large networks and, more generally, on modeling and inferential issues that arise in analyses that leverage network data in some way.

Vishesh Karwa also joined the department as an assistant professor. He is also a Patrick J. McGovern Research Fellow at the Simons Institute, Berkeley.

]]>The recipient of the Second Akaike Memorial Lecture Award is Professor **Mike West **of Duke University, USA. Professor West’s contributions to Bayesian statistics include seminal work in dynamic modeling, and the implementation of nonparametric models that paved the way to practical data analyses via the first realization of large-scale simulation-based methods. Professor West himself has also actively worked at the frontiers of various research fields to which Bayesian statistics can be applied and contributed to the creation of data-driven sciences. For example, he established a new approach for biomarker discovery using gene expression data, thus creating a novel trend in -omics biology based on data analysis.

The award ceremony and memorial lecture will be held during the plenary session of the Japanese Joint Statistical Meeting 2018, which will take place at the Korakuen Campus of Chuo University on September 10, 2018.

More information: http://www.ism.ac.jp/ura/press/ISM2018-05_e.html

]]>]]>

A great deal of statistical analysis is aimed at making inferences from a sample to a population. This might be with a view to predicting the value of some characteristics for previously unseen population members, for predicting future values in a series of observations, for understanding mechanisms and processes which generate the data, or for other reasons. Statistical theory, based on a sound understanding of probability developed over the past three hundred years, has given us very powerful tools for such inferences. Increasingly, however, we find ourselves faced with a rather different data paradigm. Instead of sampled data, we now find ourselves presented with what is claimed to be *all* of the data.

For example, supermarkets do not record just some of the transactions made; details of all of them are entered in the database. Tax offices retain all of their records, not just a random sample. Schools do not test just some of their students, but all of them.

Data of this kind have been termed *administrative data* (Hand, 2018). They are data collected during the course of some administrative exercise, not primarily with a view to statistical analysis, but simply to run an operation—a supermarket, a tax system, or an education system in the examples above. But once the data have been collected, the extra cost of retaining them in a database is negligible. The data represent a sort of exhaust from the process, and can be immensely valuable for shedding light on the organisation and operation that is generating the data.

This potential has stimulated a burst of enthusiasm for the analysis of such data. In fact, it is often claimed that such data have significant advantages over traditional sampled data.

The first claimed advantage is the negligible cost of data acquisition compared with, say, conducting a survey, since the cost is already mainly borne through the administrative exercise which is driving the measurement. That is true, but effort—and hence cost—will be needed to quality assure and perhaps clean the data, as well as link them to other relevant material. Furthermore, the cost of collating a supermarket’s transaction data might be negligible to that supermarket, but you try asking for the data and see if the cost is negligible to you.

Secondly, as we have already noted, for very good reasons we might expect *all* of the data to be there. It is true that all of a particular supermarket’s transactions can be retained in its database. But there is more than one supermarket. Different supermarkets often tend to target slightly different population segments, so that their data sets differ in subtle ways. If your aim is to describe the past customers of the organisation for which you have data, that’s fine, but if you hope to go beyond that then various imponderables arise.

Thirdly, one might justifiably expect the data to be of high quality, since otherwise the organisation is likely to go bust. A supermarket which incorrectly charges for its goods may not last long. It’s true that there is no sampling variability in the data, but there will be all sorts of other issues. Product switching with new electronic self-checkouts provides an example. Here people have been seen to enter other vegetables as carrots, since carrots are typically the cheapest vegetables sold, far cheaper than items like avocados, for example. One supermarket even appeared to have sold more carrots than it had in stock.

There’s also a more subtle (but arguably at least as important) aspect to data quality. This is that the data might not be ideal for answering your particular research question. After all, they were collected to run the organisation, not primarily for later analysis to shed light on the organisation. The definition a tax office uses for “employed” may not match the definition a social scientist would like to use.

Fourthly, and this is one of the keys of the “big data” revolution, at least in commerce and government the data will be as up-to-date as it is possible to be. Supermarket transactions enter the database essentially as soon as they are made. Contrast this with the delay of months which may arise if the data are collected by a survey. Once again, however, while the data might be instantly available to the supermarket, it is unlikely to be so readily available to external analysts.

Fifthly, administrative data tell us how people really are behaving, not how they say they behave. This suggests you cannot get any closer to social reality than with administrative data. Which is fine if that social reality is what you really want to study. Carrot misrepresentation may be social reality, but it is really a minor aspect requiring resolution as far as the macroeconomics of the supermarket are concerned.

Finally, it is claimed, administrative data necessarily provide more precise definitions than alternative sources: you know precisely the make of the vodka being sold. But the fact is that sometimes simplifications are necessary. Government trade statistics group different goods into larger classes, inevitably glossing over subtle distinctions. Credit card transactions do not record at the micro-level of the precise nature of the goods being bought.

There is no doubt that, in the face of globally declining survey response rates, new strategies for data collection have great attractions. In many ways, these alternative sources of data have properties complementary to more traditional sources. And that, of course, tells us the way forward. Combining, linking, and merging data from different kinds of sources is likely to yield more accurate and more insightful perspectives on the systems we are trying to understand.

—

**Reference: **

Hand, D.J. (2018) Statistical challenges of administrative and transaction data (with discussion). *Journal of the Royal Statistical Society, Series A*, **181**, 555–605.

Recent years have seen the dramatic rise of data science, revolutionizing industry and science. The NSF-funded National Academies consensus report entitled *Undergraduate Data Science: Opportunities and Options* (NASEM, 2018) noted that “as more data and ways of analyzing them become available, more aspects of the economy, society, and daily life will become dependent on data.”

Much has been written on the growth of data science and the role of statistics plays within it (see for example Donoho, 2017). Historically, working in data science has required a graduate degree. However, many reports indicate a shortage of well-trained data scientists to fill new positions, with many opportunities now available to those with appropriate undergraduate training. Given the demands of the workforce, the committee, chaired by Laura Haas (University of Massachusetts/Amherst) and Al Hero (University of Michigan) was charged with setting forth a vision for undergraduate data science with a focus on applications of and careers in data science.

The second chapter of the report laid out key concepts that data science professionals need to know. Building on the work of De Veaux et al. (2017), the report proposes “data acumen” as a framework for the education of future data scientists. This requires “exposure to key concepts in data science, real-world data and programs that can reinforce the limitations of tools, and ethical considerations that permeate many applications”. The committee outlined ten (overlapping) areas fundamental to developing data acumen: *Mathematical foundations; Computational foundations; Statistical foundations; Data management and curation; Data description and visualization; Data modeling and assessment; Workflow and reproducibility; Communication and teamwork; Domain-specific consideration; *and *Ethical problem solving*.

Mathematics is essential to data science, but questions remain about what type and how much mathematics is needed for bachelors’ graduates. The committee identified key concepts that would be important for all students, including set theory and basic logic; multivariate thinking (via functions and graphical displays); basic probability theory and randomness; matrices and basic linear algebra; networks and graph theory; and optimization.

Statistics was also seen as foundational to data science. Key concepts identified by the committee include variability, uncertainty, sampling error, and inference; multivariate thinking; non-sampling error, design, experiments, biases, confounding, and causal inference; exploratory data analysis; statistical modeling and model assessment; and simulations and experiments.

The third chapter of the report focused on how to develop courses (e.g., data science for all, introduction to data science) and programs (e.g., certificates, minors, and majors) that would provide flexible pathways to students. The fourth chapter reviewed challenges and barriers that need to be addressed in developing data science programs. The fifth chapter reiterated the key role that formative and summative assessment and faculty development plays in advancing data science.

What are the implications of the report and the growth of undergraduate data science for statisticians and the IMS? De Veaux et al (2017) noted that: “Students should understand the basic statistical concepts of data collection, data wrangling, data analysis, modeling, and inference. … Successful graduates should be able to apply statistical knowledge and computational skills to formulate problems, plan data collection campaigns or identify and gather relevant existing data, and then analyze the data to provide insights.”

More work is needed to create courses and flexible pathways that can provide sufficient mathematical and statistical background without a long succession of prerequisite courses, while also ensuring that students have strength in algorithmic thinking, data technologies, and domain knowledge.

The report notes that data science is in a formative development stage with robust growth likely. Academic institutions are recommended to “embrace data science as a vital new field” and “provide and evolve a range of educational pathways to prepare students for an array of data science roles in the workplace” (NASEM, 2018).

More discussion is also needed about future preparation at the graduate level, to ensure that interested data science graduates at the bachelors’ level are able to matriculate and successfully complete doctoral programs in statistics.

At a time when many (most?) institutions are pioneering data science programs, it is important for mathematical statisticians to ensure that they are part of the process of attracting students with varied backgrounds and degrees of preparation and preparing them for success in a variety of careers.

—

**References: **

De Veaux, R., et al. (2017). Curriculum guidelines for undergraduate programs in data science, *Annual Review of Statistics and its Applications*, **4**:15-30. https://www.annualreviews.org/doi/abs/10.1146/annurev-statistics-060116-053930.

Donoho, D. (2017). 50 Years of Data Science, *Journal of Computational and Graphical Statistics*, **26**:4, 745–766. doi:10.1080/10618600.2017.1384734.

National Academies of Sciences, Engineering, and Medicine (2018). *Data Science for Undergraduates: Opportunities and Options*. Washington, DC: The National Academies Press. doi:10.17226/25104.

These are interesting times for statistical science departments throughout the world. The demand for a statistician’s expertise is at an all-time high across a multitude of sectors: tech, finance, health and, not least, government and academia. This gives us the power to grow and prosper as long as we can adapt quickly enough to a rapidly changing environment that may seem challenging to the more traditional aspects of our culture.

But I bring good news from Toronto! Just like our city has had to manage enormous growth over the last decade, so has our department had to navigate the tumultuous waters of growth and change that carry potential hazards but also great promises. So how did the University of Toronto’s department of statistical sciences manage to accelerate from (at most) two job searches a year to over 30 in the last five years, and build a large network of joint interests with other departments, including the usual suspects e.g., Computer Science and Public Health, but also Astronomy and Astrophysics, Information sciences, Psychology and Sociology, to name just a few? Read on.

The story begins about eight years ago when the world learned from Google leader Hal Varian that statisticians will be the sexiest professionals at the vortex of tech revolutions engulfing the world. Subsequently, the world learned that Data Science is much more than a semi-pleonastic association of terms. From Tukey’s wishful thinking—nicely summarized in David Donoho’s 2015 (“50 Years of Data Science”, *JCGS* 2017) seminal piece that went viral before it even hit the printing presses—to Silicon Valley’s high demand for scientists able to swim in a sea of data, emerged a vague, yet seductive idea about the kind of training that is suitable for the modern world. It turns out that *ideal* is a multifarious scientist capable of handling superhuman computing tasks and juggling sound statistical methods while building authoritative subject-matter knowledge that can impact scientific discovery.

Incoming students at the University of Toronto have heard the call loud and clear which is why our specialist, major and minor programs have exploded in size. Since 2012, our undergraduate programs have *yearly* increased by 25–40 percent, leading to the current cohort of over 3,500 statistics major, minor and specialist students. Yes! You have read those numbers correctly, and if it gives you pause as a neutral observer, imagine how *we* felt.

The pressure on our department has been enormous. Instead of panicking, the former chair of our department, James Stafford, has coordinated an ambitious plan to develop professional Master of Science programs, make numerous joint hires with other data-rich programs in the University and develop a strong undergraduate teaching culture with new course initiatives that have quickly turned us into a showcase for the Faculty of Arts and Sciences. A quick visit to our webpage (www.utstat.toronto.edu) will reveal the extent of these successful initiatives and the unparalleled (to my knowledge) growth of our department. We have currently over 30 research- and teaching-stream faculty in the department, working in statistics and its intersections with computer science (machine learning, visualization, neural nets, etc.) and other disciplines, such as sociology, genetics, neuroscience, public health and more. These departments have a genuine need to make sense of their data, usually big piles of them. In turn, working with other departments opens new horizons for our faculty, introducing them to interesting problems and allowing our department to grow in directions that would have remained unexplored otherwise. The mantra “follow the data” has never been more fitting than now.

Of course, these innovations could have led to many outcomes, somewhat less scintillating. As we take a retro- and prospective look at our evolution, we realize that at the core of our story lie a penchant for data-driven vision augmentation as well as many wonderful students, faculty and staff, who have bootstrapped this department into becoming an important center for Data Science research and education.

Speaking of very good people, I would like to encourage your undergraduate trainees to apply to our graduate program, while we invite applications from your PhD students and postdocs for our eight open positions.

]]>Thriyambakam Krishnan, former Dean of Academic Studies and Professor at the Indian Statistical Institute, passed away in Chennai, India on July 4; he was 81. Jointly with G.J. McLachlan he authored the masterly and popular Wiley text on the EM algorithm and its applications. Dr. Krishnan, a PhD student of C.R. Rao, specialized in experimental design, classification, and unsupervised clustering during the early years. He served as consulting statistician to Systat and to Mu-Sigma in India. At the time of his death, he was still teaching courses in probability and statistics at the Chennai Mathematical Institute.

]]>After more than six years being published through a cooperative agreement between the IMS and the Institute for Operations Research and the Management Sciences (INFORMS) Applied Probability Society (APS), Stochastic Systems has become an INFORMS journal. The mission of the journal remains the same: Stochastic Systems is the flagship journal of the APS. It seeks to publish high-quality papers that substantively contribute to the modeling, analysis, and control of stochastic systems. A paper’s contribution may lie in the formulation of new mathematical models, in the development of new mathematical or computational methods, in the innovative application of existing methods, or in the opening of new application domains. Stochastic Systems is open access and there are no submission fees or page charges.

Relative to other applied probability outlets, Stochastic Systems focuses exclusively on operations research content. Relevant work includes, for example, queueing, and papers that explore the ties between applied probability and statistics, optimization, or game theory. (These are just a few examples.) We are also interested in papers in a range of application areas, including, but not limited to, service operations, healthcare, logistics and transportation, communications networks (including the Internet), computer systems, finance and risk management, manufacturing operations and supply chains, market and mechanism design, revenue management and pricing, the sharing economy, social networks, and cloud computing.

The editorial board provides direct evidence of our breadth of interests, and if you don’t see someone in an area that covers your paper, then why not contact me (sgh9@cornell.edu)? I’ll provide you with a quick decision about relevance of your paper to the journal.

**Why should you publish your work in Stochastic Systems?**

- 1. All papers published in Stochastic Systems are open access. There are no submission fees or page charges. Published papers are available at http://pubsonline.informs.org/journal/stsy where you can also find author guidelines and much more.
- 2. Stochastic Systems aims to return reviews to authors within three months of submission, and to publish articles online within two months of acceptance. I am proactive in striving to keep to this timetable. Issues appear quarterly.
- 3. The editorial board of Stochastic Systems includes several IMS members. Indeed, one of our goals in publishing the journal is to strengthen the ties between the IMS and the APS.
- 4. Stochastic Systems is publishing, and will continue to publish, work of the highest quality. This quality is in evidence in its editorial board, which includes five Fellows of INFORMS (Jim Dai, Paul Glasserman, Peter Glynn, John Tsitsiklis, Ruth Williams), three Fellows of the IMS (Dai, Glynn, Williams), three Fellows of the IEEE (Bruce Hajek, R. Srikant and Tsitsiklis), and four members of the National Academy of Engineering (Glynn, Hajek, Frank Kelly, Tsitsiklis)! See the journal website for the full editorial board.

I’m looking forward to receiving your submission. Submit your work at http://mc.manuscriptcentral.com/ssy/

]]>

So, how was I going to write a Presidential address? The best way to start seemed to be to see what my predecessors had said, and so I took to the internet. There I found a series of articulate and thoughtful pieces, each addressing issues of importance to our profession, with, understandably, a particular recent emphasis on data science. I was left more than a little intimidated. I toyed with trying to convince the authors that plagiarism was the highest form of flattery, but decided that this was not an appropriate solution, and so, instead, what follows is a somewhat more personal reflection on statistics, probability and, of course, the IMS.

Nonetheless, I shall draw upon the 2015 address of Erwin Bolthausen. Erwin opened with a question:

If one opens any scientific work about a topic where statistics plays a role, there are usually probabilistic concepts behind. How does it then come that probability theory and statistics, in research, have become more and more separated? The answer is to some extent evident:

Probability theory has nowadays many relations with other mathematical fields, and also with applied fields outside statistics.

For modern statistics, probability is just one crucial basis, but there are many more, often also non-mathematical ones. For instance, one has to decide which probabilistic models lead to computational feasible procedures, and still mirror the reality close enough. This cannot by answered by probability theory.

He then goes on to remark on the lack of statistical training amongst probabilists, at least in Europe, and to say that:

I think that in the modern development of probability, the relations with pure mathematics and with mathematical physics have become stronger than those with statistics.

I would perhaps put a slightly different *spin *(a pun for those who have read Erwin’s excellent summary of some of the most spectacular developments of modern probability theory) on the situation. And, of course, given my own research interests, I have to mention the tremendous impact that biology has had on all of the mathematical (including statistical) sciences.

It is true that both statistics and probability have grown at an unprecedented rate in recent years—we are told that this is the golden age of statistics; equally, we see probabilistic approaches underpinning huge swathes of modern mathematics. And, conversely, statistics and probability are calling upon a broader and broader range of techniques from the rest of the mathematical sciences—for example, one needs a far greater understanding of abstract algebra than mine to really come to grips with rough paths (or more generally, Martin Hairer’s regularity structures). In turn, the father of rough paths, Terry Lyons, now spends a large part of his time at the Alan Turing Institute in London (the UK’s National Institute for Data Science and Artificial Intelligence), developing applications of rough paths to data science.

The importance of statistical and probabilistic techniques across science has never been more widely recognised. I thought that I’d share a quote, which I learned from Sebastian Schreiber, from the Department of Ecology at UC Davis. It is from the English poet and dramatist John Gay (1685–1732):

Lest men suspect your tale untrue, Keep probability in view.

I confess that including this here is a little gratuitous, but I rather like it. (I’m also trying to match Erwin’s earliest citation. I’m failing of course: Erwin referenced *Ars Conjectandi*, published in 1713, but this quote is from *“The painter who pleased everybody and nobody”*, Fable XVIII in Gay’s first book of fables, published in 1727.) Here’s the whole of the first verse:

Lest men suspect your tale untrue, Keep probability in view.

The traveller leaping o’er those bounds, The credit of his book confounds.

Who with his tongue hath armies routed, Makes even his real courage doubted:

But flattery never seems absurd; The flattered always take your word:

Impossibilities seem just; They take the strongest praise on trust.

It seems to me that, despite when he lived, Gay had the makings of a statistician as well as a politician. And perhaps there is a lesson for data science there. Gay tells us that if we make unrealistic claims, people won’t even believe in the results and applications that we really do have. Certainly a very large part of the UK Industrial Strategy seems to be founded upon the claim that data science (and perhaps more especially AI and Machine Learning) is the solution to all our economic and social woes; but if we are really to rely upon data science as an underpinning technology in our daily lives, so that those lives very literally depend upon it, then it must maintain its scientific integrity: “Keep probability in view”.

My point is that I don’t think that it is a problem that each and every one of us is part of a huge intellectual landscape, of which we only *really *understand some very small part. In fact, we have been in this boat for a long time. Science is a massive continuum; gone are the days of renaissance (usually) man, mastering all of natural philosophy (and probably writing a few poems on the side). There is simply too much of it. As long as we can communicate with one another, we can combine our expertise, often to stunning effect.

When I was a young Research Fellow in Oxford, in the College where the great mathematician G.H. Hardy had been a Professor, other Fellows were a little too fond of reminding me that Hardy had believed that mathematicians did all their great work before the age of forty.

Young men should prove theorems, old men should write books.

I found this rather depressing—but a distinguished Professor, already considerably beyond his best-before date if we were to believe Hardy, explained to me that yes, mathematical scientists are supposed to be at the height of their computational powers when young, but there is also a “knowledge” distribution—as we grow older we know more and more (I confess that I am no longer convinced that this increases indefinitely); our power as scientists is a convolution of these distributions. In fact, these days many of us work in teams; we can combine the breathtaking speed and skill of our students with the knowledge and experience of scientists with complementary expertise. Some of our colleagues, especially at the more applied end of the field, are involved in huge collaborations. They are just one part of a complex picture—but a vital part.

We also have to be open to the idea that work of no immediate obvious practical benefit may still be of lasting importance. Hardy, of course, believed that mathematics would never be of any practical use and indeed he took pride in this:

No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world.

He was wrong, of course; his work has been of tremendous practical importance. But he also knew about teams (not just through his cricket team, “the Mathematicals”). He collaborated widely and, especially impressive for the time, internationally. His network included Pólya, Cramér, Wiener. I wonder how he would have felt to learn that probably more people know his name through the Hardy–Weinberg equilibrium than through his profound contributions to analysis and number theory.

Teamwork requires a team. That in turn requires the networks that Hardy so successfully cultivated. And I claim that here the IMS has two important roles to play. The first, most obvious, is to facilitate and cultivate those networks (at least within statistical science). This is the purpose of **IMS groups**. They have suffered varied fates, but what is clear is that the responsibility for running a group cannot rest on a single person’s shoulders indefinitely—without some sort of scaffold, the whole thing lasts for as long as the energy of its founder. So when Sofia Olhede and Patrick Wolfe offered to take over the **Data Science Group** last year, we agreed that we would put in place a minimal governance structure, borrowed to some extent from that of the highly successful **New Researchers Group**. If, as I hope, the group flourishes, then their terms of reference can provide a loose template for others.

Crucially, the Data Science Group has an executive committee that not only reflects the geographic diversity of the IMS, but also the immense range of interests overlapping data science, including high-dimensional statistics, biostatistics, algebraic statistics, Bayesian methods, big observational data, probabilistic methods, and post-secondary education in data science. We really want this to succeed, and I’d like to thank Sofia and Patrick for everything that they have done so far—anyone who has put together a scientific program committee will know just how hard it is to achieve subject, geographic, and gender diversity, but with their executive committee they have managed just that. It has taken a while for us to get this far, but the Data Science Group has an email address [datascience@imstat.org] and a website [http://groups.imstat.org/datascience/], and a session set up at the JSM in Vancouver; and much more is planned for next year.

Groups are only one way in which the IMS can help. The pressure on less established (and possibly also more established) researchers is to publish large numbers of papers, which in turn incentivizes a very narrow and specialized view. This can lead to a very distorted view of the world. Another distinguished Oxford mathematician had a very nice allegory of this. Imagine you are in a dark room. You turn on a lamp at your desk and the objects that are illuminated suddenly seem large and important. If I move the light and shine it somewhere else, other objects take on more importance. When someone turns on the overhead light, I see that everything is equally important (and that what I was looking for was not under my light at all). Although most of us will always be rather specialized, we need to be able to stand back and take a broader view.

Of course, the annual **IMS New Researchers Conference **is a great contribution here—people from very different backgrounds meet and exchange scientific ideas, as well as tips on how to navigate the treacherous waters of academia, in a relaxed environment. But this only reaches a relatively small number of people. Another extremely important contribution is through our scientific programmes and, in particular, the special lectures. The **IMS Medallion and named lectures **(some in collaboration with the Bernoulli society) provide an opportunity to showcase, at an accessible level, some of the most exciting research from across our discipline. More thanks are certainly in order here, both to the New Researchers Group, for inviting me along to their meeting again this year, and to the Committee on Special Lectures, who have once again put together a phenomenal slate of lectures.

I want to say a little more about governance. I have spent much of my academic life in Oxford. If anyone wants to learn about the ways in which an Oxbridge [i.e. Oxford or Cambridge] College is governed, they should take a look at *Microcosmographia academica*, written by Francis Cornford, the husband of Charles Darwin’s daughter Frances. It is a satire on university politics which came out in 1908, but in places it still rings very true. Cornford is very good at drawing out the reasons for doing nothing. I particularly like the Principle of Unripe Time: the argument is that although a particular action should be taken, now is not quite the right moment to take it. He goes on to say,

Time, by the way, is like the medlar[a fruit]; it has the trick of going rotten before it is ripe.

The extraordinarily long institutional memories in Oxbridge Colleges, combined with the application of Cornford’s principles of academic government, can make it extremely difficult to elicit change. The IMS has almost the opposite problem. For us, institutional memory is rather short. We (rightly) have three year terms of office on essentially all our committees, and because we are spread across the globe, new members of committees don’t routinely run into old-timers and gossip at the water cooler.

Here, I want to point to a specific issue. We have just announced the election of a list of twenty fantastic new Fellows. Every single one of them is an outstanding scientist and deserves their election. However, there are only two women among them. This is not the fault of the Committee on Fellows, who performed their charge admirably; only two women were in fact nominated. This prompted us to take a look at nominations over a longer period and we see a cyclical trend. It seems to be that numbers fall, action is taken (probably by an individual), numbers increase, things look fine, everyone relaxes, numbers fall.

I have never thought of myself as a diversity champion, and don’t find it to be a particularly comfortable role. Indeed, I find the issues to be extremely difficult. But in 1867 the British philosopher and political theorist John Stuart Mill delivered an inaugural address at the University of St Andrews in which he said,

Let not any one pacify his conscience by the delusion that he can do no harm if he takes no part, and forms no opinion.

He goes on to say something similar to a dictum often (incorrectly it seems) attributed to the British statesman Edmund Burke around 1800:

All that is required for the triumph of evil is that good men do nothing.

Evil is a strong word. Even Cornford only regarded women as the *second* most dangerous threat to the young academic; young men in a hurry (those data scientists not paying heed to the statistical underpinnings, perhaps?) came in at number one—but it is certainly wrong that more women are not being nominated. Burke did say something relevant:

No man, who is not inflamed by vain-glory into enthusiasm, can flatter himself that his single, unsupported, desultory, unsystematic endeavours are of power to defeat[evil].

So I am certainly, rather belatedly, willing to step forward and do my bit. But I can’t do it alone and I have been trying to get some structures in place to help us to break the cycle. Council has agreed guidelines on unconscious bias and conflicts of interest for all our committees, and to the creation of a diversity committee. But we all need to be much more proactive, especially in seeking nominations.

I hope that an allegory of my own will prove appropriate. There is a very famous bridge [*pictured above*] in Cambridge, England, often called “the mathematical bridge.” It was designed not, as many believe, by Newton, but by William Etheridge (a distant relative) in 1748, and built by James Essex in 1749. The design is based on that of the wooden structures used at the time as supports upon which to build stone bridges. Etheridge himself was a carpenter who worked on the building of the first bridge to cross the River Thames at Westminster in London. He is credited with inventing an underwater saw to cut through the piles that ran down into the riverbed, so that once the stone bridge was complete, the wooden supporting structure could be lowered into the water and floated away. Maybe one day we’ll be able to allow the diversity bridge to float down the river…

Diversity means much more than gender. Diversity is ethnicity, geography, discipline, …* *. Our strength as a society comes from that diversity. It is not without its challenges, but equally there is no doubting the rewards. It is also not just about diversity among the Fellows and across our scientific programme; we need diversity across our membership—the broadening of the membership is lagging behind the broadening of our journals. Part of the issue in Europe is the division between mathematics and statistics in our degree structures and training. Erwin pointed to the lack of statistical training among European probabilists and I think this leads to a lack of proper appreciation and respect. I’m sure that when I was a graduate student I would have taken much less pride in my own intellectual “family tree” than I do now: I was supervised by David Edwards, who led the functional analysis group in Oxford. He was supervised by David Kendall, who was supervised by Maurice Bartlett, and continuing backwards we have John Wishart, Karl Pearson and Sir Francis Galton. Not a bad line-up. In fact, I only discovered this fairly recently and realised that, far from being rebellious and striking out on my own when I left Oxford and functional analysis behind, in this company I was merely reverting to type. I have heard some question the relevance of the IMS to probabilists (although I notice that those asking routinely publish in our journals). I am still far from being a statistician—most of my work is either concerned with infinite-dimensional stochastic processes or population genetics (or more likely both) and I am still acutely aware of my lamentable lack of knowledge of statistics—but that doesn’t reduce the relevance of the IMS to me. In fact, I had a quick scan of the papers I’ve written in the last couple of years and realised that I had used results of at least six past-Presidents, in the last two years, not to mention the person who talked me into accepting the role of President, and several of our new Fellows. And, yes, I try to publish in our journals, because they are simply the best.

I’d like to reiterate to non-members that by joining the IMS, one is in a position to help shape the scientific programme, the future of our journals, and the contribution to the profession of this great society.

Fortunately, I didn’t set out with grand aims when I accepted the gavel from Jon just under a year ago. I hope that some of the things that have been initiated will reach fruition over the next year. Of course, Presidents don’t do the real work, all they do is set up an *ad hoc* committee and issue them with a charge, and I would like to say a huge thank you to all those colleagues who responded so generously to my requests for input, especially the very large number who I have only ever ‘e-met’, and so have never been able to thank in person.

As I come to the end of my term as President, I still have an immense sense of pride in the IMS. And so I’ll end by plagiarising my own piece in the *Bulletin* a year ago: The IMS is a badge of academic quality; we publish outstanding journals and our Committee on Special Lectures have once again excelled themselves in their contribution to the scientific programme. But most of all, the IMS is a community of scholars that supports and nourishes talent from right across the spectrum of our discipline. Long may it thrive!

]]>

The next President-Elect is **Susan Murphy**, and the five new members of Council are: **Christina Goldschmidt, Susan Holmes, Xihong Lin, Richard Lockhart and Kerrie Mengersen**. All of these will serve a three-year term starting at the IMS meeting in Vilnius in July 2018.

Christina, Susan, Xihong, Richard and Kerrie will be replacing on Council Andreas Buja, Gerda Claeskens, Nancy Heckman, Kavita Ramanan and Ming Yuan, whose terms end July 2018. They will be joining Jean Bertoin, Song Xi Chen, Mathias Drton, Elizaveta Levina and Simon Tavaré (whose terms run through July 2019); and Peter Hoff, Greg Lawler, Antonietta Mira, Axel Munk and Byeong Park (who will be on Council until August 2020).

In addition to these elected members, IMS Council is made up of the Executive Committee and the Editors, who serve ex officio. The Executive Committee will, from August, comprise:

President: **Xiao-Li Meng**

Past President: **Alison Etheridge**

President-elect: **Susan Murphy**

Treasurer: **Zhengjun Zhang**

Program Secretary: **Ming Yuan**

Executive Secretary: **Edsel A. Peña**

Jon Wellner, outgoing past-president, will be leaving the Executive Committee after three years’ service. Judith Rousseau will be stepping down after six years as Program Secretary, replaced by Ming Yuan.

The IMS journal editors are: Bálint Tóth (*Annals of Applied Probability*); Amir Dembo (*Annals of Probability*); Tilmann Gneiting (*Annals of Applied Statistics*); Ed George and Tailen Hsing (*Annals of Statistics*); and Cun-Hui Zhang (*Statistical Science*). T.N. Sriram is Managing Editor for Statistics & Probability, and for the *IMS Bulletin*.

Thanks to all the outgoing officers, editors and council members for their dedicated service to our institute. Thanks, too, to all the candidates, and all who voted!

]]>Left-right: Noel Cressie, Kerrie Mengersen, Ruth Williams |

**Noel Cressie** is a world leader in statistical methodology for analyzing spatial and spatio-temporal data, and its applications to environmental science. His fundamental contributions changed the basic paradigm for analyzing observations in space and space-time. Noel has also contributed to research on pollution monitoring, climate prediction, ocean health, soil chemistry, and glacier movement, and is a NASA Science Team Member for the Orbiting Carbon Observatory-2 mission. Responding to the huge volumes of complex data in environmental research, Noel has made ground-breaking innovations for big data analytics for remote sensing and climate change.

**Kerrie Mengersen **has made internationally recognised contributions to the field of Bayesian statistics. She has consistently maintained a dual focus on statistical methodology and its application, with methodological contributions at the frontier of Bayesian theory, methodology and computation, and applied contributions to substantive problems in health, environment and industry. Kerrie is also well known for her leadership ability and passion for developing young researchers in statistics and the applied sciences.

**Ruth J. Williams **has been admitted to the AAS as a “Corresponding Member” (this is a special category within the Fellowship, comprising eminent international scientists with strong ties to Australia who have made outstanding contributions to science) for outstanding scientific contributions to her field. Ruth was born in Australia. Her work has had a deep and lasting impact on heavy traffic analysis within the field of stochastic networks. In 2016, she was awarded the John von Neumann Theory Prize “for seminal research contributions over the past several decades, to the theory and applications of stochastic networks/systems and their heavy traffic approximations.”