Statistics in the Age of AI: Reflections from the GW Panel
I was invited as a panelist to GW Statistics Department's 90th birthday conference on "Statistics in the Age of AI". Here are my thoughts from the panel discussion and the ideas that shaped my responses.
I'd like to thank the conference organizers again for the invite, Xiaoke, Judy, Tanya and Subrata. The panel was chaired by Ron Wasserstein; the other panelists were Gizem Korkmaz, Jiayang Sun and Jenny Thompson.
Speaking at the GW Statistics 90th anniversary panel (that's Jenny Thompson next to me)
Overall Take
Compared to other panelists, I had very ambitious/crazy AGI timelines. I expect to have AI models as capable as COPPS winner level statisticians in 3-10 years (4 years is my median estimate, 10 years is my 99.99% confidence point). This should change the overall approach of statisticians to the discipline, and how we should be teaching statistics. All of my answers are based on these assumptions. Given all the other AGI timelines, I don't particularly find my estimates to be too aggressive.
With that in mind, here are the panel questions that I participated in. I'm listing each question and my take below.
Should we continue investing in statistical theory as a core research priority—or pivot toward building tools and heuristics that solve problems one at a time?
tl;dr: No, statisticians should not pivot, but the nature of theoretical work will (and needs to) change.
- Most theory doesn't result in applications and that's ok. We have to be honest about theory's track record: a small percent (10%, maybe more?) results in good to great statistical applications. That's completely fine, and that's how things work. People pursue these hard mental problems because they are hard and interesting, and it's almost impossible to foresee what kind of theory would yield the best statistical application solutions beforehand. So, yes, statisticians who are doing just pure theory work should keep working on this.
- Statistical theory is finding its use in the development of AI. High-dimensional statistics helps with mechanistic interpretability; we've been working on sparse recovery for 20+ years, and now it's crucial for understanding what's happening inside billion-parameter models. There are massive open questions around evaluation too. Current AI metrics are overfit and lack uncertainty quantification. How do we know if a model will work on problems it hasn't seen? Statisticians have been tackling generalization for decades. Even small improvements matter enormously here: when training runs cost billions, a 5% efficiency gain from better statistical methods saves hundreds of millions.
- Once AGI is here, work will be very different. Imagine having dozens of superstar PhD students who can work 24/7 without getting tired. That's what's coming. Theory development won't be a solo activity anymore - it'll be about managing and directing AI research assistants. If you're a great theorist, start thinking about how you'd leverage a team like that. How do you break down problems for AI to tackle? How do you verify their work? The best theorists will become research directors, not individual contributors.
How can we redesign statistics curricula to meet the demands of an AI-driven world without losing the soul of the discipline?
tl;dr: We don't change the curriculum, but we need to reinvent the classroom experience.
In 5 years, ChatGPT will be smarter and more knowledgable than most instructors. What does the classroom experience look like then? Why are the college students still waking up in the morning and going to the class?
There is one way I can see this happening: classes are truly interactive with some form of active learning that also utilizes LLMs with each student, AND they are centered on discussions between students where the lecturer acts as a guide (think HBS case studies). Approaching a statistical problem requires computational thinking (breaking down the problem, abstracting away some stuff etc.), why can't that be the focus of the class? Once LLMs are better at modeling data than us, we probably do not need most students to learn how to model; instead we need them to be critical consumer of statistics, i.e. people who can evaluate the models, analyze/improve the approach, and/or identify wrong claims/assumptions. These ideas are easily teachable through interactive guided sessions, mentorship from lecturers, and these are very hard to replace with an experience that solely consists of a computer screen and a chatbot.
AI also enables something new: personalized classes. Classes can focus on core concepts and the big picture, then students branch out based on their interests. Theory-lovers can explore proofs with AI tutors. Applied folks can dive into real datasets. Everyone gets what they need without slowing anyone down.
How do we thoughtfully incorporate AI tools—like ChatGPT—into our professional activities, such as research, technical writing, and teaching?
tl;dr: Go all in on AI tools but be radically honest about your use.
Statisticians should fully embrace AI, in pretty much everything. Definitely let AI handle the boring stuff, writing boilerplate code, cleaning messy datasets and drafting paper sections you're struggling with. But, as models improve, statisticians should also use it to generate and evaluate ideas/hypotheses, develop complex codebases, and help you with interdisciplinary work (both in communication and in execution). Statistics is perfectly positioned to benefit from this with the focus on algorithms, coding, and math.
But here's the key: the community needs radical honesty about AI use. If ChatGPT helped write your paper, say so. We need to avoid AI-to-AI busy work loops. Instead, use AI to actually move your thinking forward. Have it generate 20 approaches while you pick the best one. Let it critique your experimental design. Share what works too - which prompts, which tools, which workflows. We're all figuring this out together.
These were the main questions I tackled during the panel. The other panelists brought different perspectives; there was definitely some caution on how soon we will get these AI capabilities but nobody called me out on my timelines, so I'm assuming they somewhat agreed with that too. Here is to hoping that statistics navigates this new transition well.
AI Notes: I used Claude Opus 4 in editing this article.