Faculty Q&A: Sameer Deshpande on the present and future of sports analytics

Assistant Professor Sameer Deshpande (photo courtesy of Deshpande)

In recent decades, sports has become an increasingly popular domain for applying advanced statistical techniques. The world of sports analytics has exploded, and its takeoff was just beginning as Assistant Professor Sameer Deshpande was starting his Ph.D. at the University of Pennsylvania.

What began as a class project assessing NBA player performance sparked Deshpande’s interest in using statistical modeling to evaluate decisions and outcomes in sports. He has since pursued a research focus in sports analytics, including explorations of baseball, basketball, and football. He has conducted this work as an assistant professor in the Department of Statistics since 2021.

Deshpande sat down to discuss his background and entry into the sports analytics field, insights from studies on baseball decision-making and football’s health risks, and promising analytical frontiers like leveraging player tracking data. He also reflected on the collaborative environment at UW-Madison that has empowered him to do impactful, interdisciplinary work at the intersection of sports and statistics. The following conversation has been condensed and edited for length and clarity.

Talk a bit about your background before coming to UW-Madison. What led you to join the faculty in the Department of Statistics here?

I knew from the time I started my undergraduate degree that I wanted to go to graduate school and be a professor. I started as a mathematics major at MIT, doing a lot of pure math. Gradually, I gravitated toward statistics because I realized I was very happy as a consumer of mathematics, but maybe not so much as a producer.

I really started my training in statistics in graduate school, largely from the ground up as I worked toward a PhD. I also found a research focus in Bayesian statistics and laid a nice foundation for future research during this time.

Fast forward to 2020, and I had completed my PhD at Wharton and was wrapping up a postdoctoral research role. I saw UW-Madison had an opening, but I didn’t know where Madison was. I applied, and the people were a big selling point right off the bat. During the interview process, I had such lovely interactions with everybody and it just seemed like a really nice place to live and to work. And that’s borne out.

One of your research interests is sports analytics, or using statistical methods to analyze sports-related data. What sparked your interest in this area?

I’ve always been a sports fan, and when I was an undergraduate, sports analytics were just starting to permeate the broader culture. When I started graduate school, I thought it would be fun to do some deeper analysis of sports data. Abraham Wyner and Shane Jensen were working in this area and showed me what academic sports analytics research looked like. They both became very influential in my career.

One of my first research projects was inspired by Dr. Jensen who had written a paper assessing how we rate hockey players and adjust statistically for the fact that some players have very good teammates. This idea is now known as adjusted plus/minus (+/-).

Looking at NBA data, the project allowed us to understand that not only do we have to assess performance vis-à-vis who you’re playing with, but we also have to account for the context in which you’re playing. For instance, if you’re scoring a lot of points toward the end of a 30-point blowout, it doesn’t matter as much, because the game has been decided. So we had to do adjusted +/- on a particular scale that accounted for context. This project got me excited about working with sports data, and it also introduced me to Bayesian statistics, which is an approach I’ve used heavily in my career ever since.

Not only do we have to assess performance vis-à-vis who you’re playing with, but we also have to account for the context in which you’re playing.

You’ve also looked at plate discipline in baseball and developed a way to statistically evaluate which pitches batters should swing at and which they should not. Can you explain what you found there?

A lot of times, we evaluate decisions based on the result, like when the Seahawks passed the ball instead of running it at the end of the 2015 Super Bowl. In baseball, if a batter hits a home run, in retrospect it seems like swinging at the pitch was always the right choice in that situation. But was it? It’s difficult to say, because there are many possible paths an at-bat can take. A player can swing and miss, hit a pop fly, ground out, or foul it off, in play or out of play.

Figure 2 from Yee & Deshpande (2023): Framework for modeling the outcomes of a pitch.

Our work tried to do a full probabilistic treatment of the different paths that an at-bat can follow in order to determine whether swinging at a given pitch increases a team’s run expectancy. We considered a lot of different variables in our models. For instance, we’d assume a 3-2 pitch at the bottom of the 7th inning with two runners on and trailing by 5 if the pitch was thrown down and away. In addition, we needed to quantify the uncertainties inherent in a baseball game, and our Bayesian treatment allowed us to do this neatly.

Let’s talk about football. Can you talk more about the work you’ve done looking at the effects of youth sports on health later in life?

Sure. Some colleagues and I did a large study a few years ago using data from the Wisconsin Longitudinal Study (WLS), a project at UW-Madison that followed 10,317 men and women who graduated from Wisconsin high schools in the 1950s. It tracked all kinds of data, including whether they played youth football. They also had regular follow-ups to evaluate mental and physical health.

What we found in this big observational study was that when you controlled for lots of background variables, the people who played football were about as healthy as the people who didn’t. This was kind of surprising to us, because we thought we would find evidence that any amount of participation is dangerous.

This study launched a multi-year effort, and now we have multiple studies that followed up on this, using a similar approach across different sports. In all of them, we’re trying to figure out, ‘Is there a detectable difference in health outcomes?’ To date, we haven’t found anything in the directions people are expecting. We’re not identifying a large average negative impact.

At the same time, I tend to think that any sport with repetitive head trauma can hardly be considered safe, even if we aren’t seeing large population-level effects in our analysis. There is still a lot more work to be done in this area.

Is there an area of sports analytics that is currently especially exciting or promising to you?

There’s a lot of real-time tracking data across different sports coming in now that we haven’t had before, and that’s incredibly exciting, but often researchers have a hard time accessing it. I think there is a real opportunity for sports leagues to work with academic researchers to allow us to access the data so we can work with them synergistically. People are spending all this money to collect this new data, and it would be a wasted opportunity if they couldn’t do anything useful with it.

Finally, tell us about your experience in the Department of Statistics since you started here in 2021.

It has been very positive. It’s a really stimulating environment. The students are great, and my colleagues are great. The larger research environment at UW-Madison is supportive for junior faculty, and people are collaborative and generous with their time. I’m especially appreciative of our department’s senior faculty, who set the tone from the top of being very welcoming and passionate about their work—and they take a real interest in everything that we do.

Teaching-wise, I’ve had a lot of fun. I’ve been able to update our Bayesian Statistics course (STAT 775) and really think about novel ways to teach that material. We’re now seeing more students from outside the department take the course, which is exciting to me.

I’m especially appreciative of our department’s senior faculty, who set the tone from the top of being very welcoming and passionate about their work—and they take a real interest in everything that we do.

Overall, I love the agency and the freedom that’s provided here, and I work with an amazing group of people every day. That’s really the key.

For more information on Sameer Deshpande’s research, visit his website.

To read more about Deshpande’s work studying the impacts of football participation on health, read this article from the College of Letters & Science.