Page 2 of 4
Q&A: Women's Elite Cycling Team Uses Analytics to Put Them Ahead of the Curve
The use of advanced analytics in professional sports is still in its infancy. According to Tim Alexander, senior product marketing manager at TIBCO Software, "Applying data science – using different methodologies to find trends -- is just starting" in professional sports. One example of analytics in sports is an elite women's professional cycling team based in Silicon Valley, Team TIBCO-SVB. The team of women racers, who hope to compete in the 2016 Summer Olympics in Rio de Janeiro, is using data analytics to analyze training and race data and optimize individual and team performances. In this interview, Alexander discusses specific ways the cycling team is analyzing its data to improve results using technologies such as live data analysis -- and where he sees the future of analytics and sports. "It's going to be fascinating," he says.
BI This Week: How long has TIBCO been a sponsor of the women's cycling team? How does the sponsorship benefit the team and TIBCO?
Tim Alexander: TIBCO-SVB is the longest-standing women's pro cycling squad in North America. TIBCO is proud to have supported the team as title sponsor since the team's inception. Team TIBCO-SVB wrapped up its eleventh season [this year] with a long list of victories and accolades at the most demanding races in the world, all with an eye toward the 2016 Olympics in Brazil.
The partnership with TIBCO began when team founder and former Olympian Linda Jackson approached the company's founder, Vivek Ranadivé, with a request for sponsorship to support her mission -- to boost elite-level opportunities for women racers. Originally confined to a traditional sponsorship arrangement, the partnership is now evolving to harness TIBCO's Fast Data capabilities -- specifically the Spotfire data visualization and analytics software -- to analyze training and race data and uncover the bigger picture in the sport. Significantly, it is a development that will bring live data analysis to this arena.
What's interesting about working with the women's cycling team is that they actually generate a lot of data. We're all about fast data, and in a funny way, sponsoring a cycling team just sort of naturally fits the TIBCO message. Here's a team that is doing everything to shave off literally a third of a second to win an event. The only way to really do that effectively, to shave those portions of a second here and there to take home gold medals, is to be able to know exactly how fast you are. You need to know where you are slowing down even a little bit, even if you don't feel like you're slowing down. [Our sponsorship] is a nice fit in so many ways, both in theme and in providing us with the ability to see what it looks like to generate a lot of data and then analyze it for trends.
How do you collect data? Do you have devices on the bicycles?
All of the team uses Strava, a tracking software for athletes. It's professional and thorough. Their training data, their race data, every single thing for every individual rider -- it's all recorded in Strava. If it's a small data set, we just analyze it in memory. Otherwise, if it's a huge data set -- every race that we've ever analyzed, for example -- then we analyze it in a database.
Who on the cycling team uses the TIBCO Spotfire software? Can coaches themselves access the data? The riders themselves?
When the collaboration with TIBCO is complete, the team director, the athletes, and their individual coaches will be using the information generated by TIBCO's data analytics to measure and improve performance.
That said, in general, Spotfire is targeted to anyone who needs to ask questions about their data. We've been in this space for a very long time, and our focus isn't creating the prettiest dashboard. Our focus is the person who says, I have all of this data and I need to get my question answered and move along. That user may be an analyst, it may be someone on the marketing team, or it may be somebody who has downloaded a profit and loss sheet from Oracle and needs to quickly answer a question or break something out and then move on.
We are very much focused on the experience of analyzing rather than already knowing the three questions that you need to answer and then making a pretty version of those answers.
We're not about, "Hey, you can change the color here or the background there." We cater to people with data problems, people who might say, "I actually don't know my next question until I ask this question. And as soon as I answer this question, I have another question, and I need to be able to answer that one just as quickly." That's what we do.
[We are geared toward] your average user who says, "I don't know everything about data science. I don't know everything about how to understand all this data I have at a much deeper level. All I know is I have a lot of data and I need to answer some questions and I'd like something to show me some trends. Show me some groupings. Show me some things that I can't see just by eyeballing the data." That's what we do for business users.
Is the use of advanced analytics in professional sports fairly common? Are other cycling teams doing the same thing?
I think this is something that's in its infancy. It's ironic -- sports has always been seen as this classic playground for statisticians. They say that baseball, which is a great example, has this pure dataset that goes back 150 years. There's so much granularity to it. Statisticians have been looking for ways to measure things in sports -- that's what Moneyball is all about, right?
The idea of applying data science -- of using different methodologies to find trends -- is just starting to happen. So far, no one's been saying, "Let's look at races over the past 50 years and see where we've made the most improvements" or "What types of training regiments have dramatically increased a certain metric that we're tracking?"
Even in some of the higher-profile sports, I don't think you're seeing this level of data science applied. It's just not natural to think of starting a sports team or taking over a sports team and right away going out to find a data scientist. That's not the way people think, so I think we're in the earliest stages here.
How might analytics be used in cycling? Can you give some examples?
As an example, one of the things that TIBCO Spotfire has out of the box is line similarity. That means -- and we see this quite a bit -- that when you are analyzing data over time, nine times out of ten, if you're not visualizing it as a line, you're probably wrong. Sometimes you can view it as a bar chart, but in general, lines indicate distance over time.
Let's say you have hundreds of thousands of races that you want to analyze over time. You find one that makes you say, "Wow, this is what we want to do. This curve, which is super-linear -- that's what we want.," so now we say, "Who else has a curve that looks like this one, a curve that is similar to what we want?"
If you could eyeball it, you might say that it looks similar to this race or that race. However, if you're analyzing thousands of races, you want to be able to ask, "Statistically, which race was the absolutely most similar to this ideal curve we're analyzing?" You can do that with a right-click in Spotfire. You might say, "Take this race and rank all the other races in terms of how similar they were to this particular run." We can then see the top 20 and do some additional work from there. That's an example of line similarities.
Here's another example of how data science can give you very objective answers. If I were to sit down and analyze the performances of a range of people on different cycling teams, I might ask, "What is the most important piece of data from a particular race? Is it how fast you start off, how steady you keep your acceleration through the race, or is it getting a certain boost at a certain point?
Everyone is going to have an opinion of what's most important. The interesting thing about data science is it removes the opinion from the conversation. I can say, "Instead of arguing, let's take our best hundred races, analyze them, and find the most important thing in order for [a rider] to be able to get to a certain speed." With data science, you can do something called classification modeling where you ask the software to tell you the most important thing. With Spotfire, it's just point-and-click on a menu. There's no code involved. The answers can surprise people. It's really nice. It removes the uncertainty from the conversation.
You're now seeing advanced techniques that were reserved for hardcore data scientists. We're now seeing some of their techniques prepackaged and ready to use by people who don't make analysis their life; they're now able to take advantage of it.
You see analytics as being in its infancy in sports. Where do you see it going from here?
For professional athletes, sure, we'll discover that there's lots we don't know yet, but in the consumer market, there is going to be so much more data. With the combination of new technologies that let us accurately collect data on the body's motion combined with wearables like FitBit, the Apple watch, coming Android smart watches -- there's going to be so much more data about individuals.
You'll start seeing groupings and clustering that might tell you that you're in this particular percentile and this is the most likely problem you might have with your current exercise pattern. Or here's the most likely path to shave that minute off your seven-mile run.
Analytics for so long has been charts and graphs, but in the next step, predictive analytics is going to be more and more important. We're going to be using techniques to find the subtle patterns that separate the signal from the noise. Combined with all the additional data we're going to be getting from wearable devices, and the better analytics on motion capture, the next five to ten years will make today look primitive in comparison. It's going to be fascinating.