Hello everyone!
I was talking with CJ Selig recently and she convinced (guilted?) me into taking a look at what happens when a privateer qualifies well.
Why is this interesting? It’s interesting because there has been so much shifting of rules, ideas, and solutions recently to try to make the World Cup Finals live stream “interesting”, or maybe “complete” is the right word, that CJ thought it would be good to take a step back, look at the data and then make a conclusion. I said don’t tempt me with a good time, CJ. Well, she did. And now we’re here.
The Problem
A little backstory. So the problem originated back in Lourdes in 2015 when Aaron Gwin crashed and got disqualified in qualifying. He then went on to win in the finals. The problem was that, because he didn’t qualify in the top 20, he wasn’t on the live feed and therefore, no one saw the winning run in Lourdes that year. So they added the protection rule that said if you’re top 20 overall, you should be on the live feed.
Ok, great. But then we had the case of riders who were low in the overall in the previous year getting hurt or crashing in the first few races and being off the live feed/not protected. So they added the rule that if you were top 10 overall the previous year then you were protected for all of the current season and the rest of the spots got filled with riders who were currently in the top 20 and not in the top 10 the previous year.
i.e. You could theoretically have to be top 10 overall to be protected if everyone that was in the top 10 overall last year was out of the top 10 this year. This is why you sometimes see only 17-19 people protected.
Very confusing I know.
To make matters even more confusing, the live feed uses different rules even still. It seems like all of the people who are protected are on the live feed and if you are not protected you have to qualify top 10 to guarantee a spot on TV. But there also seems to be a bit of discretion and/or movement of these rules.
Great, Eliot. What’s your point? My point is it’s bloody difficult to get on the live feed if you are a privateer!
So before we say anything, let’s think about what we are optimizing for. Considering being on the live feed doesn’t affect the outcome of the race (ok, ok, taking weather out of it and the argument of a privateer not being able to get sponsors because they weren’t on the feed) then I think it’s safe to say were are optimizing for the most entertaining show possible.
Ok, cool, so what makes a show entertaining? To each their own, but I would say I want to see performance at a high level, I want to see my favorite riders battling it out, and I probably want a bit of drama or build up in there. Now, we start to see why it might not be a good idea to include riders with a higher overall in the mix because while you could potentially have a high level of drama, they are probably not my favorite rider, and they might not perform at a high level.
Exploring that might is what this article is about.
How likely is it that a privateer that qualifies well, will finish well?
If we know this, we can say with better accuracy that they have the ability to satisfy our performance criteria on a regular basis and based off that, we can say whether they should be included in the feed more often than they are now.
To start off, let’s do some exploration around the probability of a rider getting a certain finals position based on their current overall.
p.s. I just barely scratched the surface on this and as I began digging in, there are soooooo many interesting things to look at here. This isn’t meant to be an in-depth analysis, more of an overview and an outlay of some data. I’ll try to add tidbits that I thought of as we go along!
p.p.s. You may be wondering why the number of races goes down in the tables (i.e. overall 1 has done 132 races and overall 20 has done 121) That’s because we are only looking at finals, so as you go down the overall, you are less likely to qualify for a race.
The Data
overall Races Top 20s probability_of_top_20 0 1 132 114 86.363636 1 2 133 116 87.218045 2 3 134 116 86.567164 3 4 132 99 75.000000 4 5 130 106 81.538462 5 6 113 92 81.415929 6 7 126 96 76.190476 7 8 125 82 65.600000 8 9 130 78 60.000000 9 10 119 80 67.226891 10 11 126 81 64.285714 11 12 122 81 66.393443 12 13 127 69 54.330709 13 14 130 60 46.153846 14 15 125 72 57.600000 15 16 126 56 44.444444 16 17 122 61 50.000000 17 18 118 62 52.542373 18 19 113 64 56.637168 19 20 121 54 44.628099 20 21 109 32 29.357798 21 22 120 58 48.333333 22 23 115 47 40.869565 23 24 112 49 43.750000 24 25 101 38 37.623762 25 26 112 34 30.357143 26 27 112 42 37.500000 27 28 99 36 36.363636 28 29 113 36 31.858407 29 30 111 33 29.729730
One of the things that make World Cup Downhill so interesting (and annoying as a rider) is the variance and uncertainty of the results. You can see there is a relatively linear drop off until you get to around 30. Even after that, it doesn’t go much below 10%, which means if you race for two seasons and are in the top 80, you will probably have at least one top 20 at some point.
As an aside, it’s a bit more complicated than that, as over the season/years I could be a regular top 10 person and have bad luck and/or crashes that put me lower in the overall, which means the same person could have gotten a top 20 for overall 34, 29, and 50 and it would skew the probability of those overall positions. As always, this is correlation, not causation.
The other interesting thing is the bump at 81. This is because, historically, to be in the top 80, you had to have World Cup points. This meant if you had a lot of points from last year (they take a year to expire), but crashed at the first race and didn’t qualify, then you would be 81st. So there have been a disproportionate number of fast people racing as #81. Cool!
overall Races Top 10s probability_of_top_10 0 1 132 105 79.545455 1 2 133 105 78.947368 2 3 134 102 76.119403 3 4 132 86 65.151515 4 5 130 79 60.769231 5 6 113 71 62.831858 6 7 126 70 55.555556 7 8 125 56 44.800000 8 9 130 56 43.076923 9 10 119 50 42.016807 10 11 126 45 35.714286 11 12 122 39 31.967213 12 13 127 34 26.771654 13 14 130 26 20.000000 14 15 125 22 17.600000 15 16 126 28 22.222222 16 17 122 19 15.573770 17 18 118 22 18.644068 18 19 113 20 17.699115 19 20 121 24 19.834711 20 21 109 11 10.091743 21 22 120 19 15.833333 22 23 115 22 19.130435 23 24 112 23 20.535714 24 25 101 17 16.831683 25 26 112 11 9.821429 26 27 112 17 15.178571 27 28 99 10 10.101010 28 29 113 9 7.964602 29 30 111 15 13.513514
Um, yeah. Getting a top 10 is hard. There’s a big difference between becoming a top 20 guy and a top 10 guy. To put it in perspective, if you are around 15th, you have a 15% chance to make up those 5 spots and get a top 10. Compare that to getting a top 20, if you are around 30th, you have a 30% chance to get make up those 10 spots.
Goes to show that making up spots gets harder and harder and harder as you move up the ladder.
overall Races Top 5s probability_of_top_5 0 1 132 87 65.909091 1 2 133 82 61.654135 2 3 134 77 57.462687 3 4 132 61 46.212121 4 5 130 43 33.076923 5 6 113 31 27.433628 6 7 126 36 28.571429 7 8 125 27 21.600000 8 9 130 27 20.769231 9 10 119 29 24.369748 10 11 126 14 11.111111 11 12 122 11 9.016393 12 13 127 17 13.385827 13 14 130 9 6.923077 14 15 125 8 6.400000 15 16 126 11 8.730159 16 17 122 6 4.918033 17 18 118 7 5.932203 18 19 113 7 6.194690 19 20 121 7 5.785124 20 21 109 6 5.504587 21 22 120 6 5.000000 22 23 115 8 6.956522 23 24 112 6 5.357143 24 25 101 8 7.920792 25 26 112 4 3.571429 26 27 112 12 10.714286 27 28 99 3 3.030303 28 29 113 0 0.000000 29 30 111 4 3.603604
We see even more of a dropoff for a Podium.
overall Races Wins probability_of_win 0 1 132 34 25.757576 1 2 133 30 22.556391 2 3 134 14 10.447761 3 4 132 9 6.818182 4 5 130 4 3.076923 5 6 113 6 5.309735 6 7 126 7 5.555556 7 8 125 4 3.200000 8 9 130 8 6.153846 9 10 119 2 1.680672 10 11 126 3 2.380952 11 12 122 2 1.639344 12 13 127 3 2.362205 13 14 130 0 0.000000 14 15 125 2 1.600000 15 16 126 1 0.793651 16 17 122 0 0.000000 17 18 118 1 0.847458 18 19 113 0 0.000000 19 20 121 0 0.000000 20 21 109 0 0.000000 21 22 120 1 0.833333 22 23 115 1 0.869565 23 24 112 1 0.892857 24 25 101 0 0.000000 25 26 112 0 0.000000 26 27 112 3 2.678571 27 28 99 0 0.000000 28 29 113 0 0.000000 29 30 111 0 0.000000
This is crazy. You pretty much have no chance of winning the race if you are outside the top 3 haha.
I think as we have looked at better and better results, the variance has gone down. This makes me think of a few things
- The riders <= 10 in the overall deserve to be paid a lot
- Contrary to what we might think, that perfect run still isn’t going to net you a win/podium if you are outside the top 20 or so
- Overall is the most predictive single feature of the current season’s final results (Also based on a bunch of other stuff I’ve done) (Also, Also, hint for your fantasy team)
overall Races Top 20s probability_of_top_20 0 1 116 102 87.931034 1 2 119 104 87.394958 2 3 118 106 89.830508 3 4 114 94 82.456140 4 5 106 91 85.849057 5 6 87 73 83.908046 6 7 89 72 80.898876 7 8 93 70 75.268817 8 9 91 64 70.329670 9 10 77 61 79.220779 10 11 82 63 76.829268 11 12 79 66 83.544304 12 13 83 53 63.855422 13 14 73 43 58.904110 14 15 70 48 68.571429 15 16 57 36 63.157895 16 17 56 35 62.500000 17 18 61 38 62.295082 18 19 49 41 83.673469 19 20 51 36 70.588235 20 21 43 21 48.837209 21 22 65 45 69.230769 22 23 52 35 67.307692 23 24 49 34 69.387755 24 25 44 28 63.636364 25 26 39 20 51.282051 26 27 43 31 72.093023 27 28 35 20 57.142857 28 29 30 17 56.666667 29 30 28 18 64.285714
Interesting! The lower overall probabilities didn’t change much (probably because they were already qualifying top 20) but the higher ones changed a lot! I think this is down to two things: 1. Like I mentioned before, you have riders who should be in the top 20 overall, but aren’t and 2. If you’re on fire in qualifying, it is going to be rare that you’re not on fire during the race.
overall Races Top 20s probability_of_top_20 0 1 103 93 90.291262 1 2 102 90 88.235294 2 3 102 94 92.156863 3 4 91 77 84.615385 4 5 81 72 88.888889 5 6 60 53 88.333333 6 7 65 52 80.000000 7 8 64 53 82.812500 8 9 58 47 81.034483 9 10 48 42 87.500000 10 11 47 39 82.978723 11 12 45 35 77.777778 12 13 44 34 77.272727 13 14 33 21 63.636364 14 15 34 24 70.588235 15 16 19 11 57.894737 16 17 24 18 75.000000 17 18 16 13 81.250000 18 19 20 17 85.000000 19 20 19 13 68.421053 20 21 11 7 63.636364 21 22 20 13 65.000000 22 23 19 15 78.947368 23 24 21 15 71.428571 24 25 16 11 68.750000 25 26 11 8 72.727273 26 27 23 19 82.608696 27 28 12 7 58.333333 28 29 7 5 71.428571 29 30 9 6 66.666667
Continuing to even out…
overall Races Top 20s probability_of_top_20 0 1 87 78 89.655172 1 2 76 66 86.842105 2 3 78 74 94.871795 3 4 58 49 84.482759 4 5 49 43 87.755102 5 6 37 32 86.486486 6 7 31 25 80.645161 7 8 37 32 86.486486 8 9 30 24 80.000000 9 10 24 21 87.500000 10 11 17 16 94.117647 11 12 21 16 76.190476 12 13 14 13 92.857143 13 14 12 9 75.000000 14 15 13 8 61.538462 15 16 4 2 50.000000 16 17 6 3 50.000000 17 18 2 2 100.000000 18 19 5 5 100.000000 19 20 7 6 85.714286 20 21 4 3 75.000000 21 22 7 6 85.714286 22 23 5 5 100.000000 23 24 7 5 71.428571 24 25 5 4 80.000000 25 26 3 2 66.666667 26 27 9 9 100.000000 27 28 2 1 50.000000 28 29 2 1 50.000000 29 30 2 2 100.000000
Again, we see more and more people having a 100% probability to get a top 20 if they qualify well. I didn’t dig into the individual results but just at a glance, I see # 256 which was Sam Blenkinsop in 2008 when he won qualifying and won the finals. So just because we have a high overall, doesn’t mean that it is a slow rider.
Take Aways
I came into this thinking that the current way the Live Feed worked was a bit biased. Now, I think it’s extremely biased. haha just kidding, I actually think they probably are solving the problem about as well as they could be. It’s important to keep in mind what the business goals are for things like this which is why we defined a metric at the start.
Looking at the data, if it were me and I was only optimizing for the best show then I would make the top 10 the same top 10 from qualifying because I think we have seen that if you qualify top 10 you have a pretty good chance at doing well again. I would then fill 10-20 with anyone who is protected (to take advantage of that new protection rule) that didn’t qualy top 10, then I would fill the rest of 10-20 with 10-20 from qualifying regardless of overall.
This is pretty much what they are doing.
Now if we just wanted to be fair, I think there is a pretty compelling case to be made that if you qualify top 20 then you are probably going to get top 20 which means you should probably be on the Live Feed! But I don’t have data on fairness so let’s leave that out.
Sustainability for the Sport
This is a bit of a classic trade-off between short term and long term gains. In the short term, we should ONLY put the top riders on the live feed because that will get the most viewers RIGHT NOW. For the long term, we should only put the people who qualify well on the live feed because they will have more opportunities to get sponsors and exposure.
I think it’s important to put ourselves in the position of having to optimize for the life of a sport and, to be honest with what the trade-offs are and be realistic that there is no right answer. I think it’s obvious that Loic Bruni gets more views than the guy in 30th, but the guy in 30th might turn out to be Loic Bruni in 3 years.
As I said, there are so many more things we could look at and take into account, but I think I’ll leave it open for you to form your own opinion and thank you for reading another nerdy article by yours truly đŸ™‚
Would read again. Awesome insight.