Can science select our next eventing team?

In a WEG or a Games year, one favorite pastime in the world of the horse, is speculating about who will, and who will not, make the Team. Perhaps the hardest to evaluate is in the eventing. Here there are so many factors, here is it very hard to create a head-to-head final trial. So how do you really measure Badminton against Adelaide? How do you pick the riders who rise to the occasion, and leave out the ones who go to pieces? How do you spot the combination on the rise, or the ones about to slip into obscurity? Impossible you say? Think again, a couple of Irishmen (!) have crunched mountains of stats and come up with ground breaking results!

I must confess that I have long been a sceptic when it comes to ‘sports science’, not because I have anything against science, but because the quality of the ‘science’ served up in the horse world has been so often simplistic, or worse, simple minded. That is until I found myself listening, fascinated, to Australian eventing selector, Georgia Widdup. Georgia competed herself and got as far as a World Cup final in Sweden, but she is brighter than the average bear – a qualified lawyer, and when she starts applying her formidable intellectual skills to eventing, and eventing selection, listen hard.

She began to tell me about a company in Ireland, Equiratings, that was crunching the numbers on our sport with amazing results. The project began ten years ago, when Sam Watson, an international four star eventer began to track his own performance on paper against the world’s best. It’s now an award winning business with clients in fifteen countries.

So how does it work? It’s hard to give a simple answer because their data base can answer so many different questions, but let’s look at an issue that is very real for the future of Australian eventing, the changes in format that will be in place for the next Games in Tokyo. Okay we know they are very dumb changes and the effect will be so disastrous that they will only last one Games, but the truth is that there are going to be teams of three with no drop score. And that is a worry for Australia, because equiratings tells us that we tend to star or not make the finish line.

This is how Georgia explained it to me:

You were saying that one thing the stats told us was that we had a very low reliability rate of cross country completion, compared to Germany and the UK…

” One of the data points that equiratings has been able to produce is a ‘reliability rating’. This data point tells you the likelihood of any individual competitor completing the competition. Equiratings have provided us with some analysis they have done on the last 5 championships. This was pretty fascinating as it showed that it’s the Brits who’ve got the highest reliability rate of cross country completion. They are at 94%, Germany is 88% and we are down to 58%. It is an issue for our team. Now we need to adjust this somewhat for selection strategy – at London, we had five in the team, with three to count, so your strategy in selection was recognizing that you had two potential drop scores, which is obviously different to a championships where no drop scores are allowed – as is the case under the new rules – then I think your approach changes, getting home becomes way more important.”

“The Equiratings figures show that if you select three combinations each with an 80% reliability rating (equiratings tells us that an 80% reliability rating is high), the statistics say three times 80%, you end up on a 50%. That means we have only a 50:50 chance of ending up with a team come showjumping day. What that tells you is that reliability is going to be everything if you want to finish with a team. The calculations from the last 5 championships showed us that based on our current reliability ratings, we had a 30% chance of finishing with a team score at the Tokyo Games.”

Ouch!

“It is interesting to look at the Brits – their reliability is really high, and you wouldn’t think that, given that they didn’t have a great time cross country at Rio, but what we do know is that they have stops, but they finish. And clearly that is not what we are doing. At the last few championships, our riders have tended to go clear, or be eliminated, whereas the Brits’ horses have jumping penalties cross country but they don’t have eliminations to the same extent as we do.” … now we don’t have the option of a drop score, the eliminations are a killer.”

This is an example of how the system works analysing past performance, in this case, Shane Rose’s Virgil:

Shane and Virgil at Adelaide

What do they tell us about our dressage vis a vis the rest of the world?

“Our dressage is third, very close to most competitive nations, and we are sixth for showjumping, which we would have known.”

The Equiratings stats also allow us to come up with an objective ranking of the degree of difficulty of the various events around the world, this has been a long standing bone of contention, just how do you measure a score at Adelaide against a score at Burghley or Pau…

Yes this is what they call the ‘High Performance Rating’. This is an individual number that is assigned to an individual performance. This number adjusts for a range of factors – firstly the score that was achieved (ie lower overall finishing score = better HPR score) but it also adjusts for what we would normally consider a range of subjective factors. Difficulty of the cross country course (including time), difficulty of the showjumping course, and subjectivity across dressage judging.

more follows

“One of the great thing about these statistics is that it gives an objective rating for a particular event, not an average of a series of events over time. So if in the run up to say, Badminton, you want to canter round a CIC or not do the cross country, it doesn’t pollute the high performance score that comes out of the main competition that you are preparing for. Each separate performance generates its own high performance score. It’s a great way of comparing an individual result at Adelaide to Badminton to Pau to Saumur to Wallaby Hill. They can be compared in a pretty accurate way.”

“They say they have 52,000,000 data points to feed into that. Often you look at it and think intuitively, that makes sense, but it is also a great way of picking up the odd performance that you might have thought looks good, but that is actually better than that – or the performance that looks really good, but it isn’t really all that great. It’s a great way of comparing apples to oranges which is always a great challenge to us as selectors – Australia versus USA versus Europe, 3-star versus 4-star, CIC versus CCI, it’s a great way of making all those comparisons.”

So they have taken into account all the aspects of an event – terrain, cross country course, dressage judging, standard of the showjumping and so on, and come up with an objective ranking?

“They have a whole lot of data points to assess how difficult a cross country course was relative to another one. It is based in part on what they know about the horses in the competition. They know the likelihood of each individual competitor going clear cross country, or having a stop or whatever, so they can then measure what actually happens against what was likely to happen, and you can tell if the course was harder or easier than the average. If they know that you Chris, generally score 45 in your dressage, based on everything they know of your horse, then on a particular day it comes in a does a blinder and scores 40, that’s fine, that might be a great test. But if they can see that everyone in the field is scoring five marks better than what they usually score, then there is an adjustment in the HPR score.”

“Same with the cross country. If we know based on the data, that of the horses in this field, 10% of them are likely to have cross country penalties and/or time penalties, but everyone jumps clear and goes under time, then there is an adjustment to recognize that it was a slightly easier track. Same with the showjumping, if you have horses that the data indicates are four faulters and everyone jumps clean, then there is an adjustment. This way we can accurately compare events…”

How did you get on to these guys?

“I started following them a few years ago. There’s a whole safety dimension that I thought was pretty interesting. Mike E-S (Etherington-Smith) has done a bit with them, I think around the safety area, so he’d gotten to know them as well. After Rio, Chris Webb went to speak to them, asking what we could do from a High Performance perspective. They came up with a deal, and we signed up, from now to Tokyo. I think we are in the early stages of understanding the full functionality of what they can do. They are very keen that we use the data, and we use it well. We’re still trying to get our heads around it, so I talk to them quite a bit. The AIS have supported us with Dr Alison Alcock – a statistics guru with a PHD in performance analysis and she is really interested in using this data to help us improve our performances – so over the next few years, I think we are going to use this data better because there is a multitude of uses for it.”

“One is for selection, and that’s sometimes just cross referencing intuitive insights. I kind of feel like this, does the data back it up? But then there is a whole performance angle, using it for riders to improve their performances, so they can have a look where they are relative to other riders.”

Can we go back and look at the top five and top ten percent?

“We had the guys do a ranking to see where Australia is at four and three-star level, so we could get a bit of a comparison of riders based in Australia, in England, in Europe and in the US. It is just a ranking, but a few things stood out. At four-star, Chris Burton has three of the top ten performances for Australia, on three different horses. Sam Griffiths has two of them, both on his mare, we knew that’s a pretty impressive horse. Then Sammi Birch and Emma McNab both fit into the top 10 based on their performances at Pau. It tells us, that of the places where Australians competed, Pau was very much the most difficult competition, and those performances of Emma and Sammi, were outstanding.”

Santano – in the top 4 but not ours any more…

“To get into the top 10% for 4-star events, you have to have a ranking of 158 and we’ve only got two over 158, Burton on Santano (who has been sold) and Nobilis and Griffiths on Paulank Brockhurst. If you go down to three-star, the cut-off is 132, so we’ve got another three that fit inside the top 5%, Santano again, Tapner on Prince Mayo (also sold) and Hoy and Cheeky Calimbo. Last year may not have been our best year, but I for one, think those Pau performances were pretty exciting.’

Emma McNab and Fernhill Tabasco – the stats say they are exciting

Is there any way we can look at the results from Pau, and we know the course builder there, Michelet, always throws up a pretty difficult track, but we are really selecting for Tryon, can we rank Mark Phillips’ tracks and include the information from previous WEGs and get some idea of how all that will translate?

“We haven’t done it yet, but I know we can say to the guys, these are the four or five people that we are looking to take to Tryon, can you run some scenarios for us based on the possible combinations of three in the team. You can do their performances based on previous results, but also with that reliability over-lay. Another example, recently I asked them to show me what the likelihood was of the riders producing their best performances at championships, and they are looking into that.”

You had a riders meeting recently, how did the riders react to the stats?

“The UK based riders have met with the Equiratings guys, but it is still in the preliminary stages where everyone is getting their head around it. I met with the Australian based riders at the end of the year, and the focus of that was very much talking about these reliability statistics, which I think really surprised them, like it surprised us. Looking at individual results it does throw up the time faults and the one showjumping rail… riders are always focused on their dressage scores and if they are generally decent jumpers they are not focused on the one or two rails, and these stats bring all that into focus.”

“Equiratings makes great podcasts and I really want to get our riders listening to them and thinking about the issues they raise. They have done some great analysis of the scores as they happen at Rio, and they make the point that if the media knew what was really happening round by round, we could make our sport so much more marketable. There was a point in time at Rio where it was so finely balanced, but no-one really knew what was happening because no-one knew how to interpret the statistics. What was happening was so exciting, but none of the commentators were across it, they didn’t know how to read what was happening, and understand how the drop score was going to work. These guys make the point – oh, we just lost the biggest opportunity to promote our sport by not having a really sophisticated understanding of it!”

“What they do with this tool, is the first horses have all gone for all the teams, we already know who the second, third and fourth horses are going to be, you can really see where the day is going. Say the French have run their absolutely most reliable horse first, and it has a penalty, I don’t think that happened, but say that happened, that affects France’s cumulative potential in a whole lot of ways, and using this tool, we know what the horses that are coming are likely going to do in both the cross country, and the showjumping, so you can say okay, our last horse is about to run, we know what he can do, we know how our horses showjump, then you can make different decisions. For instance, the Americans let their last horse go the direct way at an influential fence, where it fell, if they knew really accurately where they were standing, they might have said, go the long way, take the 10 penalties, and we can still go well.”

It’s sophisticated enough that you could have it in the tent with you at Tokyo?

“Yes, Alison is going to really get across it, and we are hoping to take her to Tryon and use the WEG as a practice run. My view on Tryon is, yes we have a drop score but our whole mentality should be, this is a dress rehearsal for Tokyo and everyone has to finish, and we should be acting like we didn’t have a drop score.”

“I feel it’s a fascinating scenario and we can get ahead of the rest of the world if we start practicing now. I haven’t experienced it, but everyone says there is a different level of pressure at the Olympics, and the pressure was really on in the past if you were running fourth, and we had already lost a horse – now every single rider is going to be in that position. How do you prepare for that?”

Can you see a day where we actually pick a team on the basis of these numbers?

“I don’t think so. I think there is always more information that gets fed into us, horse soundness, fitness, there’s a whole lot of that stuff that we take in, but I definitely think the stuff this system produces is going to become more relevant. I think the days of all the selectors traveling to all the events around the world are done. It’s expensive, it’s time consuming, and you can actually access so much footage from the events, and that footage with this sort of information, you don’t need a lot more than that. Personally, I think it really helps, to see the horse at least once, so you get a sense of what the horse is like. I like to think I’ve seen all the contenders at some point, but other than that, we are not in a funding position to have people traveling all over, it’s not a good use of resources. The video footage, plus this, is a great way of doing it.”

Conversely, can you see the day when someone goes to the Court of Sport Arbitration and says, I have an Equiratings score that is better than that rider who had been selected…

“Definitely, but I don’t think it will be successful. This information is helpful but it is only part of the picture. I have put into our selection policy that we will use this data, but we won’t rely on it, and it won’t put one person ahead of another. Ultimately if you want to challenge the selection you have to do it on the basis of the policy.”

There is also an important predictive analysis in terms of safety…

“The other element of what these guys do is the quality index. The basic premise is often that big falls have followed a similar trajectory, something like 20 penalties, 20 penalties, 20 penalties, fall. That’s perhaps what you intuitively think, the horse is running out and the rider suddenly gets really determined and holds the horse to the face of a fence and they fall. They say there is a predictability, not in every fall but in a lot of falls. Using their algorithms, which are way more sophisticated than I can convey, they can rate combinations: red, orange, green. Green obviously is fine, orange, they are saying hang on, there’s something in your form that we think is of a concern, and red, there is a lot in your form that we think is of concern.”

“I think now in Ireland, if you come up red you now can’t enter a competition at that level, you have to go back a step and do some competitions successfully at a lower level. I think they are working with the FEI to bring that a version of this into all FEI competitions, and here, they are working with EA, and with the safety officer, Roger Kane to look at these traffic lights. They were saying to me at Badminton last year, that there was a rider who had got in on points, but whose previous form included a couple falls cross country. That raises the question, should you be at Badminton? Surely it is in the sport’s interest to protect some riders from themselves…”

But you and I know, going to events over so many years and you look at a particular combination, and you know there are going to be tears, and more often than not, there are…

“Okay, accidents happen, but we do know that some combinations are accidents waiting to happen, but what they are do is really clever in terms of improving safety in the sport.”

2 thoughts on “Can science select our next eventing team?”

Sophie Roberts says:

April 24, 2018 at 12:25 pm

Wow, that is really fascinating! Great to see our sport catching up with the rest of the world and utilising all the technology and data available to us. Looking forward to seeing how this impacts our result at Tryon!
Brigid Woss says:

October 21, 2018 at 3:20 pm

How did this new technology impact on our results at Tryon?
Thank you for this information. I do enjoy The horse magazine on line when I get time to read it.
Thank you again..but I still miss having a written copy in my hand.

Comments are closed.