Progress of World Records in Select Track and Field Events and Detection of Doping
Background and Motivation
For those just wishing to play with the app, click here.
The modern Olympiad began in 1896 in order to promote harmony and friendship between participating nations. Despite crass commercialism and episodes of corruption, the Olympic games still bring a period of peace and goodwill every two years in which obscure sports suddenly come to the forefront of our lives. Whether the Jamaican Bobsledders or Irish Curling Team charms us or we are suddenly taken by a Ukrainian gymnast, we all stop and watch the spectacle. The Olympic motto is "Citius, Altius, Fortius" or "Faster, Higher, Stronger," and the competitors serve as exemplars of our best ideals. For better or worse, they are the gods who live among us. Their behavior and actions can influence the values of millions.
Thus, it is in our collective interest to ensure that competition is fair. However, performance enhancing drugs serve to imperil this very notion and are a threat to the Games as a whole. As the heart of the Summer Olympiad, Track and Field (or simply "Athletics" in Europe) is the sport with the highest profile events. The winner of the 100m race is traditionally known as the world's fastest man (or woman). The winners in the high jump competition are the world's best leapers. If you doubt this, consider for one second what it would be like to leap in the air and clear the door frame in a typical room by one foot.
Track events have been besieged by scandal at a distressing rate since 1988 when Ben Johnson had his 100m medal stripped from him after testing positive for anabolic steroids. Soviet and Eastern-Bloc countries had systematic and state-sponsored doping programs beginning in the early 1950's, but steroids were not formally banned at the Olympics until 1976. However, there was considerable debate about the efficacy of anabolics until one saw them in action in the person of Mr. Johnson. His devastation of the best 100m field ever assembled served notice that steroids not only worked, they could make you superhuman. In the intervening years, the world has seen scandals in the US, China, Russia, and other countries. Most recently, Russia was banned from the Rio Olympics in 2016 for running a (yet another!) state-sponsored doping program.
Despite our circumstantial knowledge that many athletes are doping, positive drug tests are a rarity. As we will see, the most troubling aspect of these scandals is that most high-profile cases hinge on the concept of the "non-analytical" positive. In the BALCO scandal, ledgers were found that detailed doping regimens linked with athlete's names, and cancelled checks were found that accompanied shipment details. So, a non-analytical positive is said to occur when there is sufficient circumstantial evidence to conclude that an athlete is doping even though they have not failed a drug test. Some may well remember Lance Armstrong's frequent proclamations that he was one of the world's most tested athletes and had never failed a drug test. Putting aside the failed test for corticosteroids that was expunged on the basis of a back-dated prescription, he is correct. Yet his US Postal Team ran one of the most sophisticated and widespread doping operations in athletic history. This suggests that our current testing protocols are fairly ineffective, and leads one to wonder whether they can be fixed.
At this point, it would be reasonable to ask - if doping is inevitable, then why fight it? We seem to waste lots of energy testing athletes who are one step (or many!) ahead of anti-doping controls. Why not just accept the fact that doping will occur and set some reasonable limits? First, steroids and other drugs are highly correlated with excess morbidity. Young, and frankly stupid, athletes will abuse their bodies for fleeting fame and glory. Others are forced to dope without their consent and/or knowledge and pay the price many years later. If we watch, we participate. If we don't the Olympics and their ideals will perish. In a time when we are increasingly connected digitally but socially isolated, this would be tragic.
Events considered for this study
For now, we will only consider performances by male athletes due to the availability of a much larger data set over a longer period of time. Women only began to complete at races longer than 3000m in the Olympics after 1984, and there was probably never a period when 'clean' competition prevailed given the proclivity of both China and the Soviets/Eastern-Bloc regimes to dope their athletes.
Two of the events we analyzed are "bookends" in track and field - the 100m and 10000m footraces. They represent opposite ends of the respiration spectrum. The 100m race is purely anaerobic and the 10000m race is estimated to be about 95% aerobic. Sprinters have impressive physiques with large muscles. Distance runners are lithe and elfen, and when highly fit, are typically thin to the point of appearing fragile.
In addition, we consider the 1500m race, which is sometimes referred to as the 'metric mile' in the US, although it is 109m short of the imperial distance. This race requires a highly developed aerobic engine with a concomitant powerful finishing 'kick' that can see the last 400m covered in 49s. Experts estimate that the balance of this race is 80-84% aerobic and 16-20% anaerobic.
Types of doping and doping controls
In general terms, the goals of doping are the following:
- increase muscle fiber density and strength
- reduce inflammation and tissue damage
- improve cardiovascular efficiency
Anabolic steroids promote muscle growth and allow an athlete to recover more quickly. This allows them to simultaneously increase the intensity and frequency of workouts and leads to a higher degree of fitness. Many people think that steroid use is like the cartoon "Popeye" where the hero consumes spinach and spontaneously develops very large and strong muscles. It fact, steroid supplementation without hard workouts is ineffective and pointless.
Corticosteroids reduced inflammation. Consumption of insulin leads to higher glycogen storage in muscles and reduces tissue damage. This is extremely dangerous as it can cause blood sugar levels to drop to lethal levels. Human growth hormone (hGH) allows one to recover more quickly and to build muscle.
Blood doping can occur through both autologous and homologous transfusion. In the former, one stores blood only to re-infuse it right before competition. Before storage, the blood is centrifuged to increase the red cell density which allows one to more efficiently carry oxygen to cells. This is manifested in a higher hematocrit level (HC) which represents the percentage of red cells by volume in the athlete's blood. Homologous transfusions occur when an athlete is injected with someone else's blood (of the same or compatible blood type). These are the only transfusions that can be detected through testing.
Another form of blood doping utilizes the hormone erythropoietin (EPO) and it's variants. Here, the ingestion of EPO stimulates the body to produce more red blood cells. Unmonitored usage of the drug can lead to death due to heart attacks as the blood becomes to viscous to push through vessels. There have been cluster deaths in groups of cyclists who were using the drug. They were otherwise healthy and some of the most fit people on our planet. These cluster deaths started shortly after the drug was introduced in Europe in 1987. Officials were completely at a loss as to how to deal with this development and had to resort to banning all athletes who had a hemocrit level in excess of 50% until their blood levels were "normal." This approach was not without detractors as it is an arbritary standard and does not seem to be guided by strong science. One has to wonder how many athletes came in at the 49.9 mark during these years.
The first test for EPO was not developed until 2000 and it was ineffective. Refinements were made, and the first truly effective test was introduced at the Athens Olympics. The fact that it was a urine, rather than blood, test made storage and collection much more efficient and effective.
The primary weapon in the anti-doping agency's arsenal is serological and urine tests for prohibited substances. As the reader may have gleaned, they have been woefully inadequate in detecting doping, except for athletes from developing nations and those who have been unfortunate enough to consume tainted supplements. Competitors from wealthy nations or those with a state-sponsored doping apparatus can be tested with impunity but will not yield an analytical positive. In addition, there are many athletes from developing nations who have no out of competition testing from their national associations.
Athletes from wealthy nations can resort to using "designer" steroids that cannot be detected, or they can simply take anabolics or blood dopers under the guidance of physicians or trainers. There have been a number of infamous collaborations between physicians, trainers, and team management - Michele Ferrari is perhaps most notorious.
Investigations that produce non-analytical positives are expensive and time consuming. They use questionable methods that may impinge upon athlete's human and civil rights. For example, the US anti-doping authority (USADA) pursued Lance Armstrong in a manner that was reminiscent of the way in which US Attorneys have pursued mafiosa in the past. Do we really want to run roughshod over people to catch a few dopers?
More recently, the World Anti-Doping Authority (WADA) has promulgated the concept of the "Biological Passport." This consists of digital records that mark key indicators in blood values over a period of time. If an athlete's values are deemed to be sufficiently anomalous, they are banned and considered to be a non-analytical positive. However, this approach is not universally agreed upon and is generally considered quite controversial, especially with respect to how the 'suspicious' levels are determined.
Targeted testing and follow-up investigation can work, and more importantly be fair, when the decisions are data-driven. What if performances could be analyzed to determine suspicious patterns to guide follow-up testing and analysis? Would this allow us to increase the efficacy of testing?
Analysis of Track Races
The data used in this study was obtained from a variety of sources, but most principally All-time athletics. Each event has approximately 5000-7000 performances in a ranked list. For the purposes of classification, we consider all-time 'AT' and world record 'WR' performances separately. The world record data is relatively sparse and consists of no more than a few hundred points given the relative rarity of this occurrence.
Each race presents unique opportunities and challenges for and to analysis. Typically, sprinters dope with anabolics, and distance runners with blood dopers, although there have been notable cases that shed light on the complicated cocktail that many athletes have consumed. For example, Marion Jones was found to be on EPO, Insulin, hGH, and designer anabolics. As she was a sprinter, this caused surprise and consternation among those who follow track and field - was everybody on everything? There is no widespread agreement as to why she would be on such a regimen. Perhaps the increased O2 in the blood aides recovery. Perhaps the athlete demanded to be on everything much the way patients with viral infections hector physicians for antibiotics. Whatever the reason, it has quickly become clear that athletes and their enablers are becoming ever more sophisticated in their approach to performance enhancing drugs and evading anti-doping controls.
First, we will consider the 100m and 10000m events. The 100m has a very useful subset of annulled performances in which athletes were known to be on PEDs while competing. There are at least 120 performances on this list and they range from world records to 'median'. This will form the basis for comparisons between known dopers and the larger pool of performances that may or may not include dopers.
The 10000m event is the most grueling running event that is held on the track. It is exactly 100 times longer than the 100m race and it typically takes >160 times longer to complete. It is a race which depends on very different energy systems than the sprints do and therefore has different training and performance requirements. People who excel in this distance running are ectomorphs and do not have the need or desire to build muscle like the mesomorphs who dominate the sprints. Thus anabolics are not considered to be a primary mode of doping. However, blood doping will enhance the O2 carrying capacity in blood and lead to greater efficiency while running. If we draw our attention to the plot which displays the 10000m world record over time, we see that there is a sudden and unremitting assault on the world record beginning in the late 1980's. Curiously enough, this coincides with the release of EPO in Europe and the US as a therapy for cancer patients who were suffering from anemia.
Thus, the period of time marked by the red rectangle represents a period of little or no doping control of EPO and its variants - a kind of 'wild west' period in distance running the likes of which has not been seen since. In fact, further examination of elite performances, i.e. sub-27 minutes, shows that while twice as many performances have been recorded under this mark in the post- than in the EPO-era, no one has come within 25s of the world record since 2011. This plot can be accessed under the 10000m analysis tab of the Shiny App.
Non-Parametric Statistical Methods Utilized and Discussion of Results
A cursory look at the "all-time" density plots for each event shows that the data are significantly skewed and there is room to question whether it is permissible to employ approaches suited to normally distributed quantities. Traditional approaches such as transforming the data on a log scale were unsatisfactory and might lead to more questions. The safest approach seems to be to begin with Levene's test for the equality of variances. Applying this to the 100m dash, we typically get a very large p-value if we compare a sample of 120 performances from the all-time list with the 120 that are annulled due to doping offenses. This test is sensitive to heteroskedasticity in the data as can be seen from the plots below. The large p-value suggests that the data is homoskedastic.
Further, if we employ a Wilcoxon Rank Sum Test to test the equality of the shape of the distributions, it can be seen that the large p-value is suggestive that we should retain the null hypothesis, i.e. that the distributions are the same. Coupled with the fact that both distributions have identical medians, we are left with a startling choice - either doping is ineffective, or we cannot distinguish between a group of known dopers and a random selection of the same size from the all-time list. The inescapable and unpalatable conclusion is that most of the world 's top 100m runners are doped. One can see more detail by clicking "include annulled performances" in the shiny app.
Analyzing the 10000m races is somewhat more difficult. It is hard not to notice that the world record took an unprecedented assault over a 6-8 year period where EPO is readily available and there was limited to no doping control. If we group the times by "era" as in before EPO(<1990), during the period where it exists and testing is lax or non-existent (1990-2005) and after testing becomes at least marginally effective ( 2006 -), we can see that the distributions are markedly different as displayed in a density plot. A Kruskal-Wallis test is a kind of non-parametric one way anova that can be used to good effect when the data does not meet normality requirements. The -value obtained is 0 and it tells us that these distributions are all very different.
While the record time drops slightly in the period between 2000-2005, it has remained unapproachable since. Moreover, of the 113 sub-27 minute performances recorded in history, 70 occurred after 2006. Yet, when we examine density plot shown in the '10000m Analysis' tab, it is clear that there are twice as many performances below 26:45 during the EPO era than after it. So while the median time continues to drop for the group as a whole, the top performers are slower. It is astonishing that the world record has not been approached within 25s (or 1s per lap) since 2011, yet the median time has significantly dropped as one can see from the plots below.
If we perform both a Levine and Wilcoxon test on the data shown in the sub-27 minute plot, p-values of close to 0 are obtained. These distributions are very different and while the median times are similar, the skew and kurtosis are markedly different. It does beg the question of what might cause such a change in the distribution of times. While EPO use does seem to have been curtailed somewhat by the more recent testing protocols, evidence suggests that the drug continues to be abused. In the summer of 2016, Spanish police burst into the hotel rooms of one of the world's most foremost middle distance groups and found EPO. There have been other and less consequential busts before and after. It appears that many groups may be utilizing a protocol known as "microdosing" in which athletes take repeated sub-therapeutic doses that clear the system within hours. While the efficacy of this dosing regimen is unknown and even the subject of some controversy, it appears that many athletes are following it. Thus it might be expected that athletes would still gain a training advantage from microdosing, but not to the degree that they would from the full dose. If the smaller form of doping was more widespread, it could explain why more athletes are running faster today, yet currently, the fastest athletes are slower then they were twenty years ago. The only comparable period of stagnation occurred in between 1924-1937 when running was in its infancy and there were no consistent training practices.
Moving on to the 1500m race, we see that there is no similar trend in the World Record as that for the 10000m. The periods before and after EPO testing becomes effective are markedly similar. There are several possible explanations for this ranging from a lack of widespread doping in the event to microdosing being as effective as a full dose of EPO in this particular race. It seems more likely that a combination of anabolics and blood doping agents would be the approach that most athletes would take given what has been gleaned from non-analytical positives. It may be interesting to examine the combination of 1500 and 5000m times that top athletes in both events post. Beginning with Said Aouita in the late 1980's, a surprisingly large number of athletes have become fast at both events - running in the low 3:30's or below in the 1500m and sub-13 minutes in the 5000m race. The most high profile drug bust in this group is Dieter Baumann who famously blamed his failed test on "spiked toothpaste." In addition, Daniel Komen ran otherworldly times of 3:29 and 12:39 and failed a test for caffeine but was later cleared on a technicality.
An analysis of all-time performance lists in three events along with world-record data shows that there is at least a strong case to be made that doping has been an operative mechanism in track and field and that the dance between anti-doping authorities and doping athletes has left an identifiable signature in the progress of median times and world records in these events. Further analyses could be made of other events and finally women's events as well.