Understanding the relative age effect
It exists in any data set that tracks growth or performance of youngsters. The question is not, is it there? The question is what to do about it?
In 2017 the company I was working with was collecting growth and physical performance data on Malaysian school age children. And, like almost everyone who gets their hands on a little bit of data like this, I did an analysis of relative age growth and performance for a few of the age groups. In this article I've included two charts from that analysis just to show what this looks like. But I didn't do that analysis to see if there was a relative age effect buried in the data because I knew there was. It exists in any data set that tracks growth or performance of youngsters. The question is not, is it there? The question is what to do about it?
What I want to do in this letter is focus on why we need to be aware of the relative age differences in our athletes and present some ideas as to how harmful effects can be mitigated. If it weren't for numerous routines that professionalize youth sports such as tryouts, team selections, training camps, and ill-conceived talent identification schemes then the relative age effect would be much less a concern. But since these things already exist at even the youngest levels of sport then mitigation of the relative age effect is a serious issue.
Grouping athletes by age and sex is the primary method of categorization used in sport. But sometimes the way we calculate age is detrimental to some athletes simply because of the month they were born.
What is the relative age effect?
The relative age effect (RAE) describes observable differences, usually related to growth and performance, between young athletes who are the same chronological age but have different developmental ages. The effect is caused by the relationship between chronological age and an eligibility cut-off date used for cohort selection.
Table 1 shows how the relative age quarters (RAQ) are assigned to athletes. Since the most common cut-off date for determining age in international sport is 1 January those born in the first three months of a calendar year are in the Q1 cohort. The Q2 cohort is for athletes born in the months of April through June etc. If an NGB has different age cut-off dates, for example 1 September coincides with the start of the school year in some countries, then the RAQs would shift — September, October, and November would become the Q1 cohort — but the idea is the same. Q1 always represents the relatively older athletes in the group.
Since the Q1 cohort is relatively older than others in the selection period they usually perform better and earlier than the relatively younger athletes. Thus they are often perceived to be better athletes. This relative age advantage means Q1 athletes will get more attention from coaches, more praise for their efforts, and more advanced training and travel opportunities throughout their sporting career. These advantages add up and eventually mean that being born early in a sport selection period can be a jackpot of future opportunity and better training.
Conversely, those born late in the selection period (Q4) will find themselves at a perpetual disadvantage because unlike the late-maturer who eventually catches up with his early-maturing counterparts, the athlete who suffers a relative age disadvantage will feel the effect over their entire sport career. The RAE persists because it results from an administratively defined selection period, not a mere difference in the rate of growth. Growing out of the effect is not possible as the younger athletes would have already missed out on the many early opportunities enjoyed by the relatively older cohort. This means that they have not received significant coaching attention over a period of years nor have they had opportunities to participate in camps or select competitions. By the time the RAE no longer matters — around 14 years old — they would not have the same game sense, skill level, or other attributes necessary to compete as successfully as the older cohort.
The problem caused by the relative age effect is that we are making decisions about athletes way too early in their sport experience. What we see as 'ability' is merely a result of being a little older than their friends.
Relative age bias is easy to find; the only problem is that we usually find it long after the damage is done. The original research that highlighted the RAE's existence was the 1980s analysis of Canadian junior hockey players whose birthdays were unaccountably grouped in the early part of the year (Q1 and Q2). The normal distribution of births in a population is mostly even throughout the year, so when the birth dates of the hockey players fell well outside this pattern researchers began to wonder why. Tipped off as to what to look for, a similar skewing of birth dates in European soccer players was also found. We now know that this pattern shows up in many different sports when upper level youth teams are analyzed.
It should be noted that the RAE is baked into any administrative structure using fixed age selection periods. In academics, parents of children born late in the eligibility period for kindergarten typically hold their child out a year so they are not the youngest in the school. The typical age to start kindergarten in the United States is five years, which means that the ages of the children could range from just barely five to almost seven, a whopping 40% difference.
Figures 1 and 2 illustrate typical growth and performance differences for 8-year-old Malaysian school children. All of the children were born in the same year and their RAQs were calculated from their birth date.
Notice the difference in height between the Q1 and Q4 cohorts in Figure 1. The boys show an average difference of almost 3.5 cm with the girls showing almost 2 cm difference. Additionally, average height gets progressively lower for the Q2 and Q3 cohorts though not as much as Q4.
On the performance side, Figure 2 shows the Q1 boys are 0.5 seconds faster in a 50 m run with the Q1 girls about 0.3 seconds faster. Overall the data follows the same pattern as in Figure 1 with younger athletes getting progressively slower.
I included these charts to show that even though relative age differences are mostly unrecognized on the field, when reduced to numbers it's clear that we are not chasing phantoms.
It's important to note that Q1 athletes do exhibit higher ability but it’s not athletic ability. It’s simply a result of growing. Q1 athletes are taller, stronger, and heavier than the younger cohorts. It’s easy to overlook this however and assume some other sport related reason is in play:
Coaches, parents, and officials often don't realize that what they're observing is the RAE in action and not evidence of higher athletic ability. Indeed, the very idea that 8-, 9-, or 10-year-olds would have 'high' athletic ability is suspect to begin with. Nonetheless they are allowing a relative age bias to inform their decisions about who should receive special attention or advanced opportunities.
Youth sport organizations frequently create situations where premature judgements about young athletes have to be made such as travel or all-star teams, select camps, etc. But it could also be something on a very local level such as most valuable player awards. While these kinds of selections and awards are normal and typically done in any sport scenario it is inappropriate to apply the same selection logic at young ages as is done at much older ages.
Relative age bias causes a form of artificial elimination. Even though it's quite unintentional, athletes in the younger cohorts eventually find that they are not progressing as much as their older counterparts and the activity is just not fun for them anymore. They switch to another sport or, worse, dropout of sport completely. But by reducing this built-in bias we can help more youngsters have successful sport experiences. We can also increase the number of athletes who reach an elite level of performance within the NGB population. If we can reduce relative age bias we will keep more youngsters engaged longer with fewer disadvantages accruing to the younger cohort. Here are some ideas for reducing the RAE:
Know your athlete's RAQs. Knowing an athlete's relative age quarter should be as common as knowing their age or sex. Coaches can use the Sportkid Metrics Relative Age Calculator to calculate RAQs for their entire team.
Be aware of the RAE during practice sessions. How you are grouping athletes should be top of mind during practice at very young ages. Although some separation of relatively older and relatively younger athletes may be necessary, keep in mind that there is nothing at stake that should override the need to reduce bias. Many adverse consequences can be avoided simply by paying attention.
Avoid professionalizing youth sport with selections or stratification of youngsters. Since the RAE disappears as athletes age, avoiding unnecessary selections for very young athletes will diminish its effects. For example, eliminate the select team, or all-star team process until athletes are older (over 14 years). Assign coaches strategically such as not having the chief coach always working with the best athletes, or not routinely assigning the Q4 athletes to assistant coaches or 'B' squads. Remember that in sport these stratifications will happen soon enough on their own (as they should); there's no need to artificially introduce them at young ages.
NGBs and international federations should stop using fixed dates to determine age. Governing bodies for individual sports should use the athlete's age on the day of competition, or day of the start of the select camp etc. This reduces the RAE by spreading it around the calendar. In one competition an athlete might be in the relatively older group and in another, the younger. It just depends on where their birthday is in relation to the start date of the event. This way, even though the RAE is never completely eliminated, its influence is lessened. For team sports this is a tough one: Keeping teams together for an entire season often depends on having a fixed selection period. Overall, sport organizations need to be aware of their relative age environment and make appropriate local changes if they can.
Sometimes reducing ages to numbers helps us to understand better why the RAE is damaging at younger ages. A 16-year-old is only about 7% older than a 15-year-old and both are closing in fast on their adult height, which means that growth and other attributes that typically go with it like strength, lung capacity, etc. are almost at their maximum and slowing down. This means that differences in performance potential would be negligible. A 7-year-old, on the other hand, is almost 20% older than a 6-year-old and is in a period of rapid growth, and strength and speed increases. This is a volatile period of growth and training. Giving one athlete advanced coaching and special opportunities and not the other can have long lasting consequences that cannot be made up. We owe all our athletes our best efforts and knowing that the RAE can be damaging to some youngsters means that we should make honest efforts to reduce it.
How to use the relative age calculator
The Sportkid relative age tool was created in Google Sheets. To use the tool with your own team go to this link and click 'yes' when asked to make a copy of the sheet.
Paste in the birth dates of your athletes into the birthdate column. The dates have to be in a recognized date format. The tool will convert dates you enter to the YYYY-MM-DD format. This conversion is not foolproof, so be sure it's done correctly. You may have to reformat some of your dates to get it right. You can also add names to go with the birth dates but this is optional, the tool works with or without names.
As dates are entered the sheet will calculate the current age and the relative age quarter for each athlete. Calculating RAQs for all dates is not very useful since you may have athletes born in a range of years. After a certain age (around 16 to 18 years) the RAQs lose relevance, so you will probably want to limit the calculations to athletes born in a specific year so that you're looking at mostly younger athletes. However, RAQs of older athletes might show earlier relative age bias somewhere in the past.
By entering a year in the green bar you can narrow the RAQ calculation to a specific birth year, which is more useful than calculating all RAQs.
You can also change the age calculation date to anything you want. This only changes the calculated age, it doesn’t affect the RAQ chart.
The frequency table counts all values in the RAQ column. The chart displays data from the frequency table.
Based on past data collection projects I was associated with, you will find that the RAQs for younger ages are more or less evenly distributed. An important trend to watch for though is fewer athletes in the Q3 and Q4 quarters as athletes get older. These youngsters are the ones who would have suffered from a relative age disadvantage at younger ages. If premature attrition is taking place due to the RAE then this is where it will show up in the data.
The RAE is part of the reason some athletes leave sport prematurely. By making judgements about athletes that would otherwise be perfectly legitimate at older ages, sport officials are raising the level of professionalism for many younger athletes and lowering the enjoyment factor. Professionalizing the youth level is driving youngsters away either through poor experiences or artificial elimination. The RAE is a primary culprit of artificial elimination and should be reduced whenever possible.