(Re)cycled Internet Data does not Make ‘Attractive’ Science

(June 11th, 2014) We just received a new Research Letter from Switzerland by our corresponding author, Albern Räder (a.k.a. Jeremy Garwood).



The internet revolution has been around for two decades. It has transformed everyone’s rapid access to huge quantities of information. But it has also encouraged some scientists to search for data that can be exploited at little cost in ‘original’ research.

Consider the case of Erik Postma, an evolutionary biologist from the University of Zurich. He has recycled internet data from the world’s most famous cycle race – the Tour de France – and used it to find ‘A relationship between attractiveness and performance in professional cyclists’. In his report, Postma claims to have found a new evolutionary link between men’s faces and their physical endurance, at least when peddling a bicycle.

In order to look at how females select ‘high quality males’ for mating, Postma decided to test whether endurance performance is associated with ‘facial attractiveness’ in elite professional cyclists, who completed the 2012 Tour de France race. He explains that according to his reading of evolutionary theory, prehistoric hunters with more endurance should be favoured, not only in the hunt for wild animals but also when getting the girls for mating. Quite why he considers modern professional cyclists to be a good model for prehistoric hunters is not clear but a big advantage is the availability of lots of data sets on the official Tour de France website.

Who’s a pretty face?

Out of 153 cyclists who completed the 2012 Tour, Postma selected 80 riders and downloaded their details - height, weight, race results – and, above all, the photographs of their faces. His main hypothesis is that people should be capable of assessing the performance qualities of these cylists just by looking at their photographed faces.

Postma conducted an online survey in which volunteers from around the world were asked to ‘rate’ the selected photographs. There were a total of 816 ‘raters’, each of whom assessed the faces for ‘attractiveness’, ‘likeability’, and ‘masculinity’ on a scale of 1 to 5, 5 being the top score.

Postma duly processed these personal judgements and matched their attractive-likeable-masculinity scores against their relative performances in the Tour de France. He analysed various combinations of the survey results, and certain personal details that the survey participants had given about themselves, then compared these to details for individual riders.

Overall, Postma claims to have found one big result, namely that ‘the slope of attractiveness’ plotted against the cylists’ race performance differed ‘significantly among female raters using the pill, female raters in the non-fertile part of their cycle, female raters in the fertile part of their cycle, and male raters’.

In effect, when using the responses from all 816 raters, Postma could not find a strong correlation between the riders’ attractiveness and their performance. Therefore, he sub-divided the data. The raters had also been asked whether they were men or women and about their sexual preferences. Furthermore, the women were asked about their menstrual cycles and whether they took oral contraceptives. By sub-dividing his raters’ ratings, he now finds that ‘there was no significant difference between men and pill-using women’ – but that their ‘slope’ was ‘significantly weaker’ than the one that he obtained from plotting just the results for the non-pill using women.

In his article, and especially in his official press release and answers to journalists’ questions, Postma clearly thinks he has found something meaningful. “Attractive riders are, therefore, faster”, summed up Postma in the University of Zurich press release. He told the BBC that “studies find that women using the pill have a reduced preference for masculine faces, and we found the same phenomenon: women on the pill had a reduced preference for faster cyclists” (BBC News 5/02/2014)

But what exactly is Postma measuring?

In his article, there are surprisingly few details about the questionnaire that was given to his 816 participants. Not that there is a lack of paperwork. Postma’s final report may be only 4 pages long, but he has provided 21 pages of online supplementary material plus (hidden in the small print after the Acknowledgements) there is a link to a data file containing his survey results, tabulated across 19 columns, that fills a whopping 644 pages!

Unfortunately, there are no examples of the photographs used in the survey, so we can’t give our own assessment. Furthermore, we can’t identify individual cyclists from the survey results since Postma has given them each a “unique identifier” that only he knows. He has also statistically converted all of their heights, weights and performance data into his own “relative” measurements – presented to 14 decimal places!

But, Postma has simply recuperated his experimental materials over the internet from the offcical Tour de France site: “I use data from elite professional cyclists that finished the 2012 Tour de France.” 

Faces on photos

To measure attractiveness, he took 80 portraits of riders “taken on the day before the start of the race”. These portraits “showed the head, neck and part of the shoulders.”

Yet Postma told the BBC that: “If we took the 10% best riders and compared their performance to the 10% worst, we found the best were on average 25% more attractive than the worst ones.” This means the 8 most “attractive” riders scored 25% better than the last 8.

If you look at the portrait photos of the riders, arranged by teams on the 2012 Tour de France site, you can see they are indeed quite a varied bunch.

However, in the scientific paper, readers must rely on Postma’s assurances and statistical manipulation. However, in response to pressing demands from journalists, Postma provided far more useful information. He gave them a list of the Top 10 riders! And the Swiss newspaper, ‘Tagesanzeiger’, not only prints the list, it also shows the Top 10 photographs side-by-side.

With this list, you can go back to the Tour site and look up all of the riders’ details in order to judge their qualities for yourself. For a start, it should be noted that the Top 10 came from 8 teams but that 3 riders came from one team – ‘BMC racing’: the 1st, 5th and 10th rated riders. Is there a correlation between being in ‘BMC racing’ and being ‘attractive’? Well, as Postma says, we’re looking at their “head, neck and part of the shoulders”. All of the riders are wearing cycling jerseys – these have smart collars that hug the neck and distinct colours on the shoulders. Could the raters have been influenced by the jersey? ‘BMC racing’ have a black collar and black and red jersey.

What about their faces? Even Postma has recognised that smiling may affect how the riders are rated. In his Supplementary Figure S2, he shows there is a statistically significant correlation between riders who smile and their higher ‘likeability’ score.

And, sure enough, in the Top 10 photos, 8 of the Top 10 are smiling – hardly a major feat of physical endurance? Postma admits that “the only other rider-specific variable that affected his likeability score was his facial expression, with smiling riders being rated as significantly more likeable”. However, Postma insists that smiling does not correlate to the undefined quality of ‘attractiveness’.

Another obvious point about the photos is the non-rider-specific presence of the Tour de France advertising stand behind the riders. This shows a mixture of coloured texts on a white background. The Top-rated rider is flanked by ‘Le Tour’ in black geometric squiggles to his right and nicely positioned over his left shoulder is the logo for the Province de Liege (where the Tour started) which features a white lion on a red background. In fact, if you look at the Top 10 photos side-by-side, it could above all be argued that the winner, Amael Moinard, has the best-composed photograph. One can only guess at what the photos of the Bottom 10 looked like!

Performance ratings?

Postma gives over several pages of his supplementary material to explaining his performance rating of the 80 chosen riders. But why? We already know where they were placed at the end of a grueling 3-week race that saw them cycling up incredibly steep mountain tracks, whizzing through innumerable villages, trying their luck breaking away from the ‘peloton’, and sprinting at stage finishes.

The race was a total length of 3,496.9 km. It was won by Bradley Wiggins in 87 hours 35minutes at an average speed of 39.9 km/hr. Yet, Postma insists on “quantifying performance” by adding the results of three individual time trials (that represented a total of 100 km) and then gives each of these equal weighting to the Tour itself. Once more individual details are absent from his statistical manipulations that present the relative rider performances to 14 decimal places.

Why not simply look at their final position in the overall Tour de France? Thanks to the Top 10 list sent to journalists, we can see that the ‘most attractive’ rider (Amael Moinard) came 45th and the second best (Yann Huguet) only came 138th. So they weren’t the highest performing Top riders after all. In fact, Posma excluded the winners of the Tour from his survey. He told the BBC that Bradley Wiggins (no.1), Chris Froome (no.2 and subsequent Tour winner in 2013) and the other high-performing members of ‘Team Sky’ were excluded because they wore sunglasses in their pre-Tour photographs. The cylist as model for prehistoric hunter does not apparently extend to sunglasses!

But is this really a fair comparison? The Tour de France is one of the toughest sporting events in the world. Anybody who can complete the race is a top athlete by any standard. But comparing 80 of the world’s best cyclists still seems unlikely to fulfil Postma’s criteria. Just how much variation in the assessment of performance does he think is there to be detected? Wouldn’t it be better if he had designed his experiment to include people who were not particularly athletic, or who were sprinters rather than marathon runners, or who had health problems hidden from view, or who were pretty-faced models, or good smilers, or… There are no controls and besides, how clearly defined are the criteria of ‘attractiveness’ and ‘masculinity’? 28% of the raters were men, yet they apparently scored the same as women taking the contraceptive pill?

“We don't know what people are picking up in the faces that is signalling the riders’ performance.” Postma admitted to the BBC. But this doesn’t stop him from speculating that women have been programmed by evolution to find attractive men who are physically talented. “The endurance in a man is a big plus for women”.

Unbiased questionnaire?

How scientific were Postma’s raters in their judgements of the riders’ faces? Everything was done over the Internet. Postma tells us his questionnaire was online at a commercial survey site (‘FluidSurveys’). He found his survey participants by ‘posting’ an announcement on two mailing lists for evolutionary biologists: ‘Oikos-listan’ and ‘EvolDir’. The former is ‘subscriber-only’ but hunting through the archives of the latter reveals Postma’s ‘post’ dated 01/08/2012: ‘Human sexual selection survey’ (EvolDir August 1, 2012 Month in Review, p.50-51).

When reading this, it appears Postma never thought about possible experimental bias if participants were already informed of the desired result before beginning the experiment. Could they be in any doubt when he asks: ‘Do nice guys finish first, or are looks deceiving?’ then goes on to explain that his study is “investigating the relationship between looks and performance. You will be presented with the portraits of professional cyclists that have taken part in the 2012 Tour de France. You will judge how attractive, masculine and likeable you find each one of them, based on their looks alone.”

Postma also fails to mention that he offered an incentive to participants: “Furthermore, I am giving away a 100 Euro Amazon gift voucher to a randomly selected participant.” One wonders about his statistical definition of ‘random’.

Perhaps, in the end, this scientific article says less about human evolution than about the evolution of lazy scientists, happy to exploit internet ‘data’ that they can statistically ‘process’ and regurgitate in some publishable, but not very meaningful, form. Hardly a measure of endurance, and not particularly attractive either.

 

Photo: Tour de France




Last Changes: 07.31.2014



Information 4


Information 5


Information 6