A new Science(ability)?

(November 2nd, 2015) Big Data, Innovation, Personalised Medicine and co. – Are these the hallmarks of a new science(ability) in medicine? An essay by Gerd Antes, Freiburg.

If one is to believe what one reads in editorials, comments or opinion articles in scientific journals, then we are at the beginning of a golden age for patients and healthy people. Patients will be diagnosed much earlier and more correctly, and then treated accurately, efficiently and free of side effects by personalised medicine. Healthy people aren't at all at risk because they will be protected from becoming sick in the first place by perfect preventative healthcare.

This is achieved with "systems medicine", through which the biological mechanisms of a pathogenesis (disease development) may be better understood by using methods from "omics" research, systems biology, computer science and network theory – which, in turn, through "translation" seems to be used automatically in personalised or individualized medicine. "Big Data" is omnipresent as a universal tool; obstacles and barriers are almost non-existent. All that is required are unlimited computer performance, unrestricted data acquisition and its storage in limitless clouds, as well as free access to the data and the resulting journal articles – Open Access makes it possible. Obtaining the necessary resources for all of the above goes, of course, without saying.

This Land of Cockaigne contrasts, however, with the dark, conservative world of science, where distortions, aberrations, failure and waste are commonplace and, thus, rather the rule than the exception. Weak points are found throughout the entire process of knowledge generation – from the first ideas and fundamental research through to crucial human clinical trials and the implementation of the knowledge gained. This, inter alia, was described in January 2014, in the impressive special volume "Research: increasing value, reducing waste" by The Lancet, in which five articles dealt with the deficits in processes for medical science and application from different perspectives, and suggested how to overcome or at least reduce them. The focus was on how more gain and less waste could be achieved in health research and patient care.

"The highly-acclaimed 'Translation' does not work"

Some deficiencies are particularly noticeable because, on the one hand, they have significant adverse effects and, on the other, may be relatively easily studied – whereafter, they ultimately have to be accepted as proven beyond reasonable doubt. The greatest source of waste lies in the systematic ignorance of existing knowledge during the planning, implementation and reporting of the next research steps. Hence, the highly-acclaimed "Translation" functions as poorly in application as it does during the scientific process.

Particularly alarming is that this failure frequently occurs during clinical trials. One ought to actually assume that clinical trials are planned very purposefully and carefully in order to aim specifically at finding answers where there is a relevant gap in knowledge – answers, which in turn provide an essential basis for diagnostic or therapeutic decisions. First of all, the crucial question should, of course, be whether there have already been identical or similar studies. But far from it – as is shown in a wealth of profound methodological studies over recent decades. Scientists, ethics commissions and authorities, one after the other, persistently consider only existing studies from a more recent period and conveniently forget everything prior to that, in a kind of collective amnesia.

Especially ethics commissions should, at this stage, already be in a high state of alarm and be striving for improvement, since every needlessly repeated study is unethical and any such application should be rejected upon application. Again, the reality of the situation is clear: in general, neither in terms of their authority nor their competence, are ethics commissions in the position to verify whether or not an application is justified based on existing studies. Funders of trials can also be placed in the same boat: they do have the possibility, if not the duty, to make the systematic checking of existing studies as a prerequisite for financially supporting a study.

Collectively turning a blind eye is made easier by a publication system that doesn't deserve its name in terms of completeness in the presentation of results. A multitude of studies in recent decades shows a globally consistent publication rate of around 50 percent. In other words, in addition to the 20,000 randomised clinical trials that are annually recorded in Medline, there are a further 20,000 that are never published and thus cannot be used as a knowledge base upon which to make decisions. The effects are literally fatal. Since those "vanishing" trials are not the results of random selection but on average tend to show undesirable results, the visible part of the study results leads to a significant overestimation of the respective therapies or diagnostic methods – and thus to chronic over-optimism.

"Apparently, more data will solve every problem."

It isn't easy to identify just how great this bias is. Trials with antidepressants show, for example, overvaluation of between 20 and 50 percent. The situation is similar for potential side effects and injuries; here the risk factor is often suppressed in report, so that the benefit/loss ratio, on average, is greatly overestimated.

So far for a brief excerpt from the dark side of deficits.

In general, we seem to be facing a period of transition – between the "Old World" of science, which seems to be overly preoccupied with its own deficits in quality, and a "New World", in which we are spared the self-tormenting confrontation with the shortcomings of scientific processes. How then can the optimistic perspectives of a golden future, on the one side, be reconciled with the grim view of present-day empiricism? Quite simply by selling the "Golden Future" as science but, at the same time, by resolutely relinquishing the usual quality standards of science and instead being content with promises, good faith and anecdotal "evidence". Consequently, today's scientific principles, summarised under such catchphrases as "integrity of science" or "Good Scientific Practice", as they are presented on most medical faculty and major research institute websites, ought to be firmly ignored, as they would rapidly tarnish bright perspectives and dampen optimism. To err – essentially the basic element of science and research – will simply be a non-occurence. In future, evaluation, validation, plausibility and quality assurance will no longer be needed – which of course makes life significantly easier.

Particularly striking is this contrast between the boring and time-consuming, never-ending struggle for greater quality in science, on the one hand, and a future unburdened by it, on the other, when you take a look at what generous promises and visions for the future "Big Data" is offering. It begins with its definition, which on the one hand is very dazzling (Wikipedia: "mass data, which cannot be evaluated using classical methods") but at the same time reveals "Big Data" as a buzzword that underlies constant change – which, in turn, allows much room for interpretation and leaves it not really tangible. Big Data is simply everything – the data itself, the complex technologies that are used for collecting and evaluating the data, as well as the methods, by which the data cow can be milked.

Whilst on the technological side the progress through continued development towards ever higher processing performance is not surprising, the methodological statements may only be followed in sheer astonishment. The all-dominant creed is the belief in correlation as the sole carrier of information. The age of causality is over, we are now in the new era of correlation. Any search for explanations is unnecessary and a waste of resources – let's simply trust in the power of data and its correlations. That correlations can indeed show connections, but in terms of causality can also be extremely misleading, is only mentioned in passing and is only regarded as a problem if there is not enough data available. Hence, "more data" solves every problem.

"Every needlessly repeated study is unethical."

The fact that a simultaneous decline in stork populations and birth rates does not causally mean that storks deliver babies, is bound to be a well-known fact. What hundreds of lecturers have instilled into generations of young students through this example, will be reaped by advocates of Big Data in one fell swoop. All it will take is a mere reference to the new philosophy that data is everything.

When searching for evidence for this completely game-changing claim, however, very little can actually be found. The reasoning is almost always restricted to examples, in which Google, Amazon, Yahoo and other concerns have met the right decisions, with respect to the central criterion "sales enhancement". The fact that the "evidence" is based on individual examples, and is thus almost entirely anecdotal – incidentally, another key criticism of orthodox methodology – is a further demonstration of how Big Data either ignores or thwarts the principles of the scientifically established methodical approach, developed over many decades.
Germany, with its usual tardiness, has finally taken up the discussion. One can sense the urgency to catch up with this progress as fast as possible (even if, according to "Gartner's Hype Cycle", Big Data in medicine has already surpassed the pinnacle of its expectations) – and can also witness large-scale naivety and gullibility to accept the claims of Big Data's concept for success or even to visualise Big Data, where there is none. The fear of being left behind appears to outweigh everything else. Despite healthcare being voiced as one of the sectors with the greatest potential for Big Data, just exactly where this potential should lie is not usually revealed. Instead, they adopt the mantra that Big Data is simply good per se!

Any examples usually involve huge amounts of data (such as that of the German National Cohort). These are, however, usually well-structured and thus complex, but despite this they are conventionally recorded and evaluated – hence they cannot be regarded as Big Data. The often ignored difference lies simply between meticulously planned and structured data acquisition and data management, on the one hand, and the access to any data and its links, on the other. Particularly in medicine, only little joy might await the advocates of Big Data in Germany because, in this country, the legal restrictions on data access and uncontrolled data usage already prevent the implementation of big data promises at this level.

"In the big data future, any form of stringent methodology will be disruptive."

The systematic "hyping" of Big Data is, however, simultaneously the basis for the belief in its infinite, inherent potential. That this comes hand in hand with an astonishing lack of being able to take criticism is not necessarily a surprise. What actually is surprising, is that even from within the scientific community there emerges a willingness to compromise good scientific quality. Literature shows that Big Data is more of a marketing concept, than a scientifically sustainable vision for the future. This is particularly evident in the debate on limits and potential damage from Big Data. Possible misstatements are deliberately ignored. Because validation, gold standards and similar methodological cornerstones are not required, corresponding limitations in methods rarely surface.

Undoubtedly, however, it is argued that Big Data needs unrestricted access to data, in order to achieve its full potential. It is conceded that this would tantamount to a general attack on the privacy of individuals, institutions and companies. Hence, the solution is all the more surprising: the responsibility for the data simply has to be transferred from the owner to the user – and the doors to the "Big Data Future" will open (reflecting the quote: "Big Data: The revolution that will change our lives"). This ceaseless demand either stems from absolute naivety or is driven by the desire to permanently relax data protection and privacy. If you look at examples for the use of Big Data, usually the latter appears to be the case to gain more and more personal data for commercial purposes. And thus, it goes without saying that every form of stringent methodology is disruptive.

The hype in favour of the "brave new world" is supported by the sweeping demand for "Innovation". This mantra-like and ever-present repetition seems to show that it already takes a certain amount of courage to announce an event, present an article on a new method or issue a press release about the development of anew procedure without mentioning the characteristic "innovative" ("Innovation" has 391 million Google hits). However, in medicine and healthcare sharp conflicts do exist with this term. The mainly economically-driven demand for innovation actually does not automatically mirror medical benefit. On the contrary, particularly for new diagnostic or therapeutic procedures or medical devices, it is fact that the new must prove its supremacy over the well-established, or should at least be on a par. The approval process regulates this for drugs as well as for non-drug interventions – in medical technology this is not the case, to-date.

"The economically-driven demand for innovation actually does not automatically mirror medical benefit."

Whoever hampers the introduction of new methods or the pursuit of more revenue and profit by persistently applying established methodology, will, in turn, quickly become a hindrance to innovation. A typical textbook example was described in the Süddeutsche Zeitung on September 5, 2015 under the title "Systemfehler" ("System Error"). A law intended to prevent bogus innovations, "failed" in the case of a particular epilepsy drug, from which, in fact, many patients could have "benefited" – just because the authorities were nevertheless unconvinced that the drug had a clear beneficial value, and that it would generate an appropriate revenue for the manufacturer.

The usual ingredients were involved: the description of individual patients; disregard for the benefit/harm ratio; the claim that this illness is something special and that, therefore, in this case common methodology ought not be applied; the statutory rulebook for preventing innovation. For, in the latter case, it is not a matter of preventing a drug but rather the threat it could disappear, because, after this verdict, the manufacturer can no longer demand the targeted price. Innovation apparently being prevented by methodically derived rules when, in fact, it boils down to simple economics.

On the scale of exaggerations, personalised or individualised medicine also deserve a place at the top. Headlines, such as "personalised medicine for innovation strategy" and similar expressions, show how such terms on the hype scale serve to mutually embellish one another. "Big Data and medical technology for personalised medicine" is yet another example of such connections. Google is very helpful for checking the existence and frequency of such combinations. The result is overwhelming – there is everything conceivable.

According to this, the connection to Big Data runs mainly along the "omics" lines. The largely naive expectations that via the identification of genetic constellations a clear connection to disease symptoms can be found and, with this knowledge, therapeutic approaches can easily be developed have, meanwhile, given way to great disillusionment. Accordingly, the complexity of relationships between genetic basis and symptomatics, in particular, has led to a massive increase in development of computer capacity for bioinformatics. If one adds the rapidly developing gene-sequencing industry, the desire for Big Data here, too, is understandable. In fact, large amounts of data are already available here, but only strictly structured data – hence, not Big Data, with respect to its definition and requirements.

But back to personalised medicine. This term frequently appears to be synonymous for relaxing methodical requirements. Tailoring a therapy to individual patients is an understandable desire, whose fulfilment is perhaps conceivable in the distant future. At present, however, realisation means developing therapies for limited groups of patients with similar needs. Thus, one is directly confronted with the problem of no longer being able to routinely apply established study methodology, due to a shortage of appropriate patients. The temptation to slacken the requirements for proof of efficacy and safety is, therefore, certainly great. In addition to the objective challenges facing methodology, one also detects the deep desire to achieve a secure, deterministic and truly personalised medicine, instead of having to deal with biological variability, handling populations, uncertainty over statements and the necessary statistical analyses. It is obvious that this will not succeed.

"Personalised medicine appears frequently as a synonym for relaxing methodological requirements."

Hence, we observe an increasing polarization between self-critical reflection, especially in the life sciences and medical research as well as, on the other hand, the promise of healing through this revolution, which, among other things, Big Data will bring. The continuation of the above-mentioned The Lancet-Initiative at the "Reward/Equator" conference last September in Edinburgh as well as relevant activities by Nature and Science, together with the US National Institutes of Health (NIH), however, show just how seriously and widely this crisis is now being taken.

On the other hand, however, the pressure is mounting on science due to increased economising, especially in medicine – in particular by the increasing gap between growing biologisation on the research side and the real facts with respect to patient care. If, in the end, this leads to a significant erosion in methodical quality, it would certainly be detrimental to both sides. In the short term, easily learned lessons and "innovations" might make an impression; long term, however, there appears to be no alternatives to the rigid, scientifically derived principles – from both a quality as well as an economical perspective.


Gerd Antes is biostatistician and director of the German Cochrane Centre at the University Hospital Freiburg.

(The essay first appeared in German in Laborjournal 10/2015 Illustration: Tim Teebken)

Last Changes: 12.11.2015