Close your Eyes and Listen to Tumour Music
(May 9th, 2017) Gene expression profiles often look boring and confusing. Perhaps a little music could make data presentation and analysis all the more enjoyable.
Some presentations on high-throughput genomic data look like someone crashed a motorcycle into a paint store. Some time ago, your Lab Times reporter witnessed a speaker show endless slides of gene expression profiles on matrices in neon green and garish red. All those shades of green and red, representing upregulated and downregulated genes in cancer cells, respectively. To be honest, it looked a real mess. And, imagine if you were one of the eight percent of men or 0.5 percent of women with colour blindness. Technically, you might also be “visually impaired” and unable to recognise complex patterns in colourful rows and columns, if you are not from the field or if you decide to print the figures in black and white. Your Lab Times reporter queries whether complex gene expression data could be represented in a more intuitive manner. The answer is, of course, yes.
Martin Staege at the Martin Luther University Halle-Wittenberg, Germany, proposed an alternative form of information display, based on “musically-interpreted gene expression data”. The idea of transforming a sequence of nucleotides or amino acids into musical notes has been pioneered by some labs in Europe, however, Staege has taken the concept to a new scale, by providing an automated computer tool called Gene Expression Music Algorithm (GEMusicA), “A method for the transformation of DNA microarray data into melodies that can be used for the characterisation of differentially expressed genes,” he explains.
Using his tool, Staege was able to create musical representations of gene expression data from samples of small-blue-round-cell tumours, such as neuroblastomas and Ewing sarcomas, and Hodgkin’s lymphoma cell lines as well as normal B cells. The expression measurements in these cell lines, using, for instance, an Affymetrix Human Exon array, involve several hundreds of thousand of probes. The Hodgkin’s lymphoma profile alone contained 1,411,399 probe sets. GEMusicA easily crunched this large number of individual measurements and, after some filtering of regions of low variability, the algorithm delivered standard piano-forte melodies that emphasised the differences in the expression profiles. In some cases, Beethoven’s “Song of Joy” from the 9th symphony or the ever controversial Wagner’s “Ride of the Valkyries” served as templates. With this set-up, minute details, like the mutually exclusive expression of the v-myc avian myelocytomatosis viral oncogene homolog (MYC), could be identified as notes at very high frequencies.
“We demonstrate that this Gene Expression Music Algorithm (GEMusicA) can be used for discrimination between samples with different biology and for the characterisation of differentially expressed genes,” Staege states. In one of his latest papers, GEMusicA helped characterising the Ewing Sarcoma stem cell signature.
The main drawback of this approach, of course, is the need for a functional auditory sense. It does, however, represent a concrete and joyful alternative to displaying huge amounts of information in confusing rows of red and green. Future studies might involve optimisation of labelling because, at this point, the names of the audio files are inappropriate for radio marketing. But who knows, may be the next summer hit will be based on a tune from a Neuroblastoma expression profile called “9SKNMC2228st”.