The primary goal of this work is to convert genome-encoded protein sequences into musical notes in order to hear auditory protein patterns. Although there have been previous efforts to do this, one of the main problems has involved the large jumps between consecutive notes in a 20 note range (2.5 octaves) that results from a one-to-one amino acid-to-musical note assignment. Some other concerns include assigning rhythm, dynamics, and accompaniment according to the characteristics of the protein sequence.
We derived a reduced 13 base note range according to hydrophobicity and pairing of similar amino acids. The amino acid pairs were differentiated using variants of three-note chords, namely the root position and first inversion chords. A rhythm has been encoded into the musical sequence according to the organism’s codon distribution used in the genome-encoded protein sequence. Such a designation allows each amino acid to be represented by different note durations. The result is a set of rules that produces musical compositions that can be applied to any protein sequence . As an example, we have used a prototype human protein, Thymidylate Synthase A (ThyA). A detailed description of our coding assignment can be found in the Project Evolution.
In addition to the primary goal, we also aim to use this conversion to help make protein sequences more approachable and tangible for the general public and children. The project also opens opportunities for visually impaired scientists to access protein sequences more readily. We show and allow one to listen to examples of several proteins translated into music by these methods and also provide the opportunity for others to convert their own gene of interest using our GENE2MUSIC program.
- Takahashi R, Miller, JH: Conversion of Amino Acid Sequences in Proteins to Classical Music: Search for Auditory Patterns. Genome Biology 2007.