Angiotensin-converting enzyme 2 (ACE2) is a protein found on endothelial cells that line our arteries, lungs, and intestines. ACE2 is located on the cell surface and spans the cell membranes making it suitable for use as an entry point by viruses. ACE2 is known to be the main entry point of multiple coronaviruses including SARS-CoV-2.
Click the play button to find out what the protein sounds like.
Included below is a 3D ‘ribbon’ view of the protein.
Like other coronaviruses, SARS-CoV-2 consists of an RNA surrounded by a membrane containing proteins. One of these proteins, the spike glycoprotein, is responsible for binding to proteins on the surface of human epithelial cells leading to infection. The virus is then able to use cellular machinery to produce virions may reach other target cells for infection. Click the play button to find out what the SARS-CoV-2 spike protein sounds like.
To develop treatments against the virus and take away it’s virulence studying and targeting the spike protein is paramount.
Included below is a 3D ‘ribbon’ view of the protein.
A different color represents each amino acid, which posesses distinct physicochemical properties. The Venn diagram below illustrates the color encodings for amino similar amino acids.
A table of colors with their corresponding amino acids is included below.
Amino Acid |
Symbol | Color |
---|---|---|
Proline | P | |
Asparagine | N | |
Aspartate | D | |
Serine | S | |
Glutamine | Q | |
Glutamate | E | |
Arginine | R | |
Methionine | M | |
Isoleucine | I | |
Leucine | L | |
Phenylalanine | F | |
Tyrosine | Y | |
Tryptophan | W | |
Lysine | K | |
Histidine | H | |
Alanine | A | |
Glycine | G | |
Valine | V | |
Cysteine | C |
|
Threonine | T |
In addition to the primary sequence, this representation of proteins encodes secondar structure in the form of the duration of the notes. The table below shows the mappings.
secondary structure |
duration |
---|---|
β-sheet (all types) |
1.0s |
helices (α and others) | 0.5s |
random coil and unstructured | 2.0s |
Each amino acid has been assigned a chord that plays upon encountering the specific amino acid. Starting one ocatve below middle C, the chord assignments (with three latter amino acid codes) are as follows:
Trp-C, Met-D, Pro-E, His-F, {Tyr-G (RP), Phe-G (FI)}, {Leu-A(RP), Ile-A (FI)}, {Val-B (RP), Ala-B (FI)}, Cys-C, Gly-D, {Thr-E (RP), Ser-E (FI)}, {Gln-F (RP), Asn-F(FI)}, {Glu-G (RP), Asp-G (FI)}, {Arg-A (RP), Lys-A (FI)}.
Where RP is the root progression of the chord and FI is the first inversion. For example cysteine is C Major.
Amino acid building blocks connect to form macromolecules —proteins— that support all life in one way or another. Their versatility and vast number of possible sequential combinations leads to the formation of many many many structures that can have different useful properties.
Just like viruses exploit the properties of the protein sequence and structure to infect their hosts, we can utilize the sequential and structural information gained through analysis of sequences and structure of viral proteins to find the weak spots and target viruses to remove their ability to bind to target calls and infect them.
This visualization looks at whether proteins can be represented as music and abstracted accompanying visuals effectively and if so, what value does that representation add when used in conjunction with more traditional approaches. I think that sonic and visual representations that follow a sequential approach will allow for a system of pattern recognition in protein sequences to find and identify interesting regions.
Click one of the cards associated with proteins above to learn more about it and explore it's properties through sound and color.