Proteins, supercomputers, and drug discovery

6 min readJul 15, 2023

“Everything that living things do can be understood in terms of the jiggling and wiggling of atoms.” — Richard Feynman Have you ever wondered how your body manages to function so seamlessly? How does oxygen get transported from the air we breathe to our cells? How do we digest the food we eat? And what protects us from harmful bacteria and viruses? The answer to these questions is proteins — complex chemical molecules that act like tiny machines, performing critical biological functions that keep us alive and well.

Did you know that an estimated 50 to 100 trillion proteins exist in the human body, each with a specific function? For example, Hemoglobin is a protein that carries oxygen to the body’s cells. Digestive enzymes like pepsin and trypsin are proteins that break down food in our stomachs. And antibodies like Immunoglobulin A are proteins that help us fight off infections. But when proteins malfunction, it can lead to serious health problems. For example, abnormal proteins can cause neurodegenerative disorders such as Alzheimer’s and cancer. Keeping these tiny machines running is essential for maintaining good health and happiness. But we do not always feel healthy, especially when pathogenic bacteria and viruses attack our bodies.

Sometimes our immune system and associated protein machinery are ineffective against deadly pathogens like coronavirus. But the interesting thing is that the proteins also play a critical role in the lives of these pathogens? Imagine if we could hack some of their proteins to make them useless; thus, we can stop some functions of the pathogen machinery. If that function is crucial for its survival, the pathogen will die. Take the Human Immunodeficiency Virus (HIV), which causes the life-threatening condition known as Acquired Immune Deficiency Syndrome (AIDS). One of the proteins that HIV needs to survive is called HIV protease. Please have a look at its structure in Figure 1a. HIV-protease is responsible for cleaving a large protein chain to form new HIV viruses (Figure 1b). It contains ‘flaps’ which act as a gateway to enter inside its cavity. The red-colored portion is the site that cleaves this large chain which enters the HIV-protease when the gateway or flaps are open. This site is called the ‘active site’ of protein. By binding to the active site of HIV-protease and blocking the cleaving process, scientists have developed drugs, such as Ritonavir, that can inhibit the formation of new HIV viruses and stop the progression of AIDS. For example, see how the drug molecule binds at the active site in Figure 1c and closes the flaps that block the active site, as shown in Figure 1d.

Discovering a new drug that can inhibit the functioning of a crucial protein in a pathogen is a complex and time-consuming process. The process of drug discovery involves identifying a set of candidate molecules that have the potential to bind to the target protein and inhibit its function. One of the major challenges in this process is the need to calculate the binding affinity between each candidate molecule and the target protein.

The binding affinity is a measure of the strength of the attraction between the drug molecule and the protein. To calculate the binding affinity, we need to take into account the forces between hundreds of atoms in the drug molecule, thousands of atoms in the protein, and even more atoms in the water and salt ions that surround the protein. The scale of these calculations is so huge that it would take regular computers thousands of months to complete. This is where supercomputers come in. Supercomputers are massively parallelized set of thousands-lakhs of computer processors that can perform multiple calculations simultaneously and complete them in a few days.

One of the commonly used simulation programs in drug discovery is GROMACS (General molecular dynamics simulations), an open-source molecular dynamics simulation software that can be used to calculate the binding affinity between a drug molecule and the target protein. The simulations are not only faster and less expensive than experiments but also allow us to replay them, identify mistakes and do better next time. This way, the chances of failure in actual experiments are significantly reduced. However, one of the major challenges in studying the interactions between proteins and drug molecules is that proteins are not rigid molecules but constantly changing shape. This means that a protein’s structure is not fixed, and a single structure may not accurately reflect the true binding affinity between a protein and a drug molecule. To address this challenge, our team at IIT Delhi developed a technique called “ Linear Response driven Molecular Dynamics Simulation.” This technique has two aspects. The first is to predict how the protein shape changes when a drug molecule binds; for example, when a drug molecule binds to a protein receptor, it causes conformational changes in the protein. The second is to increase the speed of these protein portions in the direction of their predicted movement, which reduces the time required for structural change. For example, when a drug molecule binds to the protein’s active site, the protein may change its shape to accommodate the drug molecule better, thus increasing the binding affinity. This technique allows us to simulate these changes and predict the most likely binding mode between a drug molecule and a protein, even when the protein has a flexible structure.

Since we want the protein shape should change correctly, the prediction part, which is the first aspect, should be correct. Now coming to the question, how do we predict this direction of shape change? Consider a person holding a large stretched spring (see Figure 2). He will be under a lot of stress and will try to minimize the stress by changing his shape. The spring is a perturbation to the relaxed state of the person, as the drug binding is to the native state of the protein. He can change his shape to minimize the spring-induced stress, e.g., bending forward, backward, or sideways. Some ways are rigid, like bending backward, and some are flexible, like bending forward. All ways will reduce the spring-induced stress, but rigid ways will cause extra internal stress in his body. So, the person will change his shape to minimize the total stress (spring-induced + internal). He will likely bend forwards or along a flexible way. Similarly, drug-binding induces stress in protein, which will change its shape along the direction that minimizes the total stress (drug-induced + internal).

For the example of a person holding a spring, it is intuitive to identify which way is flexible and which is rigid. But How do we identify the same in the case of a Protein? The key is identifying the intrinsic direction of protein motions along which it can change its shape. Different directions have different rigidity; their identification is technically called normal mode analysis. The drug molecule may apply force along rigid and flexible directions. Due to these forces, the protein will tend to change shape along the flexible direction as it will cause minimal internal stress.

However, we use advanced theories such as the linear response theory of statistical mechanics to predict which flexible direction is best among all available. For example, see Figure 1d to see how the structure of HIV-protease changed when it was bound to the drug as predicted using our technique. You can see the flaps in which the gateways of this protein are closed. In summary, Proteins are complex chemical molecules that perform critical functions in the human body. The same is true for pathogenic bacteria and viruses, as proteins also play a critical role in their lives. By hacking proteins and making them useless, we can stop some functions of the pathogen machinery, and if the function is crucial for its survival, the pathogen will die.

One of the major challenges in computational drug discovery is the need to calculate the binding affinity between each candidate molecule and the target protein. However, proteins are highly flexible and can assume multiple shapes, making it difficult to predict their interactions with drugs. Our newly developed technique addresses this challenge by predicting the new shape of the protein, allowing for a more accurate binding affinity calculation. This is expected to increase the accuracy of identifying potential drugs and reduce the occurrence of false-positive predictions. For any queries email us at info@moleculeai.com

Proteins, supercomputers, and drug discovery

Written by MoleculeAI

Responses (1)