|Articles|August 15, 2019

Using Supercomputers and Machine Learning to Discover Defective Amino Acids that Cause Diseases

Tiny defects under high-tech observation could point to breakthroughs.

Many diseases including cancer, diabetes and digestive disorders are caused by malfunctioning ribosomes and proteins. In the human body, ribosomes provide codes for building proteins. A research team led by Narayana R. Aluru, Ph.D., M.S., from the University of Illinois at Urbana-Champaign, Department of Mechanical Science and Engineering, Beckman Institute for Advanced Science and Technology is doing research on amino acids to help locate faulty amino acids and proteins.

The miniscule defects under high-tech observation could potentially point the way to medical breakthroughs, according to experts.

“Many diseases are caused by the faulty reading of DNA in the ribosomes which leads to a faulty amino acid chain,” said Mohammad Heiranian, a Ph.D. candidate leading the research. “Our team is using nanopore-sequencing technology for protein detection to help determine single point mutations which can cause a variety of diseases. The goal is to identify the 20 essential amino acids with high precision and high resolution to aid in disease detection. Performing this research requires a fast, inexpensive way to identify the amino acids.”

“Our team uses supercomputers and machine learning (ML) to perform simulations in our amino acid research,” said Amir Taqieddin, another Ph.D. candidate. “Using supercomputers and ML provides a huge leap forward allowing our team to do experiments that are hard to do and run thousands of simulations, which would not be possible in our lab.”

The team used the Stampede2 supercomputer, one of the most powerful supercomputers in the U.S. for open science research, located at the University of Texas at Austin's Texas Advanced Computing Center (TACC) to run 4,293 simulations studying amino acids for 65 microseconds of molecular dynamics simulation—four orders of magnitude larger than typical simulation time.

“Due to the large amount of data and computation required, this work would take approximately 100 to 200 years of processing on a laptop or takes 50 years on a cluster computer,” said Taqieddin. “Our team was able perform over 4,000 amino acid simulations on Stampede2 in slightly over a month of computation time.”

Discovery of Defective Amino Acids with Nanopore-Sequencing

The team uses supercomputers running nanopore-sequencing technology for parallel analysis of thousands of protein pores with the ability to read a chain of DNA thousands of times. Biological nanopore sequencing uses transmembrane proteins, called porins, that cross a cellular membrane and act as a pore, through which molecules can disperse. The pores contain size dependent porous surfaces — with nanometer scale "holes" distributed across the membranes.

The nanopore has tiny holes and most materials used in nanopore sequencing are too thick, meaning that they span multiple amino acid chains, according to the scientific literature. This causes issues because the signal returned from simulation is from multiple amino acids rather than a single amino acid. An analogy in the real world might be a slot where a single soccer ball should fit. If the slot is too wide, then perhaps ten soccer balls would fall into the slot and the results of testing would be inaccurate for a single ball

The team used a nanoporous single-layer molybdenum disulfide (MoS₂) which is a two-dimensional (2D) material in their research.

“The significance of MoS₂ is that it is thin, only covering three atoms,” said Heiranian. “We can accurately identify the signal from a single amino acid to determine the properties of proteins. If simulations show the result of a faulty amino acid, then we know it is from a single, specific amino acid rather than multiple amino acids.”

Figure 1. Simulation set up for the polypeptide chain with 16 units, MoS2 nanopore, and ions. Courtesy of University of Illinois, at Urbana-Champaign.

Supercomputers and Software used in the Research

The team used open source Nanoscale Molecular Dynamics (NAMD) software in their research. NAMD is noted for its parallel efficiency and is often used to simulate large systems containing millions of atoms. In addition, they used Intel MPI in their research which provided additional parallelization capabilities.

The TACC Stampede2 supercomputer used for the simulations is an 18-petaflop system containing 4,200 Intel Xeon Phi nodes, and it uses Intel Xeon Scalable processors and Intel Omni-Path Architecture.

“The scaling on Stampede2 was near ideal allowing us to complete our extensive simulations,” explained Heiranian.

Results of the Research

The nanopore research included 4,000 data points of the ionic current and resident time. Because of the volume of data, it was impossible to plot the whole domain for the different types of amino acids without doing millions of simulations. Using the Random Forest ML algorithm, they characterized the ionic current and residence time associated with the 20 standard amino acids by translocating them through a single-layer MoS₂ nanopore using extensive simulations. Supervised and unsupervised machine learning and classification techniques were used to classify and detect signals with a high prediction accuracy of up to 99.6%.

Get the best insights in digital health directly to your inbox.

Novartis Knew of Test Data Manipulation Before Drug Approval, FDA Says

UPMC Forms Telemedicine Company for Infectious Diseases

Subscribe Now!

Using Supercomputers and Machine Learning to Discover Defective Amino Acids that Cause Diseases

Discovery of Defective Amino Acids with Nanopore-Sequencing

Supercomputers and Software used in the Research

Results of the Research

Newsletter

Related Content

Healthcare, life sciences leaders expect more mergers in 2026

Family doctors face more vaccine confusion

We're missing the mark on preventative care | Viewpoint

Providence CEO sees more progress in 2026

‘Dangerous’: Changes in vaccine schedule worry healthcare leaders

Latest CME

Personalized Management in NSCLC: Strategies for Early Detection, Molecular Testing, and Targeted Therapies | Kentucky

Show Me the Data™: Personalizing First-Line and Maintenance Therapy in HER2+ Metastatic Breast Cancer to Extend Survival and Elevate Quality of Life

“D” Is for Diagnosis: Decoding a Difficult Thoracic Malignancy – Piecing Together a Rare Diagnosis, Preparing for Tomorrow’s Treatments

From Frontline to Heavily Pretreated HR+/HER2- Metastatic Breast Cancer: Expert Perspectives on Optimizing the Expanding Treatment Armamentarium

Community Oncology Connections™: DLL3-Targeting Bispecific Antibodies for Small Cell Lung Cancer – From Innovation to Practice | Iowa

Community Oncology Connections™: DLL3-Targeting Bispecific Antibodies for Small Cell Lung Cancer – From Innovation to Practice | New York

Mastering Epithelioid Sarcoma: Enhancing Diagnostic Precision and Tailoring Treatment Strategies

Clinical Showcase™: Selecting the Best Next Steps for a Patient with Epithelioid Sarcoma

Medical Crossfire®: The Who, When, and How of TROP2-Targeting ADCs, ICIs, and PARP inhibition in Triple-Negative Breast Cancer

Community Oncology Connections™: Optimizing SCLC Treatment Strategies and Managing Adverse Events Across Disease Stages | Minnesota

Community Oncology Connections™: Optimizing SCLC Treatment Strategies and Managing Adverse Events Across Disease Stages | Wisconsin

Personalized Management in NSCLC: Strategies for Early Detection, Molecular Testing, and Targeted Therapies | Arizona

Inaugural Brain & Spine Metastases Conference: Evolving Practice and Emerging Therapies

Personalized Management in NSCLC: Strategies for Early Detection, Molecular Testing, and Targeted Therapies | Nevada

Community Oncology Connections™: Optimizing SCLC Treatment Strategies and Managing Adverse Events Across Disease Stages | Indiana

2nd Annual Hawaii Cancer Conference

Community Oncology Connections™: Optimizing SCLC Treatment Strategies and Managing Adverse Events Across Disease Stages | Arkansas

Community Oncology Connections™: Optimizing SCLC Treatment Strategies and Managing Adverse Events Across Disease Stages | Tennessee

Medical Crossfire®: Bridging Evidence to Practice in AML…Updates on FLT3, IDH1/2, Maintenance, Combos, and Clinical Trials

A Breath of Strength: Managing Cancer Associated LEMS and Lung Cancer as One

Show Me the Data™: Bridging Clinical Gaps Along the Continuum From Resectable, Early Stage to Advanced Gastric/Gastroesophageal Junction Cancers

Striking the Right Nerve: Managing Cancer Associated LEMS in Lung Cancer Patients

19th Annual New York GU Cancers Congress™

Medical Crossfire®: Expert Interpretations of the Latest Data in CLL Management – Understanding the Impact of Optimal Treatment Selection on Patient Outcomes

Virtual Testing Board: Digging Deeper on Your Testing Reports to Elevate Patient Outcomes in Advanced Non–Small Cell Lung Cancer

Medical Crossfire® – From Diagnostic Dilemmas to Potential Treatment Breakthroughs: Exploring Novel Targets for Extrapulmonary Neuroendocrine Carcinomas

11th Annual School of Gastrointestinal Oncology® (SOGO®)

Addressing Unmet Needs in HER2+ Metastatic BTC

Community Practice Connections™: Tailored Treatment Approaches for Older Patients With Advanced HR+/HER2– Breast Cancer

Community Practice Connections™: Optimizing Treatment Outcomes and Preserving Fertility in Premenopausal HR+ Breast Cancer

From Bench to Bedside: Paradigm Shifts in HER2+ Metastatic BTC Treatment

Proactive Adverse Event Management for HER2+ BTC Treatments

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

A Case-Guided Discussion on Managing Immune Thrombocytopenic Purpura (ITP)

GI Tumor Board—Applying Recent Advances in Biomarker Testing and Treatment in Metastatic Colorectal Cancer

Evolving Treatment Strategies in Pancreatic Cancer: Current Standards, Emerging Targets, and the Role of Molecular Testing

Medical Crossfire®: Precision Medicine in Glioma Treatment — Integration of Molecular Profiling to Inform Targeted Therapies

Cases and Conversations™: Sorting Through the Expanding Treatment Options for Patients with Relapsed/Refractory Multiple Myeloma

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Medical Crossfire®: Harnessing the Power of Modern Therapies in Newly Diagnosed Multiple Myeloma

Medical Crossfire®: Improving Patient Outcomes in Myeloproliferative Neoplasms With Novel Therapeutic Approaches

Tumor Board: Expert Insights on Managing Classical 𝘌𝘎𝘍𝘙 Mutations, 𝘌𝘎𝘍𝘙 Exon 20 Insertions, and Atypical 𝘌𝘎𝘍𝘙 Mutations in Metastatic NSCLC

Medical Crossfire®: Expert Perspectives on Targeting c-Met Overexpression and 𝘔𝘌𝘛 Genomic Alterations in NSCLC – Unveiling the Complexities of 𝘔𝘌𝘛 Dysregulation

Cases & Conversations™: Transforming AML Care—Precision Strategies, Evolving Therapies, and Clinical Insights

Medical Crossfire®: Integrating Next-Generation Endocrine Targeting Therapies to Improve Outcomes for Patients With HR+/HER2- Breast Cancer

Breast Cancer Tumor Board: Targeting TROP2 – Innovations in Triple-Negative Breast Cancer Treatment

Expert Guidance on Frequently Asked Questions Regarding the Use of ADCs in TNBC

Evaluating the Latest Data and Ongoing Trials for Novel ADC Approaches in TNBC

Establishing the Rationale for ADC and ICI Combinations in TNBC

Breaking Down the Rationale for Targeting TROP2 in TNBC

Dissecting Clinical Trial and Real-World Data for ADCs in TNBC

Breaking Down the Latest Clinical Data for First-line Maintenance and R/R SCLC

Cross-Disease Integration of Immunotherapy Innovations

Broadening the Frontline—Studies Informing the Use of Immunotherapy in Hepatocellular Carcinoma

Optimizing Treatment for Biliary Tract Cancers

PER Resource Center: Integrating Novel Approaches in TNBC – New Avenues for TROP2-Targeting ADCs and Beyond – Nursing

Practical Considerations and Future Directions for New Treatment Strategies in SCLC

Expert Roundtable and Panel Discussions: Current and Future Landscape of TNBC

Show Me the Data®: New and Emerging Roles for Oral SERD Therapy in the Treatment of ER+/HER2– Breast Cancer

Navigating Treatment Gaps in SCLC: Relapse, Resistance, and Need for New Options

Medical Crossfire® in Adjunctive Testing: Charting a New Course in Prostate Cancer Risk Assessment

BURST CME™ Resource Center: Integrating Novel PSMA-Directed Radioligand Approaches for Diagnosis and Management of Prostate Cancer

Radioligand Therapy 101: The Science Behind the Strategy

Ready for Radioligand Therapy? Patient Selection and Sequencing Simplified

Working Together: Overcoming Barriers to Optimize Outcomes in Patients Treated With Radioligand Therapy Through Multidisciplinary Care

Imaging Matters: Decoding PSMA PET for Better Decision-Making

A New Era of Targeted Therapy for Advanced NSCLC: Exploring Future Directions for Bispecific Antibodies and ADCs

Community Practice Connections™: Enhancing Melanoma Outcomes With Intratumoral Oncolytic Immunotherapy–Strategies for the Multidisciplinary Team