Kevin Yip Archives - Sanford Burnham Prebys
Institute News

Simulating science or science fiction? 

AuthorGreg Calhoun
Date

August 27, 2024

By harnessing artificial intelligence and modern computing, scientists are simulating more complex biological, clinical and public health phenomena to accelerate discovery.

While scientists have always employed a vast set of methods to observe the immense worlds among and beyond our solar system, in our planet’s many ecosystems, and within the biology of Earth’s inhabitants, the public’s perception tends to reduce this mosaic to a single portrait.

A Google image search will reaffirm that the classic image of the scientist remains a person in a white coat staring intently at a microscope or sample in a beaker or petri dish. Many biomedical researchers do still use their fair share of glassware and plates while running experiments. These scientists, however, now often need advanced computational techniques to analyze the results of their studies, expanding the array of tools researchers must master to push knowledge forward. For every scientist pictured pipetting, we should imagine others writing code or sending instructions to a supercomputer.

In some cases, scientists are testing whether computers can be used to simulate the experiments themselves. Computational tools such as generative artificial intelligence (AI) may be able to help scientists improve data inputs, create scenarios and generate synthetic data by simulating biological processes, clinical outcomes and public health campaigns. Advances in simulation one day might help scientists more quickly narrow in on promising results that can be confirmed more efficiently through real-world experiments.

“There are many different types of simulation in the life sciences,” says Kevin Yip, PhD, professor in the Cancer Genome and Epigenetics Program at Sanford Burnham Prebys and director of the Bioinformatics Shared Resource. “Molecular simulators, for example, have been used for a long time to show how certain molecules will change their shape and interact with other molecules.”

“One of the most successful examples is in structural biology with the program AlphaFold, which is used to predict protein structures and interactions,” adds Yip. “This program was built on a very solid foundation of actual experiments determining the structures of many proteins. This is something that other fields of science can work to emulate, but in most other cases simulation continues to be a work in progress rather than a trusted technique.”

In the Sanford Burnham Prebys Conrad Prebys Center for Chemical Genomics (Prebys Center), scientists are using simulation-based techniques to more effectively and efficiently find new potential drugs.

Click to Play VideoNanome Virtual Reality demonstration

To expedite their drug discovery and optimization efforts, the Prebys Center team uses a suite of computing tools to run simulations that model the fit between proteins and potential drugs, how long it will take for drugs to break down in the body, and the likelihood of certain harmful side effects, among other properties.

“In my group, we know what the proteins of interest look like, so we can simulate how certain small molecules would fit into those proteins to try and design ones that fit really well,” says Steven Olson, PhD, executive director of Medicinal Chemistry at the Prebys Center. In addition to fit, Olson and team look for drugs that won’t be broken down too quickly after being taken.

“That can be the difference between a once-a-day drug and one you have to take multiple times a day, and we know that patients are less likely to take the optimal prescribed dose when it is more than once per day,” notes Olson. 

Steven Olson, PhD, profile photo

Steven Olson, PhD, is the executive director of Medicinal Chemistry at the Prebys Center.

“We can use computers now to design drugs that stick around and achieve concentrations that are pharmacologically effective and active. What the computers produce are just predictions that still need to be confirmed with actual experiments, but it is still incredibly useful.”

In one example, Olson is working with a neurobiologist at the University of California Santa Barbara and an x-ray crystallographer at the University of California San Diego on new potential drugs for Alzheimer’s disease and other forms of dementia.

“This protein called farnesyltransferase was a big target for cancer drug discovery in the 1990s,” explains Olson. “While targeting it never showed promise in cancer, my collaborator showed that a farnesyltransferase inhibitor stopped proteins from aggregating in the brains of mice and creating tangles, which are a pathological hallmark of Alzheimer’s.”

“We’re working together to make drugs that would be safe enough and penetrate far enough into the brain to be potentially used in human clinical trials. We’ve made really good progress and we’re excited about where we’re headed.”

To expedite their drug discovery and optimization efforts, Olson’s team uses a suite of computing tools to run simulations that model the fit between proteins and potential drugs, how long it will take for drugs to break down in the body, and the likelihood of certain harmful side effects, among other properties. The Molecular Operating Environment program is one commercially available application that enables the team to visualize candidate drugs’ 3D structures and simulate interactions with proteins. Olson and his collaborators can manipulate the models of their compounds even more directly in virtual reality by using another software application known as Nanome. DeepMirror is an AI tool that helps predict the potency of new drugs while screening for side effects, while StarDrop uses learning models to enable the team to design drugs that aren’t metabolized too quickly or too slowly.

Steven Olson et al using VR in Prebys Center

The Prebys Center team demonstrates how the software application known as Nanome allows scientists to manipulate the models of potential drug compounds directly in virtual reality.

“In addition, there are certain interactions that can only be understood by modeling with quantum mechanics,” Olson notes. “We use a program called Gaussian for that, and it is so computationally intense that we have to run it over the weekend and wait for the results.”

“We use these tools to help us visualize the drugs, make better plans and give us inspiration on what we should make. They also can help explain the results of our experiments. And as AI improves, it’s helping us to predict side effects, metabolism and all sorts of other properties that previously you would have to learn by trial and error.”

While simulation is playing an active and growing role in drug discovery, Olson continues to see it as complementary to the human expertise required to synthesize new drugs and put predictions to the test with actual experiments.

“The idea that we’re getting to a place where we can simulate the entire drug design process, that’s science fiction,” says Olson. “Things are evolving really fast right now, but I think in the future you’re still going to need a blend of human brainpower and computational brainpower to design drugs.”


Programming in a Petri Dish, an 8-part series

How artificial intelligence, machine learning and emerging computational technologies are changing biomedical research and the future of health care

  • Part 1 – Using machines to personalize patient care. Artificial intelligence and other computational techniques are aiding scientists and physicians in their quest to prescribe or create treatments for individuals rather than populations.
  • Part 2 – Objective omics. Although the hypothesis is a core concept in science, unbiased omics methods may reduce attachments to incorrect hypotheses that can reduce impartiality and slow progress.
  • Part 3 – Coding clinic. Rapidly evolving computational tools may unlock vast archives of untapped clinical information—and help solve complex challenges confronting health care providers.
  • Part 4 – Scripting their own futures. At Sanford Burnham Prebys Graduate School of Biomedical Sciences, students embrace computational methods to enhance their research careers.
  • Part 5 – Dodging AI and computational biology dangers. Sanford Burnham Prebys scientists say that understanding the potential pitfalls of using AI and other computational tools to guide biomedical research helps maximize benefits while minimizing concerns.
  • Part 6 – Mapping the human body to better treat disease. Scientists synthesize supersized sets of biological and clinical data to make discoveries and find promising treatments.
  • Part 7 – Simulating science or science fiction? By harnessing artificial intelligence and modern computing, scientists are simulating more complex biological, clinical and public health phenomena to accelerate discovery.
  • Part 8 – Acceleration by automation. Increases in the scale and pace of research and drug discovery are being made possible by robotic automation of time-consuming tasks that must be repeated with exhaustive exactness.
Institute News

Dodging AI and other computational biology dangers

AuthorGreg Calhoun
Date

August 13, 2024

Sanford Burnham Prebys scientists say that understanding the potential pitfalls of using artificial intelligence and computational biology techniques in biomedical research helps maximize benefits while minimizing concerns

ChatGPT, an artificial intelligence (AI) “chatbot” that can understand and generate human language, steals most headlines related to AI along with the rising concerns about using AI tools to create false “deepfake” images, audio and video that appear convincingly real.

But scientific applications of AI and other computational biology methods are gaining a greater share of the spotlight as research teams successfully employ these techniques to make new discoveries such as predicting how patients will respond to cancer drugs.

AI and computational biology have proven to be boons to scientists searching for patterns in massive datasets, but some researchers are raising alarms about how AI and other computational tools are developed and used.

“We cannot just purely trust AI,” says Yu Xin (Will) Wang, PhD, assistant professor in the Development, Aging and Regeneration Program at Sanford Burnham Prebys. “You need to understand its limitations, what it’s able to do and what it’s not able to do. Probably one of the simplest examples would be people asking ChatGPT about current events as they happen.”

(ChatGPT has access only to news information up to certain cutoff dates based on the training set of websites and other information used for the most current version. Thus, its awareness of current events is not necessarily current.)

“I see a misconception where some people think that AI is so intelligent that you can just throw data at an AI model and it will figure it all out by itself,” says Andrei Osterman, PhD, vice dean and associate dean of curriculum for the Graduate School of Biomedical Sciences and professor in the Immunity and Pathogenesis Program at Sanford Burnham Prebys.

Yu Xin (Will) Wang, PhD

Yu Xin (Will) Wang, PhD, is an assistant professor in the Development, Aging and Regeneration Program at Sanford Burnham Prebys.

“In many cases, it’s not that simple. We can’t look at these models as black boxes where you put the data in and get an answer out, where you have no idea how the answer was determined, what it means and how it is applicable and generalizable.”

“The very first thing to focus on when properly applying computational methods or AI methods is data quality,” adds Kevin Yip, PhD, professor in the Cancer Genome and Epigenetics Program at Sanford Burnham Prebys and director of the Bioinformatics Shared Resource. “Our mantra is ‘garbage in, garbage out.’”

Andrei Osterman, PhD

Andrei Osterman, PhD, is a professor in the Immunity and Pathogenesis Program at Sanford Burnham Prebys.

Once researchers have ensured the quality of their data, Yip says the next step is to be prepared to confirm the results.

“Once we actually plug into certain tools, how can we actually tell whether they are doing a good job or not?” asks Yip. “We cannot just trust them. We need to have ways to validate either experimentally or even computationally using other ways to cross-check the findings.”

Yip is concerned that AI-based research and computational biology are moving too fast in some cases, contributing to challenges reproducing and generalizing results.

“There are so many new algorithms, so many tools published every day,” adds Yip. “Sometimes, they are not maintained very well, and the investigators cannot be reached when we can’t run their code or download the data they analyzed.”

For AI and computational biology techniques to continue their rapid development, it is important for the scientific community to be responsible, transparent and collaborative in sharing data and either code or trained AI models so that studies can be reproduced to enhance trust as these fields grow.

Privacy is another potential breeding ground for mistrust in research using AI algorithms to analyze medical data, from electronic health records to insurance claims data to biopsied patient samples.

“It is completely understandable that members of the public are concerned about the privacy of their personal data as it is a primary topic I discuss with colleagues at conferences,” says Yip. “When we work with patient data, there are very strict rules and policies that we have to follow.”

Yip adds that the most important rule is for scientists to never re-identify the samples without proper consent, which means using algorithms to predict which patient provided certain data.

Kevin Yip, PhD

Kevin Yip, PhD, is a professor in the Cancer Genome and Epigenetics Program at Sanford Burnham Prebys.

Ultimately for Yip, using AI and computational methods appropriately—within their limitations and without violating patients’ privacy—is a matter of professional integrity for the owners and users of these emerging technologies.

“As creators of AI and computational tools, we need to maintain our code and models and make sure they are accessible along with our data. On the other side, users need to understand the limitations and how to make good use of what we create without overstepping and claiming findings beyond the capability of the tools.”

 “This level of shared responsibility is very important for the future of biomedical research during the data revolution.”


Programming in a Petri Dish, an 8-part series

How artificial intelligence, machine learning and emerging computational technologies are changing biomedical research and the future of health care

  • Part 1 – Using machines to personalize patient care. Artificial intelligence and other computational techniques are aiding scientists and physicians in their quest to prescribe or create treatments for individuals rather than populations.
  • Part 2 – Objective omics. Although the hypothesis is a core concept in science, unbiased omics methods may reduce attachments to incorrect hypotheses that can reduce impartiality and slow progress.
  • Part 3 – Coding clinic. Rapidly evolving computational tools may unlock vast archives of untapped clinical information—and help solve complex challenges confronting health care providers.
  • Part 4 – Scripting their own futures. At Sanford Burnham Prebys Graduate School of Biomedical Sciences, students embrace computational methods to enhance their research careers.
  • Part 5 – Dodging AI and computational biology dangers. Sanford Burnham Prebys scientists say that understanding the potential pitfalls of using AI and other computational tools to guide biomedical research helps maximize benefits while minimizing concerns.
  • Part 6 – Mapping the human body to better treat disease. Scientists synthesize supersized sets of biological and clinical data to make discoveries and find promising treatments.
  • Part 7 – Simulating science or science fiction? By harnessing artificial intelligence and modern computing, scientists are simulating more complex biological, clinical and public health phenomena to accelerate discovery.
  • Part 8 – Acceleration by automation. Increases in the scale and pace of research and drug discovery are being made possible by robotic automation of time-consuming tasks that must be repeated with exhaustive exactness.
Institute News

Scripting their own futures

AuthorGreg Calhoun
Date

August 8, 2024

At Sanford Burnham Prebys Graduate School of Biomedical Sciences, students embrace computational methods to enhance their research careers

Although not every scientist-in-training will need to be an ace programmer, the next generation of scientists will need to take advantage of advances in artificial intelligence (AI) and computing that are shaping biomedical research. Scientists who understand how to best process, store, access and employ algorithms to analyze ever-increasing amounts of information will help lead the data revolution rather than follow in its wake.

“I think the way to do biology is very different from just a decade or so ago,” says Kevin Yip, PhD, a professor in the Cancer Genome and Epigenetics Program at Sanford Burnham Prebys and the director of the Bioinformatics Shared Resource. “Looking back, I could not have imagined playing much of a role as a data scientist, and now I see that my peers and I are at the core of the whole discovery process.”

In 2017, bioinformatics experts suggested in Genome Biology that graduate education programs should focus on teaching computational biology to all learners rather than just those with a special interest in programming or data science. The authors noted that the changing nature of the life sciences required researchers to respond in kind. Teams of scientists must be able to formulate algorithms to keep pace and detect new discoveries obscured within oceans of data too vast to parse with prior methods.

“I think most people now would agree that data science and the use of computational methods—AI included—are indispensable in biology,” says Yip. “To use these approaches to the greatest effect, computational biologists and bench laboratory scientists need to be trained to speak a common language.”

Kevin Yip, PhD

Kevin Yip, PhD, is a professor in the Cancer Genome and Epigenetics Program at Sanford Burnham Prebys.

When Yip joined Sanford Burnham Prebys in 2022, he was tasked with directing a course on computational biology for the Institute’s Graduate School of Biomedical Sciences.

“We believe that the new generation of graduate students needs to have the ability to understand what algorithms are and how they work, rather than just treating those tools as black boxes,” says Yip. “They may not be able to invent new algorithms right out of the course, but they’ll be better equipped to participate in collaborative projects.”

Andrei Osterman, PhD

Andrei Osterman, PhD, is a professor in the Immunity and Pathogenesis
Program at Sanford Burnham Prebys.

Yip’s work developing the course has been well-received by graduate students based on their evaluations of the class.  

“I loved the computational biology course,” says Katya Marchetti, a second-year PhD student in the lab of  Karen Ocorr, PhD, and the recipient of an Association for Women in Science scholarship.

“It was so helpful to learn skills that I could immediately see incorporating into my own research. I’m so glad I had this course. I know for a fact that I will need this knowledge and experience to be successful in whatever comes after my PhD. The people who have these skills objectively do better in postdoctoral fellowships or in the biotechnology industry.”

Yip and his fellow faculty members in the graduate school see an opportunity to further expand their approach to computational biology and data science topics.

“In the current course, students learn to use computational methods to analyze transcriptomics data,” says Andrei Osterman, PhD, vice dean and associate dean of Curriculum for the Graduate School and a professor in the Immunity and Pathogenesis Program at Sanford Burnham Prebys. “This is very useful hands-on training, but not advanced enough for some students.”

“We are seeing students with a computer science background coming into our graduate program,” notes Yip. “We are thinking about adding a new elective course for students who want to go beyond what our current class is offering.”

Graduate education is quickly evolving at Sanford Burnham Prebys and throughout the biomedical research community to match the demands of an era defined by effectively integrating computation and biology.

“Mutual understanding among data scientists and biologists is very important for where research is heading,” says Yip. “We will keep improving our training to set our students up for success.”

Katya Marchetti

Katya Marchetti is a second-year PhD student at Sanford Burnham Prebys.


Programming in a Petri Dish, an 8-part series

How artificial intelligence, machine learning and emerging computational technologies are changing biomedical research and the future of health care

  • Part 1 – Using machines to personalize patient care. Artificial intelligence and other computational techniques are aiding scientists and physicians in their quest to prescribe or create treatments for individuals rather than populations.
  • Part 2 – Objective omics. Although the hypothesis is a core concept in science, unbiased omics methods may reduce attachments to incorrect hypotheses that can reduce impartiality and slow progress.
  • Part 3 – Coding clinic. Rapidly evolving computational tools may unlock vast archives of untapped clinical information—and help solve complex challenges confronting health care providers.
  • Part 4 – Scripting their own futures. At Sanford Burnham Prebys Graduate School of Biomedical Sciences, students embrace computational methods to enhance their research careers.
  • Part 5 – Dodging AI and computational biology dangers. Sanford Burnham Prebys scientists say that understanding the potential pitfalls of using AI and other computational tools to guide biomedical research helps maximize benefits while minimizing concerns.
  • Part 6 – Mapping the human body to better treat disease. Scientists synthesize supersized sets of biological and clinical data to make discoveries and find promising treatments.
  • Part 7 – Simulating science or science fiction? By harnessing artificial intelligence and modern computing, scientists are simulating more complex biological, clinical and public health phenomena to accelerate discovery.
  • Part 8 – Acceleration by automation. Increases in the scale and pace of research and drug discovery are being made possible by robotic automation of time-consuming tasks that must be repeated with exhaustive exactness.