Yijun Sun, PhD.

Yijun Sun, PhD, is developing groundbreaking analytical tools and fostering collaborations between computer scientists and biologists.

Bioinformatics Project Provides Tools for Using Massive Datasets

Published April 19, 2013 This content is archived.

Story by Suzanne Kashuba

Yijun Sun, PhD, assistant professor of microbiology and immunology, is using a $519,000 award from the National Science Foundation (NSF) to develop novel analytic methods for the study of microbial communities.

“This work ... has the potential to significantly advance discovery and understanding of the hidden microbial world.”
Yijun Sun, PhD
assistant professor of microbiology and immunology
Print

“This work represents a major transformation of the bioinformatics methodology used for investigating microbial communities,” says Sun.

It has the potential to significantly advance discovery and understanding of the hidden microbial world, he explains.

Developing Tools to Study Microbial Communities

Sun’s project involves creating an integrated suite of computational tools and statistical methods that will allow researchers from various disciplines to analyze tens of millions of 16S rRNA sequences.

The tools could be applied to research ranging from human epidemiological studies to global ocean surveys. Scientists will be able to use these tools to extract biologically relevant patterns from massive sequence data.

Overcoming Current Computational Hurdles

Complex microbial communities remain poorly characterized because large amounts of data overwhelm existing computational resources and analytic methods.  

To overcome these hurdles, the researchers are developing algorithms using advanced techniques, including parallel computing, online learning, graphical modeling and dimensionality reduction.

They also plan to establish a web application for performing comparative microbial community analysis.

These new analytical approaches will allow scientists to:

  • derive microbial community diversity
  • develop quantitative disease-associated microbial profiles
  • determine environment-microbe and microbe-microbe interactions
  • identify and quantify sequences from unclassified species.

Open Source, Online Tools On the Way

The results and tools developed will be made widely available to researchers through publications, web applications, workshops and open source projects.

Software and results will be accessible via a website for Sun’s lab.

Fostering Cross-Disciplinary Research, Training

“Open source projects will invite researchers from other fields, such as mathematics and statistics, to join this project,” emphasizes Sun.

Close interactions between computer scientists and biologists will create new teaching and training opportunities and spark new algorithmic research, he says.

“Currently, there is a shortage of researchers with a deep understanding of both computer science and molecular biology,” he adds. “This project will provide intensive training in both areas for two graduate students and a postdoctoral fellow.”

Authority on Machine Learning Joined UB in 2012

Sun, a machine learning and bioinformatics researcher, joined UB in fall 2012. His lab is located within UB’s New York State Center of Excellence in Bioinformatics & Life Sciences.

This NSF grant represents funds remaining from a three-year award he received in 2011.

Volker Mai, PhD, associate professor of epidemiology at the University of Florida, is collaborating with Sun on the project.