Gene set enrichment analysis is integral to the study of genomics, allowing researchers to interpret data collected at the level of single genes by their collective involvement in biological pathways. However, the assessment and comparison of such methods has been generally ad hoc and lacking in data related to public health.
In a paper published recently in the journal Briefings in Bioinformatics, Ludwig Geistlinger, an investigator with the CUNY Institute for Implementation Science in Public Health and CUNY SPH Associate Professor Levi Waldron provided the most thorough benchmarking of gene set enrichment analysis methods to date, along with open-source software and extensive curated databases for applying these benchmarks to new gene set enrichment methods.
“While most benchmarking studies of gene set enrichment analysis have employed simulations and a couple datasets on model organisms, our work incorporates a curated compendium of 75 expression datasets investigating 42 human diseases that have gene pathways of known relevance,” says Geistlinger, the study’s lead author. “This study improves our understanding of how gene set enrichment analysis methods perform in biomedical applications, and will help future developments focus on improvements relevant to public health. We wanted to provide specific advice to researchers about currently popular methods, but also to provide an easy-to-use framework for others to extend our efforts.”
Geistlinger L, Csaba G, Santarelli M, Ramos M, Schiffer L, Turaga N, Law C, Davis S, Carey V, Morgan M, Zimmer R, Waldron L: “Toward a gold standard for benchmarking gene set enrichment analysis.” Brief. Bioinform. 2020.