Statistical relationships across epigenomes using large-scale hierarchical clustering

Published in Bioinformatics Advances, 2025

Abstract:
Recent advances in genomics and sequencing platforms have revolutionized our ability to create immense data sets, particularly for studying epigenetic regulation of gene expression. However, the avalanche of epigenomic data is difficult to parse for biological interpretation given non-linear complex patterns and relationships. This attractive challenge in epigenomic data lends itself to machine learning for discerning infectivity and susceptibility. In this study, we explore over 3,000 epigenomes of uninfected individuals and provide a framework to characterize the relationships among epigenetic modifiers, their modifiers, genetic loci, and specific immune cell types across all chromosomes using hierarchical clustering.