The Applications of the General Expression Transformer

March 17, 2025 By Stuart P. Atkinson

Multi-panel figure showing MPRA experiemnt, results, and predicted enhancer/promoter interactions
Next-generation lentiMPRA workflow. Barcoded enhancer libraries are cloned into lentiviral vectors, mapped by paired-end sequencing, packaged and transduced into cells, and quantified by RNA/DNA sequencing to measure activity of ~20,000–240,000 elements in parallel. From Fu, Mo, and Buendia et al.

The Applications of the General Expression Transformer

In the first part of this blog series, Epigenome Technologies offered a brief overview of the development of GET - general expression transformer Nature study from researchers led by Xi Fu, Raul Rabadan (Columbia University), and Eric P. Xing (MBZUAI/Carnegie Mellon University) (Fu, Mo, and Buendia et al.).

We previously noted that understanding relationships between the chromatin landscape and transcription remained challenging, as most approaches employ separate epigenetic and transcriptomic assays performed in distinct cell samples that require the reintegration of diverse datasets, which may reduce robustness. Parallel analysis of individual cells for RNA expression and DNA from targeted tagmentation by sequencing or " Paired-Tag Epigenome Technologies generates joint epigenetic and gene expression profiles at single-cell resolution and detects histone modifications and RNA transcripts in individual nuclei with an efficiency similar to single-nucleus RNA-seq/ChIP-seq assays. Paired-tag technology enables researchers to advance their understanding of transcriptional regulation and improve disease management.

In this second part of this series, Epigenome Technologies summarizes the applications of GET: predicting regulatory activity and recognizing regulatory elements/physical interactions between transcription factors (TFs) supports the identification of unreported distal regulatory regions in erythroblasts and a TF-TF interaction in B cells that explain the significance of a leukemia-associated germline mutation risk and the construction of a TF/coactivator interaction catalog.

Multi-panel figure demonstrating cCRE inference from various methods
Comparative CRE inference at NFIX and BCL11A loci. Genome browser tracks show HiChIP-defined loops (ground truth), ATAC accessibility in HUDEP-2 and fetal erythroblasts, and predicted regulatory element activity from GET (Jacobian), Enformer, HyenaDNA (ISM), and ABC power-law methods over 1.1 Mb (chr19) and 0.75 Mb (chr2). From Fu, Mo, and Buendia et al.

Investigating Fetal Hemoglobin-Regulating Loci

Multi-panel figure describing how TF inference works and showing causal predictoin curves
Causal transcription-factor network recovery. (a) ICA-derived motif–motif interaction graph highlighting key TF pairs. (b) Macro F₁-score curves for GET causal predictions versus motif-colocalization and random baseline, benchmarked against AP/BioID-MS and HepG2 ChIP-seq, as a function of retained motif-pair percentile. From Fu, Mo, and Buendia et al.

Building a Structural Catalog of the Human Transcription Factor Interactome

Multi-panel figure showing folding prediction of PAX5 IDR mutations, network interactions, and co-IP
PAX5 IDR mutation and interaction analysis. (a) Per-residue pLDDT profile of PAX5 showing low-confidence regions and mutational hotspots (V26G, P80R, G183S). (b) BioID network of PAX5 interactors enriched in the low-confidence IDR. (c) Structural model of the PAX5 F domain with G183S highlighted. (d–e) Co-immunoprecipitation and quantification of NR2C2 binding to wild-type versus G183S PAX5. From Fu, Mo, and Buendia et al.

TFTF Co‐binding Prediction: A Closer Look

Multi-panel figure showing motif enrichments and overlaps
Promoter motif enrichment and gene‐set analysis. (a) Venn diagram of genes with top 10,000 promoters containing PAX2 or NR/3 motifs, listing select lineage markers. (b) Bar plots of TF perturbation signature enrichment (e.g., ETV6–PAX5 MUT vs. WT) and GO terms for genes downregulated in mutant vs. wild-type. (c) Enrichment of ProB/MatB‐activated and -repressed gene sets (–log₁₀ p-value).

A Focus on PAX5 and the Functional Role of Mutations

Multi-panel figure showing model predictions with no training and one-shot training
Zero- and one-shot GET expression predictions across diverse human samples. (a–c) Scatterplots of predicted vs. observed log-expression in lymphoma B cells, normal B cells, and HSPCs (R², r shown). (d) Schematic of zero-shot GET mapping from fetal atlas to GBM samples. (e) Violin plots of Pearson’s r in GBM tumors, macrophages, and oligodendrocytes. (f) Radar plot of per-chromosome R², Pearson, and Spearman performance in GBM tumors.

The Applications for the General Expression Transformer - The Highlights

Overall, the authors report exciting examples of the potential outcomes of applying GET, but what's next? The authors note that future enhancements to GET may include the integration of additional biological information (e.g., 3D chromatin architecture and single-cell data) and the incorporation of more cell states and a broader range of assays (e.g., TF binding and histone modification profiles) to further improve model outcomes. They also note the potential for GET in predicting the functional impact of non-coding genetic variants, which may provide further insight into disease susceptibility and perhaps support the development of novel therapeutics. Moving forward, single-cell datasets provided by Paired-tag an analytical platform that creates joint epigenetic and gene expression profiles at single-cell resolution and detects histone modifications and RNA transcripts in individual nuclei - may support the development of related studies and exciting new tools. The Bing Ren lab developed Paired-tag, and Epigenome Technologies offers optimized Paired-Tag kits and services to researchers in the epigenetics field under an exclusive license from the Ludwig Institute for Cancer Research.

See Nature (January 2025) for more on the application of GET, and stay tuned to Twitter, Bluesky, and LinkedIn to keep up to date with all the new epigenetics studies; furthermore, check out our Products and Services pages to see how Epigenome Technologies can elevate your research today.