Abstract
Transcription factors (TFs) are fundamental regulators of gene expression and perform diverse functions in cellular processes. The management of 3-dimensional (3D) genome conformation and gene expression relies primarily on TFs. TFs are crucial regulators of gene expression, performing various roles in biological processes. They attract transcriptional machinery to the enhancers or promoters of specific genes, thereby activating or inhibiting transcription. Identifying these TFs is a significant step towards understanding cellular gene expression mechanisms. Due to the time-consuming and labor-intensive nature of experimental methods, the development of computational models is essential. In this work, we introduced a two-layer prediction framework based on a support vector machine (SVM) using the latent space representation of a protein language model, ProtBert. The first layer of the method reliably predicts and identifies transcription factors (TFs), and in the second layer, the proposed method predicts and identifies transcription factors that prefer binding to methylated deoxyribonucleic acid (TFPMs). In addition, we also tested the proposed method on an imbalanced database. In detecting TFs and TFPMs, the proposed model consistently outperformed state-of-the-art approaches, as demonstrated by performance comparisons via empirical cross-validation analysis and independent tests.
| Original language | English |
|---|---|
| Article number | 4234 |
| Journal | International Journal of Molecular Sciences |
| Volume | 26 |
| Issue number | 9 |
| DOIs | |
| State | Published - 2025.05 |
Keywords
- bidirectional encoder representations from transformers
- machine learning
- methylated deoxyribonucleic acid
- non-methylated deoxyribonucleic acid
- protein language model
- transcription factors
Quacquarelli Symonds(QS) Subject Topics
- Computer Science & Information Systems
- Engineering - Petroleum
- Data Science
- Engineering - Chemical
- Chemistry
- Biological Sciences
Fingerprint
Dive into the research topics of 'TFProtBert: Detection of Transcription Factors Binding to Methylated DNA Using ProtBert Latent Space Representation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver