Programmable protein-DNA crosslinking for the direct capture and quantification of 5-formylcytosine

Abstract

5-Formylcytosine (5fC) is an epigenetic nucleobase of mammalian genomes that occurs as intermediate of active DNA demethylation. 5fC uniquely interacts and reacts with key nuclear proteins, indicating functions in genome regulation. Transcription-activator-like effectors (TALEs) are repeat-based DNA binding proteins that can serve as probes for the direct, programmable recognition and analysis of epigenetic nucleobases. However, no TALE repeats for the selective recognition of 5fC are available, and the typically low genomic levels of 5fC represent a particular sensitivity challenge. We here advance TALEbased nucleobase targeting from recognition to covalent crosslinking. We report TALE repeats bearing the ketoneamino acid p-acetylphenylalanine (pAcF) that universally bind all mammalian cytosine nucleobases, but selectively form diaminooxy-linker-mediated dioxime crosslinks to 5fC. We identify repeat-linker combinations enabling single CpG resolution, and demonstrate the direct quantification of 5fC levels in a human genome background by covalent enrichment. This strategy provides a new avenue to expand the application scope of programmable probes with selectivity beyond A, G, T and C for epigenetic studies.

Description

Table of contents

Keywords

Citation