Finding cis-regulatory elements using comparative genomics: some lessons from ENCODE data.
Identification of functional genomic regions using interspecies comparison will be most effective when the full span of relationships between genomic function and evolutionary constraint are utilized. We find that sets of putative transcriptional regulatory sequences, defined by ENCODE experimental data, have a wide span of evolutionary histories, ranging from stringent constraint shown by deep phylogenetic comparisons to recent selection on lineage-specific elements. This diversity of evolutionary histories can be captured, at least in part, by the suite of available comparative genomics tools, especially after correction for regional differences in the neutral substitution rate. Putative transcriptional regulatory regions show alignability in different clades, and the genes associated with them are enriched for distinct functions. Some of the putative regulatory regions show evidence for recent selection, including a primate-specific, distal promoter that may play a novel role in regulation.