Systematic Identification of Abundant Adenosine to Inosine Editing Sites in the Human Transcriptome
Levanon, E.Y., Eisenberg, E., Yelin, R., Nemzer, S., Hallegger, M., Shemesh, R., Fligelman, Z.Y., Shoshan, A., Pollock, S.R., Sztybel, D., Olshansky, M., Rechavi, G., and Jantsch, M.F. (2004), Nature Biotechnology 22, 1001-1005.

Compugen Ltd., 72 Pinchas Rosen St., Tel-Aviv 69512, Israel

RNA editing by members of the double-stranded RNA-specific ADAR family leads to site-specific conversion of adenosine to inosine (A-to-I) in precursor messenger RNAs. Editing by ADARs is believed to occur in all metazoa, and is essential for mammalian development. Currently, only a limited number of human ADAR substrates are known, while indirect evidence suggests a substantial fraction of all pre-mRNAs being affected. Here we describe a computational search for ADAR editing sites in the human transcriptome, using millions of available expressed sequences. 12,723 A-to-I editing sites were mapped in 1,637 different genes, with an estimated accuracy of 95%, raising the number of known editing sites by two orders of magnitude. We experimentally validated our method by verifying the occurrence of editing in 26 novel substrates. A-to-I editing in humans primarily occurs in non-coding regions of the RNA, typically in Alu repeats. Analysis of the large set of editing sites indicates the role of editing in controlling dsRNA stability.

Links to Supplementary Data

Editing Sites Database
The database contains all genes predicted to be edited which have NCBI LocusLink annotation. For each such gene, it provides some annotation data, a list of the related editing sites, the flanking genomic sequences, alignment to an Alu repeat and multiple alignment of some of the GenBank sequences exhibiting editing.

Flanking Regions of the 12,723 Sites (in FASTA format)
Genomic sequences flanking the predicted editing locations. Lower-case "a" stands for the editing site.

Chromatograms of the Lab Verified Editing Sites
Chromatograms appear in high resolution in slideshow mode. The gene name used here is after a representing RefSeq sequence. The RefSeq itself is not necessarily edited.
View online (HTML) | Download (Zip, 4.8 MB)