G2Cdb::Gene report

Gene id
Gene symbol
Homo sapiens
WD repeat domain 47
G00000804 (Mus musculus)

Databases (6)

ENSG00000085433 (Ensembl human gene)
22911 (Entrez Gene)
1223 (G2Cdb plasticity & disease)
WDR47 (GeneCards)
Marker Symbol
HGNC:29141 (HGNC)
Protein Sequence
O94967 (UniProt)

Synonyms (1)

  • KIAA0893

Literature (7)

Pubmed - other

  • Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes.

    Kimura K, Wakamatsu A, Suzuki Y, Ota T, Nishikawa T, Yamashita R, Yamamoto J, Sekine M, Tsuritani K, Wakaguri H, Ishii S, Sugiyama T, Saito K, Isono Y, Irie R, Kushida N, Yoneyama T, Otsuka R, Kanda K, Yokoi T, Kondo H, Wagatsuma M, Murakawa K, Ishida S, Ishibashi T, Takahashi-Fujii A, Tanase T, Nagai K, Kikuchi H, Nakai K, Isogai T and Sugano S

    Life Science Research Laboratory, Central Research Laboratory, Hitachi, Ltd., Kokubunji, Tokyo, 185-8601, Japan.

    By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by more than 500 bp and thus are very likely to constitute mutually distinct alternative promoters. To our surprise, at least 7674 (52%) human RefSeq genes were subject to regulation by putative alternative promoters (PAPs). On average, there were 3.1 PAPs per gene, with the composition of one CpG-island-containing promoter per 2.6 CpG-less promoters. In 17% of the PAP-containing loci, tissue-specific use of the PAPs was observed. The richest tissue sources of the tissue-specific PAPs were testis and brain. It was also intriguing that the PAP-containing promoters were enriched in the genes encoding signal transduction-related proteins and were rarer in the genes encoding extracellular proteins, possibly reflecting the varied functional requirement for and the restricted expression of those categories of genes, respectively. The patterns of the first exons were highly diverse as well. On average, there were 7.7 different splicing types of first exons per locus partly produced by the PAPs, suggesting that a wide variety of transcripts can be achieved by this mechanism. Our findings suggest that use of alternate promoters and consequent alternative use of first exons should play a pivotal role in generating the complexity required for the highly elaborated molecular systems in humans.

    Genome research 2006;16;1;55-65

  • Transcriptome analysis of human gastric cancer.

    Oh JH, Yang JO, Hahn Y, Kim MR, Byun SS, Jeon YJ, Kim JM, Song KS, Noh SM, Kim S, Yoo HS, Kim YS and Kim NS

    Laboratory of Human Genomics, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon , 305-333, Korea.

    To elucidate the genetic events associated with gastric cancer, 124,704 cDNA clones were collected from 37 human gastric cDNA libraries, including 20 full-length enriched cDNA libraries of gastric cancer cell lines and tissues from Korean patients. An analysis of the collected ESTs revealed that 97,930 high-quality ESTs coalesced into 13,001 clusters, of which 11,135 clusters (85.6%) were annotated to known ESTs. The analysis of the full-length cDNAs also revealed that 4862 clusters (51.7%) contained at least one putative full-length cDNA clone with an initiation codon, with the average length of the 5' UTR of 140 bp. A large number appear to have a diverse transcription start site (TSS). An examination of the TSS of some genes, such as TEGT and GAPD, using 5' RACE revealed that the predicted TSSs are actually found in human gastric cancer cells and that several TSSs differ depending on the specific gastric cell line. Furthermore, of the human gastric ESTs, 766 genes (9.5%) were present as putative alternatively spliced variants. Confirmation of the predicted spliced isoforms using RT-PCR showed that the predicted isoforms exist in gastric cancer cells and some isoforms coexist in gastric cell lines. These results provide potentially useful information for elucidating the molecular mechanisms associated with gastric oncogenesis.

    Mammalian genome : official journal of the International Mammalian Genome Society 2005;16;12;942-54

  • A human protein-protein interaction network: a resource for annotating the proteome.

    Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H and Wanker EE

    Max Delbrueck Center for Molecular Medicine, 13092 Berlin-Buch, Germany.

    Protein-protein interaction maps provide a valuable framework for a better understanding of the functional organization of the proteome. To detect interacting pairs of human proteins systematically, a protein matrix of 4456 baits and 5632 preys was screened by automated yeast two-hybrid (Y2H) interaction mating. We identified 3186 mostly novel interactions among 1705 proteins, resulting in a large, highly connected network. Independent pull-down and co-immunoprecipitation assays validated the overall quality of the Y2H interactions. Using topological and GO criteria, a scoring system was developed to define 911 high-confidence interactions among 401 proteins. Furthermore, the network was searched for interactions linking uncharacterized gene products and human disease proteins to regulatory cellular pathways. Two novel Axin-1 interactions were validated experimentally, characterizing ANP32A and CRMP1 as modulators of Wnt signaling. Systematic human protein interaction screens can lead to a more comprehensive understanding of protein function and cellular processes.

    Cell 2005;122;6;957-68

  • The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).

    Gerhard DS, Wagner L, Feingold EA, Shenmen CM, Grouse LH, Schuler G, Klein SL, Old S, Rasooly R, Good P, Guyer M, Peck AM, Derge JG, Lipman D, Collins FS, Jang W, Sherry S, Feolo M, Misquitta L, Lee E, Rotmistrovsky K, Greenhut SF, Schaefer CF, Buetow K, Bonner TI, Haussler D, Kent J, Kiekhaus M, Furey T, Brent M, Prange C, Schreiber K, Shapiro N, Bhat NK, Hopkins RF, Hsie F, Driscoll T, Soares MB, Casavant TL, Scheetz TE, Brown-stein MJ, Usdin TB, Toshiyuki S, Carninci P, Piao Y, Dudekula DB, Ko MS, Kawakami K, Suzuki Y, Sugano S, Gruber CE, Smith MR, Simmons B, Moore T, Waterman R, Johnson SL, Ruan Y, Wei CL, Mathavan S, Gunaratne PH, Wu J, Garcia AM, Hulyk SW, Fuh E, Yuan Y, Sneed A, Kowis C, Hodgson A, Muzny DM, McPherson J, Gibbs RA, Fahey J, Helton E, Ketteman M, Madan A, Rodrigues S, Sanchez A, Whiting M, Madari A, Young AC, Wetherby KD, Granite SJ, Kwong PN, Brinkley CP, Pearson RL, Bouffard GG, Blakesly RW, Green ED, Dickson MC, Rodriguez AC, Grimwood J, Schmutz J, Myers RM, Butterfield YS, Griffith M, Griffith OL, Krzywinski MI, Liao N, Morin R, Morrin R, Palmquist D, Petrescu AS, Skalska U, Smailus DE, Stott JM, Schnerch A, Schein JE, Jones SJ, Holt RA, Baross A, Marra MA, Clifton S, Makowski KA, Bosak S, Malek J and MGC Project Team

    The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.

    Funded by: PHS HHS: N01-C0-12400

    Genome research 2004;14;10B;2121-7

  • Prediction of the coding sequences of unidentified human genes. XII. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro.

    Nagase T, Ishikawa K, Suyama M, Kikuno R, Hirosawa M, Miyajima N, Tanaka A, Kotani H, Nomura N and Ohara O

    Kazusa DNA Research Institute, Kisarazu, Chiba, Japan.

    In this paper, we report the sequences of 100 cDNA clones newly determined from a set of size-fractionated human brain cDNA libraries and predict the coding sequences of the corresponding genes, named KIAA0819 to KIAA0918. These cDNA clones were selected on the basis of their coding potentials of large proteins (50 kDa and more) by using in vitro transcription/translation assays. The sequence data showed that the average sizes of the inserts and corresponding open reading frames are 4.4 kb and 2.5 kb (831 amino acid residues), respectively. Homology and motif/domain searches against the public databases indicated that the predicted coding sequences of 83 genes were similar to those of known genes, 59% of which (49 genes) were categorized as coding for proteins functionally related to cell signaling/communication, cell structure/motility and nucleic acid management. The chromosomal locations and the expression profiles of all the genes were also examined. For 54 clones including brain-specific ones, the mRNA levels were further examined among 8 brain regions (amygdala, corpus callosum, cerebellum, caudate nucleus, hippocampus, substantia nigra, subthalamic nucleus, and thalamus), spinal cord, and fetal brain.

    DNA research : an international journal for rapid publication of reports on genes and genomes 1998;5;6;355-64

  • Construction and characterization of a full length-enriched and a 5'-end-enriched cDNA library.

    Suzuki Y, Yoshitomo-Nakagawa K, Maruyama K, Suyama A and Sugano S

    International and Interdisciplinary Studies, The University of Tokyo, Japan.

    Using 'oligo-capped' mRNA [Maruyama, K., Sugano, S., 1994. Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides. Gene 138, 171-174], whose cap structure was replaced by a synthetic oligonucleotide, we constructed two types of cDNA library. One is a 'full length-enriched cDNA library' which has a high content of full-length cDNA clones and the other is a '5'-end-enriched cDNA library', which has a high content of cDNA clones with their mRNA start sites. The 5'-end-enriched library was constructed especially for isolating the mRNA start sites of long mRNAs. In order to characterize these libraries, we performed one-pass sequencing of randomly selected cDNA clones from both libraries (84 clones for the full length-enriched cDNA library and 159 clones for the 5'-end-enriched cDNA library). The cDNA clones of the polypeptide chain elongation factor 1 alpha were most frequently (nine clones) isolated, and more than 80% of them (eight clones) contained the mRNA start site of the gene. Furthermore, about 80% of the cDNA clones of both libraries whose sequence matched with known genes had the known 5' ends or sequences upstream of the known 5' ends (28 out of 35 for the full length-enriched library and 51 out of 62 for the 5'-end-enriched library). The longest full-length clone of the full length-enriched cDNA library was about 3300 bp (among 28 clones). In contrast, seven clones (out of the 51 clones with the mRNA start sites) from the 5'-end-enriched cDNA library came from mRNAs whose length is more than 3500 bp. These cDNA libraries may be useful for generating 5' ESTs with the information of the mRNA start sites that are now scarce in the EST database.

    Gene 1997;200;1-2;149-56

  • Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides.

    Maruyama K and Sugano S

    Institute of Medical Science, University of Tokyo, Japan.

    We have devised a method to replace the cap structure of a mRNA with an oligoribonucleotide (r-oligo) to label the 5' end of eukaryotic mRNAs. The method consists of removing the cap with tobacco acid pyrophosphatase (TAP) and ligating r-oligos to decapped mRNAs with T4 RNA ligase. This reaction was made cap-specific by removing 5'-phosphates of non-capped RNAs with alkaline phosphatase prior to TAP treatment. Unlike the conventional methods that label the 5' end of cDNAs, this method specifically labels the capped end of the mRNAs with a synthetic r-oligo prior to first-strand cDNA synthesis. The 5' end of the mRNA was identified quite simply by reverse transcription-polymerase chain reaction (RT-PCR).

    Gene 1994;138;1-2;171-4

Gene lists (5)

Gene List Source Species Name Description Gene count
L00000009 G2C Homo sapiens Human PSD Human orthologues of mouse PSD adapted from Collins et al (2006) 1080
L00000016 G2C Homo sapiens Human PSP Human orthologues of mouse PSP adapted from Collins et al (2006) 1121
L00000061 G2C Homo sapiens BAYES-COLLINS-MOUSE-PSD-CONSENSUS Mouse cortex PSD consensus (ortho) 984
L00000069 G2C Homo sapiens BAYES-COLLINS-HUMAN-PSD-FULL Human cortex biopsy PSD full list 1461
L00000071 G2C Homo sapiens BAYES-COLLINS-MOUSE-PSD-FULL Mouse cortex PSD full list (ortho) 1556
© G2C 2014. The Genes to Cognition Programme received funding from The Wellcome Trust and the EU FP7 Framework Programmes:
EUROSPIN (FP7-HEALTH-241498), SynSys (FP7-HEALTH-242167) and GENCODYS (FP7-HEALTH-241995).

Cookies Policy | Terms and Conditions. This site is hosted by Edinburgh University and the Genes to Cognition Programme.