G2Cdb::Gene report

Gene id
Gene symbol
Homo sapiens
calcium binding and coiled-coil domain 1
G00000803 (Mus musculus)

Databases (6)

ENSG00000012822 (Ensembl human gene)
57658 (Entrez Gene)
1222 (G2Cdb plasticity & disease)
CALCOCO1 (GeneCards)
Marker Symbol
HGNC:29306 (HGNC)
Protein Sequence
Q6FI59 (UniProt)

Synonyms (3)

  • Cocoa
  • KIAA1536
  • calphoglin

Literature (22)

Pubmed - other

  • Genome-wide association study of tanning phenotype in a population of European ancestry.

    Nan H, Kraft P, Qureshi AA, Guo Q, Chen C, Hankinson SE, Hu FB, Thomas G, Hoover RN, Chanock S, Hunter DJ and Han J

    Channing Laboratory, Department of Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA. hnan@hsph.harvard.edu

    We conducted a multistage genome-wide association study (GWAS) of tanning response after exposure to sunlight in over 9,000 men and women of European ancestry who live in the United States. An initial analysis of 528,173 single-nucleotide polymorphisms (SNPs) genotyped on 2,287 women identified LOC401937 (rs966321) on chromosome 1 as a novel locus highly associated with tanning ability, and we confirmed this association in 870 women controls from a skin cancer case-control study with joint P-value=1.6 x 10(-9). We further genotyped this SNP in two subsequent replication studies (one with 3,750 women and the other with 2,405 men). This association was not replicated in either of these two studies. We found that several SNPs reaching the genome-wide significance level are located in or adjacent to the loci previously known as pigmentation genes: MATP, IRF4, TYR, OCA2, and MC1R. Overall, these tanning ability-related loci are similar to the hair color-related loci previously reported in the GWAS of hair color.

    Funded by: NCI NIH HHS: CA122838, CA128080, R01 CA122838, R01 CA122838-01A2, R03 CA128080, R03 CA128080-02

    The Journal of investigative dermatology 2009;129;9;2250-7

  • Genome-wide association study of panic disorder in the Japanese population.

    Otowa T, Yoshida E, Sugaya N, Yasuda S, Nishimura Y, Inoue K, Tochigi M, Umekage T, Miyagawa T, Nishida N, Tokunaga K, Tanii H, Sasaki T, Kaiya H and Okazaki Y

    Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.

    Panic disorder (PD) is an anxiety disorder characterized by panic attacks and anticipatory anxiety. Although a number of association studies have been conducted, no gene has been identified as a susceptibility locus. In this study, we conducted a genome-wide association study of PD in 200 Japanese patients and the same number of controls, using the GeneChip Human Mapping 500 K Array Set. Genotypes were determined using the Bayesian Robust Linear Model with Mahalanobis (BRLMM) genotype calling algorithm. The genotype data were data-cleaned using criteria for SNP call rate (>or=95%), Hardy-Weinberg equilibrium (P>or=0.1%) and minor allele frequency (>or=5%). The significance level of the allele P-value was set at 1.0 x 10(-6), to make false discovery rate (FDR) <0.05. As a result, seven SNPs were significantly associated with PD, which were located in or adjacent to genes including PKP1, PLEKHG1, TMEM16B, CALCOCO1, SDK2 and CLU (or APO-J). Studies with other samples are required to confirm the results.

    Journal of human genetics 2009;54;2;122-6

  • Screening and association testing of common coding variation in steroid hormone receptor co-activator and co-repressor genes in relation to breast cancer risk: the Multiethnic Cohort.

    Haiman CA, Garcia RR, Hsu C, Xia L, Ha H, Sheng X, Le Marchand L, Kolonel LN, Henderson BE, Stallcup MR, Greene GL and Press MF

    Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, California, USA. haiman@usc.edu

    Background: Only a limited number of studies have performed comprehensive investigations of coding variation in relation to breast cancer risk. Given the established role of estrogens in breast cancer, we hypothesized that coding variation in steroid receptor coactivator and corepressor genes may alter inter-individual response to estrogen and serve as markers of breast cancer risk.

    Methods: We sequenced the coding exons of 17 genes (EP300, CCND1, NME1, NCOA1, NCOA2, NCOA3, SMARCA4, SMARCA2, CARM1, FOXA1, MPG, NCOR1, NCOR2, CALCOCO1, PRMT1, PPARBP and CREBBP) suggested to influence transcriptional activation by steroid hormone receptors in a multiethnic panel of women with advanced breast cancer (n = 95): African Americans, Latinos, Japanese, Native Hawaiians and European Americans. Association testing of validated coding variants was conducted in a breast cancer case-control study (1,612 invasive cases and 1,961 controls) nested in the Multiethnic Cohort. We used logistic regression to estimate odds ratios for allelic effects in ethnic-pooled analyses as well as in subgroups defined by disease stage and steroid hormone receptor status. We also investigated effect modification by established breast cancer risk factors that are associated with steroid hormone exposure.

    Results: We identified 45 coding variants with frequencies > or = 1% in any one ethnic group (43 non-synonymous variants). We observed nominally significant positive associations with two coding variants in ethnic-pooled analyses (NCOR2: His52Arg, OR = 1.79; 95% CI, 1.05-3.05; CALCOCO1: Arg12His, OR = 2.29; 95% CI, 1.00-5.26). A small number of variants were associated with risk in disease subgroup analyses and we observed no strong evidence of effect modification by breast cancer risk factors. Based on the large number of statistical tests conducted in this study, the nominally significant associations that we observed may be due to chance, and will need to be confirmed in other studies.

    Conclusion: Our findings suggest that common coding variation in these candidate genes do not make a substantial contribution to breast cancer risk in the general population. Cataloging and testing of coding variants in coactivator and corepressor genes should continue and may serve as a valuable resource for investigations of other hormone-related phenotypes, such as inter-individual response to hormonal therapies used for cancer treatment and prevention.

    Funded by: NCI NIH HHS: CA54281, CA63464, N01-PC-35137, N01-PC-35139

    BMC cancer 2009;9;43

  • Role of the N-terminal activation domain of the coiled-coil coactivator in mediating transcriptional activation by beta-catenin.

    Yang CK, Kim JH and Stallcup MR

    Department of Biochemistry and Molecular Biology, University of Southern California, 1333 San Pablo Street, MCA 51A, Los Angeles, California 90089-9151, USA.

    The coiled-coil coactivator (CoCoA) is involved in transcriptional activation of target genes by nuclear receptors and the xenobiotic aryl hydrocarbon receptor, as well as target genes of the Wnt signaling pathway, which is mediated by the lymphocyte enhancer factor (LEF)/T cell factor transcription factors and the coactivator beta-catenin. The recruitment of CoCoA by nuclear receptors is accomplished by the interaction of the central coiled-coiled domain of CoCoA with p160 coactivators; the C-terminal activation domain (AD) of CoCoA is used for downstream signaling, whereas the function of the N-terminal region is undefined. Here we report that the N terminus of CoCoA contains another AD, which is necessary and sufficient for synergistic activation of LEF1-mediated transcription by CoCoA and beta-catenin. The N-terminal AD contains a p300 binding motif, which is important for synergistic cooperation of CoCoA and p300 as coactivators for LEF1 and beta-catenin. p300 contributes to the function of the CoCoA N-terminal AD primarily through its histone acetyltransferase activity. Moreover, in cultured cells, endogenous p300 is recruited to the promoter of an integrated reporter gene by the N terminus of CoCoA. Thus, the coactivator function of CoCoA for nuclear receptors and LEF1/beta-catenin involves differential utilization of two different CoCoA ADs.

    Funded by: NIDDK NIH HHS: DK43093, P03 DK48522, P30 DK048522, R01 DK043093

    Molecular endocrinology (Baltimore, Md.) 2006;20;12;3251-62

  • A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration.

    Lim J, Hao T, Shaw C, Patel AJ, Szabó G, Rual JF, Fisk CJ, Li N, Smolyar A, Hill DE, Barabási AL, Vidal M and Zoghbi HY

    Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

    Many human inherited neurodegenerative disorders are characterized by loss of balance due to cerebellar Purkinje cell (PC) degeneration. Although the disease-causing mutations have been identified for a number of these disorders, the normal functions of the proteins involved remain, in many cases, unknown. To gain insight into the function of proteins involved in PC degeneration, we developed an interaction network for 54 proteins involved in 23 inherited ataxias and expanded the network by incorporating literature-curated and evolutionarily conserved interactions. We identified 770 mostly novel protein-protein interactions using a stringent yeast two-hybrid screen; of 75 pairs tested, 83% of the interactions were verified in mammalian cells. Many ataxia-causing proteins share interacting partners, a subset of which have been found to modify neurodegeneration in animal models. This interactome thus provides a tool for understanding pathogenic mechanisms common for this class of neurodegenerative disorders and for identifying candidate genes for inherited ataxias.

    Funded by: NICHD NIH HHS: HD24064; NINDS NIH HHS: NS27699

    Cell 2006;125;4;801-14

  • Differential use of functional domains by coiled-coil coactivator in its synergistic coactivator function with beta-catenin or GRIP1.

    Yang CK, Kim JH, Li H and Stallcup MR

    Department of Biochemistry and Molecular Biology, University of Southern California, Los Angeles, California 90089, USA.

    beta-Catenin, a pivotal component of the Wnt-signaling pathway, binds to and serves as a transcriptional coactivator for the T-cell factor/lymphoid enhancer factor (TCF/LEF) family of transcriptional activator proteins and for the androgen receptor (AR), a nuclear receptor. Three components of the p160 nuclear receptor coactivator complex, including CARM1, p300/CBP, and GRIP1 (one of the p160 coactivators), bind to and cooperate with beta-catenin to enhance transcriptional activation by TCF/LEF and AR. Here we report that another component of the p160 nuclear receptor coactivator complex, the coiled-coil coactivator (CoCoA), directly binds to and cooperates synergistically with beta-catenin as a coactivator for AR and TCF/LEF. CoCoA uses different domains to bind GRIP1 and beta-catenin, and it uses different domains to transmit the activating signal to the transcription machinery, depending on whether it is bound to GRIP1 or beta-catenin. CoCoA associated specifically with the promoters of transiently transfected and endogenous target genes of TCF/LEF, and reduction of the endogenous CoCoA level decreased the ability of TCF/LEF and beta-catenin to activate transcription of transient and endogenous target genes. Thus, CoCoA uses different combinations of functional domains to serve as a physiologically relevant component of the Wnt/beta-catenin signaling pathway and the androgen signaling pathway.

    Funded by: NIDDK NIH HHS: DK43093, R01 DK043093

    The Journal of biological chemistry 2006;281;6;3389-97

  • The LIFEdb database in 2006.

    Mehrle A, Rosenfelder H, Schupp I, del Val C, Arlt D, Hahne F, Bechtel S, Simpson J, Hofmann O, Hide W, Glatting KH, Huber W, Pepperkok R, Poustka A and Wiemann S

    Division Molecular Genome Analysis, German Cancer Research Center, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany. a.mehrle@dkfz.de

    LIFEdb (http://www.LIFEdb.de) integrates data from large-scale functional genomics assays and manual cDNA annotation with bioinformatics gene expression and protein analysis. New features of LIFEdb include (i) an updated user interface with enhanced query capabilities, (ii) a configurable output table and the option to download search results in XML, (iii) the integration of data from cell-based screening assays addressing the influence of protein-overexpression on cell proliferation and (iv) the display of the relative expression ('Electronic Northern') of the genes under investigation using curated gene expression ontology information. LIFEdb enables researchers to systematically select and characterize genes and proteins of interest, and presents data and information via its user-friendly web-based interface.

    Nucleic acids research 2006;34;Database issue;D415-8

  • Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes.

    Kimura K, Wakamatsu A, Suzuki Y, Ota T, Nishikawa T, Yamashita R, Yamamoto J, Sekine M, Tsuritani K, Wakaguri H, Ishii S, Sugiyama T, Saito K, Isono Y, Irie R, Kushida N, Yoneyama T, Otsuka R, Kanda K, Yokoi T, Kondo H, Wagatsuma M, Murakawa K, Ishida S, Ishibashi T, Takahashi-Fujii A, Tanase T, Nagai K, Kikuchi H, Nakai K, Isogai T and Sugano S

    Life Science Research Laboratory, Central Research Laboratory, Hitachi, Ltd., Kokubunji, Tokyo, 185-8601, Japan.

    By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by more than 500 bp and thus are very likely to constitute mutually distinct alternative promoters. To our surprise, at least 7674 (52%) human RefSeq genes were subject to regulation by putative alternative promoters (PAPs). On average, there were 3.1 PAPs per gene, with the composition of one CpG-island-containing promoter per 2.6 CpG-less promoters. In 17% of the PAP-containing loci, tissue-specific use of the PAPs was observed. The richest tissue sources of the tissue-specific PAPs were testis and brain. It was also intriguing that the PAP-containing promoters were enriched in the genes encoding signal transduction-related proteins and were rarer in the genes encoding extracellular proteins, possibly reflecting the varied functional requirement for and the restricted expression of those categories of genes, respectively. The patterns of the first exons were highly diverse as well. On average, there were 7.7 different splicing types of first exons per locus partly produced by the PAPs, suggesting that a wide variety of transcripts can be achieved by this mechanism. Our findings suggest that use of alternate promoters and consequent alternative use of first exons should play a pivotal role in generating the complexity required for the highly elaborated molecular systems in humans.

    Genome research 2006;16;1;55-65

  • Towards a proteome-scale map of the human protein-protein interaction network.

    Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, Smolyar A, Bosak S, Sequerra R, Doucette-Stamm L, Cusick ME, Hill DE, Roth FP and Vidal M

    Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, 44 Binney Street, Boston, Massachusetts 02115, USA.

    Systematic mapping of protein-protein interactions, or 'interactome' mapping, was initiated in model organisms, starting with defined biological processes and then expanding to the scale of the proteome. Although far from complete, such maps have revealed global topological and dynamic features of interactome networks that relate to known biological properties, suggesting that a human interactome map will provide insight into development and disease mechanisms at a systems level. Here we describe an initial version of a proteome-scale map of human binary protein-protein interactions. Using a stringent, high-throughput yeast two-hybrid system, we tested pairwise interactions among the products of approximately 8,100 currently available Gateway-cloned open reading frames and detected approximately 2,800 interactions. This data set, called CCSB-HI1, has a verification rate of approximately 78% as revealed by an independent co-affinity purification assay, and correlates significantly with other biological attributes. The CCSB-HI1 data set increases by approximately 70% the set of available binary interactions within the tested space and reveals more than 300 new connections to over 100 disease-associated proteins. This work represents an important step towards a systematic and comprehensive human interactome project.

    Funded by: NCI NIH HHS: R33 CA132073; NHGRI NIH HHS: P50 HG004233, R01 HG001715, RC4 HG006066, U01 HG001715; NHLBI NIH HHS: U01 HL098166

    Nature 2005;437;7062;1173-8

  • A human protein-protein interaction network: a resource for annotating the proteome.

    Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H and Wanker EE

    Max Delbrueck Center for Molecular Medicine, 13092 Berlin-Buch, Germany.

    Protein-protein interaction maps provide a valuable framework for a better understanding of the functional organization of the proteome. To detect interacting pairs of human proteins systematically, a protein matrix of 4456 baits and 5632 preys was screened by automated yeast two-hybrid (Y2H) interaction mating. We identified 3186 mostly novel interactions among 1705 proteins, resulting in a large, highly connected network. Independent pull-down and co-immunoprecipitation assays validated the overall quality of the Y2H interactions. Using topological and GO criteria, a scoring system was developed to define 911 high-confidence interactions among 401 proteins. Furthermore, the network was searched for interactions linking uncharacterized gene products and human disease proteins to regulatory cellular pathways. Two novel Axin-1 interactions were validated experimentally, characterizing ANP32A and CRMP1 as modulators of Wnt signaling. Systematic human protein interaction screens can lead to a more comprehensive understanding of protein function and cellular processes.

    Cell 2005;122;6;957-68

  • The lacrimal gland transcriptome is an unusually rich source of rare and poorly characterized gene transcripts.

    Ozyildirim AM, Wistow GJ, Gao J, Wang J, Dickinson DP, Frierson HF and Laurie GW

    Department of Cell Biology, UVa Health System, University of Virginia, Charlottesville, VA 22908 USA.

    Purpose: To sequence and comprehensively analyze human and mouse lacrimal gland transcriptomes as part of the NEIBank project.

    Methods: cDNA libraries generated from normal human and mouse lacrimal glands were sequenced and analyzed by PHRED, RepeatMasker, BLAST, and GRIST. Human "lacrimal-preferred genes" and putative gene regulatory elements were respectively identified in UniGene and ConSite, and gene clustering was analyzed by chromosomal mapping. "Hypothetical proteins," identified by keyword search, were verified by genomic alignment and queried in the Conserved Domain database and GEO Profiles.

    Results: The top six transcripts in human and mouse differed, revealing a previously unappreciated molecular divergence. The human transcriptome is enriched with transcripts from 29 lacrimal-preferred genes and a content of poorly characterized hypothetical proteins, proportionally greater than in all other tissues. Only 45% of lacrimal preferred, but 71% of hypotheticals, have mouse orthologs. Many of the latter display apparently altered cancer expression in the CGAP SAGE library collection-often in keeping with predicted WD40, protein kinase, Src homology 2 and 3, RhoGEF, and pleckstrin homology domains involved in cell signaling. At the genomic level, lacrimal-expressed genes show some evidence of clustering, particularly on human chromosomes 9 and 12. Binding sites for TFAP2A, FOXC1, and other transcription factors are predicted.

    Conclusions: Interspecies divergence cautions against use of mouse models of human dry eye syndromes. Lacrimal preferred and hypothetical proteins, gene clustering, and putative gene regulatory elements together provide new clues for a molecular understanding of lacrimal gland function and mechanisms of coordinated tissue-specific transcriptional regulation.

    Funded by: NEI NIH HHS: EY 13143, R01 EY013143, R01 EY013143-04

    Investigative ophthalmology & visual science 2005;46;5;1572-80

  • Cellular signaling mediated by calphoglin-induced activation of IPP and PGM.

    Takahashi K, Inuzuka M and Ingi T

    Department of Neurophysiology, Brain Research Institute, Niigata University, 1 Asahi-machi, Niigata 951-8585, Japan.

    Universal protein networks conserved from bacteria to animals dictate the core functions of cells. Inorganic pyrophosphatase (IPP) is an essential enzyme that plays a pivotal role in a broad spectrum of cellular biosynthetic reactions such as amino acid, nucleotide, polysaccharide, and fatty acid biosynthesis. However, the in vivo cellular regulation mechanisms of IPP and another key metabolic enzyme, phosphoglucomutase (PGM), remain unknown. This study aimed to examine the universal protein regulatory network by utilizing genome sequences, yeast proteomic data, and phosphoryl-transfer experiments. Here we report a novel human protein, henceforth referred to as calphoglin, which interacts with IPP and activates it. Calphoglin enhances PGM activity through the activated IPP and more directly on its own. Protein structure and assembly, catalytic function, and ubiquitous cellular localization of the calphoglin (-IPP-PGM) complex were conserved among Escherichia coli, yeast, and mammals. In the rat brain, calphoglin mRNA was enriched in the hippocampus and the cerebellum. Further, the linkage of the calphoglin complex to calcium signaling was demonstrated by its interactive co-localization within the calmodulin/calcineurin signaling complex, by Ca(2+)-binding and Ca(2+)-controlled activity of calphoglin-IPP, and by calphoglin-induced enhancement of microsomal Ca(2+) uptake. Collectively, these results suggest that the calphoglin complex is a common mechanism utilized in mediating bacterial cell metabolism and Ca(2+)/calmodulin/calcineurin-dependent mammalian cell activation. This is the first report of an activator of IPP and PGM, a function novel to proteins.

    Biochemical and biophysical research communications 2004;325;1;203-14

  • From ORFeome to biology: a functional genomics pipeline.

    Wiemann S, Arlt D, Huber W, Wellenreuther R, Schleeger S, Mehrle A, Bechtel S, Sauermann M, Korf U, Pepperkok R, Sültmann H and Poustka A

    Molecular Genome Analysis, German Cancer Research Center, 69120 Heidelberg, Germany. s.wiemann@dkfz.de

    As several model genomes have been sequenced, the elucidation of protein function is the next challenge toward the understanding of biological processes in health and disease. We have generated a human ORFeome resource and established a functional genomics and proteomics analysis pipeline to address the major topics in the post-genome-sequencing era: the identification of human genes and splice forms, and the determination of protein localization, activity, and interaction. Combined with the understanding of when and where gene products are expressed in normal and diseased conditions, we create information that is essential for understanding the interplay of genes and proteins in the complex biological network. We have implemented bioinformatics tools and databases that are suitable to store, analyze, and integrate the different types of data from high-throughput experiments and to include further annotation that is based on external information. All information is presented in a Web database (http://www.dkfz.de/LIFEdb). It is exploited for the identification of disease-relevant genes and proteins for diagnosis and therapy.

    Genome research 2004;14;10B;2136-44

  • Complete sequencing and characterization of 21,243 full-length human cDNAs.

    Ota T, Suzuki Y, Nishikawa T, Otsuki T, Sugiyama T, Irie R, Wakamatsu A, Hayashi K, Sato H, Nagai K, Kimura K, Makita H, Sekine M, Obayashi M, Nishi T, Shibahara T, Tanaka T, Ishii S, Yamamoto J, Saito K, Kawai Y, Isono Y, Nakamura Y, Nagahari K, Murakami K, Yasuda T, Iwayanagi T, Wagatsuma M, Shiratori A, Sudo H, Hosoiri T, Kaku Y, Kodaira H, Kondo H, Sugawara M, Takahashi M, Kanda K, Yokoi T, Furuya T, Kikkawa E, Omura Y, Abe K, Kamihara K, Katsuta N, Sato K, Tanikawa M, Yamazaki M, Ninomiya K, Ishibashi T, Yamashita H, Murakawa K, Fujimori K, Tanai H, Kimata M, Watanabe M, Hiraoka S, Chiba Y, Ishida S, Ono Y, Takiguchi S, Watanabe S, Yosida M, Hotuta T, Kusano J, Kanehori K, Takahashi-Fujii A, Hara H, Tanase TO, Nomura Y, Togiya S, Komai F, Hara R, Takeuchi K, Arita M, Imose N, Musashino K, Yuuki H, Oshima A, Sasaki N, Aotsuka S, Yoshikawa Y, Matsunawa H, Ichihara T, Shiohata N, Sano S, Moriya S, Momiyama H, Satoh N, Takami S, Terashima Y, Suzuki O, Nakagawa S, Senoh A, Mizoguchi H, Goto Y, Shimizu F, Wakebe H, Hishigaki H, Watanabe T, Sugiyama A, Takemoto M, Kawakami B, Yamazaki M, Watanabe K, Kumagai A, Itakura S, Fukuzumi Y, Fujimori Y, Komiyama M, Tashiro H, Tanigami A, Fujiwara T, Ono T, Yamada K, Fujii Y, Ozaki K, Hirao M, Ohmori Y, Kawabata A, Hikiji T, Kobatake N, Inagaki H, Ikema Y, Okamoto S, Okitani R, Kawakami T, Noguchi S, Itoh T, Shigeta K, Senba T, Matsumura K, Nakajima Y, Mizuno T, Morinaga M, Sasaki M, Togashi T, Oyama M, Hata H, Watanabe M, Komatsu T, Mizushima-Sugano J, Satoh T, Shirai Y, Takahashi Y, Nakagawa K, Okumura K, Nagase T, Nomura N, Kikuchi H, Masuho Y, Yamashita R, Nakai K, Yada T, Nakamura Y, Ohara O, Isogai T and Sugano S

    Helix Research Institute, 1532-3 Yana, Kisarazu, Chiba 292-0812, Japan.

    As a base for human transcriptome and functional genomics, we created the "full-length long Japan" (FLJ) collection of sequenced human cDNAs. We determined the entire sequence of 21,243 selected clones and found that 14,490 cDNAs (10,897 clusters) were unique to the FLJ collection. About half of them (5,416) seemed to be protein-coding. Of those, 1,999 clusters had not been predicted by computational methods. The distribution of GC content of nonpredicted cDNAs had a peak at approximately 58% compared with a peak at approximately 42%for predicted cDNAs. Thus, there seems to be a slight bias against GC-rich transcripts in current gene prediction procedures. The rest of the cDNAs unique to the FLJ collection (5,481) contained no obvious open reading frames (ORFs) and thus are candidate noncoding RNAs. About one-fourth of them (1,378) showed a clear pattern of splicing. The distribution of GC content of noncoding cDNAs was narrow and had a peak at approximately 42%, relatively low compared with that of protein-coding cDNAs.

    Nature genetics 2004;36;1;40-5

  • The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment.

    Clark HF, Gurney AL, Abaya E, Baker K, Baldwin D, Brush J, Chen J, Chow B, Chui C, Crowley C, Currell B, Deuel B, Dowd P, Eaton D, Foster J, Grimaldi C, Gu Q, Hass PE, Heldens S, Huang A, Kim HS, Klimowski L, Jin Y, Johnson S, Lee J, Lewis L, Liao D, Mark M, Robbie E, Sanchez C, Schoenfeld J, Seshagiri S, Simmons L, Singh J, Smith V, Stinson J, Vagts A, Vandlen R, Watanabe C, Wieand D, Woods K, Xie MH, Yansura D, Yi S, Yu G, Yuan J, Zhang M, Zhang Z, Goddard A, Wood WI, Godowski P and Gray A

    Departments of Bioinformatics, Molecular Biology and Protein Chemistry, Genentech, Inc, South San Francisco, California 94080, USA. hclark@gene.com

    A large-scale effort, termed the Secreted Protein Discovery Initiative (SPDI), was undertaken to identify novel sec 169a reted and transmembrane proteins. In the first of several approaches, a biological signal sequence trap in yeast cells was utilized to identify cDNA clones encoding putative secreted proteins. A second strategy utilized various algorithms that recognize features such as the hydrophobic properties of signal sequences to identify putative proteins encoded by expressed sequence tags (ESTs) from human cDNA libraries. A third approach surveyed ESTs for protein sequence similarity to a set of known receptors and their ligands with the BLAST algorithm. Finally, both signal-sequence prediction algorithms and BLAST were used to identify single exons of potential genes from within human genomic sequence. The isolation of full-length cDNA clones for each of these candidate genes resulted in the identification of >1000 novel proteins. A total of 256 of these cDNAs are still novel, including variants and novel genes, per the most recent GenBank release version. The success of this large-scale effort was assessed by a bioinformatics analysis of the proteins through predictions of protein domains, subcellular localizations, and possible functional roles. The SPDI collection should facilitate efforts to better understand intercellular communication, may lead to new understandings of human diseases, and provides potential opportunities for the development of therapeutics.

    Genome research 2003;13;10;2265-70

  • Immunomic analysis of human sarcoma.

    Lee SY, Obata Y, Yoshida M, Stockert E, Williamson B, Jungbluth AA, Chen YT, Old LJ and Scanlan MJ

    Ludwig Institute for Cancer Research, New York Branch at Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021, USA. .

    The screening of cDNA expression libraries from human tumors with serum antibody (SEREX) has proven to be a powerful method for identifying the repertoire of tumor antigens recognized by the immune system of cancer patients, referred to as the cancer immunome. In this regard, cancer/testis (CT) antigens are of particular interest because of their immunogenicity and restricted expression patterns. Synoivial sarcomas are striking with regard to CT antigen expression, with >80% of specimens homogeneously expressing NY-ESO-1 and MAGE-A3. In the present study, 54 sarcoma patients were tested for serum antibodies to NY-ESO-1, SSX2, MAGE-A1, MAGE-A3, MAGE-A4, MAGE-A10, CT7, and CT10. Two patients had detectable antibodies to CT antigens, and this seroreactivity was restricted to NY-ESO-1. Thus, although highly expressed in sarcoma, CT antigens do not induce frequent humoral immune responses in sarcoma patients. Sera from these two patients were used to immunoscreen cDNA libraries from two synovial sarcoma cell lines and normal testis, resulting in the identification of 113 distinct antigens. Thirty-nine antigens were previously identified by SEREX analysis of other tumor types, and 2339 antigens (59%) had a serological profile that was not restricted to cancer patients, indicating that only a proportion of SEREX-defined antigens are cancer-related. A novel CT antigen, NY-SAR-35, mapping to chromosome Xq28 was identified among the cancer-related antigens, and encodes a putative extracellular protein. In addition to testis-restricted expression, NY-SAR-35 mRNA was expressed in sarcoma, melanoma, esophageal cancer, lung cancer and breast cancer. NY-SAR-35 is therefore a potential target for cancer vaccines and monoclonal antibody-based immunotherapies.

    Proceedings of the National Academy of Sciences of the United States of America 2003;100;5;2651-6

  • Toward a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs.

    Wiemann S, Weil B, Wellenreuther R, Gassenhuber J, Glassl S, Ansorge W, Böcher M, Blöcker H, Bauersachs S, Blum H, Lauber J, Düsterhöft A, Beyer A, Köhrer K, Strack N, Mewes HW, Ottenwälder B, Obermaier B, Tampe J, Heubner D, Wambutt R, Korn B, Klein M and Poustka A

    Molecular Genome Analysis, German Cancer Research Center, 69120 Heidelberg, Germany. s.wiemann@dkfz.de

    With the complete human genomic sequence being unraveled, the focus will shift to gene identification and to the functional analysis of gene products. The generation of a set of cDNAs, both sequences and physical clones, which contains the complete and noninterrupted protein coding regions of all human genes will provide the indispensable tools for the systematic and comprehensive analysis of protein function to eventually understand the molecular basis of man. Here we report the sequencing and analysis of 500 novel human cDNAs containing the complete protein coding frame. Assignment to functional categories was possible for 52% (259) of the encoded proteins, the remaining fraction having no similarities with known proteins. By aligning the cDNA sequences with the sequences of the finished chromosomes 21 and 22 we identified a number of genes that either had been completely missed in the analysis of the genomic sequences or had been wrongly predicted. Three of these genes appear to be present in several copies. We conclude that full-length cDNA sequencing continues to be crucial also for the accurate identification of genes. The set of 500 novel cDNAs, and another 1000 full-coding cDNAs of known transcripts we have identified, adds up to cDNA representations covering 2%--5 % of all human genes. We thus substantially contribute to the generation of a gene catalog, consisting of both full-coding cDNA sequences and clones, which should be made freely available and will become an invaluable tool for detailed functional studies.

    Genome research 2001;11;3;422-35

  • DNA cloning using in vitro site-specific recombination.

    Hartley JL, Temple GF and Brasch MA

    Life Technologies, Inc., Rockville, Maryland 20850, USA. jhartley@lifetech.com

    As a result of numerous genome sequencing projects, large numbers of candidate open reading frames are being identified, many of which have no known function. Analysis of these genes typically involves the transfer of DNA segments into a variety of vector backgrounds for protein expression and functional analysis. We describe a method called recombinational cloning that uses in vitro site-specific recombination to accomplish the directional cloning of PCR products and the subsequent automatic subcloning of the DNA segment into new vector backbones at high efficiency. Numerous DNA segments can be transferred in parallel into many different vector backgrounds, providing an approach to high-throughput, in-depth functional analysis of genes and rapid optimization of protein expression. The resulting subclones maintain orientation and reading frame register, allowing amino- and carboxy-terminal translation fusions to be generated. In this paper, we outline the concepts of this approach and provide several examples that highlight some of its potential.

    Genome research 2000;10;11;1788-95

  • Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing.

    Simpson JC, Wellenreuther R, Poustka A, Pepperkok R and Wiemann S

    Department of Cell Biology and Biophysics, EMBL Heidelberg, Germany.

    As a first step towards a more comprehensive functional characterization of cDNAs than bioinformatic analysis, which can only make functional predictions for about half of the cDNAs sequenced, we have developed and tested a strategy that allows their systematic and fast subcellular localization. We have used a novel cloning technology to rapidly generate N- and C-terminal green fluorescent protein fusions of cDNAs to examine the intracellular localizations of > 100 expressed fusion proteins in living cells. The entire analysis is suitable for automation, which will be important for scaling up throughput. For > 80% of these new proteins a clear intracellular localization to known structures or organelles could be determined. For the cDNAs where bioinformatic analyses were able to predict possible identities, the localization was able to support these predictions in 75% of cases. For those cDNAs where no homologies could be predicted, the localization data represent the first information.

    EMBO reports 2000;1;3;287-92

  • Prediction of the coding sequences of unidentified human genes. XVII. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro.

    Nagase T, Kikuno R, Ishikawa K, Hirosawa M and Ohara O

    Kazusa DNA Research Institute, Kisarazu, Chiba, Japan. nagase@kazusa.or.jp

    To provide information regarding the coding sequences of unidentified human genes, we have conducted a sequencing project of human cDNAs which encode large proteins. We herein present the entire sequences of 100 cDNA clones of unknown human genes, named KIAA1444 to KIAA1543, from two sets of size-fractionated human adult and fetal brain cDNA libraries. The average sizes of the inserts and corresponding open reading frames of cDNA clones analyzed here were 4.4 kb and 2.6 kb (856 amino acid residues), respectively. Database searches of the predicted amino acid sequences classified 53 predicted gene products into the following five functional categories: cell signaling/communication, nucleic acid management, cell structure/motility, protein management and metabolism. It was also revealed that homologues for 32 KIAA gene products were detected in the databases, which were similar in sequence through almost their entire regions. Additionally, the chromosomal loci of the genes were determined by using human-rodent hybrid panels unless their chromosomal loci were already assigned in the public databases. The expression levels of the genes were monitored in spinal cord, fetal brain and fetal liver, as well as in 10 human tissues and 8 brain regions, by reverse transcription-coupled polymerase chain reaction, products of which were quantified by enzyme-linked immunosorbent assay.

    DNA research : an international journal for rapid publication of reports on genes and genomes 2000;7;2;143-50

  • Construction and characterization of a full length-enriched and a 5'-end-enriched cDNA library.

    Suzuki Y, Yoshitomo-Nakagawa K, Maruyama K, Suyama A and Sugano S

    International and Interdisciplinary Studies, The University of Tokyo, Japan.

    Using 'oligo-capped' mRNA [Maruyama, K., Sugano, S., 1994. Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides. Gene 138, 171-174], whose cap structure was replaced by a synthetic oligonucleotide, we constructed two types of cDNA library. One is a 'full length-enriched cDNA library' which has a high content of full-length cDNA clones and the other is a '5'-end-enriched cDNA library', which has a high content of cDNA clones with their mRNA start sites. The 5'-end-enriched library was constructed especially for isolating the mRNA start sites of long mRNAs. In order to characterize these libraries, we performed one-pass sequencing of randomly selected cDNA clones from both libraries (84 clones for the full length-enriched cDNA library and 159 clones for the 5'-end-enriched cDNA library). The cDNA clones of the polypeptide chain elongation factor 1 alpha were most frequently (nine clones) isolated, and more than 80% of them (eight clones) contained the mRNA start site of the gene. Furthermore, about 80% of the cDNA clones of both libraries whose sequence matched with known genes had the known 5' ends or sequences upstream of the known 5' ends (28 out of 35 for the full length-enriched library and 51 out of 62 for the 5'-end-enriched library). The longest full-length clone of the full length-enriched cDNA library was about 3300 bp (among 28 clones). In contrast, seven clones (out of the 51 clones with the mRNA start sites) from the 5'-end-enriched cDNA library came from mRNAs whose length is more than 3500 bp. These cDNA libraries may be useful for generating 5' ESTs with the information of the mRNA start sites that are now scarce in the EST database.

    Gene 1997;200;1-2;149-56

  • Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides.

    Maruyama K and Sugano S

    Institute of Medical Science, University of Tokyo, Japan.

    We have devised a method to replace the cap structure of a mRNA with an oligoribonucleotide (r-oligo) to label the 5' end of eukaryotic mRNAs. The method consists of removing the cap with tobacco acid pyrophosphatase (TAP) and ligating r-oligos to decapped mRNAs with T4 RNA ligase. This reaction was made cap-specific by removing 5'-phosphates of non-capped RNAs with alkaline phosphatase prior to TAP treatment. Unlike the conventional methods that label the 5' end of cDNAs, this method specifically labels the capped end of the mRNAs with a synthetic r-oligo prior to first-strand cDNA synthesis. The 5' end of the mRNA was identified quite simply by reverse transcription-polymerase chain reaction (RT-PCR).

    Gene 1994;138;1-2;171-4

Gene lists (5)

Gene List Source Species Name Description Gene count
L00000009 G2C Homo sapiens Human PSD Human orthologues of mouse PSD adapted from Collins et al (2006) 1080
L00000016 G2C Homo sapiens Human PSP Human orthologues of mouse PSP adapted from Collins et al (2006) 1121
L00000061 G2C Homo sapiens BAYES-COLLINS-MOUSE-PSD-CONSENSUS Mouse cortex PSD consensus (ortho) 984
L00000069 G2C Homo sapiens BAYES-COLLINS-HUMAN-PSD-FULL Human cortex biopsy PSD full list 1461
L00000071 G2C Homo sapiens BAYES-COLLINS-MOUSE-PSD-FULL Mouse cortex PSD full list (ortho) 1556
© G2C 2014. The Genes to Cognition Programme received funding from The Wellcome Trust and the EU FP7 Framework Programmes:
EUROSPIN (FP7-HEALTH-241498), SynSys (FP7-HEALTH-242167) and GENCODYS (FP7-HEALTH-241995).

Cookies Policy | Terms and Conditions. This site is hosted by Edinburgh University and the Genes to Cognition Programme.