User:Crary033/sandbox
Warning This page contains syntax errors ("cite%20note") caused by a VisualEditor bug. Do not copy/move content from this page until the errors have been repaired. See {{Warning VisualEditor bug}} for more information. |
C6orf10 is a protein that in humans is encoded by the C6orf10 gene.[3][4]C6orf10 is an open reading frame on chromosome 6 containing an uncharacterized protein with accession number NP_006772.[5] C6orf10 is also known as Testis Specific Basic Protein one (TSBP1) and Testis expressed basic protein one.[6] The location of the C6orf10 gene on chromosome 6 is 6p21.32 and its longest known mRNA transcript is 2,194 base pairs long.[7][8] Expression of this gene is highest in the testis but is also seen in other tissue types such as the brain, lens of the eye and the medulla.[9][10][11] The C6orf10 gene contains 9 different mRNAs with 8 alternatively spliced variants and one unspliced form. Also, four alternative polyadenylation sites have been identified.[9] The C6orf10 protein has three complete isoforms and five partial isoforms.[5] The longest isoform, which is isoform 1 contains 563 amino acids, and is 61495 kDa long.This protein is a part of the Neuromodulin-N superfamily.[12] The C6orf10 protein orthologs are restricted to mammals indicating an evolutionarily newer protein.
Contents
[edit]- 1Gene
- 2mRNA Transcript
- 3Protein
- 4Homologs
- 5Cellular Localization
- 6References
- 7External links
- 8Further reading
Gene[edit | edit source]
[edit]C6orf10 is located on chromosome 6 at 6p21.32 located on the reverse strand. It is also known as Testis Specific Binding protein 1 (TSBP1) and is 81,390 nucleotides long with 29 total exons. The gene neighborhood includes butyrophilin like 2 (BTNL2)[1], encoding a class II major histocompatibility complex transmembrane protein involved in immunoregulation, NOTCH4[2], a highly conserved, transmembrane protein involved in adjacent cell signaling during development, and HLA-DRA[3], major histocompatibility complex class II transmembrane protein involved in antigen presentation in the immune system.
mRNA Transcript[edit | edit source]
[edit]C6orf10 contains seven human mRNA splice variants. The most common splice variant in humans is isoform a[4] which is 2194 bp long and contains 22 exons. Isoform X4[5] is the longest isoform at 4480 bp, but is not common in humans. This isoform is more commonly seen in orthologs of the human version of C6orf10.
Protein[edit | edit source]
[edit]C6orf10 Isoform a[6] is a 563 amino acid long protein and isoform X4[7] is a 607 amino acid protein. C6orf10 isoform a is rich in lysine (K), Glutamine (Q) and Glutamic acid (E) and poor in Histidine (H) and Phenylalanine (F)[8]. Overall, isoform a is a basic protein with an isoelectric point of 9 and a molecular weight of 62,000 kDa[9]. This isoform contains two transmembrane regions near the beginning of the amino acid sequence. The first transmembrane region spans from residue 6 to residue 25 (19 total residues) and has an isoelectric point of 5. The second transmembrane region spans from residue 100 to residue 119 (19 total residues) and has an isoelectric point of 8. Isoform a contains a PTZ00121 domain starting with residue 221 and going until the end of the protein. there are several repetitive sequences within this domain.
Homologs[edit | edit source]
[edit]Orthologs[edit | edit source]
[edit]By searching the NCBI BLAST [1]database for protein-protein interactions, it was found that C6orf10 is a protein only found in mammals. The BLAST database found the highest number of homologs in the Primates, Artiodactyla, and Carnivora. There were only a couple of homologs in the taxonomic orders of Rodentia, Chiroptera, and Perissodactyla. In the orders of Scandentia, Eulipotyphyla, Tubulidentataand sirenia there was only one complete homolog, but a few partial sequences do exist. There were partial protein sequences in Lagomorpha, Dermoptera, and Macroscelidea and there were no orthologs in Diprotodontia, Didelphimorphia, Cetacea, Dasyuromorphia, Pilosa, Monotremata, and Proboscidea. BLAST recovered one potential reptilian homolog in the Tiger snake (Notechis scutatus), however the peptide sequence was much longer than the C6orf10 sequence and there were many gaps.
Primates[edit | edit source]
[edit]Northern White cheeked Gibbon (Nomascus leucogenys)
Carnivora[edit | edit source]
[edit]Dog (Canis lupus familiaris), Cat (Felis catus)
Cellular Localization[edit | edit source]
[edit]It is predicted that C6orf10 is localized to the nucleus or the endoplasm due to endoplasmic retention signals and nuclear localization signals.
[1] BLAST: Basic Local Alignment Search Tool. National Center for Biotechnology InformationAvailable at: https://blast.ncbi.nlm.nih.gov/Blast.cgi. Accessdate 4 March 2019.
- ^ "BTNL2 butyrophilin like 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 1 May 2019.
- ^ "NOTCH4 notch receptor 4 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 1 May 2019.
- ^ "HLA-DRA major histocompatibility complex, class II, DR alpha [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 1 May 2019.
- ^ "Homo sapiens testis expressed basic protein 1 (TSBP1), transcript variant 1, mRNA". 11 November 2018.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "PREDICTED: Homo sapiens chromosome 6 open reading frame 10 (C6orf10), transcript variant X4, mRNA". 26 March 2018.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "testis-expressed basic protein 1 isoform a [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 1 May 2019.
- ^ "uncharacterized protein C6orf10 isoform X4 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 1 May 2019.
- ^ "SAPS Results". www.ebi.ac.uk. Retrieved 1 May 2019.
- ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 1 May 2019.
An Error has occurred retrieving Wikidata item for infoboxC6orf10 is a protein that in humans is encoded by the C6orf10 gene.[1][2]C6orf10 is an open reading frame on chromosome 6 containing a protein that is ubiquitously expressed at low levels in the adult genome and may play a role during fetal development. C6orf10 has been found to be linked to both neurodegenerative and autoimmune diseases in adults. Expression of this gene is highest in the testis but is also seen in other tissue types such as the brain, lens of the eye and the medulla.[3][4][5]
Gene
[edit]C6orf10 is located on chromosome 6 at 6p21.32 located on the reverse strand. It is also known as Testis Specific Binding protein 1 (TSBP1) and is 81,390 nucleotides long with 29 total exons. The gene neighborhood includes butyrophilin like 2 (BTNL2)[1], encoding a class II major histocompatibility complex transmembrane protein involved in immunoregulation, NOTCH4[2], a highly conserved, transmembrane protein involved in adjacent cell signaling during development, and HLA-DRA[3], major histocompatibility complex class II transmembrane protein involved in antigen presentation in the immune system.
mRNA Transcript
[edit]C6orf10 contains seven human mRNA splice variants (a, b, c, X1, X2, X3, X4). The most common splice variant in humans is isoform a[4] which is 2194 bp long and contains 22 exons. Isoform X4[5] is the longest isoform at 4480 bp, but is not common in humans. This isoform is more commonly seen in orthologs of the human version of C6orf10. The most commonly conserved region of C6orf10 across isoforms is the second half of the protein.
Protein
[edit]C6orf10 Isoform a[6] is a 563 amino acid long protein, and isoform X4[7] is a 607 amino acid protein.
Composition
[edit]C6orf10 isoform a is rich in lysine (K), Glutamine (Q) and Glutamic acid (E) and poor in Histidine (H) and Phenylalanine(F)[8]. Isoform a is a basic protein with an isoelectric point of 9 and a molecular weight of 62,000 kDa[9].
This isoform contains two transmembrane regions near the beginning of the amino acid sequence. The first transmembrane region spans from residue 6 to residue 25 (19 total residues) and has an isoelectric point of 5. The second transmembrane region spans from residue 100 to residue 119 (19 total residues) and has an isoelectric point of 8. Isoform a contains a PTZ00121 domain starting with residue 221 and going until the end of the protein. There are several repetitive sequences within this domain.
Secondary Structure
[edit]C6orf10 consists mostly of alpha helices and random coils. There are only a few regions that contain bets sheets.
Tertiary Structure
[edit]C6orf10 contains a highly conserved stem loop structure in the 3' UTR from base 100-124.
Subcellular Localization
[edit]C6orf10 is predicted to be localized to the Nucleus and the Endoplasmic Reticulum. There is a signal peptide cleavage site between amino acid 30 and 31 which includes the first transmembrane domain. This N-terminal region of C6orf10 is likely localized to the endoplasmic reticulum. The C-terminal region of the protein contain two nuclear localization signals from amino acid 489-505 and 513-529 indicating that the section of the protein after the signal peptide cleavage site is localized to the nucleus.
Expression
[edit]C6orf10 is ubiquitously expressed at low levels in the adult human genome. In adults, expression of this gene is highest in the testis. C6orf10 is expressed at higher levels in fetal and embryonic tissues. This indicates C6orf10 may play a role in development.
Regulation of Expression
[edit]Transcriptional Regulation
[edit]C6orf10 has a promoter that is 1206 bases long. This promoter overlaps with the 3' UTR but ends before the first codon. This promoter is fairly well conserved across primates except for a 136 nucleotide region midway through and the end of the promoter region. Primates have insertions at these two regions that humans are missing. This may suggest that these regions of the promoter are not essential to humans.
Transcription Factors
[edit]C6orf10 transcription is regulated by the binding of many transcription factors to the promoter region. The CCAAT binding protein and TATA box are highly conserved regions that are important in the initiation of transcription. Several of the transcription factors including EH1, NACA,NKX5-2, SIX4, VCR, etc. are involved in developmental pathways.
Abbreviation | Transcription Factor Full Name | Matrix score | Strand |
CSRNP-1 | Cytosine-Serine rich nuclear protein 1(AXUD1, AXIN1 up-regulated-1) | 1.0 | + |
CCAAT Box | CCAAT/enhancer binding protein (C/EBP), gamma | 0.923 | + |
EH1 | Engrailed Homeobox 1 | 0.862 | + |
Cart-1 | Cartilage homeoprotein 1 | 0.997 | - |
ZFP 263 | Zinc finger protein 263, ZKSCAN12 (Zinc finger protein with KRAB and SCAN domains 12) | 0.921 | + |
SWI/SNF | SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily, a member 3 | 0.999 | - |
TATA Box | Vertebrate TATA binding protein factor | 0.899 | + |
HSF2 | Heat shock Factor 2 | 0.974 | - |
Hmx2/Nkx5-2 | Hmx2/Nkx5-2 homeodomain transcription factor | 0.933 | + |
Pdx1 | Insulin promoter factor 1; pancreatic and duodenal homeobox 1 (Pdx1) | 0.924 | + |
LMX1A | LIM homeobox transcription factor 1 alpha | 1.0 | + |
NACA | Nascent polypeptide-associated complex subunit alpha 1 | 1.0 | - |
Oct1 | Octamer binding factor 1 | 0.921 | + |
POU6F1 | POU class 6 homeobox 1 | 0.973 | + |
STAT 5B | Signal transducer and activator of transcription 5B | 0.973 | + |
SIX 4 | Sine oculis homeobox homolog 4 | 0.96 | - |
NMP4 | Nuclear matrix protein 4 | 0.971 | + |
MSX | Homeodomain proteins MSX-1 and MSX-2 | 0.989 | - |
AREB6 | Atp1a1 regulatory element binding factor 6 | 1.0 | - |
VCR | Vertebrate caudal related homeodomain protein | 0.963 | + |
Protein Interactions
[edit]Most of the predicted protein interactions with C6orf10 are based solely on text mining and information gathered from genome-wide association studies. The two proteins with the highest interaction scores were Butyrophilin-like protein 2 (BTNL2) and Tetratricopeptide repeat domain containing TTC32. BTNL2 is a negative regulator of T-cell activity and member of the immunoglobulin superfamily. BTNL2 is located in the C6orf10 gene neighborhood. TTC32 is from a protein family of structural repeat motifs that mediate protein-protein interactions in the formation of protein-protein complexes. This may indicate the potential for C6orf10 to interact with another protein for form a complex.
Clinical Significance
[edit]C6orf10 has been bound to be associated with both neurodegenerative diseases and autoimmune diseases. These associations are mostly obtained from genome wide association studies. Common neurodegenerative diseases associated with C6orf10 include frontotemporal dementia, Parkinson's disease, and Alzheimer's disease. Autoimmune diseases associated with C6orf10 include Rheumatoid arthritis, psoriasis, multiple sclerosis, Grave's disease and lupus.
Homologs
[edit]Orthologs
[edit]By searching the NCBI BLAST [1]database for protein-protein interactions, it was found that C6orf10 is a protein only found in mammals. The BLAST database found the highest number of homologs in the Primates, Artiodactyla, and Carnivora. There were only a couple of homologs in the taxonomic orders of Rodentia, Chiroptera, and Perissodactyla. In the orders of Scandentia, Eulipotyphyla, Tubulidentata and sirenia there was only one complete homolog, but a few partial sequences do exist. There were partial protein sequences in Lagomorpha, Dermoptera, and Macroscelidea and there were no orthologs in Diprotodontia, Didelphimorphia, Cetacea, Dasyuromorphia, Pilosa, Monotremata, and Proboscidea. BLAST recovered one potential reptilian homolog in the Tiger snake (Notechis scutatus), however the peptide sequence was much longer than the C6orf10 sequence and there were many gaps.
Common Name | Latin Name | Taxonomic Group | Abbreviation | Date of divergence (MYA) | Accession number | % Sequence Identity | % Sequence similarity | Sequence Length (aa) | ||
Humans | Homo sapiens | Primates | HSA | 0 | NP_006772.3 | 100 | 100% | |||
Northern White Cheeked Gibbon | Nomascus leucogenys | Primates | NML | 19.43 | XP_012358645.1 | 84.35 | 87% | 311 | ||
Nancy Ma's Night Monkey | Aotus nancymaae | Primates | ANM | 42.6 | XP_012296102.1 | 73.68 | 79% | 240 | ||
Tree shrew | Tupaia chinesis | Scandentia | 85 | XP_027622457.1 | 47.2 | 66% | 378 | |||
Long-tailed Chinchilla | Chinchilla lanigera | Rodentia | CLA | 88 | XP_013372168.1 | 42.03 | 54% | 522 | ||
Lesser Egyptian Jeroba | Jaculus jaculus | Rodentia | JJS | 88 | XP_012807521.1 | 46.43 | 64% | 311 | ||
Cat | Felis catus | Carniovra | FCT | 94 | XP_023109532.1 | 40.8 | 52% | 481 | ||
Large Flying fox | Pteropus vampyrus | Chiroptera | PVY | 94 | XP_023378855.1 | 41.64 | 57% | 358 | ||
Star nosed mole | Condylura cristata | Eulipotyphyla | CCT | 94 | XP_012590317.1 | 42.92 | 54% | 389 | ||
Przewalski's Horse | Equus przewalskii | Perissodactyla | EPZ | 94 | XP_008507892.1 | 43.98 | 66% | 302 | ||
White tailed deer | odocoileus virginianus texanus | Artiodactyla | OVT | 94 | XP_020765388.1 | 44.07 | 55% | 703 | ||
Big brown bat | Eptesicus fuscus | Chiroptera | EFS | 94 | XP_027989578.1 | 45.3 | 56% | 518 | ||
Cattle | Bos taurus | Artiodactyla | BTS | 94 | XP_024839688.1 | 48.22 | 62% | 832 | ||
Dromedary camel | Camelus dromedarius | Artiodactyla | CDD | 94 | XP_010980533.1 | 48.4 | 60% | 463 | ||
Horse | Equus caballus | Perissodactyla | ECB | 94 | XP_023480439 | 50.17 | 62% | 808 | ||
Dog | Canis lupus familiaris | Carnivora | CLF | 94 | XP_022281580.1 | 50.43 | 64% | 510 | ||
Polar bear | Ursus maritimus | Carnivora | UMM | 94 | XP_008710138.1 | 50.6 | 71% | 352 | ||
Ardvark | Orycteropus afer afer | Tubulidentata | OAA | 102 | XP_007949632.1 | 44.75 | 63% | 628 | ||
West Indian Manatee | Trichechus manatus latirostris | Sirenia | TML | 102 | XP_004391060.1 | 45.49 | 57% | 399 |
Paralogs
[edit]C6orf10 has one paralog that diverged about 135.6 million years ago. This paralog is called Thioredoxin domain containing protein 2 (TXNDC2).
- ^ Stammers M, Rowen L, Rhodes D, Trowsdale J, Beck S (May 2000). "BTL-II: a polymorphic locus with homology to the butyrophilin gene family, located at the border of the major histocompatibility complex class II and class III regions in human and mouse". Immunogenetics. 51 (4–5): 373–82. doi:10.1007/s002510050633. PMID 10803852.
- ^ "Entrez Gene: C6orf10 chromosome 6 open reading frame 10".
- ^ "AceView: Gene:C6orf10, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 25 February 2019.
- ^ "TSBP1 testis expressed basic protein 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 25 February 2019.
- ^ "Tissue expression of C6orf10 - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 25 February 2019.
- ^ "The Mfold Web Server | mfold.rit.albany.edu". unafold.rna.albany.edu. Retrieved 5 May 2019.