Chromosome 4 open reading frame 50
Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene.[1] The protein localizes in the nucleus.[1] C4orf50 has orthologs in vertebrates but not invertebrates[2]
Gene
[edit]The C4orf50 gene is on chromosome 4 at position 4p16.2 and is located on the minus strand.[1][3] The gene's longest isoform consists of 11 exons, a coding sequence of 6370 nucleotides, and an upstream in-frame stop codon.[4] Other genes in the gene neighborhood include: CRMP1 and JAKMIP1[1]
Protein
[edit]C4orf50 is 1508 amino acids long and has a calculated molecular weight of 30 kDa.[1] The isoelectric point is at approximately a pH of 5.6.[5] In addition, the protein has higher than normal amounts of glutamic acid and arginine, and lower than normal amounts of phenylalanine and tyrosine.[6]
Tertiary structure
[edit]i-TASSER and Phyre 2 predict C4orf50 to have a tertiary structure rich in alpha helices concentrated near the N-terminus and C-terminus.[7][8]
Gene level regulation
[edit]Expression
[edit]C4orf50 RNA is expressed lowly and ubiquitously in most tissue types. C4orf50 is expressed at a much higher level in the brain, testis, adrenal, and prostate.[3] C4orf50 was expressed in specific parts of the brain including the hippocampus and striatum.[3] Other tissues with moderate expression included the frontal lobe, parietal lobe, and amygdala.[3] In all available RNA-sequencing data shows C4orf50 is found in the brain.
Protein level regulation
[edit]Modification
[edit]It is predicted that C4orf50 has 21 phosphorylation sites, one sulfonation site, one N-glycosylation site, and several O-glycosylation sites.[9]
Subcellular localization
[edit]The primary subcellular location is the nucleus.[1] Immunofluorescent staining of C4orf50 antibodies show that C4orf50 is present in the nucleus, but the reason remains unknown.[10] C4orf50 is less abundant than most proteins in humans[10]
Evolution
[edit]Orthologs
C4orf50 in Homo sapiens is poorly conserved. It is found in vertebrates but not invertebrates and has many orthologs including mammals, reptiles, birds, amphibians, and fish.[11] Table 1 below shows orthologs of C4orf50 in mammals, reptiles, birds, amphibians, and fish. C4orf50 is evolving considerably quickly compared to reference sequences Cytochrome C and Fibrinogen alpha. This is shown to the right when comparing the divergence rates of C4orf50, Cytochrome C, and Fibrinogen Alpha.
Genus and Species | Common Name | Taxonomic Group | Median Date of Divergence (MYA*) | Accession # | Sequence Length (aa) | Sequence Identity to Human Protein (%) | Sequence Similarity to Human Protein (%) |
---|---|---|---|---|---|---|---|
Homo sapiens | Human | Primate | 0 | XP_047271622 | 1508 | 100 | 100 |
Tupaia chinensis | Chinese Tree Shrew | Tupaiidae | 85 | XP_027622007 | 1448 | 93 | 53.2 |
Mus musculus | House Mouse | Rodentia | 87 | XP_006504299 | 1238 | 90 | 41.9 |
Talpa occidentalis | Iberian Mole | Talpidae | 94 | XP_037386436 | 1364 | 79 | 44.3 |
Mauremys mutica | Yellow Pond Turtle | Testudines | 319 | XP_044874448 | 1954 | 62 | 30.5 |
Alligator mississippiensis | American Alligator | Crocodilia | 319 | XP_019333198 | 1893 | 37 | 28.3 |
Apteryx rowi | Okarito Kiwi | Apterygiformes | 319 | XP_025910622 | 1459 | 8 | 47.2 |
Aquila chrysaetos chrysaetos | Golden Eagle | Accipitriformes | 319 | XP_040979081 | 1611 | 10 | 38.3 |
Gallus gallus | Chicken | Galliformes | 319 | XP_046772670 | 1627 | 7 | 44.6 |
Anser cygnoides | Swan Goose | Anseriformes | 319 | XP_047902118 | 1596 | 18 | 31.7 |
Falco cherrug | Saker Falcon | Falconiformes | 319 | XP_027669980 | 1518 | 8 | 50.4 |
Strigops | Kakapo | Psittaciformes | 319 | XP_030347251 | 1497 | 8 | 50.4 |
Geotrypetes seraphini | Gaboon Caecillian | Dermophiidae | 353 | XP_033815404 | 1897 | 11 | 37.8 |
Halichoerus grypus | Grey Seal | Phocidae | 94 | XP_035960566 | 1536 | 85 | 51 |
Amblyraja radiata | Thorny Skate | Rajiformes | 464 | XP_032876992 | 2434 | 74 | 50.8 |
*MYA = Million Years Ago
References
[edit]- ^ a b c d e f "C4orf50 Gene - GeneCards | CD050 Protein | CD050 Antibody". www.genecards.org. Retrieved 2022-07-29.
- ^ "Home - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2022-07-29.
- ^ a b c d "C4orf50 chromosome 4 open reading frame 50 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2022-07-29.
- ^ "PREDICTED: Homo sapiens chromosome 4 open reading frame 50 (C4orf50), transcript variant X2, mRNA". 2022-04-05.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2022-07-29.
- ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2022-07-29.
- ^ www.sbg.bio.ic.ac.uk http://www.sbg.bio.ic.ac.uk/~phyre2/html/. Retrieved 2022-07-29.
{{cite web}}
: Missing or empty|title=
(help) - ^ "I-TASSER results". seq2fun.dcmb.med.umich.edu. Retrieved 2022-07-29.[permanent dead link ]
- ^ "Services". www.healthtech.dtu.dk. Retrieved 2022-07-29.
- ^ a b "C4orf50 Antibody (PA5-63550)". www.thermofisher.com. Retrieved 2022-07-29.
- ^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2022-07-29.