Tetratricopeptide repeat protein 39B

TTC39B
Identifiers
Aliases	TTC39B, C9orf52, tetratricopeptide repeat domain 39B
External IDs	OMIM: 613574; MGI: 1917113; HomoloGene: 25228; GeneCards: TTC39B; OMA:TTC39B - orthologs
Gene location (Human)
Chr.	Chromosome 9 (human)
End	15,307,360 bp
Gene location (Mouse)
Chr.	Chromosome 4 (mouse)
End	83,242,492 bp
RNA expression pattern
	Top expressed in
	pancreatic epithelial cell; ; buccal mucosa cell; ; gingival epithelium; ; skin of limb; ; skin of leg; ; skin of abdomen; ; skin of arm; ; gonad; ; islet of Langerhans; ; skin of thigh;
	Top expressed in
	spermatocyte; ; spermatid; ; granulocyte; ; deep cerebellar nuclei; ; zygote; ; secondary oocyte; ; medial vestibular nucleus; ; epithelium of stomach; ; cumulus cell; ; facial motor nucleus;
	More reference expression data
	n/a
Orthologs
	158219
	69863
	ENSG00000155158
	ENSMUSG00000038172
	Q5VTQ0
	Q8BYY4
	NM_001168339; NM_001168340; NM_001168341; NM_001168342; NM_152574
	NM_025782; NM_027238
	NP_001161811; NP_001161812; NP_001161813; NP_001161814; NP_689787
	NP_081514
	Wikidata
View/Edit Human	View/Edit Mouse

Tetratricopeptide repeat protein 39B is a protein that in humans is encoded by the TTC39B gene. TTC39B is also known as C9orf52 or FLJ33868. The main feature within tetratricopeptide repeat 39B is the domain of unknown function 3808 (DUF3808), spanning the majority of the protein.

Gene

TTC39B Gene Location on Chromosome 9

The gene for TTC39B is located on the short arm of the ninth chromosome at 9p22.3. The genomic DNA is 136,517 bases long, consists of 39 introns and 20 exons, and is on the minus strand. The mRNA has a length of 3,276 bases. TTC39B is surrounded by LOC100419056, a chloride channel, voltage-sensitive 3 pseudogene.^[5]

Function

TTC39B is expected to have a molecular binding function as well as a role in lipid regulation; the phenotype as well as the function in vivo is unknown.^[6]

Homology and evolution

Paralogs

There are two known paralogs for TTC39B: TTC39A and TTC39C. TTC39A has two splice isoforms and TTC39C has three splice isoforms.

TTC39A has been tested for association to diseases like breast neoplasms and is expected to have molecular binding function and localizes in various compartments (extracellular space, membrane, nucleus).^[7]

TTC39C is expected to localize in cytoplasm. No phenotype has been discovered, and the gene's in vivo function is unknown.^[8]

Orthologs

Genus and Species	Common Name	RNA Percent Identity	Divergence
Pan paniscus	Bonobo	99%	6.3 MYA
Pan troglodytes	Chimpanzee	99%	6.3 MYA
Gorilla gorilla gorilla	Gorilla	99%	8.8 MYA
Nomascus leucogenys	Gibbon	98%	20.4 MYA
Papio anubis	Baboon	97%	29.0 MYA
Pongo pygmaeus	Orangutan	97%	15.7 MYA
Callithrix jacchus	Marmoset	96%	42.6 MYA
Saimiri boliviensis boliviensis	Squirrel monkey	94%	42.6 MYA
Canis lupus familiaris	Dog	91%	94.2 MYA
Otolemur garnettii	Bushbaby	90%	74.0 MYA
Felis catus	Cat	89%	94.2 MYA
Bos taurus	Cow	88%	94.2 MYA
Cricetulus griseus	Hamster	87%	92.3 MYA
Ovis aries	Sheep	85%	94.2 MYA
Rattus norvegicus	Rat	85%	92.3 MYA

Distant homologs

Genus and Species	Common Name	RNA Percent Identity	Divergence
Sarcophilus harrisii	Tasmanian devil	78%	162.6 MYA
Gallus gallus	Chicken	75%	296.0 MYA
Taeniopygia guttata	Zebra finch	75%	296.0 MYA
Anolis carolinensis	Lizard	75%	296.0 MYA
Xenopus laevis	Frog	74%	371.2 MYA

Phylogeny

TTC39B is conserved in organisms from human to platyhelminthes and is not conserved in yeast and fungi.

Protein

The TTC39B gene has five different transcript variants, each coding for a different protein. This article focuses on tetratricopeptide repeat protein 39B isoform 1, the longest of all of the proteins. When translated, the TTC39B protein is composed of 682 amino acids and has a molecular weight of 76,955.64 kDa. The isoelectric point of the protein is 7.16 pH.^[9]

Conservation

Close Orthologs:

Genus and Species	Common Name	Protein Percent Identity	Divergence
Pan troglodytes	Chimpanzee	99%	6.3 MYA
Pan paniscus	Bonobo	99%	6.3 MYA
Nomascus leucogenys	Gibbon	98%	20.4 MYA
Papio anubis	Baboon	98%	29.0 MYA
Callithrix jacchus	Marmoset	97%	42.6 MYA
Saimiri boliviensis boliviensis	Squirrel monkey	96%	42.6 MYA
Heterocephalus glaber	Naked mole-rat	92%	92.3 MYA
Canis lupus familiaris	Dog	91%	94.2 MYA
Cricetulus griseus	Hamster	90%	92.3 MYA
Ovis aries	Sheep	89%	94.2 MYA
Cavia porcellus	Guinea pig	86%	92.3 MYA

Distant Orthologs:

Genus and Species	Common Name	Protein Percent Identity	Divergence
Sarcophilus harrisii	Tasmanian devil	73%	162.6 MYA
Taeniopygia guttata	Zebra finch	72%	296.0 MYA
Pteropus alecto	Bat	55%	94.2 MYA
Bos taurus	Cow	54%	94.2 MYA
Rattus norvegicus	Rat	54%	92.3 MYA
Gallus gallus	Chicken	54%	296.0 MYA
Danio rerio	Zebrafish	54%	400.1 MYA
Crassostrea gigas	Oyster	50%	782.7 MYA
Camponotus floridanus	Ant	43%	782.7 MYA
Nasonia vitripennis	Wasp	42%	782.7 MYA
Ciona intestinalis	Urochordata	40%	722.5 MYA
Clonorchis sinensis	Liver fluke	35%	792.4 MYA

Domains and motifs

The Domain of Unknown Function 3808 (DUF3808) domain is conserved from fungi to humans and is currently has an unknown function. It is located from amino acid 142 until 568 (a length of 427 amino acids). Proteins of this family also contain a TPR_2 domain at their C-terminus, which also has an unknown function.^[10]

Another conserved domain in the TTC39B protein is the TPR_12 tetratricopeptide repeat. It is located from amino acid 600 until 658 (a length of 59 amino acids).^[11] The TPR domains are found in many proteins that facilitate specific interactions with a partner protein. Three-dimensional structural data have shown that a TPR region forms two antiparallel alpha-helices. TPR motifs that are arranged one in front of another create a right-handed helical structure with an amphipathic channel which could possibly accommodate the complementary region of a target protein. Most TPR-containing proteins are associated with multiprotein complexes, and there is extensive evidence indicating that TPR motifs are important to the functioning of chaperone, cell-cycle, transcription, and protein transport complexes.^[12] Two more TPR domains are found in the TTC39B protein: TPR1 which spans from amino acids 393 to 426 (34 amino acids long) and TPR2 which spans from amino acids 626 to 659 (also 34 amino acids long).^[13]

TTC39B contains three transmembrane regions, all located within the DUF3808 region.^[14] Since there are three transmembrane regions, the N-terminus and C-terminus of the protein will be on opposite sides of the plasma membrane.

Post-translational modifications

Phosphorylation Sites:^[15]

Amino Acid	Position
Serine (S)	28, 32, 42, 51, 61, 62, 72, 91, 93, 94, 96, 101, 102, 107, 120, 123, 124, 125, 126, 127, 134, 148, 165, 173, 194, 215, 217, 218, 221, 224, 229, 270, 279, 305, 313, 329, 344, 347, 350, 365, 393, 421, 454, 461, 464, 477, 500, 509, 524, 526, 548, 551, 557, 573, 578, 580, 614, 634, 638, 660, 663, 680, 681
Threonine (T)	89, 100, 110, 121, 128, 152, 174, 183, 202, 211, 250, 269, 356, 362, 370, 467, 487, 493, 512, 563, 628, 651
Tyrosine (Y)	167, 172, 206, 210, 239, 271, 274, 295, 363, 398, 434, 451, 452, 453, 468, 523, 542, 608, 620, 623, 636, 656, 659

Probability of Sumoylation Sites^[16] (bolded):

No.	Position	Group	Score
1	619	ESEKL LKYD HYLVP	0.91
2	262	NMINF IKGG LKIRT	0.77
3	302	EFEGG VKLG SGAFN	0.76
4	133	STKVD LKSG LEECA	0.73

There is one possible N-glycosylation site at amino acid 391, however, since the TTC39B protein does not contain a signal peptide, it is unlikely that this glycosylation actually occurs.

Secondary structure

According to an analysis of the secondary protein structure, TTC39B is most likely to be expressed in the endoplasmic reticulum, mitochondria, and Golgi apparatus.^[14]

Tertiary and quaternary structure

The TTC39B protein folds into an alpha-alpha super helix. 40% of its structure matches with d1w3ba, the superhelical domain of o-linked GlcNAc transferase. O-GlcNAc couples metabolic status to the regulation of a wide variety of cellular signaling pathways by acting as a nutrient sensor.^[17]

Expression

Promoter and transcription start site

The promoter for TTC39B starts at base pair 15,307,109 and ends at base pair 15,307,858. It has a length of 750 base pairs. The transcription start site for TTC39B protein isoform 1 is located from base pairs 15,307,340 to 15,307,389 and has a length of 50 bp.

Expression profile

TTC39B is well expressed in muscles, internal organs, secretory organs, reproductive organs, the immune system, and the nervous system.^[6] TTC39B is expressed in a multitude of tissues: testis, lung, islets of langerhans, pancreas, kidney, pooled germ cell tumors, breast carcinoma, etc.^[6]

Transcript variants

There are five different transcript variants for the TTC39B gene. Isoform 1 is the longest transcript and encodes the longest isoform. Isoform 2 uses an alternate in-frame splice site in the central coding region, compared to variant 1, which results in a shorter protein. Isoform 3 and 4 have multiple differences in the central coding region but maintain the reading frame compared to isoform 1. Isoform 5 differs in the 5' UTR and has multiple coding region differences, compared to variant 1. These differences cause translation initiation at an in-frame downstream AUG and results in isoform 5 having a shorter N-terminus compared to isoform 1.^[18]

Interacting proteins

Binding transcription factors

Transcription Factor Binding Sites:^[19]

Matrix Family	Detailed Family Information	From	To	Strand	Similarity	Sequence (CAPITALS: core sequence)
V$PLAG	Pleomorphic adenoma gene	51	73	(+)	1.000	taGGGGgaagtagaggagttcca
V$TALE	TALE homeodomain class recognizing TG motifs	157	173	(+)	1.000	ggtggtgtGTCAgaggc
V$ZF02	C2H2 zinc finger transcription factors 2	294	316	(-)	1.000	cagcgCCCCacctggggtccgtg
V$MIZ1	Myc-interacting Zn finger protein 1	417	427	(-)	1.000	cacgcCCTCtg
O$TF2B	RNA polymerase II transcription factor II B	517	523	(-)	1.000	ccgCGCC

Cellular Proteins

TTC39B interacts with ubiquitin C (UBC), a polyubiquitin precursor. Conjugation of ubiquitin monomers or polymers leads to different effects within a cell. Ubiquitination has been associated with protein degradation, DNA repair, cell cycle regulation, kinase modification, endocytosis, and regulation of other cell signaling pathways.^[20]

Clinical significance

Disease association

On a locus on chromosome 9p22 found to be associated with high-density lipoprotein (HDL-C), TTC39B was the only one of several genes in the locus to have an eQTL in liver, with the allele associated with decreased expression correlating with increased HDL-C. Knockdown of the mouse ortholog TTC39B via a viral vector (50% knockdown) resulted in significantly higher plasma HDL-C levels at 4 days and 7 days. The data indicates that TTC39B as causal genes for lipid regulation.^[21]

References

^ ^a ^b ^c GRCh38: Ensembl release 89: ENSG00000155158 – Ensembl, May 2017
^ ^a ^b ^c GRCm38: Ensembl release 89: ENSMUSG00000038172 – Ensembl, May 2017
^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "LOC100419056 chloride channel, voltage-sensitive 3 pseudogene". NCBI. Retrieved 13 May 2013.
^ ^a ^b ^c "TTC39B, a comprehensive annotation of human, mouse, and worm genes with mRNAs or ESTsAceView". AceView. Retrieved 13 May 2013.
^ "TTC39A, a comprehensive annotation of human, mouse, and worm genes with mRNAs or ESTsAceView". AceView. Retrieved 13 May 2013.
^ "TTC39C, a comprehensive annotation of human, mouse, and worm genes with mRNAs or ESTsAceView". AceView. Retrieved 13 May 2013.
^ "Tetratricopeptide repeat protein 39B isoform 1 [Homo sapiens] - Protein - NCBI".
^ "NCBI". Retrieved 9 May 2013.^{[permanent dead link‍]}
^ "NCBI". Retrieved 9 May 2013.
^ Blatch GL, Lässle M (November 1999). "The tetratricopeptide repeat: a structural motif mediating protein-protein interactions". BioEssays. 21 (11): 932–9. doi:10.1002/(SICI)1521-1878(199911)21:11<932::AID-BIES5>3.0.CO;2-N. PMID 10517866.
^ "NP_689787.2: TTC39B gene product [Homo sapiens]". NCBI. Retrieved 13 May 2013.
^ ^a ^b "Biology Workbench". SDSC Biology Workbench. Retrieved 13 May 2013.
^ "NetPhos 2.0 Server". Center for Biological Sequence Analysis. Retrieved 13 May 2013.
^ "SUMOsp 2.0 - SUMOylation Site Prediction". The CUCKOO Workgroup. Archived from the original on 10 May 2013. Retrieved 13 May 2013.
^ Lazarus MB, Nam Y, Jiang J, Sliz P, Walker S (January 2011). "Structure of human O-GlcNAc transferase and its complex with a peptide substrate". Nature. 469 (7331): 564–7. Bibcode:2011Natur.469..564L. doi:10.1038/nature09638. PMC 3064491. PMID 21240259.
^ "TTC39B tetratricopeptide repeat domain 39B [Homo sapiens (human)]". NCBI. Retrieved 13 May 2013.
^ "GEMS Launcher: Matlnspector: Search for transcription factor binding sites". Genomatix Software Suite. Retrieved 13 May 2013.^{[permanent dead link‍]}
^ "UBC Gene - GeneCards". GeneCards. Retrieved 13 May 2013.
^ Teslovich TM, Musunuru K, Smith AV, et al. (August 2010). "Biological, clinical and population relevance of 95 loci for blood lipids". Nature. 466 (7307): 707–13. Bibcode:2010Natur.466..707T. doi:10.1038/nature09270. PMC 3039276. PMID 20686565.

[refGRCh38Ensembl-1] GRCh38: Ensembl release 89: ENSG00000155158 – Ensembl, May 2017

[refGRCm38Ensembl-2] GRCm38: Ensembl release 89: ENSMUSG00000038172 – Ensembl, May 2017

[3] "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[4] "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[5] "LOC100419056 chloride channel, voltage-sensitive 3 pseudogene". NCBI. Retrieved 13 May 2013.

[ncbi.nlm.nih.gov-6] "TTC39B, a comprehensive annotation of human, mouse, and worm genes with mRNAs or ESTsAceView". AceView. Retrieved 13 May 2013.

[7] "TTC39A, a comprehensive annotation of human, mouse, and worm genes with mRNAs or ESTsAceView". AceView. Retrieved 13 May 2013.

[8] "TTC39C, a comprehensive annotation of human, mouse, and worm genes with mRNAs or ESTsAceView". AceView. Retrieved 13 May 2013.

[9] "Tetratricopeptide repeat protein 39B isoform 1 [Homo sapiens] - Protein - NCBI".

[10] "NCBI". Retrieved 9 May 2013.^{[permanent dead link‍]}

[11] "NCBI". Retrieved 9 May 2013.

[pmid10517866-12] Blatch GL, Lässle M (November 1999). "The tetratricopeptide repeat: a structural motif mediating protein-protein interactions". BioEssays. 21 (11): 932–9. doi:10.1002/(SICI)1521-1878(199911)21:11<932::AID-BIES5>3.0.CO;2-N. PMID 10517866.

[13] "NP_689787.2: TTC39B gene product [Homo sapiens]". NCBI. Retrieved 13 May 2013.

[Biology_Workbench-14] "Biology Workbench". SDSC Biology Workbench. Retrieved 13 May 2013.

[15] "NetPhos 2.0 Server". Center for Biological Sequence Analysis. Retrieved 13 May 2013.

[16] "SUMOsp 2.0 - SUMOylation Site Prediction". The CUCKOO Workgroup. Archived from the original on 10 May 2013. Retrieved 13 May 2013.

[pmid21240259-17] Lazarus MB, Nam Y, Jiang J, Sliz P, Walker S (January 2011). "Structure of human O-GlcNAc transferase and its complex with a peptide substrate". Nature. 469 (7331): 564–7. Bibcode:2011Natur.469..564L. doi:10.1038/nature09638. PMC 3064491. PMID 21240259.

[18] "TTC39B tetratricopeptide repeat domain 39B [Homo sapiens (human)]". NCBI. Retrieved 13 May 2013.

[19] "GEMS Launcher: Matlnspector: Search for transcription factor binding sites". Genomatix Software Suite. Retrieved 13 May 2013.^{[permanent dead link‍]}

[20] "UBC Gene - GeneCards". GeneCards. Retrieved 13 May 2013.

[pmid20686565-21] Teslovich TM, Musunuru K, Smith AV, et al. (August 2010). "Biological, clinical and population relevance of 95 loci for blood lipids". Nature. 466 (7307): 707–13. Bibcode:2010Natur.466..707T. doi:10.1038/nature09270. PMC 3039276. PMID 20686565.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]