HSP47: Species Variation

The procollagen-specific Hsp47 belongs to the serpin superfamily and is encoded by a single copy gene in the human genome (SERPINH1) 37 as well as in the mouse genome (Serpinh1) 38. The SERPINH1 gene is conserved in mammals including chimpanzee, Rhesus monkey, cow, dog, mouse, and rat; in chicken and zebrafish (Danio rerio) as well as in frog. However, the SERPINH1 gene is missing in any invertebrate genome 112. 226 organisms have orthologs with the human SERPINH1 gene. Single copy genes of SERPINH1 can also be found in tetrapods, coelacanth, and lampreys while fishes harbor a volatile number of SERPINH1 genes. For instance, ray-finned fishes possess three copies of the SERPINH1 gene, whereas amazon molly (Poecilia formosa) has four copies of the SERPINH1 gene and Tetraodon has only one partial copy. Two copies each can be found in the genome of Takifugu and Japanese medaka (Oryzias latipes) 113. Noteworthy, all SERPINH1 genes analyzed have a characteristic 4 exon-3 intron gene structure with variable synthetic organization and sequence identities 112. Kumar et al. determined intron insertions in the SERPINH1 genes from ray-finned fishes with two introns inserted in the largest exon, eI 112. The group of Chandan Goswami identified the ancestral genomic loci of SERPINH1 in the genome of Japanese lamprey (Lethenteron japonicum) 113. Ancestral SERPINH1 fom L. japonicum is a functionally active gene encoding a protein (LjaHsp47/Serpin H) 470 amino acids in length which bears the non-inhibitory reactive center loop (RCL) and the ER-retention signal HDEL at the C-terminus 113. LjaHsp47/Serpin H1 shows 47% sequence identity to human Hsp47/Serpin H1.

Apart from SERPINH1 which is missing in the invertebrate genome, these organisms express a variable number of serpin-coding genes. While the genome of the fruit fly Drosophila melanogaster has been noted to encode 29 serpins 114, the nematode Caenorhabditis elegans contains nine SERPIN genes, five of the gene products serve as proteinase inhibitors 115. One of them, SRP-6, was found to play a pivotal role in pro-survival pathways by blocking necrosis 116. Arthropod genomes characteristically harbor 10 – 40 SERPIN genes with enhanced functional divergence due to alternative splicing of the sequence encoding the reactive center loop (RCL) 117. Arthropod serpins are secreted into the hemolymph where they serve as eminent mediators of innate immune responses 118. Correspondingly, Serpin 27A from D. melanogaster has been shown to be involved in regulating immune responses in insects 119, 120. Consensus analyses made by Irving and colleagues revealed a close association between the horse-shoe crab and the insect serpins since both organisms share a natural ancestor in the Protostomia branch of the Coelomata 121.

Plant serpins, being among the first reported members of the SERPIN superfamily, build a rational and diverse evolutionary unit 122. It has been proposed that, due to the lack of orthology between plants and animals, there was only a single serpin gene at the plant-animal divergence 121. Although numerous animal serpins have been functionally characterized to date, the functions of serpins in plants are somewhat unclear. In the genome of japonica rice (Oryza sativa), 14 serpin-coding genes could be identified, while the genome of the unicellular alga Arabidopsis thaliana has eight genes encoding full-length serpins 123. Amongst the Viridiplantae, SERPIN genes are also present in bryophytes, gymnosperms and flowering plants 123. Almost all plant serpins analyzed so far are potent inhibitors of individual mammalian serpins 124-126. The serpin barley protein Zx, one of the main protein components in beer and highly abundant in barley grain, represents a potent inhibitor of trypsin and chymotrypsin 125. The endogenous cysteine proteinase metacaspase-9 (AtMC9) has been reported as being targeted in vitro by the suicide inhibitor AtSerpin1 from Arabidopsis 127. AtSerpin1 was recently identified to target the ‘Responsive to Desiccation-21’ (RD21) papain-like cysteine protease, thereby controlling key mechanisms in regulating apoptosis 128. Two further Arabidopsis serpins, AtSerpin2 and Aterpin3 are putatively associated with DNA damage responses 129. Inhibitory serpins found at high concentrations in seeds are assumed to block exogenous proteinases from insects and microbes that attack the endosperm and other seed tissues 123, 130.

Only limited information is available on the function of prokaryotic serpins. Serpins in prokaryotes are rarely distributed and most serpin-expressing prokaryotes harbor a single SERPIN gene only 131. Sequence-based analysis revealed that prokaryotic serpins studied so far are mostly inhibitory 131. In contrast to mammalian serpins, prokaryotic serpins have been found to exhibit elevated thermostability 131-133. It is noteworthy that serpins from Clostridium thermocellum found in the cellusome, an extracellular multiprotein complex involved in cellulose degradation, may protect the cellusome against unwanted protease activity 134.

Viral serpins have been identified only in Poxviridae where they facilitate virus replication in the infected host 135. DNA viruses, such as members of the Poxviridae family, bear multiple copies of the functional serpins that play a critical role in viral pathogenesis 136. Serpins from Poxviridae have attracted much attention due to their possible therapeutic applicability in treating immune-mediated inflammatory diseases including transplant medicine 137, 138. Compared to mammalian serpins, viral serpins contain significant deletions in their secondary structure elements. As shown for cowpox virus, the serpin CrmA (cytokine response modifier A), one of the smallest members of the SERPIN superfamily, has been shown to lack the D-helix together with compelling parts of the A- and E-helices, respectively 139. Determination of the crystal structure of thermopin from Thermobifida fusca revealed the lack of the H-helix 132. Other branches of the Poxviridae family are less well characterized as serpins from myxoma virus (Leporipoxvirus) and swinepox virus (Suipoxvirus) are, with one exception, orphans 121.


Table 3: Hsp47/Serpin H1 of various vertebrates

Protein UniProt ID Aliases Gene names Gene ID
Serpin H1 P50454 47 kDa heat shock protein (Hsp47), arsenic-transactivated protein 3 (AsTP3), cell proliferation-inducing gene 14 protein, collagen-binding protein 1/-2 (CBP-1/-2), colligin-1/-2, rheumatoid arthritis-related antigen RA-A47 HSP47, PIG14, SERPINH1, CBP1, CBP2, SERPINH2 871
Mesocricetus auratus (Golden hamster)


Serpin H1 A0A1U7QK99 Hsp47; serpin peptidase inhibitor, clade H, member 1 Serpinh1 101829082
Alligator mississippiensis (American alligator)


Serpin H1 A0A151M3U0 Hsp47; serpin peptidase inhibitor, clade H, member 1 SERPINH1, HSP47 102561378
Amazona aestiva (Blue-fronted Amazon parrot)
Serpin H1 A0A0Q3X7T8 Serpin family H member 1 AAES_10215 ?
Oryzias latipes (Japanese medaka)




H2MH48 Hsp47, serpin family H member 1 Serpinh1, hsp47, cb216, fc56b02 100529194
Poecilia formosa (Amazon molly)


Serpin H1a A0A087X795 Hsp47; serpin peptidase inhibitor, clade H, member 1a Serpinh1 103145372
Danio rerio (Zebrafish)


Serpin H1a A8WFU6 Hsp47; serpin peptidase inhibitor, clade H, member 1a Serpinh1a 555328
Ceratitis capitata (Mediterranean fruit fly)


Serpin H1 W8BRM5 Heparin cofactor 2, serine protease inhibitor 27A LOC101457426