The Eukaryotic Linear Motif resource for
Functional Sites in Proteins
Functional site class:
VCP (P97, TERA) N-terminal domain binding motifs
Functional site description:
VCP (P97, TERA) is an essential and abundant AAA-ATPase that mediates vital cellular activities with the cooperation of many cofactors. VCP complexes are involved in many cellular processes, particularly in the endoplasmic reticulum (ER)‐associated degradation (ERAD) process for protein quality control, membrane trafficking, and DNA damage response. The N-terminal domain of VCP acts as a binding site for a group of adaptor proteins through their Arg/Lys-rich peptide motifs. Three motifs known to bind to the N-terminal domain of VCP are the SHP box, VIM (VCP-Interacting Motif), and VBM (VCP-Binding Motif) and they help direct VCP into different cellular pathways. The helical VIM and VBM motifs bind to the same groove but through different key residues. Though the VCP and their binding partners are conserved in eukaryotes, the sequences that mediate their interactions are significantly different across organisms showing that evolution has established more than one way for these proteins to interact.
ELMs with same func. site: LIG_VCP_SHPBox_1  LIG_VCP_VBM_3  LIG_VCP_VIM_2 
ELM Description:
The SHP box forms a short, antiparallel β-strand that interacts by β-augmentation with the β-sheet of the C-terminal part of the NTD subdomain of VCP at a site distinct from that to which other VCP ligands bind. Proteins with SHP boxes include p47, Ufd1-Npl4, DVC1, ASPL/TUG and Derlin1. It is characterized by a short hydrophobic sequence stretch with two invariant Gly residues. Structural studies of Derlin1-VCP (5GLF) and UFD1-VCP (5C1B) complexes show that the motif is bound to three highly conserved binding pockets in VCPNc. The N-terminal loop region of the SHP motif interacts with the first pocket, while the central β strand of the motif makes key interactions with the second binding pocket. The motif forms an antiparallel strand together with the β12 strand of the VCPN domain forming a four stranded stable β sheet in the region. The β strand of the motif is futher stabilized by the α4 helix of VCPNc. The C-terminal region of the motif is stabilized by a loop-loop interaction at the third binding pocket. Mutations on the C-terminal residues impair the interaction. Leu is always present and Arg is preferred in the preceding position. In the UFD1-VCP complex (5C1B), the SHP box hydrophobic interactions are mediated by four residues of the Ufd1 SHP box (Phe225, Phe228, Asn233 and Leu235), whose side chains are positioned toward the VCP Nc subdomain. The invariant two glycine residues generate a sharp kink in the middle of the SHP Box and enable its bending upon binding to VCPN. Some adaptors like p47 and UFD1 bind to VCP in a bipartite manner through UBX domain and SHP motif (Conicella,2020; Hanzelmann,2016). The interaction between p47 and VCP is tightly regulated by the status of VCP whether it is ATP or ADP loaded. p47 binding to VCPapo/ATP can be either bipartite (UBX+SHPC) or tripartite (UBX+SHPC+SHPN), while binding to VCPADP is bipartite showing the stoichiometry of 3:6 for tripartite and 6:6 for bipartite (Conicella,2020).
Pattern: .((F.)|(W))G.G[^P].L.
Pattern Probability: 0.0000091
Present in taxon: Eukaryota
Interaction Domain:
CDC48, N-terminal subdomain (IPR003338) The CDC48 N-terminal domain is a protein domain found in AAA ATPases including cell division protein 48 (CDC48), VCP-like ATPase (VAT) and N-ethylmaleimide sensitive fusion protein (Stochiometry: 1 : 1)
o See 17 Instances for LIG_VCP_SHPBox_1
o Abstract
The Endoplasmic Reticulum (ER) is an important eukaryotic cell organelle that has various functions, including the synthesis of proteins for export and quality control of nascent proteins. Newly synthesized proteins undergo folding and post-translational modifications in the ER. However, some proteins may not reach their native folded state. The ERAD (ER-Associated Degradation) mechanism acts as a protein quality control and removes these misfolded proteins. ERAD enables ER processing to distinguish the properly and improperly folded proteins in the ER lumen and then extracts them through membrane channels (dislocation or retrotranslocation) in an energy-dependent manner for delivery to cytosolic proteasomes. Nearly all ERAD substrates are ubiquitinated prior to their degradation and these ubiquitin chains provide a binding site for VCP (Valosin-Containing Protein). Thus ERAD is essential for ER homeostasis and correct functioning by degrading misfolded proteins (Hwang,2018).
Vertebrate VCP (also known as p97 or TERA for Transitional endoplasmic reticulum ATPase; Ter94 in fly, CDC48 in yeast) is a hexameric multidomain protein belonging to the functionally highly diverse AAA+ (ATPases Associated with diverse cellular Activities) superfamily of proteins. This large group of proteins drive numerous cell biological processes by converting chemical energy into mechanical energy (Khan,2022). As noted in PAXdb, VCP is a highly expressed protein, routinely observed to be amongst the top 5% of cellular proteins. VCP is likely an essential protein in all eukaryotes (Muller,2007). It is reported to be involved in a plethora of intracellular processes with the help of various co-factor proteins that specifically recruit ubiquitylated substrates. A tight control of VCP cofactor specificity and diversity as well as the assembly of higher-order VCP-cofactor complexes is accomplished by various regulatory mechanisms, which include bipartite binding, binding site competition, changes in oligomeric assemblies, and nucleotide-induced conformational changes (Hanzelmann,2017). More than 40 co-factor proteins have been identified so far, and most of them are multidomain proteins composed of specific VCP binding modules and additional domains that have functions in the recognition of ubiquitylated target proteins or possess catalytic domains or transmembrane domains (Buchberger,2015). Based on their functions, cofactors can be divided into three major classes: (i) Substrate-recruiting co-factors, such as the UFD1/NPL4 complex, link substrates to VCP and contain VCP interacting motifs and an additional ubiquitin binding domain that target ubiquitylated substrates; (ii) Substrate processing cofactors like ubiquitin (E3) ligases, deubiquitinases (DUBs) and cytosolic peptide N-glycanases (PNGase) process ubiquitylated, and N-glycosylated substrates; (iii) Regulatory cofactors like UBXD4, ASPL and SVIP sequester or recycle VCP hexamers. A few cofactors bind via their PUB or PUL domain to the unstructured C-terminal tail of VCP while the majority of the cofactors interact with the N-terminal VCP domain (CDC48_N; PF02359), often termed P97N, either via a UBX/UBXL globular domain or any one of three linear motifs, called VCP-Interacting Motif (VIM), VCP-Binding Motif (VBM), and SHP Box (named after yeast protein Shp1) (Hanzelmann,2017). In the nucleus, VCP is recruited for DNA damage repair by the SHP box protein Spartan (SPRTN) which specifically cleaves DNA-protein cross-links (Kroning,2022).

VIM and VBM are arginine-rich motifs found in several VCP cofactors with diverse functions (Buchberger,2015). The VCP CDC48_N domain has two subdomains or “lobes”. The interdomain cleft between the Nn and Nc lobes of CDC48_N provides a sterically unopposed interface for the interaction of the various VCP cofactor proteins. Despite the absence of significant sequence similarity, the VBM and VIM motifs bind partially overlapping sites at the interdomain cleft of the N domain. Hence, one N domain can only interact with one of these motifs at a time, reducing the complexity of cofactor interactions to a combinatorial problem of six N domains per VCP hexamer. The SHP box motif interacts with the C-terminal Nc/NTD subdomain of VCP CDC48_N, a site distinct from that to which the other VCP ligands bind (Lim,2016). Competition for N domain binding has been experimentally verified for various combinations of cofactors possessing different binding modules, e.g. SHP/UBX-VIM (p47-UBXD1; Kern,2009), VIM-VBM (SVIP-HRD1; Liu,2013), VIM–SHP/UBXL (gp78 – UFD1‐NPL4; Ballar,2006). Among them, SVIP is the only cofactor that binds with high affinity to all six N domains through the VIM motif forming the 6:6 stoichiometry. It is an efficient competitor for N domain cofactors and acts as a negative regulator of the ERAD pathway.

ERAD is necessary to preserve cell integrity since the accumulation of defective proteins results in more than 60 diseases including neurological dysfunction, cancer and cystic fibrosis (Guerriero,2012). Mutations in VCP are also causative of three protein aggregation diseases, Multisystem Proteinopathy (MSP), Familial Amyotrophic Lateral Sclerosis (FALS) and Charcot-Marie-Tooth Disease Type2Y (CMT2Y) (Ye,2017).

Many viruses exploit ERAD processes to promote their viral replication and to avoid detection by the immune response. The herpesviruses manipulate the immune response by the degradation of Major Histocompatibility complex (MHC-1) through retrotranslocation by the viral proteins US2 and US11. Likewise, the accessory protein Vpu of HIV induces CD4 degradation through the ERAD process helping to promote HIV infection. Many bacterial toxins also use the ERAD to invade host cells, e.g., the cholera Toxin protein employs the ERAD to enter the cytosol (Morito,2015). Different virus strains of the Nidovirales order, including the coronaviruses, use the ER-derived tuning vesicles (EDEMosomes) and double-membrane vesicles (DMVs) to sequester their double-stranded RNA from cytosolic sensors that will trigger interferon production and innate immunity (Zhang,2020; Noack,2014). These observations suggest that there might be the potential for bacterial and viral proteins to harbour VCP interacting motifs to interfere with ERAD processes.
o 6 selected references:

o 21 GO-Terms:

o 17 Instances for LIG_VCP_SHPBox_1
(click table headers for sorting; Notes column: =Number of Switches, =Number of Interactions)
Acc., Gene-, NameStartEndSubsequenceLogic#Ev.OrganismNotes
252 261 KGAFKAFTGEGQKLGSTAPQ TP 2 Homo sapiens (Human)
241 249 DQNGGGGRHNWGQGFRLGDQ TP 3 Homo sapiens (Human)
230 238 PLPEERPGGFAWGEGQRLGG TP 1 Homo sapiens (Human)
Q92890 UFD1
292 301 GGRFVAFSGEGQSLRKKGRK U 1 Homo sapiens (Human)
Q92890 UFD1
227 236 ELGFRAFSGSGNRLDGKKKG TP 4 Homo sapiens (Human)
O74926 rbd2
241 250 IAEPLSTFSSFPGKGTRLGG TP 4 Schizosaccharomyces pombe 972h-
Q0KL01 Ubxn2b
214 223 RLRFKAFSGEGQKLGSLTPE TP 2 Mus musculus (House mouse)
O35987 Nsfl1c
252 261 KGAFKAFTGEGQKLGSTAPQ TP 4 Rattus norvegicus (Norway rat)
252 261 AQLVIPFSGKGYVLGETSNL TP 3 Homo sapiens (Human)
Q8VBT9-1 Aspscr1
259 268 SAPFVPFSGGGQRLGGPSAS TP 3 Mus musculus (House mouse)
Q9CZ44 Nsfl1c
149 158 TSKPRPFAGGGYRLGAAPEE TP 3 Mus musculus (House mouse)
Q9CZ44 Nsfl1c
252 261 KGAFKAFTGEGQKLGSTAPQ TP 3 Mus musculus (House mouse)
Q12743 DFM1
283 292 QRETRTFSGRGQRLGTAPAT TP 2 Saccharomyces cerevisiae S288c
P34223 SHP1
305 314 TRKLGGFSGQGQRLGSPIPG TP 1 Saccharomyces cerevisiae S288c
P38838 WSS1
151 160 RGLYDTFLGNGQRLGGRANL TP 1 Saccharomyces cerevisiae S288c
P84994 Villin-like protein ABP41
Q8N729 NPW
131 140 EPESLDFSGAGQRLRRDVSR U 1 Homo sapiens (Human)
Please cite: The Eukaryotic Linear Motif resource: 2022 release. (PMID:34718738)

ELM data can be downloaded & distributed for non-commercial use according to the ELM Software License Agreement