The Eukaryotic Linear Motif resource for
Functional Sites in Proteins
Accession:
Functional site class:
ARS2 C-terminal leg domain ligand
Functional site description:
ARS2 is a central hub in RNA processing and decay. It is a large modular protein with several well-described interaction surfaces and partners. It forms a stable complex with CBC through its C-terminal disordered tail and interacts with RNA, proteins and multi-subunit complexes. The C-terminal arm is a ZnF domain that can bind RNA and protein partners through adjacent surfaces. It has a positively charged patch on its surface that is bound by diverse protein partners involved in different steps of RNA processing through the acidic EDGEI motif. The interaction is conserved from yeast to human and it enables the mutually exclusive binding of diverse partners to ARS2.
ELM Description:
The acidic EDGEI motif is located within disordered regions and binds ARS2 with a low-micromolar affinity (Foucher,2022). It forms an extended U-shaped structure when it binds to the highly conserved lysine and hydrophobic residues on the surface of the ARS2 C-terminal leg domain (7QY5). Triple mutations of the three key lysines in this ZnF domain or to the negative charged residues of the motif abolish the interaction (Schulze,2018; Dobrev,2021; Foucher,2022). Proteins of diverse protein families harbour such a motif often in two or three adjacent copies (Foucher,2022), enabling mutually exclusive binding of different RNA decay/processing/splicing factors to the CBC-ARS2 complex (Schulze,2018). In most proteins the motif is highly conserved among vertebrates, but in some cases also in for example, invertebrates and yeasts (Foucher,2022). Furthermore, the EDGEI-mediated interaction of PHAX with ARS2 seems to be conserved from plants to humans, as the N-terminal EDGEI motif of A. thaliana PHAX is likely to mediate the binding to the plant ARS2 homolog SERRATE (Giacometti,2017).

Looking at the alignments of the validated motif instances, glutamic acid is strictly retained in the 1st position, as well as glycine in the 3rd position. Variations between glutamic and aspartic acid occur in the 2nd position. There are patches of conserved negatively charged residues in the flanking regions of the motif, especially in the C-terminal flank, that may contact positively charged residues on the ARS2 surface

Aspartic acid in the first position is only seen in the first two motifs of ZC3H18, however it has a third motif “EEGEV” and the motifs have not been individually studied, so the first two might just bind weakly (Rouviere,2023). The only motif with an aspartic acid in the 4th position is that of HnRNPC but it was only seen to bind to ARS2 in one high-throughput assay and the motif was not studied in detail (Schulze,2018). Therefore, we concluded on a stricter motif definition for ELM.
Pattern: E[ED]G[EQ][ILVM].{0,2}[DE]
Pattern Probability: 0.0000092
Present in taxon: Eukaryota
Interaction Domain:
SERRATE/Ars2, C-terminal (IPR007042) This domain can be found in the C terminus of the SERRATE (SE) from plants and its homologue, Ars2, from animals (Stochiometry: 1 : 1)
o See 21 Instances for LIG_ARS2_EDGEI_1
o Abstract
Eukaryotic genomes are pervasively transcribed (Villa,2023), which can lead to an over-abundance of RNAs. To ensure cell integrity, there are several RNA surveillance, processing and degradation mechanisms that remove specific and aberrant RNAs. Degradation of nuclear RNAs is carried out by the RNA exosome (Zinder,2017). To selectively target RNAs to the nuclease machinery, eukaryotic cells have multi-subunit targeting complexes that direct specific classes of RNAs to the exosome for degradation.

In the fission yeast S. pombe, the targeting complex, MTREC, consists of a core complex with the zinc-finger protein, Red1 (Q9UTR8), and an RNA-helicase, Mtl1. This complex is responsible for targeting a variety of classes of mRNAs to the exosome, including meiotic mRNAs (24713849) and unstable and unspliced mRNAs (Zhou,2015). The selection of RNAs is determined by binding of protein sub-modules to the MTREC core complex. Interactions with sub-modules is driven by the multiple interaction surfaces of Red1 (Dobrev,2021). The interaction of Red1 with Ars2 (O94326) is mediated by the Red1 EDGEI motif that recruits the Ars2-Cbc submodule, with cap-binding proteins, Cbc1 (O14253) and Cbc2 (Q9P383). While studied more extensively in humans, this Red1/Mtl1-associated sub-module is thought to be responsible for processing, localization and degradation of RNA polymerase II transcripts containing a 5’ m7G cap. Ars2 contains a C-terminal zinc-finger domain that interacts with the EDGEI motif in Red1.

In humans the RNA-binding protein ARS2 (Q9BXP5) has been implicated in transcriptional, post-transcriptional as well as early transcription termination activities. Accordingly, the ARS2-CBC complex interacts with diverse RNA degradation complexes, processing and splicing factors, many of which employ one or more EDGEI motifs to interact with ARS2. Thus, this motif-mediated interaction is conserved from yeast to humans and ensures the mutually exclusive recruitment of distinct RNA-processing machineries to ARS2. The hitherto described ARS2 interactors employing the EDGEI motif include the PolyA tail exosome targeting (PAXT) complex subunit ZFC3H1, the mRNA Transcription-Export (TREX) complex subunit THOC1, the ZC3H18 protein making the link with the Nuclear Exosome Targeting (NEXT) complex, the ZC3H4 restriction factor in transcription termination, splicing factor THRAP3, 3’ end processing factor CPSF2, FLASH, PHAX and the mRNA export factor NCBP3 (Schulze,2018, Foucher,2022; Rouviere,2023). ZC3H6, RTF1 and PRPF4B are also likely to mediate interactions with ARS2 through this motif (Schulze,2018, Foucher,2022). Therefore, EDGEI-motif-mediated interactions have a huge role in the moonlighting potential of the ARS2-CBC complex.

The A. thaliana ortholog of ARS2 called SERRATE (Q9ZVD0) was shown to be central in the formation of nuclear miRNA processing dicing bodies through liquid-liquid phase separation (Xie,2021). ARS2 partners often harbouring multiple copies of the EDGEI motif and thus likely mediating multivalent interactions with ARS2 imply that this phenomenon could also be conserved in animals and humans.
o 5 selected references:

o 9 GO-Terms:

o 21 Instances for LIG_ARS2_EDGEI_1
(click table headers for sorting; Notes column: =Number of Switches, =Number of Interactions)
Acc., Gene-, NameStartEndSubsequenceLogic#Ev.OrganismNotes
Q9UKL3 CASP8AP2
C8AP2_HUMAN
934 941 SPLDELEEGEIRSDSETSKP TP 6 Homo sapiens (Human)
1 
Q9UTR8 red1
RED1_SCHPO
32 39 SNDSDKEDGEISEDDPVIDQ TP 11 Schizosaccharomyces pombe 972h-
1 
Q86VM9 ZC3H18
ZCH18_HUMAN
206 212 IDDDDLEEGEVKDPSDRKVR TP 4 Homo sapiens (Human)
1 
Q9UPT8 ZC3H4
ZC3H4_HUMAN
60 67 REDGELEEGELEDDGAEETQ TP 6 Homo sapiens (Human)
1 
Q9UPT8 ZC3H4
ZC3H4_HUMAN
55 61 PLPDDREDGELEEGELEDDG TP 6 Homo sapiens (Human)
1 
Q9H814 PHAX
PHAX_HUMAN
9 15 LEVGDMEDGQLSDSDSDMTV TP 5 Homo sapiens (Human)
1 
Q0E5P5 CG1677
Q0E5P5_DROME
400 407 VEEGELEEGEVSDEDEKRPE TP 1 Drosophila melanogaster (Fruit fly)
1 
Q0E5P5 CG1677
Q0E5P5_DROME
395 401 SSQEKVEEGELEEGEVSDED TP 1 Drosophila melanogaster (Fruit fly)
1 
Q9LTP9 At3g20430
Q9LTP9_ARATH
29 36 VEMVDVEEGEIVVDHDLDSG TP 1 Arabidopsis thaliana (Thale cress)
1 
Q92541 RTF1
RTF1_HUMAN
151 157 AESSAPEEGEVSDSDSNSSS TP 1 Homo sapiens (Human)
1 
P61129 ZC3H6
ZC3H6_HUMAN
17 23 REDGELEDGEIDDAGFEEIQ TP 1 Homo sapiens (Human)
1 
P61129 ZC3H6
ZC3H6_HUMAN
12 18 HAGHDREDGELEDGEIDDAG TP 1 Homo sapiens (Human)
1 
O60293 ZFC3H1
ZC3H1_HUMAN
23 30 KEEGELEDGEISDDDNNSQI TP 7 Homo sapiens (Human)
1 
O60293 ZFC3H1
ZC3H1_HUMAN
18 24 SGLSPKEEGELEDGEISDDD TP 7 Homo sapiens (Human)
1 
Q8WVV9 HNRNPLL
HNRLL_HUMAN
28 33 QAKRLKTEEGEIDYSAEEGE TP 4 Homo sapiens (Human)
1 
Q9P2I0 CPSF2
CPSF2_HUMAN
649 656 DTGVILEEGELKDDGEDSEM TP 4 Homo sapiens (Human)
1 
Q13523 PRPF4B
PRP4B_HUMAN
146 152 YESGSEEEGEIHEKARNGNR TP 4 Homo sapiens (Human)
1 
Q9Y2W1 THRAP3
TR150_HUMAN
930 937 HDKFSGEEGEIEDDESGTEN TP 4 Homo sapiens (Human)
1 
Q96FV9 THOC1
THOC1_HUMAN
213 220 EEGMDVEEGEMGDEEAPTTC TP 4 Homo sapiens (Human)
1 
Q53F19 NCBP3
NCBP3_HUMAN
216 223 SDDDEAEEGEVEDENSSDVE TP 7 Homo sapiens (Human)
1 
Q53F19 NCBP3
NCBP3_HUMAN
43 48 EPEPMEVEEGELEIVPVRRS TP 6 Homo sapiens (Human)
1 
Please cite: The Eukaryotic Linear Motif resource: 2022 release. (PMID:34718738)

ELM data can be downloaded & distributed for non-commercial use according to the ELM Software License Agreement