P1 peptidase of Pea seed-borne mosaic virus contains non- canonical C2H2 zinc finger and may act in a truncated form

Background: Pea seed-borne mosaic virus (PSbMV) is a member of the genus Potyvirus that can be transmitted by seeds, causing serious losses in legume crop yields. Its genome consists of 3 peptidases-P1, HC-Pro, and NIa. Interestingly, only the P1 peptidase, localized on the N-terminal portion of the viral polyprotein, still remains without any concrete function on influencing viral infections. Results: P1 peptidase consists of the originally-described C2H2 zinc finger motif. This newly revealed characteristic can help further clarify its true activity during viral infection. The truncated portion of P1 (Δ33-143) was expressed in bacteria and used for production of polyclonal antibody in rabbit. Using this antibody, it was shown that P1 (42 kDa) can be presented in infected plant also in a shorter form (30 kDa). Immunocytochemical localization of P1 in the protoplasts, obtained from the infected pea plant, demonstrated the localization of P1 in the cytoplasm, as well as in the nucleus. Moreover, newly found C2H2 motive can form active cavity according to in silico model of P1 molecule. Conclusions: Herein, it was described the C2H2 zinc finger motif (ZnF), which is non-standard compared to others present in the eukaryotic world. Moreover, the predicted protein models in silico of the ZnF found. Similar ZnF was also found in the P1 of another 5 viruses (3 Potyviruses, and 2 Tritimoviruses).


Introduction
Pea seed borne mosaic virus (PSbMV) is a member of genus Potyvirus, family Potyviridae. Infection by PSbMV belongs to an important group of legume diseases, transmitted via seeds, causing mosaics, stunting, and serious crop losses (varying between 11-82%) [17]. The P1 protein of the family Potyviridae is a serine peptidase, localized at the N-terminal part of the viral genome, cleaving both the HC-Pro and itself from the remainder of the polyprotein during viral infection. Other attributes of the P1 peptidase were its RNA binding activity, and characteristic of enhancing HC-Pro activity in the silencing of suppression [8,18].
Peptidases are not uncommon in the microbial world. Their importance is their involvement in viral protein processing; often eventually causing certain fatal diseases such as AIDS and cancer. Serine, aspartic, and cysteine peptidases are found in various viruses [9]. In this methodological analysis, the problems studied were focused on plant viral peptidases. This not only helps us to safely investigate the problematic nature of viral peptidases affecting host plants, but correspondingly applies the homologies found in order to better understand the activity of RNA viruses; not only in the field of plants, but in the wider application to those viruses which can affect and are of concern to humans.
Up to the present, the real function(s) of potyviral P1 peptidase remains unclear; without any significance of its influence upon the infectious cycle of the virus. According to the results, there exists strong suggestion about novel biochemical quality, sub-cellular localization, and its likely dedicated function of this mysterious potyviral protein from in silico analysis and modelling.

Ethical statement
All animals used in this study were housed in accordance with the guidelines specified in the legislative rules of the European Union, and they were also supervised by the competent affiliated control authority of the Czech Republic.

Viral isolate and P1 peptidase
The P1 coding region was used from the local isolate PSB204CZ of PSbMV previously identified by our laboratory in Czech Republic  NP734419). The total P1 coding region was amplified with: forward (5´-ATGTCAACACTAGTTTGC-3´), and reverse (5´-ATAGTGGTCAATTTGAAA-3´) primers using GoTaq DNA polymerase (Sigma). The P1 coding region sequence reached 1141 bp in length, and was cloned into a pGEM-T vector (Promega), and then transformed into E.coli JM109 (Promega) competent cells. The coding region was checked by sequencing. The cycle sequencing reaction was provided using pGEM-T/M13 forward and reverse primers (Promega), with the annealing site in the vector. For the cycle sequencing reaction, an ABI BigDye (v. 1.1) sequencing terminator kit (Applied Biosystems) was used. The sequences attained were analyzed and completed by SeqMan TM Software (DNA Star).

Bacterial expression of P1 Δ33-143 peptide
It was decided to express peptide P1 Δ33-143 (predicted size 14 kDa) for production of antibody against P1. This motif was chosen according to the highest probability of antigenicity within the whole P1 amino acid sequence. The P1 Δ33-143 coding region was amplified by: forward (5´-GAGCTCTGTACTCAGTGTGACATGG-3´), and reverse (5´-GGTACCAGCTTTCCGCATGAAAA-3´) primers, with the linked restriction sites SacI (forward) and KpnI (reverse; both underlined in the sequence) for re-cloning into the bacterial expression vector pQE30 (QIAGen). Competent E. coli M15[pRep4] cells (QIAGen) were transformed with the constructs attained. The cell culture, cultivated in Luria-Bertani medium, with the addition of the antibiotics ampiciline (100 mg.ml -1 ) and kanamycine (25 mg.ml -1 ) in the exponential phase of growth was induced with isopropyl-1-thio-β-Dgalactopyranozide (IPTG) to a final concentration of 1 mmol.l -1 , and then grown for an additional 4 hours at 37°C. Next, total cell lyzate was prepared by collecting the total amount of cells from the medium by centrifugation (4000 x g; 20 min). Cells in pellet were disturbed by repetitive freezing/thawing cycle. Disturbed cells were then re-suspended in the lysis buffer (50 mmol.l -1 NaH 2 PO 4 .2H 2 O, 300 mmol.l -1 NaCl, and 10 mmol.l -1 imidazole at pH 8.0) with addition of lyzozyme to final concentration of 1 mg.ml -1 . This mixture was incubated for 1 hour in 4°C and then cell debris was collected by centrifugation (10 000 x g; 30 min). Reached supernatant was used as cell lysate for further experiments.

Protein purification and antibody production
Two ml of total cell lyzate was loaded onto a 0.5 cm thick 10% concentrate of SDS-polyacrylamide gel [5]. After separation, the gel was stained with GelCode® Blue Stain Reagent (Thermo Scientific) and a 14 kDa sized protein was dissected. The gel slice was minced into small pieces and inserted into the dialysis tube (Spectra/Por, Spectrum Laboratories), with a molecular cut-off of 6000-8000 Da. The electro elution environment was reached using a dialysis buffer (50 mmol.l -1 NaH 2 PO 4 , and 1% SDS) [21]. The electro elution was run at 100 mA for over 8 hours at 4°C. The protein from the dialysis buffer was concentrated using Microcon columns (Milipore), with a molecular cut-off of 10,000 Da. This method also allows for the de-salting of the samples, as well as decontaminating the SDS from the samples, which was afterwards used for rabbit immunization. The purified protein diluted in saline to concentration 1 mg.ml -1 was emulsified in equal volume of Freud's complete adjuvant. The initiating volume 1ml was applied intramuscularly two times over a period of 7 days. Afterwards, three immunizing applications were given to the rabbits intradermally as 0.2 mg of protein homogenate in Freud's adjuvant at days 14 and 28. Lastly, the immunizing injection containing 0.5 mg of purified protein was applied on the 42 nd day from the start. Seven days past the last immunization, the rabbits were bled. The rabbit polyclonal serum produced was preserved by adding 0.02% sodium azide. Isolation of the rabbit IgG was conducted using a Montage® Antibody Purification Kit (Milipore), according to the manufacturer's recommendations.

Protein amino acid sequencing
Bands containing proteins of the right size to be analyzed were excised from the Coomassie-stained SDS-PAGE gels. The in-gel digestion by raffinose-modified trypsin was conducted according to a published protocol [14], with an additional reduction/alkylation (2-mercaptoethanol/iodoacetamide) prior to the addition of the protease [13]. Total of 5 ml digest and an equal volume of matrix solution were premixed in a test tube. Then 1 ml of the mixture was pipetted on an MSP AnchorChip 600/96 target (Bruker Daltonik) and allowed to dry at ambient temperature. The positive-ion mass spectra were measured in reflection mode on a Microflex LRF20 MALDI-TOF mass spectrometer (Bruker Daltonik), equipped with a microScout ion source and a 337-nm nitrogen laser. The mass spectra were accumulated from 100-200 shots, at a laser repetition rate of 10 Hz; the examined m/z range was 500-5000. The instrument was externally calibrated using a mixture of standard peptides.
The acquired spectra were processed by flexAnalysis 2.4 and Biotools 3.0 software (Bruker Daltonik). Database searches were run against the Swiss-Prot (ver. 56.6) and NCBInr protein sequence databases, using the Mascot Server 2.2 program (Matrix Science, London, UK) installed on a local doi: 10.7243/2050-2389-3-1 PC. Oxidation of methionine (as an optional modification) plus one enzyme-missed cleavage were chosen for all of the searches performed, without taxonomic restriction; a mass tolerance of 150 ppm was allowed.

Immunoprecipitation
Leaves from healthy and infected pea plants were minced in liquid nitrogen. The denaturizing buffer was added immediately (50 mmol.l -1 Tris-HCl pH 8, 150 mmol.l -1 NaCl, 1% NP-40, 0.5% sodium deoxycholate, and 0.1% SDS) containing 2x concentrated cocktail of inhibitors of endogenous proteases inhibitors (from 50x concentrated stock solution, complete Mini, EDTA-free, Protease Inhibitor Cocktail, Roche). Additionally the samples were sonicated. For protein immunoprecipitation, rabbit immune serum as well as purified IgG was used. Leaf lysate was subsequently incubated with the antibody, by rotation overnight in 4°C. Sepharose beads with protein-A were added (in the amount specified by the manufacturer) and samples were incubated for additional 1 hour at 4°C. Consequently, the supernatant was discarded, beads were washed twice with the lysis buffer and precipitated proteins were eluted with a sample buffer (50 mmol.l -1 Tris-HCl pH 6.8, 2% SDS, 10% glycerol, 1% β-mercaptoethanol, 12.5 mmol.l -1 EDTA, 0.02% bromophenol blue). The isolated proteins were loaded on 10% concentrated SDS-polyacrylamide gel [5], and analyzed by MALDI-TOF. For immunodetection of the precipitated P1 peptidase, the proteins was transferred onto a nitrocellulose membrane (Amersham Bioscience). The P1 peptidase was detected either by immune serum or by isolated IgG.

In silico secondary structure modeling
The secondary structure was calculated computationally from the 277 amino acid-long P1 coding region (ΔP1), using I-TASSER Online (http://zhanglab.ccmb.med.umich.edu/I-TASSER) [10,11,23], an internet service for protein structure and function predictions. The 3D-models were constructed based on multiple-threading alignments by LOMETS and iterative TASSER assembly simulations. The display models were constructed with the use of PyMOL (DeLano Scientific LLC 2006).

Protoplast isolation and Immunocytochemistry
The immunocytochemistry was performed using the isolated and purified protoplasts from healthy and PSbMV virus-infected pea plants. Total of 100 mg leaves were used, chopped into small pieces and macerated overnight at 27°C in an enzyme solution (1% Cellulase Omozuka R-10, and 0.25% Macerozyme R-10; both Duchefa); this was diluted in protoplast washing medium with gentle agitation. Separation of the protoplasts was by filtration using a 0.45 µM nylon membrane. Released protoplasts were sedimented by centrifugation (1000 x g; 10 min, 4°C). After aspiration of supernatant, the pellet of protoplasts was resuspended in 50 μl of washing medium (0.2 mmol.l -1 KH 2 PO 4 , 1 mmol.l -1 KNO 3 , 10 mmol.l -1 CaCl 2 .2H 2 O, 1 mmol.l -1 MgCO 4 .7H 2 O, and 0.5 mol.l -1 mannitol, pH 5.6). Next, 100 μl of fixing solution (1% paraformaldehyde, 0.2 mol.l -1 HEPES, 0.15% Triton X-100, and protease inhibitors) was dropped directly onto the cover slips. A 20-μl aliquot of protoplasts were placed on each slide and incubated for 2 hours in a humid chamber. Then, the slides were gently rinsed with injection water and air dried. The cover slips were stored for 1 month at -80°C, without loss of antigenicity. For immunocytochemical detection of the P1 peptidase, the protoplasts were blocked in 1% bovine serum albumin (BSA). The samples were incubated with primary polyclonal antibody against P1 Δ33-143 (aP1[33-134]; 1:100 dilution in 1% BSA in PBS-T), and afterwards with secondary antibody Alexa Fluor® 555 donkey anti-rabbit IgG (Invitrogen). To block non-specific interactions of antibody with Rubisco, slides were pre-incubated with 1% bovine serum albumin with diluted chicken polyclonal antibody against Rubisco (1:400, AbCam) without secondary antibody. Photo acquisition was accomplished using a Leica DMI 6000 with CS TCS SP5 Spectral Scanning Confocal System.

P1 ∆33-143 bacterial expression and antibody production
cDNA fragment encoding P1 ∆33-143 was inserted in bacterial expression vector. Expressed protein fragment of the predicted size 14 kDa was purified via electro-elution and used for further immunization of rabbit. For further immunodetection of P1 the immune serum, as well as purified IgG were used. Immune serum was used for western-blotting of the nitrocellulose membrane with proteins from a plant protein extract after 7, 14, and 21 days post inoculation (DPI). By 7 DPI, the antibody had clearly detected a 42 kDa P1 peptidase. Surprisingly, the antibody also detected a 30 kDa protein at 14 DPI (Figure 1). However, the serum demonstrates non-specific labeling of plant Rubisco. Purified antibody was used for the immunoprecipitation. The precipitated proteins obtained were analyzed by MALDI-TOF. Antibody against P1 Δ33-143 (aP[33-134]) precipitated the following proteins: P1 peptidase (42 kDa), 30 kDa protein, Rubisco (55 kDa), and the CI viral protein (65 kDa) (Figure 2A). The sequence of 30 kDa protein was found to cover the coding region between 80-277 amino acids (aa) of the viral P1 peptidase δP1 (Figure 2). This fragment covered 49.8% of the total P1 of PSbMV. To exclude non-specific labeling of plant protein with aP[33-134], the found sequences were analyzed using BLASTp (Basic Local Alignment Search Tool) to compare the homology with all nonredundant GenBank CDS translations from available databases (PDB, Swiss Prot, PIR, PRF) and reference database of Arabidopsis thaliana genome. Completely identical protein containing  However, size of the covered region (80-277 aa) was predicted to be 22 kDa based on the amino acid sequence. The difference (8 kDa) can by possibly explained by the posttranslational modifications, such as phosphorylation of the 40 Ser/Thr predicted to be located in 1-280 aa peptide sequence region. The sites of the modifications were predicted with high score of probability using GPS 2.0 [22]. It is necessary to experimentally determine the presence of theses or any other modifications of the P1 paptidase and their influence on the size of the peptide in future study.

P1 localization in infected pea plant protoplasts
Immunofluorescence localization of P1 peptidase using polyclonal rabbit antibody aP1  was performed on isolated protoplasts taken from the infected pea plants. A non-specific interaction of rabbit IgG with plant Rubisco was observed; it was seen as a diffused signal that was spread over the entire cytoplasm ( Figure 3A). This diffused signal precluded identification of specific P1 localizations. For confirmation of the P1 peptidase localization pattern, fixed coverslips with plant protoplasts were blocked with diluted antibody against plant Rubisco. Blocking of the non-specific interaction of aP1[33-134] to Rubisco was proven by western blot (data not shown), and was used afterwards for immunofluorescence. (Figure 3C) The missing signal, in comparison to the non-blocked IgG immunofluorescence, was identified as a P1-specific localization pattern. P1 occupied the cell cytoplasm, with no concrete localization to the membranous structures (such as the endoplasmic reticulum or golgi apparatus). (white arrows, Figure 3B) There was no evidence of localization to the cytoplasmic membrane. Surprisingly, there was detected clear signal in the cell nucleus ( Figure 3B). There is no evidence about any nuclear localization signal in the P1 coding sequence so far. In order to confirm either a concrete P1 nuclear localization or trafficking (possibly via the host protein), further experiments will be required.

Figure 3. P1 localization in infected protoplasts.
We localized the P1 peptidase in protoplasts isolated from infected pea plants. We found Rubisco interacting nonspecifically with rabbit polyclonal immunoglobulin (A). For blocking of this non-specificity we used a polyclonal antibody against Rubisco. Further localization of the P1 peptidase (green) was dedicated as specific (B). P1 localization was spread throughout the cytoplasm, and clearly in the cell nucleus (B, C).
its amino acid sequence and it was detected in lyzate taken from infected plant after 7, 14, and 21 DPI on western blot with aP1 . Initially, after 14 DPI, it was possible to detect and precipitate a 30 kDa protein. However, this precipitated 30 kDa protein is identical within the short fragments found, spread between 80 to 277 aa of the P1 coding region. This finding can be explained by alternative initiation of translation or proteolytic cleavage of the P1 peptidase immediately after its auto-cleavage from the viral polypeptide during viral infection. However, there is no evidence of any alternative site of initiation of translation or proteolytic cleavage which could explain the truncation of P1 in regard to the fragments found, nor with respect to the final size of 30 kDa.
Additionally, using SMART, a sequence analysis database [13,14,6], for the prediction of active domains in the truncated P1: ΔP1 (1-277); avoiding detection of a serine peptidase domain specific for the cleavage of P1 itself from the rest of the viral polyprotein. The N-terminal portion of the defined amino acid sequence was a surprisingly localized Cys2His2 Zinc finger motif (ZnF-C2H2; SMART accession No. SM00355). This identification was made according to the E-value near to the 1.22e+03 (Hidden Markov Model). ZnF in P1 was predicted in the formula C[32]-X 2 -C[35]-X 3 -F-X 10 -H[50]-X 20 -H[70], regarding the positioning within the P1 aa sequence. This motif most closely resembles the eukaryotic form of C2H2 ZnF. After identification of this P1 ZnF motif, all available potyviral P1 sequences were screened. Interestingly, it was identified several homologues in other potyviral P1s (Figure 5) belonging to two genus (Potyvirus and Tritimovirus). They share aa sequence homologies, and each of them contains this motif in proximity to the N-terminus.

In silico 3D molecule modelling
The topology of the possible alpha helices, beta-sheets, and coiled regions in the entire amino acid sequence were predicted by Protean (Lasergene DNASTAR ® ). This software indicated those probable regions with the presence of an alpha helix or beta sheet with intercalated coils. Due to this prediction, those sequences were chosen, which correlated with the MALDI covered motif, the final size of the in vivo protein found (30 kDa), and the topology of the polypeptide. Consequently, for the 3D structuring of P1 30 kDa the aa sequence in region 1-277 was used, with a predicted polyprotein size of 30.5 kDa (Figure 4A). I-TASSER software [10,12] generated five possible models. Finally, the Model No. 4, with the lowest C-score (-3.59), was selected. The C-score is a confidence score for estimating the quality of the predicted models created by I-TASSER (calculated by the significance of threading template alignments and the convergence parameters of the structure assembly simulations). The total molecular model has also confirmed the position of the ZnF C2H2 motif (Figure 4B).

P1 localizes to the cytoplasm and nucleus
P1 peptidase's was localized spread throughout the cytoplasm, using the aP1 . The confocal microscopy detects the localization signal of this protein very clearly in the nucleus, as well. Interestingly, relatively strong labelling was detected within the nucleolar vacuole. The nucleolar vacuole has been described as a possible site of some general UsnRNA functions, such as recycling, storage, or post-transcriptional processing [2]. A very interesting possibility for further research is the claim that the nucleolar vacuole is the location for s posttranscriptional processing of snoRNAs, before their assembly into their respective rRNP complexes. Whether localization of P1 in this structure might have some importance in the aforementioned process has yet to be ascertained. Interestingly, support of P1 quality has already been described; the ability of P1 to bind short ss-as well as ds-RNA [16]. However, any explanation of this theory needs to be verified precisely by further experimentation; primarily by prepared coimmunoprecipitations from the nuclear fractions. The evidence of P1 nuclear localization might be connected with previously reported speculation of P1´s possible activity in overcoming the host defence mechanisms (P1/HC-Pro); possibly, with other functions affecting the expression of host genes during early doi: 10.7243/2050-2389-3-1 viral infection, and then spreading as an indirect stimulation of replication of the viral RNA by the suppression of the host's transcriptional activity [20]. Up to the present, this localization experiment has only been done once [1]. That immunogold localization confirmed that P1 spreads into the cytoplasm, but it is excluded from the cell wall and plasmodesmata. Surprisingly, in their experiment, they did not detect any significant signal in the cell nucleus.
In immunoprecipitations, viral CI protein was repetitively pulled-down (data not shown). The possible interaction between P1 and CI also has been previously described [1].
The explanation of this finding was the suggestions that 1) all viral proteins are synthesized near cytoplasmic inclusions, which are usually associated with the rough endoplasmic reticulum; and 2) cytoplasmic inclusions are indeed sites of potyviral RNA replication. The aforementioned interaction may be consistent with the hypothesis of Verchot and Carrington (1995b), which states that P1 peptidase could participate in virus replication, acting as a trans accessory or regulatory factor; stimulating genome amplification directly in the RNA replication process by interacting with either enzymatic components of the replication complexes or with the RNA sequence, as well as indirectly in genome amplification by stimulating translation of viral RNA [20]. Protein lyzate used in these experiments has sufficiently disrupted cytoplasmic inclusion and most of the membranous structures were disaggregated because of denaturizing conditions (for details see Materials and Methods). This fact mostly supports the theory concerning RNA replication. The established ZnF motif, which possibly binds to the RNA, also complements this opinion. Notwithstanding this, the true interaction(s) between P1 and CI have yet to be established and proven.

P1 contains non-canonical C2H2 ZnF
Recently, the first purported prokaryotic C2H2 ZnF domain has been identified in a transcriptional regulator (Ros protein) from Agrobacterium tumefaciens [3]. This suggests that the classical ZnF domain, previously thought to be exclusively confined to eukaryotic organisms, could be widespread within the living kingdom from the eukaryotic to the prokaryotic, both in plants and animals. A single ZnF domain itself is not sufficient for high affinity binding to a specific DNA target sequence. Nevertheless, there are some hints confirming that the single ZnF domain is capable of sequence-specific DNA binding when flanked by a basic region [3,4,7]. Bacterial ZnF contains three histidine residues; and the 9 aa region between the second cystein and the first histidine is shorter than the canonical 12 aa spacer observed in eukaryotic ZnF. After identification of this P1 ZnF motif, all available potyviral  doi: 10.7243/2050-2389-3-1 P1 sequences were screened. A number of homologues in other potyviral P1s were found (Figure 5). They share aa sequence homology, and each of these localizes this motif in the proximity of the N-terminus. A finding of conserved Cys and His residues was previously mentioned by [18]. (Figure 4). However, it is not possible to reserve a strict formula for the localization of the novel viral ZnF because within different P1 the spacer between Cys and His varies. These finding suggests that the P1 peptidase contains some form of ZnF; possibly also detectable in other viral proteins. In order to affirm the actual activity of this motif, and/or its involvement in some regulative processes during viral infection, additional experimentation will be required.