Characterization of plasma sEVs
We extracted plasma sEVs from 21 patients and 15 controls with an sEV extraction kit (Ribo Bio, China) using polyethylene glycol (PEG) precipitation according to the workflow shown in Fig. 1. The extracted sEVs were confirmed using TEM, which detected their round and cup-like spherical shapes (Fig. 2A), and NTA, which analyzed their diameter and distribution. The average size of the sEVs was 147.5 nm, ranging from 17.5 to 300 nm (Fig. 2B). Compared to the HepG2 cell lysates, the proteins in sEVs were detected using western blotting to determine the presence of three positive markers, CD63, CD81, and TSG101 (Fig. 2C), and the absence of one negative marker, Calnexin (Fig. 2C). As shown in our WB, TEM, and NTA results, the sEV extraction kit we used isolated the sEVs efficiently and accurately.
Proteomic profiling of plasma sEVs
After identifying the extracted sEVs, a label-free nano-LC–MS/MS analysis was performed to comprehensively analyze the proteins in sEVs and compare them between the normal and HCC groups. Five sEV samples selected randomly and evenly from 15 controls were pooled together to form three new specimens representing the control group, and seven sEV samples from 21 patients with HCC were selected randomly and evenly to form three new specimens representing the HCC group as a method to eliminate the interindividual variability and save money. Three new specimens from the HCC group or control group were subjected to a proteomic analysis. The characteristics of the participants from the HCC group and control group were listed in Supplementary Table 1. The results revealed 850,808 secondary spectral maps, of which 77,850 were matched to the UniProt public database. Using the criteria for protein identification and contamination described in the “Methods” section, 3872 unique peptide segments were identified in a total of 4167 peptide segments, and 281 proteins were quantified among 335 identified proteins.
The overlap of proteins between groups was visualized in a Venn diagram to assess the reproducibility of proteins identified among the groups. Two hundred ninety-three and 296 sEV proteins were identified in the control and HCC groups, respectively (Fig. 3C). A total of 272 proteins were expressed commonly among the control and HCC groups (Fig. 3C). Two hundred twenty-nine (78.2%) proteins in the control group (Fig. 3A) and 236 (79.7%) proteins in the HCC group (Fig. 3B) appeared in all three replicates.
Compared to the healthy controls, a fold change in HCC sEV proteins ≥ 1.5 or < 0.66 was defined as a differentially expressed protein (DEP) [14, 15], and we identified 54 DEPs in HCC sEVs by performing a quantitative analysis. Among these 54 proteins, the statistical analysis (p < 0.05) revealed that only 13 proteins were upregulated and 14 proteins were downregulated in HCC sEVs, as shown in the volcano plot (Fig. 4A). Hierarchical clustering analysis was performed to analyze the patterns of all 27 significantly differentially expressed proteins (SDEPs) in all six samples, and the results are shown in Fig. 4B. In addition, the information related to all 54 DEPs is included in Supplementary Table 2.
GO and KEGG pathway analyses
We annotated the aforementioned proteins by performing a Gene Ontology (GO) analysis to understand the functions, locations, and biological pathways of the sEV proteins identified in plasma from patients with HCC. GO is a standardized functional classification system that provides a dynamically updated standardized glossary describing the properties of genes and their products in living organisms. The GO functional annotation was divided into three categories: biological process (BP), molecular function (MF), and cellular component (CC).
We analyzed all 27 SDEPs by determining GO functional annotations compared with the total proteins of reference species and obtained the signature of the difference. The GO analysis of biological processes revealed an enrichment of SDEPs in the “acute inflammatory response,” “protein activation cascade,” “protein maturation,” “humoral immune response,” and “protein processing” (Fig. 5A). A chord diagram revealed the links of SDEPs with biological processes, and the data showed that the majority of SDEPs were involved in the “acute inflammatory response,” including AHSG, C1QB, C1QC, C4BPA, C4BPB, CFP, IGHV3-7, IGLV3-25, ORM1, ORM2, PROS1, and VTN (Fig. 5B). The GO analysis of molecular functions revealed an enrichment of SDEPs involved in “enzyme inhibitor activity,” “peptidase regulator activity,” “endopeptidase regulator activity,” “peptidase inhibitor activity,” and “endopeptidase inhibitor activity” (Fig. 5C). A chord diagram revealed the links of SDEPs with molecular functions, and the data showed that the majority of SDEPs were involved in “endopeptidase inhibitor activity,” including AHSG, PROS1, SERPINA10, SERPINA6, and SERPIND1 (Fig. 5D). The GO analysis of cellular components revealed an enrichment of SDEPs in “blood microparticle,” “collagen-containing extracellular matrix,” “endoplasmic reticulum lumen,” “vesicle lumen,” and “cytoplasmic vesicle lumen” (Fig. 5E). A chord diagram revealed the links of SDEPs with cellular components, and the data showed that the majority of SDEPs were involved in the “collagen-containing extracellular matrix,” including AHSG, AZGP1, C1QB, C1QC, CFP, fibrinogen alpha chain (FGA), fibrinogen beta chain (FGB), fibrinogen gamma chain (FGG), ORM1, ORM2, TGFBI, and VTN (Fig. 5F).
In addition, SDEPs were analyzed using the KEGG pathway database to predict the related pathways and involved proteins. We found that SDEPs were involved in “complement and coagulation cascades”, “pertussis”, and “Staphylococcus aureus infection” (Fig. 6A). The clustering analysis of KEGG pathways indicated that the SDEPs from HCC plasma sEVs were mainly involved in the complement and coagulation cascade pathways (Fig. 6B), including C1QB, C1QC, C4BPA, C4BPB, F13B, FGA, FGB, FGG, SERPIND1, PROS1, and VTN. Taken together, the results of GO and KEGG analyses suggested that the complement (C1QB, C1QC, C4BPA, and C4BPB) and coagulation (F13B, FGA, FGB, and FGG) pathways are the major pathways in which SDEPs are involved, and the dysregulated proteins of the complement and coagulation pathways may be potential molecular signatures for HCC. Detailed information on the GO and KEGG analyses is provided in Supplementary Table 3.
Protein network analysis
We developed a protein–protein interaction (PPI) network using the Search Tool for Retrieval of Interacting Genes/Proteins (STRING) database to obtain a better understanding of the relationships among these SDEPs and their closely related proteins. Based on previous studies examining these proteins, the relationships among the SDEPs and their closely related proteins were analyzed, and the SDEPs and their closely related proteins were grouped into four clusters, which are labeled in different colors (Fig. 7). Information on these previous studies is included in Supplementary Table 4. In Fig 7, F13B, FGA, FGB, and FGG act as network hubs in the red cluster; C1QB, C1QC, C4BPA, and C4BPB are network hubs in the yellow cluster. These data were consistent with the aforementioned results indicating that F13B, FGA, FGB, FGG, C1QB, C1QC, C4BPA, and C4BPB were upregulated in plasma sEVs from patients with HCC. The raw data from the PPI network are included in Supplementary Tables 5 and 6. Detailed information on the clustering analysis is provided in Supplementary Table 7.
Validation of complement proteins in sEVs
We verified the findings from the proteomics analysis by measuring the expression levels of SDEPs involved in the complement cascade using WB in the aforementioned pooled samples. The protein concentrations in sEVs were quantified using the BCA method, and each lane was loaded with the same amount of protein. The SDEPs involved in the complement cascade were increased in plasma sEVs from patients with HCC, including C1QB, C1QC, C4BPA, and C4BPBP (Fig. 8A). The band densities of C1QB (p < 0.05), C1QC (p < 0.05), C4BPA (p < 0.05), and C4BPB (p < 0.01) in the HCC group were significantly higher than those in the control group (Fig. 8B–E). These results were consistent with the data from the proteomic analysis.
In addition to the aforementioned pooled samples, we also wanted to detect the protein expression of C1QB, C1QC, C4BPA, and C4BPBP in plasma sEVs from individual patients, and samples from seven patients with HCC and seven normal controls were randomly selected. The results showed increased C1QB, C1QC, C4BPA, and C4BPBP levels in plasma sEVs from patients with HCC (Fig. 8F). The band densities of C1QB (p < 0.0001), C1QC (p < 0.0001), C4BPA (p < 0.0001), and C4BPB (p < 0.001) in the HCC group were significantly higher than those in the control group (Fig. 8G–J).