Skip to main content

Exploring best-matched embedding model and classifier for charging-pile fault diagnosis


The continuous increase of electric vehicles is being facilitating the large-scale distributed charging-pile deployment. It is crucial to guarantee normal operation of charging piles, resulting in the importance of diagnosing charging-pile faults. The existing fault-diagnosis approaches were based on physical fault data like mechanical log data and sensor data streams. However, there are other types of fault data, which cannot be used for diagnosis by these existing approaches. This paper aims to fill this gap and consider 8 types of fault data for diagnosing, at least including physical installation error fault, charging-pile mechanical fault, charging-pile program fault, user personal fault, signal fault (offline), pile compatibility fault, charging platform fault, and other faults. We aim to find out how to combine existing feature-extraction and machine learning techniques to make the better diagnosis by conducting experiments on realistic dataset. 4 word embedding models are investigated for feature extraction of fault data, including N-gram, GloVe, Word2vec, and BERT. Moreover, we classify the word embedding results using 10 machine learning classifiers, including Random Forest (RF), Support Vector Machine, K-Nearest Neighbor, Multilayer Perceptron, Recurrent Neural Network, AdaBoost, Gradient Boosted Decision Tree, Decision Tree, Extra Tree, and VOTE. Compared with original fault record dataset, we utilize paraphrasing-based data augmentation method to improve the classification accuracy up to 10.40%. Our extensive experiment results reveal that RF classifier combining the GloVe embedding model achieves the best accuracy with acceptable training time. In addition, we discuss the interpretability of RF and GloVe.


Recently, with the acceleration of global warming, human beings have realized that unrestricted use of fossil energy is harmful to the earth. Electric vehicles (EVs), with the advantage of environment-friendliness and energy efficiency, are considered to replace traditional fuel vehicles (Yan et al. 2019). With the increasing number of EVs, many distributed charging piles are among the essential infrastructures (Chen et al. 2020). Generally, a large number of charging piles locate in the wild with uncontrollable environmental factors, causing frequent charging-pile faults. Therefore, it is crucial to maintain the effectiveness of charging piles (Zhang et al. 2022; Wei et al. 2021).

Charging-pile service companies have been bringing a series of measures into force, with the aim to guarantee the effectiveness of charging piles. For example, when the customers encounter problems, they offer a service hotline and WeChat (Hao et al. 1087) mini program to publish emergency work orders. We now explain why it is necessary for a service provider to predict charging-pile faults to improve the efficiency of repairing service. The occurrence of charging-pile work orders may be due to a mechanical fault or cyber security. We can imagine a scenario of mechanical fault: (a) a customer describes a fault of the charging pile using the service hotline; (b) the staff receives the fault work order, records the fault description, and dispatches maintenance workers to repair piles; (c) maintenance workers finish the work order and submit the fault category to the service system. However, dispatching maintenance workers will waste human and material resources if the fault is in the software platform or online electric system. Moreover, from the aspect of cyber security, security analysis and protection mechanisms must be conducted in order to improve the communication security between EVs and charging piles (Li et al. 2021). These discussions emphasize the importance of predicting charging-pile faults.

Recently, machine learning (ML) or deep learning (DL)-based techniques play a crucial role in charging-pile fault diagnosis (Shuai et al. 2022; Du et al. 2021) and abnormal detection (Li et al. 2021). Especially, Li et al. (2021) utilized Random Forest (RF) classifier to implement abnormal detection. However, existing studies on charging-pile fault diagnosis focus on the mechanical log data or sensor data streams (Gao et al. 2020, 2018; Wang et al. 2021; Yong and Ji 1650), while we concentrate on work order fault description data recorded by staff (different from mechanical log data and sensor data streams) and classify 8 types of faults, including installation error fault, charging-pile mechanical fault, charging-pile program fault, user personal fault, signal fault (offline), pile compatibility fault, charging platform fault, and other faults.

Figure 1 presents a simplified workflow of our paper. We firstly collect the raw data from the real-world electric service work orders to build a fault record dataset. Then, we conduct data preprocess by utilizing Jieba (Junyi 2022) tokenizer to tokenize the Chinese fault description. After that, we extract fault features based on fault description by adopting the extensively used word embedding models, such as N-gram (Suen 1979), Word2vec (Mikolov et al. 2013), GloVe (Pennington et al. 2014), and BERT (Devlin et al. 2018). At last, we utilize 10 ML or DL classifiers, including RF (Breiman 2001), Support Vector Machine (SVM) (Cortes and Vapnik 1995), K-Nearest Neighbor (KNN) (Sebastiani 2002), Multilayer Perceptron (MLP) (Rumelhart et al. 1986), Recurrent Neural Network (RNN) (Elman 1990), AdaBoost (AB) (Freund and Schapire 1997), Gradient Boosted Decision Tree (GBDT) (Friedman 2001), Decision Tree (DT) (Breiman et al. 2017), Extra Tree (ET) (Geurts et al. 2006), and VOTE, to classify the word embedding features for fault diagnosis.

Fig. 1
figure 1

The flow of paperwork

We summarize the following main contributions:

  • We create a dataset of realistic charging-pile faults. Specifically, we collect original long-term real-world electric service work orders from June to December 2021. Moreover, we select fault description and category to build a structured fault record dataset. “Fault record dataset” section details the building of the dataset.

  • We carry out extensive experiments to explore the best-matched combination between 4 fault description feature extraction models and 10 classifiers for effective fault diagnosis. To the best of our knowledge, we are the first to achieve all types of charging-pile fault diagnoses using fault descriptions (“Experimental result and discussion” section).

The left paper is organized as follows. “Preliminary” section overviews word embedding approaches and classifiers. “Fault record dataset” section gives the fault-record dataset. Experimental results and discussion are provided in “Experimental result and discussion” section. “Conclusion” section presents the conclusion.


Word embedding vector is a crucial feature extraction approach and benefits calculating the cumulative sentence embedding to conduct ML operation. This section first introduces 4 word embedding approaches to be investigated in this paper, including TF-IDF N-gram, Word2vector, GloVe, and BERT. Then 10 ML/DL classifiers are presented.

Word embedding approaches

Four word embedding approaches are discussed.

N-gram (Suen 1979)

It is a distinguished language feature extraction method. Due to its outstanding performance in dealing with sequence information, N-gram has been used in text feature extraction and classification fields and also achieved great success. N-gram utilizes a sliding window to divide a sequence into n-slice parts. After counting the term frequency-inverse document frequency (TF-IDF) and One-Hot embedding, we obtain a sequence embedding. As illustrated in Fig. 2, the red box is a sliding window whose sizes are 2, 3, and 4. Then the Chinese Word (CW) sentence of our corpus will be mapped into a vector.

Fig. 2
figure 2

The flow of TF-IDF N-gram embedding

Word2vec (Mikolov et al. 2013)

It is a neural network-based algorithm for training word vectors. It has two types of architecture. One is the Continuous Bag-Of-Words (CBOW) model, and the other is the continuous skip-gram model. CBOW is similar to Feedforward Neural Net Language Model (Bengio et al. 2000), where the non-linear hidden layer is removed, and the projection layer is shared for all words. After the training converges, words with similar meanings are mapped to a similar position in the vector space (illustrated in Fig. 3).

Fig. 3
figure 3

The flow of Word2vec embedding

GloVe (Pennington et al. 2014)

It was proposed as a global vector for the word embedding model in 2014. This model combines the advantages of global matrix factorization and local context window methods and efficiently leverages the statistical information of a large corpus. After training on the non-zero elements in the word-word co-occurrence matrix, GloVe will produce a vector space with meaning in a fixed dimension. Figure 4 discloses the flow of GloVe training. We put corpus as input. Then we count CW term frequency and compute the co-occurrence matrix to train GloVe using proper hyper-parameters. At last, we obtain the word embedding result with a specific dimension.

Fig. 4
figure 4

The flow of GloVe embedding


Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al. 2018) considers the bidirectional contexts and achieves denoising autoencoding-based model pre-training. It performs better than pre-training methods based on autoregressive language modeling (Yang et al. 2019). As illustrated in Fig. 5, if we input our corpus, each CW will obtain a token embedding, a sentence embedding, and a position embedding. Then all of them have to be put in two layers bidirectional transformer. After that, the contextual representation will be output as a specific dimension vector for the following training.

Fig. 5
figure 5

The flow of BERT embedding


RF (Breiman 2001)

This classifier is based on ensemble learning and involves many independent decision trees. It uses bootstrap to extract samples as input and combines each decision tree classification result. Then RF gains the classification result via majority voting. In fact, it overcomes the over-fitting of a single tree by taking the average of multi predictions.

SVM (Cortes and Vapnik 1995)

SVM maps input vectors non-linearly to high dimension feature space, which builds a hyperplane. It aims at maximizing the margin between the two sides of a separating hyperplane.

KNN (Sebastiani 2002)

KNN is a widely used text classifier due to its simplicity and efficiency. It computes the nearest neighbors of each point by majority vote to classify.

MLP (Rumelhart et al. 1986)

MLP is a feedforward artificial neural network model. Given a set of features, MLP can learn a non-linear function approximator for classification.

RNN (Elman 1990)

RNN is a kind of neural network and is effective in processing sequence text data classification. Unlike feedforward neural networks, RNN can recurrent in the self-network to obtain a better sequence representation.

AB (Freund and Schapire 1997)

A new weak classifier is added in each AB training round until the predetermined error rate is reached. Each training sample is assigned a weight indicating the probability that it is selected into the training set by a classifier.

GBDT (Friedman 2001)

GBDT classifier is composed of multiple decision trees, and the conclusion of all trees adds up to the final classification result. Notably, the previous decision tree's residual is taken as the next decision tree's input.

DT (Breiman et al. 2017)

DT is a non-parametric supervised learning method used by the classifier. It utilizes a set of if-else decision rules to learn from data. Therefore, DT is simple and easy to understand and interpret.

ET (Geurts et al. 2006)

This classifier implements many randomized decision trees on various sub-samples and uses averaging to improve the predictive accuracy and control over-fitting.


The VOTE classifier is an ML model that trains on an ensemble of numerous models and predicts an output based on the highest probability of chosen class as the output. It will simply aggregate the result of each classifier and predict the output based on the highest majority of voting. Instead of creating separate dedicated models and finding the accuracy for each classifier, VOTE will create a single model which trains by these models and predicts output based on their combined majority of voting for each output.

Fault record dataset

In this section, we first introduce one example of raw data. Then, we conduct raw data analysis, including work order source, top 10 cities or provinces of fault recordings, and the relationship between month and fault record amount. At last, we build a fault record dataset for subsequent studies.

Raw data sample

We collect the 8,481 raw data from an actual Internet of Vehicles platform service center from June to December 2021. Intuitively, we give one example of raw data in Table 1, which includes pile number, work order source, work date, work city, work order number, client type, fault description, work order state, fault category, and fault reason. Notably, we use ‘xxx’ to represent the actual number considering data privacy.

Table 1 One example of raw data

Raw data analysis

We analyze the raw data. We observe that 53.8% of work orders are sourced from the national service hotline, 21.9% from the EV fixed line, and 8.1% from WeChat mini program. The detailed plot is given in Fig. 6, indicating that more charging-pile users feedback faults via traditional trouble calls.

Fig. 6
figure 6

Work order source

Moreover, we analyze the source of the work order. The top 10 provinces of fault records are denoted as P1, P2,…, and P10, respectively. As shown in Fig. 7, we can obtain the relationship between the region and fault records. For instance, more fault records demonstrate that more charging piles of EVs are deployed in a specific region. P1–P5 are all developed areas of China and possess more EVs than other growing provinces.

Fig. 7
figure 7

Top 10 provinces of fault records

We collect fault records from June to December 2021 (demonstrated in Fig. 8), and we observe that with the increase of the month, more fault records have been reported.

Fig. 8
figure 8

The fault record number of each month

Fault record dataset

After raw data analysis, we explore the concrete usage using fault records. Hence, we establish a fault record dataset containing the label, fault category, and fault description. As stated in Table 2, labels 0 to label 7 respectively correspond to the different fault categories, including installation error fault, charging-pile mechanical fault, charging-pile program fault, user personal fault, signal fault (offline), pile compatibility fault, charging platform fault, and other faults. Moreover, we give one fault description sample for each label and category in Table 2.

Table 2 Fault record dataset

Notably, as shown in Table 3, in this paper, we focus on Chinese text classification and prediction to preserve the original data characteristics.

Table 3 Data amount of fault record dataset

Experimental result and discussion

In this section, we concentrate on experimental settings and results. Firstly, data preprocessing and data splitting are given. Then, we introduce the ML or DL classifiers and experimental dependency used in this paper. In addition, the metrics of the experiment are represented. At last, we give the experimental result and interpretability discussion.

Data preprocessing

For extracting features of the Chinese text, we utilize the Jieba (Junyi 2022) as the tokenizer to cut the whole sentence of Chinese text into several segmentations. As stated in Table 4, we count the ten most frequent words in our dataset.

Table 4 The most 10 frequent words in dataset

After that, we use N-gram, GloVe, Word2vec (CBOW model), and BERT as embedding approaches to convert Chinese Word segmentation (CW) to a multi-dimension vector. In Table 5, we give a sample of word2vec. In addition, we split our dataset into the training and testing sets for classifier training. The training set occupies 80% of all data, and the testing set possesses a 20% dataset, as described in Table 6. At last, we convert the CW of our corpus into 300 dimensions in GloVe, 20 in Word2Vec, and 768 in BERT. We set different dimensions to discuss the relationship between the model performance and the word embedding dimension.

Table 5 The sample of Word2vec
Table 6 Splitting result of dataset

Data augmentation

As aforementioned in “Raw data sample” section, we collect 8,481 raw samples. However, a limited data scale will cause a higher error rate for ML models. The paraphrasing-based method is one of the effective data augmentation approaches in NLP (Geurts et al. 2006). In this paper, we utilize python library synonyms (Bengio et al. 2000) to find and replace the synonym of tokenizing fault description. Totally, we utilize 16,962 samples for ML training.

To be specific, we obtain all replaceable words for each sample, and randomly select a few words to replace. The more similar to the original word, the more likely it is to be selected. Plenty of synonym examples will be revealed in Table 7. Notably, our fault description is recorded in Chinese, so we give the Chinese version of the synonym to demonstrate the high similarity between the token and the synonym.

Table 7 The synonym examples

Experimental goal

In this paper, we utilize 4 embedding models, including N-gram, GloVe, Word2vec, and BERT, and use 10 ML or DL classifiers, including Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Multilayer Perceptron (MLP), Recurrent Neural Network (RNN), AdaBoost (AB), Gradient Boosted Decision Tree (GBDT), Decision Tree (DT), Extra Tree (ET), and VOTE, to classify our corpus.

We implement extensive experiments to explore the best-matched combination between word embedding models (N-gram, GloVe, Word2vec, and BERT) and classifiers. In general, we need to base the following goals.

  • Goal 1: The training time (including embedding training time and classifier training time) must be controlled in several seconds.

  • Goal 2: The combination of the embedding model and the classifier can achieve high accuracy in a real-world dataset.

  • Goal 3: The embedding model and classifier should be interpretable.

Experimental configuration

In this subsection, the experimental configuration of our experiments is given. Our experiments run in AMD R7 5800X platform with 32 GB of RAM, which is eight cores CPU, and we run RNN using NVIDIA GeForce RTX 3080 for accelerating neural network.

As described in Table 8, we use python 3.7.13 with a lot of python libraries to help model training. In addition, we utilize standard GloVe (Yang et al. 2019) in Ubuntu 18.04 LTS to train the word vectors using our corpus. Similarly, we give the hyper-parameters of each classifier in Table 9. Note that, since hyper-parameters have a large impact on each model, we try to choose the default parameters in Scikit-learn.

Table 8 The environment and corresponding libraries
Table 9 The hyper-parameters of classifiers


4 standard ML metrics, precision, recall, accuracy, and F1-score, to evaluate the performance of each model combination. For each sample in the dataset, there are four possible partitioning outcomes:

  • TP (True Positive): Number of samples belonging to and classified as a positive class;

  • FP (False Positive): Number of samples belonging to a negative category and classified as a positive category;

  • FN (False Negative): Number of samples belonging to a positive category and classified as a negative category;

  • TN (True Negative): Number of samples in the negative category and classified as negative.

Then the precision (Eq. (1)), recall (Eq. (2)), accuracy (Eq. (3)), and F1-score (Eq. (4)) of each class can be calculated respectively as follows:

$${\text{p}}recision = \frac{TP}{{TP + FP}}$$
$$recall = \frac{TP}{{TP + FN}}$$
$$accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}$$
$$F1 - score = \frac{2 \cdot precision \cdot recall}{{precision + recall}}$$

Experimental results

In this section, we will follow the experimental goals (in “Data augmentation” section) to explore the best match of embedding models and classifiers. Firstly, we give the training time for different embedding models and classifiers. Then, we discuss the effectiveness of imbalance learning. In addition, we provide the accuracy, precision, recall, F1-score, and average train time of different combinations to evaluate the model combination performance. At last, we conduct the interpretability discussion to give the final analysis.

Training time comparison

Table 10 gives training time for 4 embedding models. We observe that the N-gram (n = 2) training time is 1544.36 s, while GloVe and Word2vec are 7.50 s and 0.22 s, respectively. We use the pre-trained model- ‘chinese_wwm_ext_pytorch’ in BERT.

Table 10 The time in training embedding models (s)

We then compare the training time and accuracy of 16 combinations of 4 embedding models and 4 classifiers. The results are given in Tables 11, 12, respectively. The N-gram approach not only consumes more training time but also has lower accuracy in RF, KNN, and DT classifiers. The MLP + N-gram achieves an accuracy of 76%. However, the training time is unacceptable under Goal 1, and the accuracy rate cannot reach Goal 2. Therefore, we only select GloVe, Word2vec, and BERT in the following experiments in “Experimental results” section.

Table 11 Training time of 16 combination of 4 embedding models and 4 classifiers (s)
Table 12 Accuracy (%) of 16 combination of 4 embedding models and 4 classifiers

Imbalance learning comparison

Observing our fault record dataset (in Table 3), we find a big difference in the number of data samples for seven classes, which means the dataset is imbalanced. With this in mind, we try to utilize python library-imbalance learn to reduce the effect of the imbalance dataset. As described in Table 13, we record the accuracy under imbalance and non-imbalance learning in 10 classifiers.

Table 13 The accuracy result with and without imbalance learning (%)

After comparing the experimental results with and without imbalance learning in different classifiers and embedding models, we observe that the improvement of imbalance learning is little. Moreover, as recorded in Table 14, adopting imbalance learning consumes more training time. Hence, we only adopt non-imbalance learning processing in the following experiments to satisfy Goal 1.

Table 14 The training time with and without imbalance learning (s)

Performance comparisons

This subsection will give the overall performance comparisons from the perspective of accuracy rate, precision rate, recall rate, F1-score, and average training time.

Firstly, we provide the accuracy of different classifiers under GloVe, Word2vec, and BERT embedding models. As shown in Fig. 9, the four classifiers have better accuracy under three embedding models, including RF, RNN, DT, and ET. Especially, RF and RNN classifiers achieve the top 2 accuracy, for instance, RF + GloVe 79.67%, RF + Word2vec 81.26%, RF + BERT 80.32%, RNN + GloVe 82.91%, RNN + Word2vec 78.26%, and RNN + BERT 81.85%. However, to satisfy Goal 1, from the perspective of average training time, RNN reaches the highest time consuming, which more than 172 s. The detailed precision, recall, and F1-score results are shown in Table 15. Note that we bold metrics which are more than 79% to emphasize the performance of classifiers.

Fig. 9
figure 9

The accuracy and average training time result from different combinations

Table 15 Precision, recall, and F1-score result (%)

Hence, to satisfy Goal 1 and Goal 2, we select RF as the most appropriate classifier for the fault diagnosis task. Figure 10 shows the confusion matrix of RF + Glove.

Fig. 10
figure 10

The confusion matrix of RF + GloVe

Performance with data augmentation

As mentioned in “Data augmentation” section, we implement data augmentation to expand our data scale for better ML performance. With more training samples, we improve our model performance. We illustrate the accuracy and average train time in Fig. 11. Compared with performance without data augmentation, we calculate the statistical results in Table 16, where the positive improve average are in bold. Notably, all classifiers have been improved except SVM. The possible reason is that we maintain the same hyper-parameters as in Table 9 of each classifier, and SVM is more sensitive with proper hyper-parameters.

Fig. 11
figure 11

The accuracy and average training time result with data augmentation

Table 16 Accuracy result (%)

Interpretability discussion

To achieve Goal 3, we need to analyze the model's interpretability. Compared with black box DL models, such as RNN, most traditional ML models have better interpretability. In addition, Word2vec and BERT are neural network-based word embedding models, while GloVe utilizes the co-occurrence matrix and term frequency of corpus to train the embedding vector. In other words, GloVe does not involve a neural network and has better interpretability.

We utilize python library pydotplus to visualize the RF classification. We provide one of 500 RF trees, in which the number of the training sample is 200, and the embedding model is GloVe in Fig. 12, to illustrate the interpretability of RF + GloVe.

Fig. 12
figure 12

One of the RF trees when the number of training samples is 200, and the embedding model is GloVe

In brief, to satisfy Goal 1, Goal 2, and Goal 3, we select the RF classifier and GloVe word embedding model to finish the fault diagnosis task.

Result discussion

Why the accuracy of the raw data with data preprocessing is low?

We believe that they are two main reasons: little raw data scale and irregular manual fault description. As is known to all, a larger data scale will help the model learn more features. Besides, the real-world dataset is recorded by customer service staff which is irregular and casual, which will immensely affect the performance of classification. In fact, when we replace some tokens and make data augmentation using synonyms which are regular descriptions, we achieve greatly improving in accuracy metric.

What can we learn from the interpretability result?

The interpretability of the model can reflect the logic of model classification and Fig. 12 shows the logic of one RF tree. Worse interpretability, such as RNN, is a black box for the whole training process and is unacceptable for critical infrastructures.

What can we conclude from the different performances of models?

From the extensive experimental results, the RNN, RF, DT, and ET have superior model performance. Except for RNN which has worse interpretability, RF, DT, and ET are the tree-based methods, which indicates the tree structure has good performance for fault classification and fault diagnosis. Besides, compared with other classification models, the tree structure is easier to adjust the hyper-parameters and achieve the best result.


With the development of electric vehicles (EVs), many charging piles as the supporting facility have been deployed. This paper mainly focuses on fault diagnosis to maintain the effectiveness of charging piles. Specially, we vectorize fault description of the real-world fault record dataset using N-gram, GloVe, Word2vec, and BERT embedding models. Then we utilize ten machine learning or deep learning classifiers, including Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Multilayer Perceptron (MLP), Recurrent Neural Network (RNN), AdaBoost (AB), Gradient Boosted Decision Tree (GBDT), Decision Tree (DT), Extra Tree (ET), and VOTE, to explore the best-matched embedding model and classifier for helping charging-pile fault diagnosis.

Our extensive experiments reveal that RF classifier working with the GloVe embedding model in the real-world dataset can achieve the best accuracy with low training time. At last, we discuss the interpretability of RF and GloVe.

Availability of data and materials

Not applicable.



Electric vehicle


Machine learning


Deep learning


Term frequency-inverse document frequency


Chinese Word Segmentation


Random Forest


Support Vector Machine


K-Nearest Neighbor


Multilayer Perceptron


Recurrent Neural Network




Gradient Boosted Decision Tree


Decision Tree


Extra Tree


Continuous Bag-Of-Words


Download references


This work was supported by the State Grid Technology Project “Research on Interaction between Large-scale Electric Vehicles and Power Grid and Charging Safety Protection Technology” (5418-202071490A-0-0-00) from State Grid Corporation of China.


This work was supported by the State Grid Technology Project “Research on Interaction between Large-scale Electric Vehicles and Power Grid and Charging Safety Protection Technology” (5418-202071490A-0-0-00) from State Grid Corporation of China..

Author information

Authors and Affiliations



Drafting the manuscript: JW and XC. Revising the manuscript critically for important intellectual content: WW, XC, YY, CX, SY, MW, and LW. Experiments deployment: JW and LL. All authors read and approved the final manuscript.

Authors’ Information

Wen Wang is currently working as the Deputy General Manager of State Grid Electric Vehicle Service Company. He is Professorial Senior Engineer; Expert of China National Key R&D Program Review Group; Secretary-General of IEEE PES EV Satellite Committee (China); Senior Member of National Electrical System Security Protection Expert Group; Specialist in the field of power system automation, power trading and information security.

Jianhua Wang received the B.S. degree and M.S. degree in Software engineering from Taiyuan University of Technology in 2017 and 2020. He now pursues for his PhD degree in Beijing Jiaotong University, major in Cyberspace Security. His research interests include adversarial machine learning and federated learning.

Xiaofeng Peng is currently working as the V2G department head of State Grid EV Service Co., Senior Engineer; His research interests include V2G, Load Aggregation Technology.

Ye Yang is currently working as the R&D scientist of State Grid EV Service Co., Senior Engineer; His research interests include AI, Block-chain, smart grid control technology.

Chun Xiao is currently working as the senior engineer of State Grid Shanxi Marketing Service Center, and the specialist in marketing service.

Shuai Yang is currently working as the senior engineer of State Grid Shanxi Marketing Service Center, and the specialist in marketing service.

Mingcai Wang is currently working as the senior engineer of State Grid Electric Vehicle Service Company, Ltd. He is the specialist in the field of power system automation.

Lingfei Wang is currently working as the senior engineer of State Grid Electric Vehicle Service Company, Ltd. He is the specialist in power trading and control of electric power system.

Lin Li is currently an Associate Professor with the School of Computer and Information Technology, Beijing Jiaotong University. Her current research interests include cryptographic protocols, privacy preserving, and federated learning.

Xiaolin Chang (Member, IEEE) is a professor at the School of Computer and Information Technology, Beijing Jiaotong University. Her current research interests include Edge/Cloud computing, Network security, security and privacy in machine learning. She is a senior member of IEEE.

Corresponding authors

Correspondence to Wen Wang or Xiaolin Chang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Wang, J., Peng, X. et al. Exploring best-matched embedding model and classifier for charging-pile fault diagnosis. Cybersecurity 6, 7 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: