 Research
 Open access
 Published:
Detecting fake reviewers in heterogeneous networks of buyers and sellers: a collaborative trainingbased spammer group algorithm
Cybersecurity volumeÂ 6, ArticleÂ number:Â 26 (2023)
Abstract
It is not uncommon for malicious sellers to collude with fake reviewers (also called spammers) to write fake reviews for multiple products to either demote competitors or promote their productsâ€™ reputations, forming a gray industry chain. To detect spammer groups in a heterogeneous network with rich semantic information from both buyers and sellers, researchers have conducted extensive research using Frequent Item Miningbased and graphbased methods. However, these methods cannot detect spammer groups with crossproduct attacks and do not jointly consider structural and attribute features, and structureattribute correlation, resulting in poorer detection performance. Therefore, we propose a collaborative trainingbased spammer group detection algorithm by constructing a heterogeneous induced subnetwork based on the target product set to detect crossproduct attack spammer groups. To jointly consider all available features, we use the collaborative training method to learn the feature representations of nodes. In addition, we use the DBSCAN clustering method to generate candidate groups, exclude innocent ones, and rank them to obtain spammer groups. The experimental results on realworld datasets indicate that the overall detection performance of the proposed method is better than that of the baseline methods.
Introduction
The convenience of ecommerce has made online shopping increasingly common. As the information in ecommerce is asymmetry and the lack of quality control centers, consumers tend to browse reviews related to products or services before purchasing them online. This makes online reviews an essential reference for consumers to make purchasing decisions. According to a Harvard University study, every 1star increase in a productâ€™s rating on Yelp creates a 5â€“9% increase in revenue for that product (Luca 2016). Motivated by potential financial gain, some malicious sellers tend to collude with spammers, aiming to either demote competitors or promote their businesses by posting many fake reviews. The proliferation of fake reviews makes it impossible for consumers to judge the actual quality of products based on the review information. This seriously affects consumersâ€™ shopping experience and destroys the fair competition environment among merchants. It also has an extremely negative impact on the development of the ecommerce industry.
Spammers are constantly changing their spam strategies to escape the detection of the spammer identification model. Spammers often work in groups to camouflage their behavior and improve attack efficiency. A spammer group is a group of reviewers who write fake reviews for one or several products in an organized and coordinated manner (Mukherjee et al. 2012). The spammer group is more covert, destructive, and influential than individual spammers. This is because spammer groups can evade the platformâ€™s detection by avoiding certain relationships with other group members or developing multiple relationships with genuine reviewers (Shehnepoor et al. 2022). In addition, they can mislead consumers by imitating the behavior and language of genuine reviewers and writing fake reviews about target products in a short period. Therefore, how to effectively identify spammer groups on ecommerce platforms and ensure the credibility of product reviews has become an urgent issue of network information security.
Since the pioneering work of Jindal and Liu (2008), most efforts have been aimed at detecting fake reviews (Jindal and Liu 2008; Ott et al. 2011; Li et al. 2011; Cao et al. 2020, 2022) or individual spammers (Wang et al. 2012, 2020; Mukherjee et al. 2013). In recent years, several researchers (Mukherjee et al. 2012; Shehnepoor et al. 2022, 2021; Ji et al. 2020; Xu et al. 2013; Zhang et al. 2021, 2022a, 2020a, 2022b; Wang et al. 2016, 2018; Li et al. 2017; Hu 2021; Akoglu et al. 2013; Ye and Akoglu 2015; Zheng et al. 2018; Zhu et al. 2019; Chao et al. 2022) have attempted to detect spammers with collusive fraudulent behaviors at the group level. The existing work for detecting spammer groups can be roughly divided into two categories, i.e., FIMbased and graphbased methods. FIMbased methods (Mukherjee et al. 2012; Shehnepoor et al. 2022, 2021; Xu et al. 2013; Zhang et al. 2021) typically identify candidate groups based on the coreview hypothesis, and graphbased methods (Wang et al. 2016, 2018; Li et al. 2017; Zhang et al. 2022a, 2020a, 2022b; Hu 2021; Akoglu et al. 2013; Ye and Akoglu 2015; Zheng et al. 2018; Zhu et al. 2019; Chao et al. 2022), such as graph partition or clustering on the constructed reviewer relationship network, to discover candidate groups. Both these two categories of methods utilize a set of spam indicators to measure the spamming behavior of each candidate group and from which to identify spammer groups. Though most existing methods have shown excellent performance, they have two major limitations. First, existing methods only detect spammer groups from the viewpoint of reviewers or singleproduct, ignoring the characteristic that spammer groups will implement crossproduct (i.e., for multiple target products) attacks for camouflage to evade detectors. Secondly, most existing methods focus only on structural features of the review network or attribute features of nodes (e.g., features about the behavior of reviewers or products) when detecting spammer groups, without jointly considering structural and attribute features as well as the structureattribute correlation. It is necessary to design a method that comprehensively discover all the available information to learn the feature representation of nodes.
Therefore, we propose a spammer group detection algorithm based on collaborative training for heterogeneous networks named SGDCTH. In particular, we first calculate the suspiciousness of each product based on the Network Footprint Score (NFS) metric to filter target products and then construct a heterogeneous induced subnetwork based on all target products, in which we can detect spammer groups that commit crossproduct attacks. Subsequently, we use a collaborative training method to model both the intrapartition and interpartition proximity of a heterogeneous induced subnetwork. In this process, we consider each nodeâ€™s structural and attribute information and the structureattribute correlation to learn the feature representation of nodes effectively. Furthermore, we use the DBSCAN clustering method to generate candidate groups in the embedding space of reviewers. Finally, we obtain spammer groups by the group purification and ranking method. The contributions of this paper are summarized as follows:

(1)
Unlike most existing methods that detect spammer groups based on the viewpoint of reviewers (Mukherjee et al. 2012; Shehnepoor et al. 2022, 2021; Xu et al. 2013; Zhang et al. 2021, 2022a, 2020a, 2022b; Wang et al. 2016, 2018; Li et al. 2017; Hu 2021; Akoglu et al. 2013; Ye and Akoglu 2015; Zheng et al. 2018; Zhu et al. 2019; Chao et al. 2022) or singleproduct (Ji et al. 2020), we propose a newÂ heterogeneous networkbasedÂ method for identifying spammer groups from the viewpoint of crossproduct. We first filter target products, then construct a heterogeneous network based on the target productÂ set,Â andÂ finally discover spammer groups by learning feature representations of nodes in the heterogeneous network. This enables our method to detect groups that attack multiple target products more accurately.

(2)
Unlike most existing methods that focus only on structural features of the review network (Shehnepoor et al. 2022; Ye and Akoglu 2015; Zheng et al. 2018; Zhu et al. 2019; Zhang et al. 2022b) or attribute features of nodes (Mukherjee et al. 2012; Ji et al. 2020; Xu et al. 2013; Li et al. 2017; Hu 2021; Zhang et al. 2020a), we jointly consider structural and attribute features as well as the structureattribute correlation to detect spammer groups. We first extract the raw structural and attribute features of nodes, then use the collaborative training method to model both the intrapartition and interpartition proximity of a heterogeneous network. In the training process, we take all available information into account to capture suspicious spammer groups in terms of structure and attributes.

(3)
We conduct experiments on realworld review datasets and make a comparison with four baseline methods. The experimental results indicate that our method can accurately and efficiently detect active spammer groups on ecommerce websites.
The rest of the paper is organized as follows. Section â€śRelated workâ€ť reviews the related work on spammer group detection. Section â€śThe spammer group detection algorithm based on collaborative training for heterogeneous networksâ€ť details our SGDCTH method. Section â€śExperimentsâ€ť describes the experimental results. Finally, we summarize this paper in section â€śConclusionâ€ť.
Related work
According to different classification criteria, we can divide the existing work on spammer group detection into four dimensions. First, depending on the strategy used to generate candidate spammer groups, the existing work can be divided into two categories, i.e., FIMbased and graphbased methods. Secondly, according to the different features considered in group mining, the existing work can be divided into three categories, i.e., the methods based on group behavior and content analysis (B&C), the methods based on group structure analysis (S), and the methods combining group behavior and structure analysis (B&C+S). Thirdly, according to the different coupling degrees of the discovered group members, the existing work can be divided into two categories, i.e., tightly coupled and loosely coupled methods. Fourthly, according to the different concentrations of the attacked target products, the existing work can be divided into two categories, i.e., singleproduct and crossproduct methods. It is worth noting that tightly coupled and loosely coupled methods are designed based on reviewersâ€™ viewpoints, and existing works are almost entirely from the viewpoint of reviewers to detect spammer groups. However, singleproduct and crossproduct methods are designed based on productsâ€™ viewpoints. To the best of our knowledge, Ji et al. (2020) were the first to propose detecting spammer groups from the viewpoint of products. The following subsections review existing work according to the strategy used to generate candidate spammer groups.
FIMbased methods
FIMbased methods first use the FIM method to generate candidate groups based on the coreview hypothesis and then rank or classify them to obtain spammer groups. Mukherjee et al. (2012) were the first to study the problem of spammer group detection. They use the FIM method to treat reviewers who coreview the same set of products as a candidate group and propose a relationshipbased model to detect spammer groups. Later, Xu et al. (2013) proposed a KNNbased and a graphbased classification method to predict whether a candidate groupâ€™s members are suspicious. Zhang et al. (2021) first use the FIM method to discover candidate groups and then propose a method that fuses behavioral and structural feature reasoning to detect spammer groups. After obtaining the candidate groups based on the concept of FIM, Shehnepoor et al. (2022, 2021) use deep learning methods to gradually refine the reviewersâ€™ representation and remove abnormal members from candidate groups based on the refined representation, and finally classify candidate groups. However, the FIM method may incorrectly classify some genuine reviewers who accidentally post reviews into the spammer groups in the process of mining groups. In addition, the method is very sensitive to the setting of support thresholds. Therefore, FIM methods are suitable for detecting tightly coupled groups (i.e., group members need to review all target products) but not for detecting loosely coupled groups (i.e., group members do not need to review every target product to conceal the groupâ€™s spamming behavior) (Wang et al. 2016, 2018; Zhang et al. 2022b).
Graphbased methods
Graphbased methods use graph partition, clustering, community detection, and other methods to generate candidate groups on the review network and then rank or classify them to obtain spammer groups. Different graph construction methods can be further divided into homogeneous and heterogeneous graphbased methods.
Homogeneous graphbased methods
Among homogeneous graphbased methods, researchers generally detect spammer groups in the reviewer relationship network constructed based on the similarity between reviewers. Wang et al. (2016) adopt the divideandconquer idea to detect loosely coupled spammer groups on the reviewer projection network. On this basis, Wang et al. (2018) propose a topdown framework GSBC, which uses the mincut method to discover spammer groups on a biconnected reviewer network. Li et al. (2017) use the graph clustering method to obtain spammer groups on the coburst network. Zhang et al. (2022a) detect spammer groups in three steps. First, they construct a reviewer relationship network and use an improved label propagation method to discover candidate groups. Secondly, they adopt a combination of subjective and objective indicator weighting strategies to evaluate the spamicities of each candidate group. Finally, they rank the candidate groups according to spamicity scores to obtain spammer groups. Hu (2021) uses a community mining method based on network representation learning on the constructed reviewer similarity network to detect tightly connected groups and from which to identify true spammer groups. Zhang et al. (2020a) use an improved label propagation method to obtain candidate groups and propose a new ranking method to find collusive spammers. However, homogeneous graphbased methods do not deeply examine the implicit relationships among reviewers when constructing the reviewer relationship network, which fails to discover spammers with collusive fraudulent behaviors. Moreover, these methods cannot capture the highly nonlinear relationship between nodes in the network (Zheng et al. 2018).
Heterogeneous graphbased methods
Among heterogeneous graphbased methods, researchers detect spammer groups in a reviewerobject heterogeneous network constructed from the reviewerâ€™s review behavior. Akoglu et al. (2013) obtain spammer groups by the graph clustering method on an induced subgraph containing highly suspicious reviewers and corresponding products. Ye et al. (2015) propose a twostep method to discover review spammer groups. They first identify target products vulnerable to spammer attacks and then use the agglomerative hierarchical clustering method to detect spammer groups on an induced subgraph. Zheng et al. (2018) first utilize the deep network embedding method to jointly learn the feature representation of nodes in a bipartite review network and then use the DBSCAN clustering method to detect dense blocks in the latent space. Zhu et al. (2019) first embed explicit and implicit relations in a bipartite network to obtain the representation of reviewers and then use a kdimensional treebased fastdensity subgraph mining method to obtain multiple collaborative groups. Chao et al. (2022) first construct a heterogeneous network based on the idea of metagraph and use the improved DeepWalk method to learn the feature representation of nodes. Then, they utilize the Canopy and Kmeans clustering method to generate candidate groups and treat the top k most suspicious groups as spammer groups. Zhang et al. (2022b) first construct a reviewerproduct bipartite network as the agentâ€™s interactive environment and use an improved reinforcement learning method to generate candidate groups. Next, they exploit the Doc2Vec model to obtain the embedding vector of each candidate group and devise an adversarial autoencoderbased oneclass classification model for detecting collusive spammers. The above heterogeneous graphbased methods neglected to exclude innocent individuals in candidate groups. Furthermore, these methods only utilize structural or attribute information when detecting spammer groups, which do not jointly consider structural and attribute information as well as the structureattribute correlation.
Summary
Table 1 summarizes the existing work in these four dimensions. Although most existing FIMbased or graphbased methods for detecting spammer groups are generally effective, they have some limitations. Specifically, FIMbased methods are prone to misjudging genuine reviewers as spammers in the process of mining groups. Furthermore, the FIM methods are suitable for detecting tightly coupled spammer groups. For homogeneous graphbased and heterogeneous graphbased methods, the former does not take full advantage of the implicit relationships between reviewers when constructing the reviewer relationship network, while the latter ignores the step of group purification. Moreover, existing heterogeneous graphbased methods do not jointly consider the structural features of the review network and the attribute features of nodes as well as the structureattribute correlation.
The spammer group detection algorithm based on collaborative training for heterogeneous networks
Aiming at the limitations of existing research methods, we propose a new unsupervised spammer group detection algorithm, SGDCTH, as shown in Algorithm 1. FigureÂ 1 shows the overall framework of our method. In detail, our method consists of four steps. First, we filter target products based on the NFS metric and then construct a heterogeneous induced subnetwork based on the set of target products (see Algorithms 2 and 3 for details). Secondly, we use a collaborative training method to model the intrapartition and interpartition proximity of the heterogeneous induced subnetwork to obtain lowdimensional vector representations of nodes (see Algorithm 4 for details). Thirdly, the candidate spammer groups are generated based on the DBSCAN clustering method (see Algorithm 5 for details). Fourthly, innocent reviewers are excluded from the candidate group and ranked to obtain spammer groups (see Algorithm 6 for details). An algorithm implements each step. We describe the preliminary in subsection â€śPreliminaryâ€ť, and the subsequent subsections describe the implementation details of each step.
Preliminary
This subsection defines several important concepts that are relevant to our work.
Definition 1
Heterogeneous Information Network (Wang et al. 2022). Heterogeneous Information Network (HIN) is defined as a network \({\mathcal{G}} = ({\mathcal{V}},{\mathcal{E}})\), where \(\mathcal{V}\) and \(\mathcal{E}\) denotes the set of nodes and the set of edges, respectively, and each node \(v \in \mathcal{V}\) and each edge \(e \in \mathcal{E}\) is associated with their node type mapping function \(\phi (v):\mathcal{V} \to {\mathcal{A}}\) and edge type mapping function \(\varphi (e):\mathcal{E} \to {\mathcal{R}}\), where \({\mathcal{A}}\) and \({\mathcal{R}}\) denotes the set of node types and edge types respectively, \(\left{{\mathcal{A}}}{+}{\mathcal{R}} \right > 2\).
Definition 2
Metapath (Wang et al. 2021a). A metapath \({\mathcal{P}}\) is defined as a path in the form of \(A_{1} \mathop{\longrightarrow}\limits^{{R_{1} }}A_{2} \mathop{\longrightarrow}\limits^{{R_{2} }} \cdot \cdot \cdot \mathop{\longrightarrow}\limits^{{R_{l} }}A_{l + 1}\)(abbreviated as \(A_{1} A_{2} \cdot \cdot \cdot A_{l + 1}\)), which describes a composite relation \(R = R_{1} \circ R_{2} \circ \cdot \cdot \cdot \circ R_{l}\) between node types \(A_{1}\) and \(A_{l + 1}\), where \(\circ\) denotes the composition operator on relations.
The target product filtration method and the heterogeneous induced subnetwork construction method
The target product filtration method
Inspired by Ji et al. (2020), who detect spammer groups based on review bursts from the productsâ€™ viewpoints, we cite the NFS metric (Ye and Akoglu 2015) to quantify the likelihood of a product being attacked. NFS leverages two key observations relevant to realworld networks, i.e., neighbor diversity and selfsimilarity. The former means the local diversity of node importance within the neighborhood of a node, and the latter means the distributional similarity between node importance at the local and global levels. In this work, we use degree and PageRank centrality (Ye and Akoglu 2015) to measure node importance in a network.

(1)
Neighbor diversity of nodes
In order to measure the diversity of neighborhood centralities of a given product \(\widetilde{p} \in \mathcal{V}\) with degree \(deg(\widetilde{p})\), we mainly divide it into three steps. First, a list of buckets \(f = \{ 0,1,...\}\) is created so that the bucket boundary values grow exponentially as \(a \cdot b^{f}\). Then, the reviewers are placed in \(F\) buckets, and the reviewers in each bucket are counted and normalized to obtain a discrete probability distribution \(S^{(i)}\) with value \([s_{1}^{(i)} ,...,s_{F}^{(i)} ]\). Finally, by calculating the Shannon entropy of \(S^{(i)}\), the product \(\widetilde{p}\) obtains two neighbor diversity scores \(H_{deg} \left( {\widetilde{p}} \right)\) and \(H_{pr} \left( {\widetilde{p}} \right)\) for degree and PageRank, respectively. The lower these scores are, the more suspicious the product is.

(2)
Selfsimilarity in realworld network
To calculate the selfsimilarity for a given product \(\widetilde{p}\), the histogram density \(S^{(i)} = [s_{1}^{(i)} ,...,s_{F}^{(i)} ]\) of the centrality of the reviewers and the \(KL\)divergence between all reviewers in the network denoted by \(T\) is defined. \(T\) is calculated in the same way as \(S\), except that \(T\) divides the centrality values of all reviewers in a network into buckets. Finally, the product \(\widetilde{p}\) obtains two separate scores \(KL_{deg} \left( {\widetilde{p}} \right)\) and \(KL_{pr} \left( {\widetilde{p}} \right)\) from the difference in selfsimilarity. The higher these scores are, the more suspicious the product is.

(3)
NFS metric
Finally, each product receives four suspiciousness scores, where two based on neighbor diversity, i.e., \(H_{deg}\) and \(H_{pr}\), and two based on selfsimilarity, i.e., \(KL_{deg}\) and \(KL_{pr}\). We use the Cumulative Distribution Function (CDF) to unify them into a standard scale score. Let \(H = \{ H(1),H(2),...\}\) as a list of entropy values calculated for a set of products (based on degree or PageRank centrality). To quantify the extremes of \(H\left( {\widetilde{p}} \right)\), an empirical CDF is used on \(H\) and the probability that the set \(H = \{ H(1),H(2),...\}\) is less than or equal to \(H\left( {\widetilde{p}} \right)\) is counted and calculated as follows (Ye and Akoglu 2015).
On the other hand, for the \(KL\)divergence, the statistical probability that the set \(KL = \left\{ {KL(1),\;KL(2),...} \right\}\) is greater than \(KL\left( {\widetilde{p}} \right)\) is calculated as follows (Ye and Akoglu 2015).
Our ultimate goal is to take the low value in \(H\left( {\widetilde{p}} \right)\) and the high value in \(KL\left( {\widetilde{p}} \right)\), and obtain the NFS value of a product \(\widetilde{p}\) by combining them. A higher value of \(NFS\left( {\widetilde{p}} \right) \in [0,1]\) indicates that a product is more suspicious, calculated as follows (Ye and Akoglu 2015).
The heterogeneous induced subnetwork construction method
Based on the HIN and the set of target products, we first give the definition of the heterogeneous induced subnetwork.
Definition 3
Heterogeneous Induced Subnetwork. The heterogeneous induced subnetwork (HISN) is defined as a network \({\mathcal{S}\mathcal{G}} = ({\mathcal{V}}_{{{\mathcal{S}\mathcal{G}}}}, {\mathcal{E}}_{{{\mathcal{S}\mathcal{G}}}} )\), where \(\mathcal{V}_{{\mathcal{S}\mathcal{G}} }\) and \(\mathcal{E}_{\mathcal{V} }\) denotes the set of nodes and edges, respectively. The subnetwork consists of target product set \(\widetilde{P}\), all reviewers \(R\) who reviewed target products in \(\widetilde{P}\), and all products \(P \supseteq \widetilde{P}\) reviewed by these reviewers. In other words, this subnetwork is an induced subnetwork of the network \({\mathcal{G}}\) at nodes within two hops of target products in \(\widetilde{P}\).
Based on the above description, we design a method for filtering target products in Algorithm 2. For each product in the review network, its NFS value is calculated. If the NFS value exceeds a given threshold \(\delta_{{\widetilde{p}}}\), and then it is added to the target product set \(\widetilde{P}\). In addition, we design a method for constructing HISN in Algorithm 3. In Algorithm 3, we construct the HISN using all of the target products.
The collaborative trainingbased feature representation learning method
In real life, a spammer often has a close relationship with a series of manipulated target products, i.e., a reviewerproduct relationship. To increase the concealment of a group, its members often collaborate to coreview multiple target products, i.e., a reviewerreviewer relationship. The target products under attack will have overlapped spammers, i.e., a productproduct relationship. To capture these relationships, we first use a collaborative training method to model both the intrapartition proximity and interpartition proximity of HISN. Then, we model the structureattribute correlation using a latent correlation training strategy to learn the feature representation of nodes.
Intrapartition proximity modeling
The intrapartition proximity captures the relationships between nodes within the same partition in terms of both structure and attributes (i.e., implicit relationships, including reviewerreviewer relationship and productproduct relationship). On the one hand, the nodes with similar â€śinteractionâ€ť behaviors with nodes in the other partition should have high proximity (i.e., structural proximity). On the other hand, nodes sharing similar attributes tend to exhibit similar behaviors in the network (i.e., attribute proximity) (Huang et al. 2020). Specifically, we first extract the raw structural features and attribute features of nodes in HISN. Subsequently, we partition HISN and input the structural and attribute features of nodes within the same partition into two independent autoencoders to learn their compact representations. Finally, we jointly model the structural and attribute proximity to preserve the firstorder proximity of nodes.
The raw feature extraction method

(1)
The raw structural feature extraction method
The NFS measures how unusually suspiciously similar reviewers target a product, and such groups of highly similar reviewers are likely to work together in spam campaigns (Ye and Akoglu 2015). Therefore, the behavior of reviewers within each subgraph (i.e., a bipartite subgraph consisting of a target product and its corresponding reviewer) is highly consistent. However, Wang et al. (2021b) found an inconsistency between a nodeâ€™s behavior and its label semantics. Inspired by Wang et al. (2021b), we propose to adopt the contrast between node representation and subgraph representation to reduce the impact of inconsistency caused by different behaviors across subgraphs.
We first perform feature decomposition for the normalized adjacency matrix to obtain the initial feature vector \(X\), where \(x_{i} \in {\mathcal{R}}^{{d_{0} }}\) represents the \(d_{0}\)dimensional initial feature vector of a node \(v_{i}\). Then, it is fed into GNN to learn the structural features of nodes. In our work, we adopt GIN (Xu et al. 2018) in Eq.Â (4), a stateoftheart graph neural network, to learn the structural features of each node by means of a sumlike neighborhood aggregation function.
where \({\mathbf{x}}_{i}^{(l)} \in {\mathcal{R}}^{d}\) is the embedding of node \(v_{i}\) at \(l\)th layer, and \({\mathbf{x}}_{i}^{{({0})}} { = }x_{i}\). \(N(v_{i} )\) is the set of neighbors of node \(v_{i}\). \({\text{MLP}}\) denotes a multilayer perceptron. \(\lambda^{(l)}\) is either a learnable parameter or a fixed scalar. We stack \(L\) layers to obtain the higherorder structural features \({\mathbf{x}}_{i}^{(L)}\) of each node in HISN.
For each subgraph \(C_{k}\), we compute a subgraph level representation \({\varvec{s}}_{k}\) to summarize most nodesâ€™ behavior.
where \(n_{k}\) denotes the number of nodes in \(C_{k}\).
The subgraph level representation is encoded as the node representation by optimizing the loss function \({\mathcal{L}}_{Stru}^{k}\), and the final loss function \({\mathcal{L}}_{Stru}\) is the average of \(K\) subgraph losses.
where \({\mathcal{D}}\) is a discriminator that outputs the affinity score for each nodesubgraph pair. A subgraph \(\widetilde{C}_{k}\) is generated by a rowwise shuffling of the initial feature matrix of \(C_{k}\), providing that node representation \(\widetilde{{\mathbf{x}}}_{i}^{(L)}\) can be paired with subgraph representation \({\varvec{s}}_{k}\) as a negative sample.

(2)
The raw attribute feature extraction method
Each node in HISN is associated with a set of attributes. In this paper, we extract 23 behavioral features from the literature (Zhang et al. 2020b) as the raw attribute features of a reviewer. In addition, we extract 6 behavioral features from the literature (Rayana and Akoglu 2015) and the proportion of each rating level of a product, a total of 11 behavioral features as the raw attribute features of a product. In particular, the numerical attributes are normalized, the categorical attributes are coded using onehot, and they are all concatenated as the raw attribute features of a node.
Partitioning and compact representation learning method
After obtaining the raw structural features \({\mathbf{x}}\) and attribute features \({\mathbf{z}}\), we divide HISN into reviewer partition and product partition based on the metapath â€śRPRâ€ť and â€śPRPâ€ť, where two nodes connected within a partition are each otherâ€™s intrapartition neighbors. The same applies to product partition as to reviewer partition.
We feed the features \({\mathbf{x}}\) and \({\mathbf{z}}\) into two independent autoencoders to obtain encodings \({\mathbf{x}}\prime\) and \({\mathbf{z}}\prime\) as well as the reconstructed vectors \({\hat{\mathbf{x}}}\) and \({\hat{\mathbf{z}}}\). We capture the attribute information and structural information of a reviewer \(v_{i}\) by minimizing the following reconstruction loss function.
Joint modeling
To bring two nodes with similar review behavior closer together in the embedding space, after obtaining the encodings \({\mathbf{x}}\prime\) and \({\mathbf{z}}\prime\), we perform a joint model of attribute encoding and structure encoding to preserve the firstorder proximity between nodes by optimizing the following loss function.
where \(a_{mn}\) denotes the adjacency matrix elements of the synthesized intrapartition network, \(\Omega_{n} (v)\) denotes the negative sampling distribution.
Finally, \({\mathbf{x}}\prime\) and \({\mathbf{z}}\prime\) are concatenated to obtain the final embedding \({\mathbf{h}}\), which is used for interpartition proximity modeling.
Interpartition proximity modeling
Interpartition proximity captures the relationship between reviewers and products (i.e., the explicit relationship). For edges \(\mathcal{E}_{{{\mathcal{S}\mathcal{G}}_{mn} }}\) between \(r_{m}\) and \(p_{n}\), consider the joint probability as the interpartition proximity between them.
where \(\sigma\) denotes the sigmoid function. \({\mathbf{h}}_{m}\) and \({\mathbf{h}}_{n}\) denotes the final embedding of \(r_{m}\) and \(p_{n}\), respectively.
The likelihood function of the joint probability is maximized by minimizing the following loss function.
where \(\Omega_{n} (v)\) denotes the negative sampling distribution.
Latent correlation training strategy
Since structural information and attribute information are two different modalities, they provide complementary information. Moreover, they both describe the same network, implying that they have potential consistency. Therefore, we comprehensively consider their complementarity and consistency, which is called structureattribute correlation (Huang et al. 2020).
To effectively preserve attributestructure correlation, two auxiliary space transformation kernels are used to transform encodings to a new latent space and project it to obtain the latent representations \(\widetilde{{\mathbf{x}}}\) and \(\widetilde{{\mathbf{z}}}\) (Huang et al. 2020). The attributestructure correlation of any two nodes is defined as the joint probability of their latent representations.
The likelihood function of the joint probability is maximized by minimizing the following loss function.
where \(\widetilde{p}(m,n)\) denote the dynamic positive sampling distribution.
Ultimately, we combine all the optimization functions as a final objective function to optimize the embedding vector jointly.
We summarize the process of collaborative training in Algorithm 4. Lines 1â€“19 model the intrapartition proximity. In particular, lines 1â€“9 extract the raw structural and attribute features of nodes. Line 10 divides HISN into two partitions based on metapath. Lines 13â€“14 perform compact feature learning. Line 15 performs a joint model to preserve the firstorder proximity between nodes within the same partition. Lines 16â€“19 model the correlation between attribute and structural information. Lines 20â€“21 model the interpartition proximity.
The DBSCANbased candidate group generation method
After obtaining the feature representation of nodes, we utilize the DBSCAN clustering method to find candidate spammer groups in the reviewersâ€™ embedding space. The reasons for choosing the DBSCAN clustering method are that: (1) it can generate groups adaptively without the need for an artificially predefined number of groups; (2) it can discover groups of arbitrary shapes; and (3) it can find abnormal points in the process of mining groups (Ester et al. 1996). Algorithm 5 describes the specific process of the method.
The group purification and ranking method
As some genuine reviewers who coincidently post reviews may be mixed in the detected candidate groups and are misjudged as spammers, we should filter out these innocent individuals. Therefore, we use the group purification method adopted by Zhang et al. (2022a) that can be used for HIN. The basic steps of the method are as follows. First, we calculate the contrast suspiciousness metric. Specifically, we extract the reviewerproduct bipartite graph of each candidate group from the original review dataset. Based on the heterogeneous structure graph of candidate groups, we calculate the contrast suspiciousness metric in terms of structural characteristics of groups, rating time characteristics, and rating distribution characteristics. Secondly, we purify and rank the candidate groups. In particular, we define the spamicity (degree of spam) of an individual reviewer and the spamicity of a group according to the contrast suspiciousness metric. And we rank the candidate groups according to their spamicities to obtain spammer groups.
Contrast suspiciousness metric calculation method
Based on the generated candidate groups, we first construct a heterogeneous structure graph of candidate groups, which is defined as follows.
Definition 4
Heterogeneous structure graph of candidate groups.
The heterogeneous structure graph of candidate groups is defined as \(BiG = (U,V,E)\), where \(U\) denotes all members of the candidate groups and \(V\) denotes the set of products reviewed by these members from the original review dataset. Notably, if a member writes multiple reviews on a product, there are multiple edges between them, each of which is associated with a rating and a timestamp.
In real life, to reduce the cost of attacks (e.g., time), a group of suspicious reviewers \(A \subset U\) tends to collectively and actively write reviews on a set of products \(B \subset V\) in a short period. Therefore, the density score \(D(A,B)\) can be used to measure the extent to which \(A\) collective reviews the set of products \(B\) reviewed (Liu et al. 2018).
where \(f_{A} (v_{i} )\) denotes the total edge frequency from \(A\) to a product \(v_{i}\) in \(B\), \(\sigma_{ji}\) denotes the global suspiciousness of an edge, \(e_{ji}\) refers to the number of edges between \((u_{j} ,v_{i} )\).
To maximize \(D(A,B)\), \(A\) and \(B\) are mutually dependent. As a result, we introduce the definition of contrast suspiciousness.
Definition 5
Contrast suspiciousness (Liu et al. 2018).
The contrast suspiciousness denoted as \(P(v_{i} {}A)\) is defined as the conditional probability of a node \(v_{i}\) that belongs to \(B\), given \(A\). The values of contrast suspiciousness are proportional to \(q(\alpha_{i} )\), \(q(\beta_{i} )\) and \(q(\gamma_{i} )\). These values are calculated as follows.

(1)
Topology
A product is suspicious if it is only reviewed by members in \(A\) and rarely by other members (Liu et al. 2018). From the topology perspective, the contrast suspiciousness satisfies Eq.Â (16).
where \(\alpha_{i} \in [0,1]\) is the involvement ratio of members in \(A\) in the spam activity of a product \(v_{i}\), \(f_{U} (v_{i} )\) is the weighted indegree of \(v_{i}\) similar to \(f_{A} (v_{i} )\), the edges are weighted by global suspiciousness and \(q( \cdot )\) is a scale function chosen in the exponential form \(q(x) = b^{x  1}\), where \(b\)â€‰=â€‰32.

(2)
Temporal bursts and drops
Let the time series of a product \(v_{i}\) as \({\text{T}} = \{ (t_{0} ,c_{0} ),(t_{1} ,c_{1} ),...,(t_{e} ,c_{e} )\}\), where \(c_{i}\) is the number of timestamps in the time box \([t_{i}  \Delta t/2,t_{i} + \Delta t/2)\) and \(\Delta t\) is the box size. The point with the maximum value \(c_{m}\) is set as the bursting point, i.e., \((t_{m} ,c_{m} )\). The awakening point of the burst is defined as the point along the time series \({\text{T}}\), to which the distance from \(l\) (the auxiliary straight line from the start point to the bursting point) is greatest. This paper uses the MultiBurst method (Liu et al. 2018) to find the subburst points and associated awakening points of multiple bursts. From the perspective of rating time, the contrast suspiciousness satisfies Eq.Â (17).
where \(\beta_{i} \in [0,1]\) is the involvement ratio of members in \(A\) in multiple bursts, \(T_{A}\) is the collection of timestamp from members in \(A\) to \(v_{i}\), \(T_{U}\) is the collection of timestamps from all members to \(v_{i}\), \(\Delta c_{am}\) is the height difference of burstawakening point pair, and \(s_{am}\) is the slope of burstawakening point pair.
To capture the sudden drop pattern after attacking, we draw another auxiliary straight line from the highest point \((t_{m} ,c_{m} )\) to the last point \((t_{e} ,c_{e} )\). The point of death \((t_{d} ,c_{d} )\) (i.e., the end of the drop) is found by maximizing the distance to this straight line. We use the MaxDrop method (Liu et al. 2018) to find the maximum drop and slope. A product of the maximum drop and slope is used in Eq.Â (15) to measure the global suspiciousness of an edge.
where \(\Delta c_{bd}\) is the fall of maximum drop, and \(s_{bd}\) is the slope of the maximum drop.

(3)
Rating deviation and aggregation
For each product, we use the \(KL\)divergence from the distribution between members in \(A\) and other members in \(U\backslash A\) to calculate the rating deviation and weight it by a balancing factor. From the perspective of rating distribution, the contrast suspiciousness satisfies Eq.Â (19).
where \(k\) denotes the rating category, \(F_{k} (v_{i} )\) denotes the frequency with which members in \(A\) rated product \(v_{i}\) with category \(k\) scores, and \(F_{k}^{^{\prime}} (v_{i} )\) denotes the frequency with which other members \(U\backslash A\) rated product \(v_{i}\) with category \(k\) scores.
Ultimately, we use joint probability to aggregate the three signals above to obtain the contrast suspiciousness metric.
The candidate group purification and ranking method
Based on the contrast suspiciousness metric, the spamicity for a reviewer can be calculated according to Eq.Â (21).
where \(\sigma_{ji}\) is the global suspiciousness on an edge, \(e_{ji}\) is the number of edges between \((u_{j} ,v_{i} )\), and \(P\left( {v_{i} {}A} \right)\) is the contrast suspiciousness.
To increase the association of a candidate group \(A\) with a set of products reviewed \(B\), we use the expectation of the density score \(D\left( {A,B} \right)\) over the probability \(P(v_{i} {}A)\) as the spamicity of a group. The objective function is defined according to Eq.Â (22).
We describe the specific steps for group purification and ranking in Algorithm 6. The algorithm takes one candidate group \(A\) in \(Candidate\_Group\) at a time as the input, and uses a priority tree to efficiently find the reviewer with the lowest spamicity in \(A\) and remove it. Then, the contrast suspiciousness changes, and the reviewerâ€™s spamicities are updated. The algorithm decreases \(A\) until \(A\) is empty, obtaining \(A^{*}\) that maximizes the value of the objective function. The \(A^{*}\) with a group size greater than or equal to 2 is placed into \(Spammer\_Group\). Based on spamicities, we rank the groups in \(Spammer\_Group\). Finally, the algorithm returns the top 300 most suspicious spammer groups.
Experiments
Dataset and human labeling
As there is no ground truth for spammer groups in the ecommerce field, we need to label the datasets to compare the performance of the spammer group detection methods. In this subsection, we first introduce the dataset used in the experiments and then detail the method for manually labeling the dataset.
Dataset
In our experiments, we use the unlabeled AmazonBooks review dataset. AmazonBooks is a dataset of book reviews from 1993 to 2014, which includes 22,507,155 reviews, 8,026,324 reviewers, and 2,330,066 products. Due to the large amount of review data, we only extracted data in 2013 for experiments according to the GSDB method (Ji et al. 2020). Finally, we got 6,990,316 reviews, 2,998,380 reviewers, and 1,079,741 products. Table 2 shows the statistics of the dataset.
Human labeling
The problem of spammer group detection is very challenging because of no available standard datasets with group labels for model building or method evaluation. Although our SGDCTH method is completely unsupervised and does not require any labels in its implementation, we need to obtain labels for the final groups to analyze the impact of parameter values on group detection performance and to achieve performance comparisons with baseline methods. Therefore, we hired three graduate student experts in the ecommerce environment to manually label the resulting top 300 spammer groups that are generated by each detection method and take these labels as ground truth.
Specifically, we use five individual spam indicators used by Ji et al. (2020) to label groups output by Algorithm 5, including Rating deviation (RD), Ratio of Extreme Rating (EXR), The Most Reviews Oneday (MRO), Account Duration (AD), and Active time interval reviews (ATR). The human labeling method is divided into three steps. First, each group member is assigned 1 point for each spam judgment, 0.5 points for each borderline judgment, and 0 for nonspam judgment. Secondly, we calculate each groupâ€™s total spamicity and average spamicity score according to the labels of its members. Thirdly, if the average spamicity score of a group is greater than or equal to 2/3, then the group will be labeled as a spammer group.
Baselines, evaluation metrics, and experimental setting
Baselines
To evaluate the performance of our method, we compare it with four classical unsupervised spammer group detection methods.

(1)
GSDB (Ji et al. 2020). A review burstbased spammer group detection method. From the viewpoint of singleproduct, GSDB uses the Kernel Density Estimation (KDE) method to generate candidate groups from the review bursts of target products and further purify and classify the candidate groups to obtain spammer groups. The similarity between SGDCTH and GSDB is that both detect spammer groups from the viewpoint of products. The difference is that the SGDCTH method detects spammer groups for crossproduct attacks and considers the structuralattribute correlation.

(2)
GSBC (Wang et al. 2018). A graphbased spammer group detection method that introduces a topdown computational framework to identify spammer groups using the topology of a reviewer graph. The similarity between SGDCTH and GSBC is that both are graphbased methods. The difference is that the SGDCTH method takes products as the entry point and constructs a heterogeneous network to detect spammer groups.

(3)
GroupStrainer (Ye and Akoglu 2015). A twostep method for discovering target products and spammer groups in a heterogeneous network. The similarity between SGDCTH and GroupStrainer is that both detect spammer groups in heterogeneous networks. The difference is that the SGDCTH method considers the structureattribute correlation and uses the group purification method to further improve the performance of the spammer group detection method.

(4)
HoloScope (Liu et al. 2018). A method that uses the Singular Value Decomposition (SVD) method in a heterogeneous network to detect dense subgraphs. The similarity between SGDCTH and HoloScope is that both detect spammer groups in heterogeneous networks. The difference is that the SGDCTH method is from the viewpoint of products and considers the structureattribute correlation.
Evaluation metrics
As in previous work (Ji et al. 2020; Zhang et al. 2021, 2022a; Wang et al. 2016, 2018), we use precision, recall, and F1 values as evaluation criteria, which are defined according to Eqs.Â (23), (24), and (25).
where \(TP\)(True Positive) represents the number of spammer groups that are correctly detected, \(FP\) (False Positive) represents the number of true groups that are misjudged as a spammer group, and \(FN\) (False Negative) represents the number of spammer groups that are not accurately identified.
Precision reflects the number of correctly detected spammer groups as a percentage of the total number of groups predicted to be spammer groups, with larger values indicating better detection precision. Recall indicates the number of correctly detected spammer groups as a proportion of the total number of spammer groups in practice, with larger values indicating better detection performance. The F1 value reconciles and averages the test precision and recall, reflecting the overall performance of the spammer group detection algorithm.
Experimental settings
We designed three sets of experiments based on the AmazonBooks dataset in 2013. The first set of experiments aims to analyze the impact of parameter values on the group detection performance of our method. The second set of experiments aims to evaluate the performance of our method by comparison with baseline methods, including two analyses. Specifically, the first analysis is the analysis of the precision, recall, and F1 values for group detection methods, and the second is a comparative analysis of the time complexity of SGDCTH with baseline methods, which aims to verify the effectiveness and efficiency of our method. In the third set of experiments, we designed four variants of SGDCTH to verify the necessity of considering all available information and the step of group purification.
The SGDCTH method involves three parameters that need to be verified, i.e., the target product filtration threshold, the neighborhood radius threshold, and the minimum number of sample points threshold. The target product filtration threshold \(\delta_{{\widetilde{p}}}\) is between 0.5 and 0.7. To generate a suitable number of candidate groups, we select the neighborhood radius threshold Ďµ is {0.4, 0.5, 0.6, 0.7, 0.8} and the minimum number of sample points threshold \(\phi\) is {2, 3, 4} for experimental verification. In addition, we randomly initialize model parameters with a standard Xavier normal distribution (Glorot and Bengio 2010) and optimize the model with Adam (Kingma and Ba 2014). The number of GNN layers \(L\) is 2, the learning rate \(lr\) is 0.01, \(\lambda\) is set as 0 in Eq.Â (4), and the dimension of the raw structural features \({\mathbf{x}}\) is set as 128. We list the parameters for structural autoencoder and attribute autoencoder for two partitions in Table 3. The transformation kernel is set as 64â€“16. The number of epochs \(t\) is set as 50. The number of dynamic samples is set as 5. The dimension of the final embedding vector \(d\) is set as 128.
For the GSDB method, we set the target product filtration threshold \(\delta_{p}\) as 0.1, the individual spamicity threshold \(\delta_{I}\) as 0.43, and the group spamicity threshold \(\delta_{G}\) as 0.57 to obtain a total of 320 groups. For the GSBC method, we set the coreview time window size \(\tau\) as 30, the edge weight threshold \(\delta\) as 0.1, the userspecified parameter \(MP\) as 1000, and the group spamicity threshold \(\delta_{G}\) as 0.53 to obtain a total of 325 groups. For the GroupStrainer method, the target product filtration threshold \(\delta_{p}\) is set as 0.75 to filter out target products with high suspicion. For the HoloScope method, the scaling base \(b\) is set to 32. In summary, we detail the parameter settings for each method in Table 4.
Results and analysis of parameter selection
Based on the parameter settings listed in Table 4, we perform the first set of experiments for parameter selection. In this section, we mainly explore the impacts of the target product filtration threshold, the neighborhood radius threshold, and the minimum number of sample points threshold on the detection performance of our method. Notably, when discussing one parameter, other parameters will be set to their best value.
Results and analysis of the target product filtration threshold
We draw a histogram of the NFS value distribution of products, as shown in Fig.Â 2. The frequencies of products with different NFS values show a skewed distribution, with most products having a concentrated distribution of NFS values between about 0.5 and 0.7. Since the Amazon dataset is relatively dense with reviews, too small a target product filtration threshold \(\delta_{{\tilde{p}}}\) will increase the time complexity of our method, and too large a value will discard some reviewers in the process of constructing the graph, which negatively affects the algorithmâ€™s detection performance. In our experiments, we obtain through interpolation analysis that when the target productâ€™s filtration threshold \(\delta_{{\widetilde{p}}} > 0.65\), the product is vulnerable to attack. Therefore, we will set it as 0.65 and finally obtain 7027 target products.
Results and analysis of the neighborhood radius threshold
Our experiments found that the number of detected spammer groups decreases gradually as the value of Ďµ increases. To generate a comparable number of groups as the baseline methods, the neighborhood radius threshold Ďµ is set as {0.4, 0.5, 0.6, 0.7, 0.8}. The impact of Ďµ on the detection performance of our method will be further explored.
FigureÂ 3 shows the precision and F1 values of our method are gradually improving as the value of Ďµ increases. When the neighborhood radius Ďµ is set to 0.6, the precision and F1 values of the SGDCTH method are the highest for the top 300 groups, and the recall curve changes more gently. Following that, the precision of our method gradually decreases, as does the number of groups generated. When the neighborhood radius Ďµ is set to 0.8, only 232 spammer groups are generated.
Results and analysis of the minimum number of sample points threshold
Consistent with the neighborhood radius threshold Ďµ, the larger the value of the minimum number of sample points threshold, the smaller the number of discovering groups. Therefore, we set the minimum number of sample point thresholds \(\phi\) as {2, 3, 4} to further explore the impact of parameter \(\phi\) on the detection performance of our method.
FigureÂ 4 shows the precision of SGDCTH gradually decreases as the value of \(\phi\) increases. The precision of SGDCTH is lowest when \(\phi\) is set as 4, and only 215 spammer groups are obtained. Although the F1 value is lower for approximately the top 160 groups, after that, the F1 value is higher than when \(\phi\) is set as 3 or 4. On the whole, the detection performance of our method for the top 300 groups is better when the minimum number of sample points threshold \(\phi\) is set as 2.
Results and comparison analysis for the group detection method
We implemented a second set of experiments to compare the performance of our method with baseline methods. The analysis is mainly carried out in two aspects, i.e., the comparative analysis of the precision, recall, and F1 values and the time complexity.
Results and comparison analysis of the precision, recall, and F1 values
Based on the manual labeling of the top 300 groups detected by the GSDB, GSBC, GroupStrainer, HoloScope, and SGDCTH methods, we analyze the precision, recall, and F1 values for SGDCTH with baseline methods, as shown in Fig.Â 5.
FigureÂ 5a shows the precision curves of SGDCTH, GroupStrainer, and HoloScope consistently outperform the GSBC method, indicating that the heterogeneous graphbased method is capable of digging deeper into the implicit relationships among reviewers than the homogeneous graphbased method to capture spam groups with suspected collusion. Moreover, the precision curve of SGDCTH consistently outperformed that of GroupStrainer and HoloScope, indicating the effectiveness of comprehensively considering structural and attribute features of nodes. The precision curves of SGDCTH and GSDB cross at about the 185th group, before which GSDB outperforms SGDCTH, but after which SGDCTH outperforms GSDB. This is because GSDB only detects spammer groups in review bursts of a single target product, while SGDCTH can detect spammer groups that attack across multiple products, which is closer to the attack method of spammer groups in real life.
As can be seen in Fig.Â 5b, for about the top 130 groups, the recall curve of SGDCTH is slightly lower than that of GroupStrainer, which may be because our method did not detect some spam groups that evaded detection for the purpose of camouflage. However, after that, the recall curve of SGDCTH is better than that of GroupStrainer, HoloScope. Overall, the SGDCTH method seems to have the smoothest recall curve fluctuations.
FigureÂ 5c shows the F1 value obtained by combining precision and recall. Also, the F1 value curve of each method maintains a monotonically increasing state. After about the 120th group, the F1 value of SGDCTH is better than that of the GSBC, GroupStrainer, and HoloScope methods. Finally, it surpasses the GSDB method, which illustrates the superiority of crossproduct detection and collaborative training methods.
From this, we can draw two conclusions. (1) The SGDCTH, GroupStrainer, and HoloScope methods based on the heterogeneous network are better than the GSBC method based on the homogeneous network. (2) The SGDCTH method for detecting spammer groups in crossproduct attacks is more consistent with reallife attacks on spammer groups than the GSDB method for detecting spammer groups from the review bursts of a single product.
Comparison analysis of the time complexity
We compare and analyze the time complexity of the GSDB, GSBC, GroupStrainer, HoloScope, and SGDCTH methods, as shown in Table 5.
The GSDB method uses the KDE method to generate candidate groups in the review bursts of singleproduct with time complexity \(O(n^{2} )\). The time complexity of the target product filtration and the group purification and classification are both \(O(n)\), resulting in the total time complexity \(O(n^{2} )\). The GSBC method uses three loop levels to construct a reviewer relationship graph with time complexity \(O(n^{{3}} )\). The group generation and detection stage uses the mincut method in one level of loops with a time complexity of \(O(n^{{3}} )\), so the total time complexity is \(O(n^{{3}} )\). The GroupStrainer method consists of two stages, i.e., target product detection and spammer group generation. Each stage has a time complexity \(O(n^{2} )\), so the total time complexity is \(O(n^{2} )\). For the HoloScope method, the time complexity of constructing the heterogeneous graph is \(O(n^{2} )\), and the time complexity of detecting dense blocks using SVD is \(O(n\log n)\), so the total time complexity is \(O(n^{2} )\). For the SGDCTH method, the time complexity of the target product filtration is \(O(n^{2} )\), the time complexity of constructing heterogeneous graph is \(O(n^{2} )\), the time complexity of node feature representation learning is \(O(n\log n)\), the time complexity of candidate groups generation is \(O(n^{2} )\), and the time complexity of the group purification and ranking stage is \(O(n)\), so the total time complexity is \(O(n^{2} )\).
Overall, the total time complexity of GSBC is \(O(n^{{3}} )\), while the total time complexity of the GSDB, GroupStrainer, HoloScope, and SGDCTH methods are all \(O(n^{2} )\). In addition, since the SGDCTH method first filters target products vulnerable to attack by spammers, it makes SGDCTH focus on the review data closely related to target products, which greatly improves the algorithmâ€™s efficiency.
Analysis of ablation
We conduct an ablation analysis to evaluate our method and configure SGDCTH to the following settings.

(1)
SGDCTH_TP. A variant of our method, which utilizes the behavioral metric combining the abnormal distributions of product rating and product average rating used by Ji et al. (2020) in filtering target products.

(2)
SGDCTH_Stru. A variant of our method that only utilizes structural features, but ignores attribute and structureattribute correlation features.

(3)
SGDCTH_Attr. A variant of our method that only utilizes attribute features, but ignores structural and structureattribute correlation features.

(4)
SGDCTH_Stru+Attr. A variant of our method that utilizes structural and attribute features, but ignores structureattribute correlation features.

(5)
SGDCTH_DPC. A variant of our method that utilizes the Density Peaks Clustering (DPC) method to discover candidate groups in the vector space of reviewers.

(6)
SGDCTH_KMeans. A variant of our method that utilizes the KMeans clustering method to discover candidate groups in the vector space of reviewers.

(7)
SGDCTH_No purification. A variant of our method that ignores the step of group purification.
We analyze the precision, recall, and F1 values for SGDCTH with seven variants, as shown in Fig.Â 6. The precision curve in Fig.Â 6a shows that SGDCTH achieves the best performance. This is because it comprehensively considers structure, attribute, and structureattribute correlation features when detecting spammer groups. Furthermore, it uses a more robust NFS metric to filter target products and a group purification method to filter innocent members of candidate groups generated by the DBSCAN method, which further improves the performance of SGDCTH. SGDCTH_TP shows inferior performance to SGDCTH, indicating the NFS metric is more robust to evasion than behavioral metrics (Ye and Akoglu 2015). In the precision curve between the 115th and 240th groups, SGDCTH_Stru and SGDCTH_Attr show inferior performance to SGDCTH_Stru+Attr, which indicates the necessity of considering structure and attribute features comprehensively. The performance of SGDCTH_Stru+Attr is inferior to that of SGDCTH, indicating the importance of considering the structureattribute correlation features. The detection performance of SGDCTH_DPC and SGDCTH_KMeans is inferior to that of SGDCTH, which indicates that DBSCAN is better than DPC and KMeans at discovering spammer groups that attack target products separately. In addition, SGDCTH_No purification shows inferior performance to SGDCTH, indicating the necessity of group purification.
The recall curve in Fig.Â 6b shows that SGDCTH_Attr and SGDCTH_Stru+Attr have a higher recall, while SGDCTH_Stru has a lower recall. This indicates that the attribute features of nodes are somewhat adversarial. Spammers are prone to disguising their relationship with other members of groups (i.e., structural features) to evade the detector.
The F1 curve in Fig.Â 6c shows that SGDCTH_Attr, SGDCTH_Stru, and SGDCTH_KMeans have the worst performance, indicating that all available information and a better clustering method should be considered when designing the detector to enhance its robustness.
Conclusion
Online fake reviews have increasingly become a real threat to ecommerce evaluation and reputation systems, and detecting spammer groups is key to ensuring the credibility of review information on ecommerce websites. This paper proposes a collaborative trainingbased algorithm for detecting spammer groups in a heterogeneous network called SGDCTH. It greatly reduces the algorithmâ€™s time complexity by filtering target products vulnerable to spammer attacks from the productsâ€™ viewpoint. To effectively learn lowdimensional vector representations of nodes, the SGDCTH method uses a collaborative training method to model the intrapartition and interpartition proximity of a heterogeneous network, which considers the structureattribute correlation. This makes our method detect suspicious spammer groups in ecommerce in terms of structure and attributes. Since genuine reviewers are easily mixed into the detected candidate groups and misjudged by the detector as spammers, SGDCTH utilizes the group purification method to filter the innocent individuals in candidate groups. This further improves the performance of our SGDCTH method.
Although our SGDCTH method achieves good performance, there is still room for improvement. For example, we use the DBSCAN clustering method to generate candidate groups, but we need to manually set two thresholds. We will explore a method to automatically learn these two thresholds to generate higherquality groups. Future work also includes designing methods to learn node features more efficiently in heterogeneous networks, as well as simulating the attack patterns of spammer groups to write fake reviews and injecting these data into real datasets to evaluate the performance of the detection method.
Availability of data and materials
When certain data sharing requirements are met, the data is available upon request. Such requests should be sent to the first author of this paper.
Abbreviations
 FIM:

Frequent item mining
 NFS:

Network footprint score
 B&C:

Group behavior and content
 S:

Group structure
 B&C+S:

Group behavior and structure
 HIN:

Heterogeneous information network
 HISN:

Heterogeneous induced subnetwork
 KDE:

Kernel density estimation
 SVD:

Singular value decomposition
 RD:

Rating deviation
 EXR:

Ratio of extreme rating
 MRO:

Most reviews oneday
 AD:

Account duration
 ATR:

Active time interval reviews
References
Akoglu L, Chandy R, Faloutsos C (2013) Opinion fraud detection in online reviews by network effects. In: Proceedings of the international AAAI conference on web and social media, vol 7, 1st edn. pp 2â€“11
Cao N, Ji S, Chiu DK, He M, Sun X (2020) A deceptive review detection framework: combination of coarse and finegrained features. Expert Syst Appl 156:113465
Cao N, Ji S, Chiu DK, Gong M (2022) A deceptive reviews detection model: separated training of multifeature learning and classification. Expert Syst Appl 187:115977
Chao J, Zhao C, Zhang F (2022) Network embeddingbased approach for detecting collusive spamming groups on Ecommerce platforms. In: Security and communication networks, pp 1â€“13
Ester M, Kriegel HP, Sander J, Xu X (1996) A densitybased algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol 96, 34th edn. pp 226â€“231
Glorot X, Bengio Y (2010). Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, pp 249â€“256
Hu Y (2021) Unsupervised learning for spammer group detection based on network representation. Univ Electron Sci Technol China. https://doi.org/10.27005/d.cnki.gdzku.2021.000829
Huang W, Li Y, Fang Y, Fan J, Yang H (2020) BiANE: Bipartite attributed network embedding. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. pp 149â€“158
Ji SJ, Zhang Q, Li J, Chiu DK, Xu S, Yi L, Gong M (2020) A burstbased unsupervised method for detecting review spammer groups. Inf Sci 536:454â€“469
Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining. pp 219â€“230
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Li FH, Huang M, Yang Y, Zhu X (2011) Learning to identify review spam. In: Twentysecond international joint conference on artificial intelligence
Li H, Fei G, Wang S, Liu B, Shao W, Mukherjee A, Shao J (2017) Bimodal distribution and cobursting in review spam detection. In: Proceedings of the 26th international conference on World Wide Web. pp 1063â€“1072
Liu S, Hooi B, Faloutsos C (2018) A contrast metric for fraud detection in rich graphs. IEEE Trans Knowl Data Eng 31(12):2235â€“2248
Luca M (2016) Reviews, reputation, and revenue: the case of Yelp. Com. (March 15, 2016). Harvard Business School NOM Unit Working Paper, (12016)
Mukherjee A, Liu B, Glance N (2012). Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web. pp 191â€“200
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. pp 632â€“640
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. arXiv preprint arXiv:1107.4557
Rayana S, Akoglu L (2015) Collective opinion spam detection: Bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. pp 985â€“994
Shehnepoor S, Togneri R, Liu W, Bennamoun M (2021) HINRNN: a graph representation learning neural network for fraudster group detection with no handcrafted features. In: IEEE transactions on neural networks and learning systems. pp 1â€“14
Shehnepoor S, Togneri R, Liu W, Bennamoun M (2022) Spatiotemporal graph representation learning for fraudster group detection. In: IEEE transactions on neural networks and learning systems. pp 1â€“15
Wang G, Xie S, Liu B, Yu PS (2012) Identify online store review spammers via social review graph. ACM Trans Intell Syst Technol (TIST) 3(4):1â€“21
Wang Z, Hou T, Song D, Li Z, Kong T (2016) Detecting review spammer groups via bipartite graph projection. Comput J 59(6):861â€“874
Wang Z, Gu S, Zhao X, Xu X (2018) Graphbased review spammer group detection. Knowl Inf Syst 55(3):571â€“597
Wang J, Guo Y, Wen X, Wang Z, Li Z, Tang M (2020) Improving graphbased label propagation algorithm with group partition for fraud detection. Appl Intell 50(10):3291â€“3300
Wang X, Liu N, Han H, Shi C (2021a) Selfsupervised heterogeneous graph neural network with cocontrastive learning. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. pp 1726â€“1736
Wang Y, Zhang J, Guo S, Yin H, Li C, Chen H (2021b) Decoupling representation learning and classification for GNNbased anomaly detection. In: Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. pp 1239â€“1248
Wang X, Bo D, Shi C, Fan S, Ye Y, Philip SY (2022) A survey on heterogeneous graph embedding: methods, techniques, applications and sources. IEEE Trans Big Data 9(2):415â€“436
Xu C, Zhang J, Chang K, Long C (2013) Uncovering collusive spammers in Chinese review websites. In: Proceedings of the 22nd ACM international conference on information & knowledge management. pp 979â€“988
Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv preprint arXiv:1810.00826
Ye J, Akoglu L (2015) Discovering opinion spammer groups by network footprints. In: Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2015, Porto, Portugal, September 7â€“11, 2015, Proceedings, Part I 15. pp 267â€“282
Zhang F, Hao X, Chao J, Yuan S (2020a) Label propagationbased approach for detecting review spammer groups on ecommerce websites. KnowlBased Syst 193:105520
Zhang Y, Li Y, Gu X, Ji S (2021) A group spam detection algorithm combining behavior and structural feature reasoning. Comput Eng Sci 43(05):926â€“935
Zhang Q, Ji S, Zhang W et al (2022a) Group spam detection algorithm considering structure and behavior characteristics. Appl Res Comput 39(05):1374â€“1379
Zhang F, Yuan S, Wu J, Zhang P, Chao J (2022b) Detecting collusive spammers on ecommerce websites based on reinforcement learning and adversarial autoencoder. Expert Syst Appl 203:117482
Zhang S, Yin H, Chen T, Hung QVN, Huang Z, Cui L (2020b) GCNbased user representation learning for unifying robust recommendation and fraudster detection. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. pp 689â€“698
Zheng M, Zhou C, Wu J, Pan S, Shi J, Guo L (2018) FraudNE: a joint embedding approach for fraud detection. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1â€“8
Zhu C, Zhao W, Li Q, Li P, Da Q (2019) Network embeddingbased anomalous density searching for multigroup collaborative fraudsters detection in social media. Comput Mater Continua 60(1):317â€“333
Acknowledgements
The authors would like to thank the editor and anonymous referees for their constructive comments.
Funding
This paper is supported in part by the Natural Science Foundation of China (No. 71772107, 62072288), Shandong Nature Science Foundation of China [Grant No. ZR2019MF003, ZR2020MF044].
Author information
Authors and Affiliations
Contributions
QZ completed the writing and experiments for the manuscript, ZL and BX examined and validated experiments, and SJ and DKWC provided guidance and suggestions for revision. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, Q., Liang, Z., Ji, S. et al. Detecting fake reviewers in heterogeneous networks of buyers and sellers: a collaborative trainingbased spammer group algorithm. Cybersecurity 6, 26 (2023). https://doi.org/10.1186/s42400023001598
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s42400023001598