 Research
 Open access
 Published:
Reversible data hiding based on histogram and prediction error for sharing secret data
Cybersecurity volumeÂ 6, ArticleÂ number:Â 12 (2023)
Abstract
With the advancement of communication technology, a large number of data are constantly transmitted through the internet for various purposes, which are prone to be illegally accessed by third parties. Therefore, securing such data is crucial to protect the transmitted information from falling into the wrong hands. Among data protection schemes, Secret Image Sharing is one of the most popular methods. It protects critical messages or data by embedding them in an image and sharing it with some users. Furthermore, it combines the security concepts in that private data are embedded into a cover image and then secured using the secretsharing method. Despite its advantages, this method may produce noise, making the resulting stego file much different from its cover. Moreover, the size of private data that can be embedded is limited. This research works on these problems by utilizing predictionerror expansion and histogrambased approaches to embed the data. To recover the cover image, the SS method based on the Chinese remainder theorem is used. The experimental results indicate that this proposed method performs better than similar methods in several cover images and scenarios.
Introduction
The vast integration of the Internet of Things (IoT) in recent years has resulted in many aspects of people's activities being recorded and transmitted on the internet (Shambour and Gutub 2022). This technology is useful for people's daily lives, business, and health and helps create opportunities to solve problems that were previously impossible to overcome (Namasudra et al. 2020). Despite its positive impact, this technology is also accompanied by one weakness; it can invite potential disruption to the transmitted data and communications. That is why proper information security must be implemented to prevent any possibility of the data being accessed, stolen, or edited by illegal parties. Generally, there are two approaches to information security: cryptography and steganography. Both serve the same purpose but have different manners of achieving it. In cryptography, the main idea is to change the data into an incomprehensible and unreadable form (Kumar et al. 2021; Pavithran et al. 2022). Protecting data using this method typically indicates the importance of the encrypted data, creating a risk of disruption. However, this risk is minimalized in the latter method. In steganography, also known as data hiding, the confidential data are embedded into cover media (Kadhim et al. 2019); it does not change the data's format but keeps the data's presence secret (Ardiansyah et al. 2017). The nature of this approach can reduce the risk of attempts to disrupt the data because only the sender and the receiver know the significance of the transmitted data. A suitable data hiding method must prevent undesirable parties from realizing the data's presence. To that end, there are several essential aspects of suitable data hiding: imperceptibility, security, data capacity, and robustness, which must be kept at an optimum level.
In data hiding, the protected data can be any binary file, while the cover is either a digital image, audio, video, or text file. Digital images are a popular medium for covering confidential data (Kamal and Islam 2019; Yao et al. 2020; Hassan and Gutub 2022) because of their relatively small size. For this reason, however, the maximum amount of embedded data is quite limited compared to audio and video. The payload size stored in the cover image affects the number of changes in the pixels of this cover. It reduces the quality of the produced stego image. While some degradation types are inevitable, they should be kept to a minimum and not be easily identified, especially by the human eye (Suresh and Sam 2022). Research may focus on different aspects of data hiding, like improving the payload capacity of the stego images and maintaining their imperceptibility (Kumar and Agrawal 2016; Chang et al. 2015) or increasing the quality of the stego images while still having a decent payload size (Islamy and Ahmad 2019). In some circumstances, the embedding method only focuses on improving the capacity of the embedded data (Yu et al. 2022a). This approach is gaining popularity because of the widespread use of cloud storage (Yu et al. 2022b). Generally, these methods are applied alongside cryptography, where the payload is embedded in the encrypted image.
The embedded data must be fully extractable from the cover image, which can be disposed of without returning it to its original state. However, in some conditions, we may need to restore the cover image after reconstructing the embedded data; this scheme is called Reversible Data Hiding (RDH) (Cheddad et al. 2010; Kar et al. 2018). The RDH method is established around the expansion of pixel values. It can be divided into three major types: Differential Extension (DE), Histogram Shifting (HS), and PredictionError (PE) (Rad et al. 2016).
The DE method (Tian 2003) employs the difference and the integer average in a pixel block of an image to embed the data. This scheme is further improved by Dragoi and Coltuc (2014) by adding a local predictionbased DE. In that scheme, PE values are generated by calculating the leastsquares predicted value of the pixel block, and then DE extends the PE value and embeds the data. The applications of the DE method can be observed further in (Al Huti et al. 2016; Niu et al. 2017; Prabowo and Ahmad 2018). On the other hand, the PE focuses on finding the predicted error value of the pixels. It is then utilized as the space for the payload (Thodi and RodrÃguez 2007). This method is paired with the HSbased method to improve the quality of the stego images (Hong et al. 2009). In recent years, much research has used the prediction function to calculate the prediction error values, from which a histogram can be generated to embed the secret data (Hong et al. 2009; Rad et al. 2014; Luo et al. 2015; Kumar and Agrawal 2016; Yao et al. 2017; Kamal and Islam 2019). PE can also be implemented on the encrypted image (Tang et al. 2021), where the secret data are embedded using PE after the image has been encrypted.
Despite all signs of progress in information security methods, cryptography and steganography have a potential flaw in which only one party can access the secured data. This disadvantage can lead to misuse of the information or loss of the key in the case of encrypted data (AlShaarani and Gutub 2021). In order to minimize those problems, the Shamir's Secret Sharing method (1979) can be implemented to share secured data into parts and allocate them to different participants. In the case of imagebased data hiding, it produces several shadows or shared images, known as Secret Image Sharing (SIS), which the corresponding participants retrieve. It is applicable in numerous cases and is responsible for securing the distribution and storage of digital images in a cloud environment. However, previous research shows (Islamy and Ahmad 2022) that those generated stego images have dropped in quality, affecting their imperceptibility.
Based on their robustness, datahiding approaches can be categorized into two groups. The first is robust, which can withstand modifications, such as compression. The second one is nonrobust that the stego image is damaged (unrecoverable) if there is a change. Both methods have advantages and disadvantages, all of which can be applied depending on the purpose. The nonrobust is typically used in the spatial approach. This is intended to maintain the integrity of the stego image (Kadhim et al. 2019). That is, if the receiver can extract the payload and cover back to its origin, then the stego image is certainly not to experience an active attack. Meanwhile, to deal with passive attacks, like other research, is done by making the stego as similar as possible to the cover. This is also one of the objectives to be achieved in this research.
Considering those issues, this research aims to tackle the imperceptibility problem affecting the stego image by investigating both PE and HS methods, and utilizing the SIS technique. The remaining sections of this study are described as follows. "Related works" section discusses related research around data hiding and secret sharing. "Proposed method" section explains the proposed method, whose experimental results are analyzed in "Results and discussion" section. Finally, "Conclusion" section is presented to conclude the proposed method.
Related works
Secret sharing
The secret sharing technique divides data into \(n\) shares and circulates them among participants (Shamir 1979). In order to recover those original data, the dealer must at least retrieve \(k\) of them, where \(k\le n\). With that in mind, the original data can be restored if \(k\) or more shares are collected. However, it is not feasible to retrieve the original data if the collected shares are less than \(k\) since that collected information is not enough to recover the original data. This process can be calculated using the polynomial function in Eq.Â (1). In this function, \(h\) is the original data, while \(\mathrm{c}\) is the random coefficient. When implemented on an image, \(\mathrm{h}\) can be substituted for a pixel of the image or \(I\).
The Lagrange function is utilized to restore the original data (\(h\)) and \({c}_{1}\), â€¦, \({c}_{k1}\) of \(F\left(x\right)\), after \(k\) or more shares are collected.
Histogrambased method
As the name suggests, this method is developed based on the utilization of image histograms (Ni et al. 2006). This histogram contains information that can help embed private data. Overall, there are three steps required to embed those data. The first step is searching for the most and least frequent pixels on the cover image, data that are easily obtainable from the histogram. The histogram's most frequent pixel can be recognized as the peak point, while the least frequent pixel is the lowest or the zero point. Then, all pixels positioned between the lowest and the peak points are â€˜shiftedâ€™ to a new position leaving one pixel empty. It means that the shifted pixelâ€™s value is changed according to the location of the peak and lowest point pixels. Next, the empty pixel is utilized where the embedding process takes place. To embed the data, the empty pixel is gradually filled by the neighboring pixels, which indicates the total amount of private data in the number of bits. To fully understand these processes, let us define them in Eqs. (2) and (3), where the former is the pixelshifting process, and the latter is the embedding process. In both equations, \(P\) and \(L\) are the peak and the lowest pixel points, respectively; \(I\) and \(I^{\prime }\) are the pixel of the cover image before and after being shifted. The \(i\) and \(j\) notations indicate the pixelâ€™s position in a block; and lastly, \(b\left(n\right)\) denotes the secret bits with \(n\) as the index.
From those equations, notice that the embedding process can only occur as much as the number of peak pixels. It shows that the capacity of the data is tied to peak pixel frequencies. This is a weak aspect of this method compared to others like the LSB and DE. For that reason, cover images with a high peak pixel frequency, such as medical images, are preferable when using this method.
Still related to this method, an improvement is proposed by Islamy and Ahmad (2021) to increase the quality of the stego image and also enhance the payload capacity. They use PE to expand the embedding capacity of HS, and the histogram is generated after the image pixels are transformed into an error value. The error value is categorized according to the corresponding histogram partitions, each with the peak and lowest error values. They also implement the payload distribution to increase the embedding capacity, increasing the embedded bits in an error value. Their experimental results show better image quality and capacity performance than previous research.
The combination of data hiding and secret image sharing
Wu et al. (2018) presented a combination of data hiding and SIS to protect data in the cloud computing environment. First, they use SS to encode the cover image and HS and DE to embed it. This method can significantly increase the embedded data size but decrease the stego images' quality. The SIS method is also utilized by Ahmad et al. (2014) to protect medical data inside medical images. In that algorithm, the cover image is separated using SS; then, the medical data are embedded into the share images using 1bit LSB and 2bit LSBs. Based on the experimental results, implementing 1bit LSB yields better image quality, but the drawback is that it has lower data capacity. It needs a more extensive cover image to match the capacity of 2bit LSBs, but a more extensive cover image size requires more bandwidth and storage.
In (2016), Yuan et al. proposed adjusting the threshold (\(k\)), which is beneficial if the security policy is changed or it is impossible to retrieve the required \(k\). For instance, the remaining shares are useless if some participants are lost. To alleviate this problem, the proposed scheme has \(N\) probability of the potential thresholds \({t}_{1}\), \({t}_{2}\),â€¦, \({t}_{N}\). Then they use the twovariable oneway function to create the identification value. The experimental results indicate that the quality of the stego images is reasonable, and the threshold can be safely changed.
Another SISbased method has been proposed by Yan et al. (2020). Instead of hiding information by applying Visual Secret Sharing (VSS) to polynomialbased SIS using screening operations, they implement an SIS scheme with different shadow authentication capabilities. That proposed scheme is less complex in generation and recovery (authentication) and does not have pixel extensions with different shadow authentication capabilities. In addition, lossless recovery is achieved without additional encryption.
Later, Meng et al. (2021) introduced a reversible extended secret image sharing (RESIS) scheme to secure data. That scheme is designed based on the implementation of the secretsharing method by employing the Chinese Remainder Theorem (CRT) for a polynomial ring to turn confidential data into pieces of information. First, they define \({m}_{0}\left(x\right)\), which can be described in Eq.Â (4). The dealer picks four pixels in a block of 4â€‰Ã—â€‰4 pixels and then utilizes 2bits LSBs in those pixels to construct a polynomial \(C\left(x\right)\) in Eq.Â (5). A pixel value of \(I\) is used in Eq.Â (6) to construct \(D\left(x\right)\). A sharing function \(F\left(x\right)\) is calculated in Eq.Â (7) to obtain the share values. The quality of the generated shared image, measured by the Peak SignaltoNoise Ratio (PSNR), is over 40Â dB.
Here, \({a}_{i,1}\) and \({a}_{i,2}\) are the least and the second least LSB of one of the pixels in a block.
In this case, \({b}_{j}\) is the \(j\)th LSB of \(I\) started from 8.
Proposed method
The proposed embedding phase is generally shown in Fig.Â 1, consisting of three phases: initialization, embedding, and extraction. It also describes the complete flow of the proposed scheme to obtain the stego data from the initialization of the SS implementation.
Initialization
This initialization phase aims to prepare the cover image for embedding and set the parameter for the Secret Sharing method, which can be described in the following steps:

1.
First, transform the pixel values of the cover image into PE values by using a predictor. Prediction error refers to the difference between a specific pixel and the value that was estimated based on its surroundings. For this purpose, the median edge detector (MED) is utilized as the predictor to calculate the prediction value specified in Eq.Â (8). In that formula, \({\widehat{\text{I}}}_{\text{i,j}}\) represents the prediction value of a pixel.
$$\hat{I}_{{i,j}} = \left\{ \begin{gathered} \min \;\left( {I_{{i,j  1}} ,I_{{i  1,j}} } \right)if{\mkern 1mu} I_{{i  1,j  1}} \ge \max \;\left( {I_{{i,j  1}} ,I_{{i  1,j}} } \right) \hfill \\ \max \;\left( {I_{{i,j  1}} ,I_{{i  1,j}} } \right)if{\mkern 1mu} I_{{i  1,j  1}} \le \min \;\left( {I_{{i,j  1}} ,I_{{i  1,j}} } \right) \hfill \\ I_{{i,j  1}} + I_{{i  1,j}}  I_{{i  1,j  1}} \;otherwise \hfill \\ \end{gathered} \right.$$(8) 
2.
Calculate the difference between the original pixel and the generated prediction value using Eq.Â (9) to obtain the PE value (\({E}_{i,j}\)).
$${E}_{i,j}={I}_{i,j}{\widehat{I}}_{i,j}$$(9) 
3.
The dealer must set the total number of participants (\(n\)) and the minimum number required to restore the image (\(k\)).

4.
The \(\mathrm{n}\) predicted images are generated based on the total number of participants, which is set in step 3. Furthermore, these images contain the SS pixels from the embedding phase.
Embedding phase
In the embedding phase, the secret bits are categorized into groups containing several bits, and each group has a different embedding position. In general, this phase scans the four leftmost secret bits and compares them with other groups, looks for a group having the same bit values, and then embeds the data by changing the value of the previously determined PE value. The embedding phase can be depicted in Fig.Â 2.
In detail, the steps proposed in the embedding phase are as follows:

1.
First, search for the peak (\(P\)) of the PE or the most frequent PE value.

2.
The secret bits are categorized into 16 groups, each consisting of four bits. The groups are formed based on all the possible combinations of those four bits. Each group has its embedding position, explained in step 5. The list of the group and its corresponding position can be seen in Table 1.

3.
Scan the secret bits (\({b}_{n}\)) and pick the four leftmost: \({b}_{1}, {b}_{2}, {b}_{3}\), and \({b}_{4}\).

4.
The number of embedding positions \({P}_{ep}\) can be calculated using Eq.Â (10), where \(\mathrm{d}\) is the number of bits in each secret bit group, as already described in step 2; so \(d=4\).
$${P}_{ep}\text{=}{2}^{d}$$(10) 
5.
Check the scan results in step 3 and compare them to the conditions presented in Eq.Â (11), then pick the embedding position or \(ep\). The \(ep\) is the neighbouring PE value of \(P\), and it corresponds to the secret bit groups. For example, if the scan results are \({b}_{1}=1\), \({b}_{2}=1\), \({b}_{3}=1\), \({b}_{4}=1\), then search the PE value of \(P + 1\).
$$ep=\left\{\begin{array}{c}P+1\, if\, {b}_{1}=1 \, and \, {b}_{2}=1 \, and \, {b}_{3}=1 \, and \, {b}_{4}=1 \\ P+3\, if\, {b}_{1}=1 \, and \, {b}_{2}=1 \, and \, {b}_{3}=1 \, and \, {b}_{4}=0\\ P+5\, if \, {b}_{1}=1 \, and \, {b}_{2}=1 \, and \, {b}_{3}=0 \, and \, {b}_{4}=0\\ P+7\, if\, {b}_{1}=1 \, and \, {b}_{2}=0 \, and \, {b}_{3}=0 \, and \, { b}_{4}=0\\ P+9\, if \, {b}_{1}=0 \, and \, {b}_{2}=0 \, and \, {b}_{3}=0 \, and \, {b}_{4}=0\\ P+11\, if \, {b}_{1}=0 \, and \, {b}_{2}=0 \, and \, {b}_{3}=0 \, and \, {b}_{4}=1\\ P+13\, if\, {b}_{1}=0 \, and \, {b}_{2}=0 \, and \, {b}_{3}=1 \, and \, {b}_{4}=1\\ P+15\, if \, {b}_{1}=0 \, and \, {b}_{2}=1 \, and \, {b}_{3}=1 \, and \, {b}_{4}=1\\ \begin{array}{c}P1\, if\, {b}_{1}=1 \, and \, {b}_{2}=0 \, and \, { b}_{3}=1 \, and \, {b}_{4}=0 \\ P3\, if\, {b}_{1}=1 \, and \, {b}_{2}=0 \, and \, {b}_{3}=0 \, and \, {b}_{4}=1\\ P5\, if \, {b}_{1}=1 \, and \, {b}_{2}=0 \, and \, {b}_{3}=1 \, and \, {b}_{4}=1\\ P7 \, if \, {b}_{1}=1 \, and \, {b}_{2}=1 \, and \, {b}_{3}=0 \, and \, {b}_{4}=1\\ P9\, if\, {b}_{1}=0 \, and \, {b}_{2}=0 \, and \, {b}_{3}=1 \, and \, { b}_{4}=0\\ P11\, if\, {b}_{1}=0 \, and \, {b}_{2}=1 \, and \, {b}_{3}=0 \, and \, {b}_{4}=0\\ P13\, if\, {b}_{1}=0 \, and \, {b}_{2}=1 \, and \, {b}_{3}=1 \, and \, {b}_{4}=0\\ P15\, if\, {b}_{1}=0 \, and \, {b}_{2}=1 \, and \, { b}_{3}=0 \, and \, {b}_{4}=1\end{array}\end{array}\right.$$(11) 
6.
After the \(\mathrm{ep}\) has been found, the secret bit can be embedded. Compare \(\mathrm{ep}\) with \(P\), if it is higher than \(P\), increase it by \(1\); if it is lower than \(P\), then reduce it by 1. Here, we take the same instance as the previous step and use \(P + 1\), so add 1, and it becomes \(P + 1 + 1\). Another example is, if the embedding position is lower than \(P\), for instance \(P {} 3\). To embed the secret bits, decrease it by 1, having \(P  3  1\). These steps can also be described in Eq.Â (12).
$$e{p}^{^{\prime}}=\left\{\begin{array}{c}ep+1 if (ep=P+1) or (ep=P+3) or (ep=P+5) or (ep=P+7) or \\ (ep=P+9) or (ep=P+11) or (ep=P+13) or (ep=P+15) \\ \\ ep1 if (ep=P1) or (ep=P3) or (ep=P5) or (ep=P7) or \\ (ep=P9) or (ep=P11) or (ep=P13) or (ep=P15)\end{array}\right.$$(12) 
7.
Scan the next four leftmost bits, and repeat steps 5 and 6 until all bits have been embedded.

8.
The location map is used to save the location of \(ep^{^{\prime}}\) and it can be presented as \(LM_{i}\), where \(i\) is the index of the \(LM_{i}\). Each \(ep\) location is saved and used in the data retrieval process later.

9.
The SS is implemented on the embedding position on \(ep^{\prime}\) by using polynomials in Eqs. (4), (5), (6), and (7). Polynomial \(D\left( x \right)\) stores the 8 bits of the \(ep^{\prime}\), while 2 bits of LSB of the surrounding PE value of \(ep^{\prime}\) are stored in \(C\left( x \right)\). Finally, sharing polynomial \(F\left( x \right)\) in Eq.Â (7) generates \(n\) share PE values.

10.
Let the output of Eq.Â (12) be \(ep_{i}^{\prime \prime }\), where \(i\) is the PE value of the \(i\)th participant. The difference between \(ep_{i}^{\prime \prime }\) and \(ep^{^{\prime}}\) is often quite large, and it can cause major distortion to the stego image if we substitute the value of \(ep\) with \(ep_{i}^{\prime \prime }\). To mitigate this problem, \(ep_{i}^{\prime \prime }\) is embedded by utilizing 2 bits of LSB and changing the PE values to binary bits. Then, split those binaries into four groups, each consisting of 2 bits. For instance, if the binaries are 11,101,011; the groups are (1, 1), (1, 0), (1, 0), and (1, 1).

11.
Those binaries are embedded into the neighbour of the \(ep^{\prime}\) of each predicted image and are located on the top, left, bottom and right of the \(p{^{\prime}}\). The position is formulated as\(\left( {i  1, j} \right)\), \(\left( {i, j  1} \right)\), \(\left( {i + 1, j} \right)\), and\(\left( {i, j + 1} \right)\). The four pixels are calculated using Eq.Â (13), with \(b_{i}\) as the \(i\)th bit of the\(ep^{\prime}\).
$$\left. {\begin{array}{*{20}c} {E_{i  1, j}^{^{\prime}} = E_{i  1, j}  \left( {E_{i  1, j} {\text{mod}}2^{t} } \right) + b_{0} + b_{1} \times 2} \\ {E_{i, j  1}^{^{\prime}} = E_{i, j  1}  \left( {E_{i, j  1} {\text{mod}}2^{t} } \right) + b_{2} + b_{3} \times 2} \\ {E_{i + 1, j}^{^{\prime}} = E_{i + 1, j}  \left( {E_{i + 1, j} {\text{mod}}2^{t} } \right) + b_{4} + b_{5} \times 2} \\ {E_{i, j + 1}^{^{\prime}} = E_{i, j + 1}  \left( {E_{i, j + 1} {\text{mod}}2^{t} } \right) + b_{6} + b_{7} \times 2} \\ \end{array} } \right\}$$(13) 
12.
Repeat steps 8â€“9 until all of the \(ep^{^{\prime}}\) are embedded into \(E_{i  1, j}\), \(E_{i, j  1}\), \(E_{i + 1, j}\),\(E_{i, j + 1}\)

13.
Next, return the PE values of each predicted image to pixel form by using Eq.Â (14), where \(I_{i,j}^{^{\prime}}\) is the pixel value of the stego image, and \(E_{i,j}^{^{\prime}}\) is the PE value after the embedding and sharing process.
$$I_{i,j}^{^{\prime}} = \hat{I}_{i,j}  E_{i,j}^{^{\prime}}$$(14)
Extraction phase
The extraction phase restores the embedded secret data from the share images. To extract the secret data and restore the original cover image, at least \(k\) shared images are needed. In general, this process is the reverse step of the embedding process. This process is described in detail in the following step:

1.
First, transform the image to PE values using Eq.Â (15) and identify \(P\).
$$E_{i,j}^{^{\prime}} = \hat{I}_{i,j}  I_{i,j}^{^{\prime}}$$(15) 
2.
With the help of \(LM_{i}\) obtained earlier, scan the PE value and compare it to \(LM_{i}\). If the location of the scanned PE values matches \(LM_{i}\), then identify the neighbour PE values.

3.
To obtain \(ep_{i}^{^{\prime\prime}}\) extract the share PE values of the stego image by employing the surrounding PE values using Eq.Â (16).
$$ep_{i}^{^{\prime\prime}} = \left( {E_{i  1, j} \;{\text{mod}}4} \right) \times 2^{0} + \left( {E_{i, j  1} \; {\text{mod}}4} \right) \times 2^{2} + \left( {E_{i + 1, j} \;{\text{mod}}4} \right) \times 2^{4} + \left( {E_{i, j + 1} \;{\text{mod}}4} \right) \times 2^{6}$$(16) 
4.
Determine the value of \(F\left( {x_{1} } \right)\), \(F\left( {x_{2} } \right)\), \(F\left( {x_{3} } \right)\), â€¦, \(F\left( k \right)\) by collecting at least \(k\) share images.

5.
Next, Eq.Â (7) is implemented to retrieve \(ep^{^{\prime}}\) and its surrounding 2 bits of the LSB of the PE value.

6.
To obtain the embedded bits, check their position related to \(P\). For this purpose, use Eq.Â (11) to help understand the extracted bits.

7.
Steps 2â€“5 are repeated until all secret bits are taken completely.

8.
After the data extraction and the PE value recovery processes have finished, the cover image is obtained using Eq.Â (17).
$$I_{i,j} = \hat{I}_{i,j} + E_{i,j}$$(17)
Results and discussion
Experimental environment
In the experiment, ten general images and ten medical images are used as the test images. They are acquired from (USCSIPI 2021) and (National Library of Medicine 2022), respectively, and have a resolution of 512â€‰Ã—â€‰512 pixels. The experiments are applied using MATLAB 2017a on AMD Ryzen 5 3600 CPU with 16Â GB memory.
In analyzing the quality of the stego images, the Peak SignaltoNoise Ratio (PSNR) is used as a measurement. It works by calculating the noise level of an image; in this case, the noise is the difference between the original image and the stego image. The calculation of PSNR is carried out using Eqs. (18) and (19), where \(I_{{{\text{MAX}}}}\) is the imageâ€™s highest pixel value, MSE is the mean square error, and \(W\) and \(H\) are the width and height of the image in pixels, respectively.
Result analysis
The first experiment scenario tests the impact of the different \(k\) on the stego images, in which there are four tested \(k\): 3, 4, 5, and 6, while the number of participants is 6. It is essential to notice that in all scenarios, we divide the results based on the type of cover images, general and medical; either one has different characteristics and can impact the results. For the secret data, we generate 131,072 bits, which is the maximum number of payload size can be held by (Yuan et al. 2016; Meng et al. 2021). It matches the number of bits of 128â€‰Ã—â€‰128 pixels of a greyscale image (Islamy 2022). Table 2 shows the average PSNR value of stego images of each \(k\) tested in general cover images, while Table 3 provides it for the medical images. Data presented in these tables reflect that the average PSNR values of the stego images higher than 30Â dB, render the images not visually distinguishable as they could be if the values are below 30Â dB (Kyriakopoulos and Parish 2007). It shows that the threshold \(\left( k \right)\) does not significantly impact the quality of the resulting stego images. Then, a oneway analysis of variance (ANOVA) is performed to prove whether or not \(k\) affects the quality of the resulting stego image. We pick the probability value or \(p\)value and compare it to the significance level (\(\alpha\)). The primary interpretation of the \(p\)value is whether or not there is enough evidence to reject the null hypothesis, which, in this case, is that \(k\) has no significant impact. This test's significance level is 0.05 because it is considered conventional and the most commonly used. From the statistical test, the \(p\)value of results in Tables 1 and 2 is 0.9999 and 0.9994, respectively. Both have more \(p\)value than the \(\alpha\); this means the null hypothesis is proven to be correct, and statistically, there is no significant difference between different groups of \(k\). There is no significant impact on the results because of the utilization of 2bit LSB. So, the level of change in the PE value caused by the sharing process can be reduced. It can also be observed that the general image has a lower PSNR value than the medical images. It is found that medical images have more black pixels and a smaller overall variety of pixel colours; all these traits have helped them become more tolerant of change.
For the second scenario, the experiment was performed on a different number of participants (\(n\)), and then the average PSNR of each \(n\) was measured. In this scenario, the utilized \(k\) is 4, tested on \(n\) of 7, 8, 9, and 10 using the same secret bits as the previous scenario. It is performed in order to understand the effect of utilizing various \(n\) values. The result is presented in Figs.Â 3 and 4, where the former represents the general images while the latter is for medical images. It is found that the use of different \(n\) has minimal impact on the quality of the stego images. Again, to prove this statement, those results are calculated using oneway ANOVA. The \(p\)value of Figs.Â 3 and 4 are 0.9999; both are more than the \(\alpha\). These results are identical to the previous scenario and show that there are not enough differences. Similar to the previous scenario, 2bit LSBs help reduce the sharing process's impact.
Given the results from those two scenarios, one thing that should be noticed is how to decide the optimal combination of \(k\) and \(n\). From the image quality standpoint, the dealer can use as many participants as possible and choose higher thresholds, as there is no noticeable impact on quality as provided in the previous analysis. Furthermore, the more \(k\) and \(n\) means that it is harder for third parties to obtain the original data. Nevertheless, both values affect the complexity of the algorithm. A higher participant number involved in the sharing process increases the computation, representing linear growth complexity (\(O\left( n \right)\)). Therefore, theoretically, despite the PSNR results of \(n = 4\) and \(n = 20\) being similar, the lower number of \(n\) can produce a faster execution time.
In the third scenario, the proposed method is tested with various secret data sizes. There are eleven sizes of the secret data: 1Â Kb, 10Â Kb, 20Â Kb, 30Â Kb, 40Â Kb, 50Â Kb, 60Â Kb, 70Â Kb, 80Â Kb, 90Â Kb, and 100Â Kb, obtained from (Islamy 2022). This scenario aims to understand the relationship between the embedding capacity and the quality of the stego images. In this scenario, \(n\) is 10 and \(k\) is 4. The average PSNR is measured for each data size, whose results can be found in Figs.Â 5 and 6 for general and medical images, respectively. Based on the results, it is found that the more data embedded in the cover image, the more the quality of the stego image decreases. Nevertheless, the quality reduction is getting smaller along with the rising payload size. For example, the PSNR of the general and medical images started to degrade less when embedded with more than 60Â Kb of data. This means that the proposed method is suitable for embedding large amounts of data.
The stego image quality of the proposed method is compared with earlier work (Yuan et al. 2016; Meng et al. 2021) and is shown in Tables 4 and 5. The secret data used in this fourth scenario is the same as in the first and second scenarios. The results indicate that the proposed method has a better PSNR value than existing methods with the same amount of payload. It is worth noting that in the proposed method, we calculate the PE value first and then embed the data. The sharing process is implemented within the embedded PE value. It has been found that instead of the actual image pixels, the sharing process occurs within the PE value of the cover image. Afterwards, the value generated from the sharing process is embedded into its neighbour through 2bit LSBs. Because of that, the embedding space of the proposed method does not depend on the cover image size. In contrast, the cover image size plays a significant role in dictating the embedding space of the related methods (Yuan et al. 2016; Meng et al. 2021). The larger the size, the more embedding space is provided, which can cause a bandwidth issue if the cover image has a high resolution.
Table 6 compares the proposed scheme with (Yuan et al. 2016; Meng et al. 2021) from the functionality perspectives. It indicates that this proposed scheme maintains essential functionality, such as secret data and original cover images that can be recovered losslessly. Also, in the embedding phase of the proposed scheme, the payloads are divided into different categories, so each PE value used for embedding can contain more than one bit. It means that the number of secret bits that can be embedded increases. This scheme reduces the number of PE value that needs to be changed. For this reason, the stego image quality is improving than only embedding one bit per PE value. The proposed method utilizes LSB, and it is generally difficult to recover the original pixel of the cover image. Implementing CRTbased SIS helps to eliminate side information needed to recover the original cover image. The method includes the original cover PE value in the design's sharing polynomial \(\left( x \right)\) calculations. Thus, when \(\left( x \right)\) is recovered by \(k\) stego imagesâ€”the minimum numberâ€”both the secret image and the cover image can be losslessly restored.
Validity and security analysis
To validate the proposed method, we implement the threshold \(k = 1\) without 2bits LSBs, checking whether the scheme is valid. This threshold value has to be the same as the normal process without dividing (sharing) the stego image. The generated stego image is precisely the same as the stego image generated without sharing process; it is shown by its PSNR value, which is âˆž.
An apparent concern in the proposed method is the access to confidential information by a third party or the possibility of destroying the stego images because of the weak LSB substitution. The issue is addressed while implementing SS after the embedding phase, leaving the only way the third party accessing or modifying the protected data by collecting at least \(k\) images. The unwanted party has to destroy or modify at least \(k + 1\) share images to remove the possibility of recovering the secret data completely, where \(k + 1 > n/2\). To put it simply, the higher the number of participants and the thresholds, the harder the unwanted parties to obtain the protected data. Therefore, both of them directly influence the methodâ€™s security.
The histogram is the distribution of pixels of an image and can be used as the indication of a visually secure stego image (AlShaarani and Gutub 2021). In an image histogram, the \(x\)axis is the pixel value of the image while the \(y\)axis is the number of the respective pixels. Generally, the stego image histogram has to be similar to the original cover image. FigureÂ 7 compares the six share stego images of 'Airport' with the original image. The embedded data are 50Â Kb, and the \(n\) and \(k\) are 8 and 4, respectively. The result shows that all the histogram is quite similar to each other; this characteristic is also presented in other test images. So the proposed method can produce a secure stego image in terms of the histogram. This is also emphasized in Fig.Â 8, where the histogram of the stego images are presented in a chart and compared to each other. Based on that figure, it is found that the difference of each stego image histogram is very minimal and appears identical.
Another metric to measure security is by comparing the PSNR of the stego images with the original (Kadhim et al. 2019), which is discussed in the previous section because the PSNR value of the stego image represents its similarity with the original image. The higher the PSNR value, the harder to distinguish the stego image and the original image. Therefore, lowering the chance of an attack.
Conclusion
This research is motivated by hiding private data into a cover medium to secure them. We consider SS based on CRT and use it alongside the HSbased scheme. Before dividing the image using CRTSS, the embedding process is done on the PE value. We implement 2bit LSBs to minimize the distortion of the stego image. Several thresholds and participants are evaluated in the experiment, showing minimum changes to the stego images. The implementation of LSB causes the cover image to be lost after the extraction process, but using CRTSS helps prevent this. The experimental results depict that the proposed method provides better results than the previous ones.
In the future, this research can be extended to include some possibilities, for instance, how the dealer selects the cover image and ensures that it is safe and free from malicious software. Although it is out of the data hiding research scope, that selection can improve the security of the whole system.
Availability of data and materials
References
Ahmad T, Studiawan H, Ahmad HS, Ijtihadie RM, Wibisono W. Shared secretbased steganography for protecting medical data. In: International Conference on Computer, Control, Informatics and Its Applications. pp 87â€“92; 2014.
Al Huti MHA, Ahmad T, Djanali S. Increasing the capacity of the secret data using DE pixels blocks and adjusted RDEbased on grayscale images. In: International Conference on Information and Communication Technology and Systems. pp 225â€“230; 2016.
AlShaarani F, Gutub A (2021) Securing matrix countingbased secretsharing involving crypto steganography. J King Saud Univ Comput Inf Sci 34(9):6909â€“6924. https://doi.org/10.1016/j.jksuci.2021.09.009
Ardiansyah G, Sari CA, Setiadi DRIM, Rachmawanto EH. Hybrid method using 3DES, DWT and LSB for secure image steganography algorithm. In: 2nd International conferences on Information Technology, Information Systems and Electrical Engineering. pp 249â€“254; 2017.
Chang IC, Hu YC, Chen WL, Lo CC (2015) High capacity reversible data hiding scheme based on residual histogram shifting for block truncation coding. Signal Process 108:376â€“388. https://doi.org/10.1016/j.sigpro.2014.09.036
Cheddad A, Condell J, Curran K, Mc Kevitt P (2010) Digital image steganography: Survey and analysis of current methods. Signal Process 90(3):727â€“752. https://doi.org/10.1016/j.sigpro.2009.08.010
Dragoi IC, Coltuc D (2014) Localpredictionbased difference expansion reversible watermarking. IEEE Trans Image Process 23(4):1779â€“1790. https://doi.org/10.1109/TIP.2014.2307482
Hassan FS, Gutub A (2022) Novel embedding secrecy within images utilizing an improved interpolationbased reversible data hiding scheme. J King Saud Univ Comput Inf Sci 34(5):2017â€“2030. https://doi.org/10.1016/j.jksuci.2020.07.008
Hong W, Chen T, Shiu C (2009) The journal of systems and Software Reversible data hiding for high quality images using modification of prediction errors. J Syst Softw 82(11):1833â€“1842. https://doi.org/10.1016/j.jss.2009.05.051
Islamy CC, Ahmad T (2019) Improving the quality of stego image using prediction error and histogram modification. Int J Intell Eng Syst 12(5):95â€“103. https://doi.org/10.22266/ijies2019.1031.10
Islamy CC, Ahmad T (2021) Enhancing quality of the stego image by using histogram partition and prediction error. Int J Intell Eng Syst 14(2):511â€“520. https://doi.org/10.22266/ijies2021.0430.46
Islamy CC, Ahmad T (2022) ANALYZING THE IMPACT OF THE SECRET SHARING ON STEGO IMAGES. ICIC Exp Lett 16(3):307â€“315. https://doi.org/10.24507/icicel.16.03.307
Islamy CC. Payload. https://github.com/chaidirchalaf/payload. Accessed 20 Apr 2022; 2022.
Kadhim IJ, Premaratne P, Vial PJ, Halloran B (2019) Comprehensive survey of image steganography: techniques, evaluations, and trends in future research. Neurocomputing 335:299â€“326. https://doi.org/10.1016/j.neucom.2018.06.075
Kamal AHM, Islam MM (2019) A prediction error based histogram association and mapping technique for data embedment. J Inf Secur Appl 48:102368. https://doi.org/10.1016/j.jisa.2019.102368
Kar N, Mandal K, Bhattacharya B (2018) Improved chaosbased video steganography using DNA alphabets. ICT Express 4(1):6â€“13. https://doi.org/10.1016/j.icte.2018.01.003
Kumar M, Agrawal S (2016) Reversible data hiding based on prediction error expansion using adjacent pixels. Secur Commun Netw 9(16):3703â€“3712. https://doi.org/10.1002/sec.1575
Kumar A, Abhishek K, Shah K, Namasudra S, Kadry S (2021) A novel elliptic curve cryptographybased system for smart grid communication. Int J Web Grid Serv 17(4):321â€“342. https://doi.org/10.1504/IJWGS.2021.118398
Kyriakopoulos K, Parish DJ. A live system for wavelet compression of high speed computer network measurements. In: International Conference on Passive and Active Network Measurement. Berlin, Heidelberg, pp 241â€“244; 2007.
Luo T, Jiang G, Yu M, Gao W (2015) Novel prediction error based reversible data hiding method using histogram shifting. Int J Comput Theory Eng 7(5):332â€“336. https://doi.org/10.7763/IJCTE.2015.V7.981
Meng K, Miao F, Xiong Y, Chang CC (2021) A reversible extended secret image sharing scheme based on Chinese remainder theorem. Signal Process Image Commun 95:116221. https://doi.org/10.1016/j.image.2021.116221
Namasudra S, Devi D, Kadry S, Sundarasekar R, Shanthini A (2020) Towards DNA based data security in the cloud computing environment. Comput Commun 151:539â€“547. https://doi.org/10.1016/j.comcom.2019.12.041
National Library of Medicine eMicrobes Digital Library (2022) http://www.idimages.org/images/browse/ImageTechnique/. Accessed 1 Jun 2022
Ni Z, Shi YQ, Ansari N, Su W (2006) Reversible data hiding. IEEE Trans Circuits Syst Video Technol 16(3):354â€“362. https://doi.org/10.1109/TCSVT.2006.869964
Niu X, Yin Z, Zhang X, Tang J, Luo B. Reversible data hiding in encrypted AMBTC compressed images. In: Digital Forensics and Watermarking. Cham, pp 436â€“445; 2017.
Pavithran P, Mathew S, Namasudra S, Srivastava G (2022) A novel cryptosystem based on DNA cryptography, hyperchaotic systems and a randomly generated Moore machine for cyber physical systems. Comput Commun 188:1â€“12. https://doi.org/10.1016/j.comcom.2022.02.008
Prabowo HE, Ahmad T (2018) Adaptive pixel value grouping for protecting secret data in public computer networks. J Commun 13(6):325â€“332. https://doi.org/10.12720/jcm.13.6.325332
Rad RM, Wong K, Guo JM (2014) A unified data embedding and scrambling method. IEEE Trans Image Process 23(4):1463â€“1475. https://doi.org/10.1109/TIP.2014.2302681
Rad RM, Wong KS, Guo JM (2016) Reversible data hiding by adaptive group modification on histogram of prediction errors. Signal Process 125:315â€“328. https://doi.org/10.1016/j.sigpro.2016.02.001
Shambour MK, Gutub A (2022) Progress of IoT research technologies and applications Serving Hajj and Umrah. Arab J Sci Eng 47(2):1253â€“1273. https://doi.org/10.1007/s13369021058387
Shamir A (1979) How to share a secret. Commun ACM 22(11):612â€“613. https://doi.org/10.1145/359168.359176
Suresh M, Shatheesh Sam I (2022) Optimized interesting region identification for video steganography using fractional grey wolf optimization along with multiobjective cost function. J King Saud Univ Comput Inf Sci 34(6, Part B):3489â€“3496. https://doi.org/10.1016/j.jksuci.2020.08.007
Tang Z, Pang M, Yu C, Fan G, Zhang X (2021) Reversible data hiding for encrypted image based on adaptive prediction error coding. IET Image Process 15(11):2643â€“2655. https://doi.org/10.1049/ipr2.12252
Thodi DM, RodrÃguez JJ (2007) Expansion embedding techniques for reversible watermarking. IEEE Trans Image Process 16(3):721â€“730. https://doi.org/10.1109/TIP.2006.891046
Tian J (2003) Reversible data embedding using a difference expansion. IEEE Trans Circuits Syst Video Technol 13(8):890â€“896. https://doi.org/10.1109/TCSVT.2003.815962
USCSIPI SIPI image database (2021) http://sipi.usc.edu/database/database.php?volume=misc. Accessed 1 Mar 2021
Wu X, Weng J, Yan WQ (2018) Adopting secret sharing for reversible data hiding in encrypted images. Signal Process 143:269â€“281. https://doi.org/10.1016/j.sigpro.2017.09.017
Yan X, Gong Q, Li L, Yang G, Lu Y, Liu J (2020) Secret image sharing with separate shadow authentication ability. Signal Process Image Commun 82:115721. https://doi.org/10.1016/j.image.2019.115721
Yao H, Qin C, Tang Z, Tian Y. Guided filtering based color image reversible data hiding. J Vis Commun Image Represent. 2017;43(Supplement C):152â€“163. https://doi.org/10.1016/j.jvcir.2017.01.004
Yao H, Mao F, Tang Z, Qin C. Highfidelity dualimage reversible data hiding via predictionerror shift. Signal Process. 2020;170:107447. https://doi.org/10.1016/j.sigpro.2019.107447
Yu C, Zhang X, Li G, Zhan S, Tang Z (2022a) Reversible data hiding with adaptive difference recovery for encrypted images. Inf Sci 584:89â€“110. https://doi.org/10.1016/j.ins.2021.10.050
Yu C, Zhang X, Zhang X, Li G, Tang Z (2022b) Reversible data hiding with hierarchical embedding for encrypted images. IEEE Trans Circuits Syst Video Technol 32(2):451â€“466. https://doi.org/10.1109/TCSVT.2021.3062947
Yuan L, Li M, Guo C, Hu W (2016) Secret image sharing scheme with threshold changeable capability. Math Probl Eng 1:9576074. https://doi.org/10.1155/2016/9576074
Acknowledgements
The authors would like to thanks all lab and research group members who have supported this research, and all institutions, which have funded this research.
Authors' information
Chaidir Chalaf Islamy is a Ph.D student in Department of Informatics, Institut Teknologi Sepuluh Nopember (ITS), Indonesia, focussing on sharedsecret data hiding. His related research is available at https://www.scopus.com/authid/detail.uri?authorId=57210750661 . Tohari Ahmad received the Bachelor degree in computer science from Institut Teknologi Sepuluh Nopember (ITS), Indonesia, the master degree in information technology from Monash University, Australia, and the Ph.D degree in computer science from RMIT University, Australia. He was a consultant for some international companies. In 2003, he moved to ITS, where he is now a professor. His research interests include network security, information security, data hiding and computer network. He is a reviewer of a number of journals. Prof. Ahmad's awards and honors include the Hitachi Research Fellowship, and JICA Research Program to conduct research in Japan. His research is available at https://www.scopus.com/authid/detail.uri?authorId=35241970700. Royyana Muslim Ijtihadie received bachelor and master degrees from Institut Teknologi Sepuluh Nopember (ITS), Indonesia; and Ph.D from Kumamoto University, Japan. His research interests include computer network and computer security. He is now a senior lecturer in ITS and is responsible for managing the network computer infrastructure and computer security in his university. His research can be found at: https://www.scopus.com/authid/detail.uri?authorId=36975529900
Funding
This research was supported by the Ministry of Education, Culture, Research and Technology, The Republic of Indonesia, Institut Teknologi Sepuluh Nopember, and Universitas 17 Agustus 1945 Surabaya.
Author information
Authors and Affiliations
Contributions
CCI: Conceptualization, methodology, software, formal analysis, investigation, writing original draft, visualization. TA: Conceptualization, methodology, writing review and editing, supervision, project administration, funding acquisition. RMI: Conceptualization, methodology, supervision. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
All authors have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Islamy, C.C., Ahmad, T. & Ijtihadie, R.M. Reversible data hiding based on histogram and prediction error for sharing secret data. Cybersecurity 6, 12 (2023). https://doi.org/10.1186/s4240002300147y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4240002300147y