Please use this identifier to cite or link to this item: https://idr.nitk.ac.in/jspui/handle/123456789/16840
Title: Precision and Privacy Preserving Multi-Keyword Search over Encrypted Data
Authors: Siva Kumar, D Venkata Naga.
Supervisors: Thilagam, P Santhi.
Keywords: Department of Computer Science & Engineering;Searchable Encryption;Frequency of ciphertext values;Rank-orden information;Search pattern;Dynamic updates
Issue Date: 2020
Publisher: National Institute of Technology Karnataka, Surathkal
Abstract: A growing number of data owners are increasingly using cloud storage services of various Cloud Service Providers (CSPs) for storing and managing their information/documents. The cloud storage services provide numerous benefits to the data owners that include cost savings, greater reliability, ubiquitous access, and better performance. In spite of these benefits, the stored data at the remote cloud servers are vulnerable to the attacks initiated by the untrustworthy CSPs. Such data risks and associated privacy concerns have recently been in the limelight from revelations of extensive surveillance by several national security agencies and other government entities. The primary concerns of storing documents at cloud servers are confidentiality and privacy due to the loss of control over who accesses and manages the outsourced documents. In order to address these concerns, sensitive data is required to be outsourced in encrypted form to the cloud servers. Although the encryption guarantees confidentiality, it makes the retrieval process more complex. Searchable Encryption (SE) is a technique that guarantees confidentiality and privacy by storing documents in encrypted form at the cloud servers and allows search over encrypted data without decrypting it. In SE, the data owners store their documents, and the corresponding indexes in encrypted form, and the data users retrieve the documents by sending encrypted queries (trapdoors) to the cloud server. Despite its privacy and confidential guarantee, the privacy of trapdoor keywords and index keywords could be compromised due to the information leakages that are caused by the vulnerabilities in the adopted schemes used for encrypting indexes and queries. The information leakages include frequency of ciphertext values, rank-order information, and search pattern. The cloud servers exploit these leakages to infer plaintext information through various information disclosure attacks such as frequency analysis attack, rank-order exploitation attack, and scale analysis attack. Hence, this work aims at preventing these leakages and thereby mitigates the attacks. The cloud server uses frequency analysis attack to infer index keywords based on the frequency leakage of ciphertext values (repetition of the same encrypted keywords’ relevance scores) in indexes. This leakage occurs due to the insufficient randomness in the order preserving encryption (OPE) schemes that are used for encrypting keywords’ relevance scores in indexes. The existing OPE schemes leak frequency information when there are same plaintext scores for two or more keywords within the same document. In this work, an Enhanced One-to-Many order-preserving mapping techniqueis developed with improved randomness to mitigate the frequency leakage. The experimental study confirms that the proposed technique reduces not only the frequency leakage of keywords but also the co-occurring keywords. The cloud server returns the relevant documents in descending order for a given trapdoor based on the ranks of the documents. However, the cloud server uses the rankorder exploitation attack to infer the plaintext information of frequently issuing query keywords or frequently occurring keywords of the dataset based on the rank information leakage. Scale analysis attack also occurs when the users issue the same trapdoor again and again to retrieve the same documents. This attack enables the cloud server to infer plaintext keywords of trapdoors based on the search pattern leakage, which can be determined from the given set of trapdoors. The existing approaches prevent search pattern leakage by adding random keywords to a list of query keywords in a trapdoor generation approach, but precision gets affected due to the random keywords. These approaches cannot prevent rank-order information leakage completely since the random values of random keywords follow the distribution of actual keywords’ relevance scores. Therefore, it is highly essential to prevent these attacks by preserving the privacy of both rank information and the search pattern. In this work, a Pseudo-Ranking approach is developed to address this issue with the help of two servers, i.e., cloud server (CS) and the intermediate server (IS). The CS assigns pseudo-ranks to the documents instead of actual ranks, and the IS would nullify the impact of random keywords of a trapdoor for achieving higher precision. The experimental results confirm that the proposed approach preserves the privacy of both rank information and search pattern without affecting precision. Besides preventing these attacks, it is also essential to provide the latest relevant documents to the users to enable them to choose timely decisions based on updated information. To provide the latest relevant documents, there should be a provision in SE for allowing the data owners to perform the dynamic updates efficiently over the existing encrypted indexes. The existing tree-based indexing schemes cannot perform dynamic operations efficiently since the trees’ size is larger in terms of height and breadth. This causes a delay in performing dynamic updates and retrieving top-k relevant documents. In this work, a Max-heap tree based index structure is developed to address this issue. The experimental results of the proposed tree index confirm that it improves the time efficiency of top-k document retrieval and dynamic updates.
URI: http://idr.nitk.ac.in/jspui/handle/123456789/16840
Appears in Collections:1. Ph.D Theses

Files in This Item:
File Description SizeFormat 
155119CS15FV04.pdf2.24 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.