Saturday, 7 July 2012

Fuzzy Keyword Search over Encrypted Data in Cloud Computing

To get source code, video, ppt, documentation of this project please mail us:
info@ocularsystems.in

or visit us:
http://blog.ocularsystems.in/blog


Abstract:
                As Cloud Computing becomes prevalent, more and more sensitive information are being centralized into the cloud. Although traditional searchable encryption schemes allow a user to securely search over encrypted data through keywords and selectively retrieve files of interest, these techniques support only exact keyword search. In this paper, for the first time we formalize and solve the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy. Fuzzy keyword search greatly enhances system usability by returning the matching files when users’ searching inputs exactly match the predefined keywords or the closest possible matching files based on keyword similarity semantics, when exact match fails. In our solution, we exploit edit distance to quantify keywords similarity and develop two advanced techniques on constructing fuzzy keyword sets, which achieve optimized storage and representation overheads. We further propose a brand new symbol-based trie-traverse searching scheme, where a multi-way tree structure is built up using symbols transformed from the resulted fuzzy keyword sets. Through rigorous security analysis, we show that our proposed solution is secure and privacy-preserving, while correctly realizing the goal of fuzzy keyword search. Extensive experimental results demonstrate the efficiency of the proposed solution.
Algorithm / Technique used:
                         String Matching Algorithm
Algorithm Description:
    The approximate string matching algorithms among them can be classified into two categories: on-line and off-line. The on-line techniques, performing search without an index, are unacceptable for their low search efficiency, while the off-line approach, utilizing indexing techniques, makes it dramatically faster. A variety of indexing algorithms, such as suffix trees, metric trees and q-gram methods, have been presented. At the first glance, it seems possible for one to directly apply these string matching algorithms to the context of searchable encryption by computing the trapdoors on a character base within an alphabet. However, this trivial construction suffers from the dictionary and statistics attacks and fails to achieve the search privacy. An instance M of the data type string-matching is an object maintaining a pattern and a string. It provides a collection of different algorithms for computation of the exact string matching problem. Each function computes a list of all starting positions of occurrences of the pattern in the string.
System Architecture:
Fuzzy Keyword Search over Encrypted Data in Cloud Computing
Existing System:
This straightforward approach apparently provides fuzzy keyword search over the encrypted files while achieving search privacy using the technique of secure trapdoors. However, this approaches serious efficiency disadvantages. The simple enumeration method in constructing fuzzy key-word sets would introduce large storage complexities, which greatly affect the usability.
For example, the following is the listing variants after a substitution operation on the first character of keyword
                       CASTLE:  {AASTLE, BASTLE, DASTLE, YASTLE, ZASTLE}.
Proposed System:
Main Modules:
            1. Wildcard – Based Technique
            2. Gram – Based Technique
            3. Symbol – Based Trie – traverse Search Scheme
1. Wildcard – Based Technique:
       In the above straightforward approach, all the variants of the keywords have to be listed even if an operation is performed at the same position. Based on the above observation, we proposed to use an wildcard to denote edit operations at the same position. The wildcard-based fuzzy set edits distance to solve the problems.
For example, for the keyword CASTLE with the pre-set edit distance 1, its wildcard based fuzzy keyword set can be constructed as
SCASTLE, 1 = {CASTLE, *CASTLE,*ASTLE, C*ASTLE, C*STLE, CASTL*E, CASTL*, CASTLE*}.
Edit Distance:
  1. Substitution
  2. Deletion
  3. Insertion
a)      Substitution :  changing one character to another in a  word;
b)     Deletion :  deleting one character from a word;
c)      Insertion:  inserting a single character into a word.
2. Gram – Based Technique:
Another efficient technique for constructing fuzzy set is based on grams. The gram of a string is a substring that can be used as a signature for efficient approximate search. While gram has been widely used for constructing inverted list for approximate string search, we use gram for the matching purpose. We propose to utilize the fact that any primitive edit operation will affect at most one specific character of the keyword, leaving all the remaining characters untouched. In other words, the relative order of the remaining characters after the primitive operations is always kept the same as it is before the operations.
For example, the gram-based fuzzy set SCASTLE, 1 for keyword CASTLE can be constructed as
              {CASTLE, CSTLE, CATLE, CASLE, CASTE, CASTL, ASTLE}.
3. Symbol – Based Trie – traverse Search Scheme
             To enhance the search efficiency, we now propose a symbol-based trie-traverse search scheme, where a multi-way tree is constructed for storing the fuzzy keyword set over a finite symbol set. The key idea behind this construction is that all trapdoors sharing a common prefix may have common nodes. The root is associated with an empty set and the symbols in a trapdoor can be recovered in a search from the root to the leaf that ends the trapdoor. All fuzzy words in the trie can be found by a depth-first search.
In this section, we consider a natural extension from the previous single-user setting to multi-user setting, where a data owner stores a file collection on the cloud server and allows an arbitrary group of users to search over his file collection.
System Requirements:
Hardware Requirements:
•         System                     : Pentium IV 2.4 GHz.
•         Hard Disk                : 40 GB.
•         Floppy Drive          : 1.44 Mb.
•         Monitor                   : 15 VGA Colour.
•         Mouse                      : Logitech.
•         Ram                         : 512 Mb.
Software Requirements:
•         Operating system      : – Windows XP.
•         Coding Language       : DOT NET
•         Data Base                     : SQL Server 2005
Conclusion:
1. In this paper, for the first time we formalize and solve the problem of supporting efficient yet privacy-preserving fuzzy search for achieving effective utilization of remotely stored encrypted data in Cloud Computing.
2. We design two advanced techniques (i.e., wildcard-based and gram- based techniques) to construct the storage-efficient fuzzy keyword sets by exploiting two significant observations on the similarity metric of edit distance.
3. Based on the constructed fuzzy keyword sets, we further propose a brand new symbol-based trie-traverse searching scheme, where a multi-way tree structure is built up using symbols transformed from the resulted fuzzy keyword sets.
4. Through rigorous security analysis, we show that our proposed solution is secure and privacy- preserving, while correctly realizing the goal of fuzzy keyword search. Extensive experimental results demonstrate the efficiency of our solution.

No comments:

Post a Comment