Essay Example on A remarkable increase in different issues like network Complexity









A remarkable increase in different issues like network complexity increased access of Internet information sharing and a growing impact of Internet gives rise to security and privacy as a major concern for research Data mining is a technique for extracting knowledge automatically and intelligently from huge amount of data Individual Sensitive information compromising the individual's right to privacy is also disclosed during the process Privacy preserving data mining PPDM refers to securing the privacy of personal data or sensitive information without losing the productiveness of data Privacy preserving data mining is drawing booming attention in the past recent years with the expeditious development of Internet data processing and data storage technologies Privacy of an individual will not be violated until and unless one feels his her private information is being used unfavorably No one can prevent someone's personal information from being misused once it is disclosed There are several methods that have been put forward for privacy concern but this branch of research is still in its infancy 

A bunch of techniques and methods have been developed for privacy preserving data mining that allows one to extract required relevant knowledge from huge amount of data and hiding sensitive data from disclosure or inference at the same time The ultimate goal of PPDM is to develop an efficient algorithm that meets the following requirements Research of PPDM has the following approaches i Data Hiding The sensitive data like name address contact number etc are either replaced or blocked or trimmed from the database This prevents the user of data from trading off with other individual s personal information ii Rule Hiding The sensitive information or rules extracted from data mining process are blocked for use Thus the private information explored from the mining cannot be used iii Secure Multiparty Computation SMC The data is encrypted before being shared for computations so as to avoid the data from being leaked Privacy Preserving Data Mining Techniques are classified on the basis of following dimension Data distribution Data or rule hiding Data modification Data mining algorithm Privacy preservation 1 Data distribution On the basis of data distribution the PPDM algorithms are categorized as centralized and distributed In the centralized database system whole data is stored at a single database While in the distributed database the data may be present in different databases at different locations The distributed database is further classified as horizontal data distribution and vertical data distribution In the horizontal approach the records of different databases resides at different locations while in the vertical approach all the data for various attributes is present in different locations 2 Data or rule hiding 

On the basis of purpose of hiding PPDM algorithms are classified as data hiding and rule hiding In the data hiding approach the sensitive data like name address contact number etc are either replaced or blocked or trimmed from the database This prevents the user of data from trading off with other individual s personal information Most of the procedures use data hiding techniques as a measure to keep the information safe from revealing out through hiding precise patterns by modifying the data 3 Data modification Modification is required to modify or change the data in order to attain high level of privacy The data can be modified by perturbation blocking merging aggregation sampling or swapping or using combination of any of these techniques i Perturbation It refers to changing the original value by some new value For example replacing 1 by 0 or 0 by 1 i e adding some noise ii Blocking Refers to blocking of data from being disclosed by substituting the current attribute value by iii Aggregation or merging It is achieved by combining various values into a loutish group iv Swapping This means interchanging the values of some particular data v Sampling It refers to unleashing of data only for a particular sample 4 Data mining algorithms There are various data modification algorithms which prepare a ground for analysis and designing of data hiding algorithms Some of the important algorithms are i Classification ii Decision tree inducers iii Association rule mining algorithms iv Clustering algorithms v Rough sets vi Bayesian networks In the current outline PPDM techniques use classification association rule mining and clustering

Association mining cites to the detection of associated rules periodically Clustering analysis is a task of dividing or splitting a data set into different groups Classification refers to finding of set of models for estimating an outcome on the basis of the input provided which gives data classes 5 Privacy preservation The selective modification of data is done using PPDM technique and is required to achieve higher utility for the modified data given that the privacy is not lost The techniques which are used in centralized data distributions involve sanitation blocking distortion and generalization Secure multi party computation is one of the algorithms which deals with the computation of any function for any input provided that one input is held by each candidate and no private information is disclosed to any contributor during the computation For data hiding data distortion is used mainly then the data sanitation and then generalization Thus a trade off between privacy and accuracy is to be achieved since improving one of these usually makes the other one to suffer in terms of cost Better methods should be built up to balance the disclosure computation and communication cost To keep the confidential data private more supreme algorithms should be developed as in today's world privacy of data and information is one of the major concerns

