This section presents the concepts of rough set theory. A rough set approach to attribute generalization in data mining. Comparative analysis between rough set theory and data mining. After 15 year of pursuing rough set theory and its application the theory has reached a certain degree of maturity. Granular computing is an emerging computing paradigm of information processing. Comparative analysis between rough set theory and data.
This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering general computer security to server, network, and cloud security. The authors have broken the discussion into two sections, each with a specific theme. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles. Analysis of imprecise data is an edited collection of research chapters on the most recent developments in rough set theory and data mining. A rough set is a formal approximation of a crisp set in terms of a pair of sets that give the lower and upper approximation of the original set learn more in. The approach is based on the combination of generalization distribution table gdt and the rough set methodology.
We can use rough set approach to discover structural relationship within imprecise and noisy data. Theory and application on rough set, fuzzy logic, and. Clustering in data mining algorithms of cluster analysis. A partition of u is a family of mutually disjoint nonempty subsets of u, called blocks, such that the union of all blocks is u. A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are represented. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical analysis of data.
Based on rough sets 4 and the concept of lower and upper boundary sets 5, we introduce a method for updating approximations by considering adding and. In this perspective, granular computing has a position of centrality in data mining. A crucial concept in the rough set approach to machine learning is that of. Rough set approach, fuzzy set approachs, prediction, linear and multipleregression. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a. A rough set knowledge discovery framework is formulated for the analysis of. Introduction modern organizations use several types of decision support systems to facilitate decision support. A rough set approach to mining incomplete data is presented in this paper. In this thesis, a rulebased rough set decision system is. Additionally, the rough set approach to lower and upper approximations and certain possible rule sets concepts are introduced. Chapter 1 gives an overview of data mining, and provides a description of the data mining process. Granular computing is an emerging computing paradigm of information processing and an approach for knowledge representation and data mining. Download data mining tutorial pdf version previous page print page. Real life data are frequently imperfect, erroneous, incomplete, uncertain and vague.
For the rough set theory, in the process of data mining, there are still a large number. Due to the xml document is a kind of semistructured data, using the traditional data mining methods for mining of xml data is not applicable. Rough set theory rst is a mathematical approach that handles uncertainty and is capable of discovering knowledge from a database. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach. The rough set theory, which originated in the early 1980s, provides an alternative approach to the fuzzy set theory, when dealing with uncertainty, vagueness or inconsistence often. A characteristic set, a generalization of the elementary set wellknown in rough set. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering. This overview provides a description of some of the most common data mining algorithms in use today. Decision rule induction for service sector using data. This paper introduces a new approach for mining ifthen rules in databases with uncertainty and incompleteness. Parallel computing of approximations in dominancebased rough sets approach, knowledgebased systems, 87. In its abstract form, it is a new area of uncertainty mathematics closely related to fuzzy theory. Data mining and knowledge discovery in real life applications 36 outset, rough set theory has been a methodology of database mining or knowledge discovery in relational databases. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical.
It is useful for dealing with indiscernibility of objects caused by. Approximation can further be applied to data mining related task, e. Summarization providing a more compact representation of the data set, including visualization and report. This paper discusses the basic concepts of rough set theory and point out some rough set based research directions and applications. The chapters in this work cover a range of topics that focus on discovering dependencies among data, and reasoning about vague, uncertain and imprecise information.
Mining incomplete dataa rough set approach jerzy w. Visualization of data through data mining software is addressed. The general experimental procedure adapted to data mining problems involves the following steps. Fuzzyrough data mining with weka aberystwyth university. It demonstrates this process with a typical set of data. Rule generation from raw data is a very effective and most widely used tool of data mining. Rough set approach to the analysis of the structureactivity relationship of quaternary imidazolium compounds. Seminal description of datamining approaches with reference. The dominancebased rough set approach drsa is an extension of rough set theory for multicriteria decision analysis mcda, introduced by greco, matarazzo and slowinski.
Data mining, data tables, distributed data mining ddm, rough sets. A characteristic set, a generalization of the elementary set wellknown in rough set theory, may be computed using such blocks. A rough set approach for the discovery of classification rules in. Data mining, decision tables, rough set, rule extraction. The rough set approach 7 to data analysis has many important advantages that. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach in the context of data analysis and modelbuilding applications. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases. Risk assessment is very important for safe and reliable investment. Rough set theory provides a simple and elegant method for analyzing data. A rough set approach to attribute generalization in data. The approach is based on the combination of generalization distribution table gdt. A rough set based method for updating decision rules on attribute values coarsening and refining, ieee transactions on knowledge and data engineering, 2612. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test.
The concept of rough, or approximation, set s was introduced by pawlak, and is based on the single assumption that information is associated with. Rough set theory fundamental concepts, principals, data. Dominancebased rough set approach for group decisions, european journal of operational research, 2511. Data mining is a process of discovering various models, summaries, and derived values from a given collection of data.
New directions in rough sets, data mining, and granularsoft computing, lnai. Rough set theory and zadehs fuzzy set theory are two independent approaches to deal with uncertainty. Rough set theory, data mining, decision table, decision rule, data representation. Rough set theory was proposed by pawlak 17 in 1982, which o ers a mathematical approach to data analysis and data mining 11,1821. Data mining technology has emerged as a means for identifying patterns and trends from large quantities o. Data mining is an emerging powerful tool for analysis and prediction. A decisiontheoretic rough set approach for dynamic data mining. A survey on rough set theory and its applications sciencedirect. Chapter 1 basic concepts contains general formulation of basic ideas of rough set theory together with brief discussion of its place in classical set theory. Relationships exist between rough set theory and dempstershafers theory of. Pdf a decisiontheoretic rough set approach for dynamic. Pdf a decisiontheoretic rough set approach for dynamic data.
Sets, fuzzy sets and rough sets our digital library. Mining incomplete dataa rough set approach springerlink. The chapter is focused on the data mining aspect of the applications of rough set theory. And combining with probability logic, random truth degree of rough logic can be studied in the future. Based on the rough set theory, the rough logic and its deduction theory system can be established. A convenient way to present equivalence relations is through partitions. Chapter 2 rough sets and reasoning from data presents the application of rough set concept to reason from data data mining. Relationships exist between rough set theory and dempstershafers theory of evidence. Rough set theory was introduced by zdzislaw pawlak in 1982. Pawlak as a mathematical ap proach to deal with vagueness and uncertainty in data analysis 50, 51. Recently, the rough set and fuzzy set theory have generated a great deal of interest among more and more researchers. For the rough set theory, in the process of data mining, there are still a large number of problems need to be discussed, such as large data sets, efficient reduction algorithm, parallel computing, hybrid algorithm, etc. Rough set theory and its applications ua computer science.
A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. Puts forward a xml mining model based on rough set theory. Through indepth study on the existing rough set and data mining technologies, for the shortcomings of the existing data mining algorithms based on rough set, this paper presents an improved algorithm. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Rough set theory has been a methodology of database mining or knowledge discovery in relational databases. Research on data mining algorithm based on rough set. Discriminant versus rough set approach to vague data analysis. It is a new mathematical tool to deal with partial information. An overview of useful business applications is provided.
However, the clustering algorithms for categorical data are few and are unable to handle uncertainty. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a discussion of data representation with rough set theory including pairs of attributevalue blocks, information tables reducts, indiscernibility relation and. Finally, some description about applications of the data mining system with. Rough set can be used as a tool to generate rules form decision table in data mining. Data mining, data tables, distributed data mining ddm. There are so many approaches for handling missing attribute values. The data mining technology instead of classic statistical analysis is developed to help the people to discover the knowledge inside of the data. Book description practical applications of data mining emphasizes both theory and applications of data mining algorithms. Rule induction from a decision table using rough sets theory. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory, and logical analysis of data lad. Inhibitory rules in data analysis a rough set approach.
Chapter 2 presents the data mining process in more detail. Also, this method locates the clusters by clustering the density. Pdf this article comments on data mining and rough set theory, regarding the article myths about rough set. Another methodology which has high relevance to data mining and plays a central role in this volume is that of. Rough set theory 7 is a new mathematical approach to data analysis and data mining. A rough set approach for generation and validation of rules for missing attribute values of a data set. It is useful for dealing with indiscernibility of objects caused by incomplete or limited information. In this data mining clustering method, a model is hypothesized for each cluster to find the best fit of data for a given model. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. The customer related data are categorical in nature. Rough sets theory is a new mathematical approach used in the intelligent data analysis and data mining if data is uncertain or incomplete. The monograph can be used under the creation of courses for graduate students and for ph.
Risk analysis technique on inconsistent interview big data. In recent years we witnessed a rapid grow of interest in rough set theory and its application, world wide. Reduce cost of mailing by targeting a set of consumers likely to buy a new cell phone product approach. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to. The authors express that rough data set theory is not the only discipline in which discretization is. Performance analysis and prediction in educational data. On rough set based approaches to induction of decision.
And combining with probability logic, random truth. Hongmei chen, tianrui li, ieee senior member, chuan luo, shijinn horng, ieee member. The rough set theory offers a viable approach for decision rule extraction from data. The rough set approach 7 to data analysis has many important advantages that provide efficient algorithms for finding hidden patterns in data, finds minimal sets of data, evaluates significance of data, generates. For the purposes of analysis and decision support in the business area in many cases data mining using rough set theory is used. A rough set approach for generation and validation of rules. Intrusion detection a data mining approach nandita.
577 1349 446 1209 1326 1171 1050 1579 1554 607 560 1156 527 596 1488 538 74 996 1274 1442 637 1168 565 424 76 625 1569 83 1011 1028 1268 1285 75 803 925 1134 676 1470 142 416 624 197 796 974 649