Chapter 12 Discovering New Knowledge Data Mining


 Loreen McLaughlin
 6 years ago
 Views:
Transcription
1 Chapter 12 Discovering New Knowledge Data Mining BecerraFernandez, et al.  Knowledge Management 1/e Prentice Hall Additional material 2007 Dekai Wu
2 Chapter Objectives Introduce the student to the concept of Data Mining (DM), also known as Knowledge Discovery in Databases (KDD). How it is different from knowledge elicitation from experts How it is different from extracting existing knowledge from databases. The objectives of data mining Explanation of past events (descriptive DM) Prediction of future events (predictive DM) (continued)
3 Chapter Objectives (cont.) Introduce the student to the different classes of statistical methods available for DM Classical statistics (e.g., regression, curve fitting, ) Induction of symbolic rules Neural networks (a.k.a. connectionist models) Introduce the student to the details of some of the methods described in the chapter.
4 Historical Perspective DM, a.k.a. KDD, arose at the intersection of three independently evolved research directions: Classical statistics and statistical pattern recognition Machine learning (from symbolic AI) Neural networks
5 Objectives of Data Mining Descriptive DM seeks patterns in past actions or activities to affect these actions or activities eg, seek patterns indicative of fraud in past records Predictive DM looks at past history to predict future behavior Classification classifies a new instance into one of a set of discrete predefined categories Clustering groups items in the data set into different categories Affinity or association finds items closely associated in the data set
6 Classical statistics & statistical pattern recognition Provide a survey of the most important statistical methods for data mining Curve fitting with least squares method Multivariate correlation KMeans clustering Market Basket analysis Discriminant analysis Logistic regression
7 Figure D input data plotted on a graph y x
8 Figure data and deviations y x Best fitting equation x
9 Induction of symbolic rules Present a detailed description of the symbolic approach to data mining rule induction by learning decision trees Present the main algorithm for rule induction C5.0 and its ancestors, ID3 and CLS (from machine learning) CART (Classification And Regression Trees) and CHAID, very similar algorithms for rule induction (independently developed in statistics) Present several example applications of rule induction
10 Table 12.1 decision tables (if ordered, then decision lists) Name Outlook Temperature Humidity Class Data sample1 Sunny Mild Dry Enjoyable Data sample2 Cloudy Cold Humid Not Enjoyable Data sample3 Rainy Mild Humid Not Enjoyable Data sample4 Sunny Hot Humid Not Enjoyable Note: DS = Data Sample
11 Figure 12.1 decision trees (a.k.a. classification trees) Yes Is the stock s price/earning s ratio > 5? No Root node Has the company s quarterly profit increased over the last year by 10% or more? Leaf node Don t buy No Yes Don t buy Is the company s management stable? Yes No Buy Don t Buy
12 Induction trees An induction tree is a decision tree holding the data samples (of the training set) Built progressively by gradually segregating the data samples
13 Figure 12.2 simple induction tree (step 1) {DS1, DS2, DS3, DS4} Outlook = Cloudy = Rainy = Sunny Rain DS2  Not Enjoyable DS3  Not Enjoyable DS1 Enjoyable DS4  Not Enjoyable
14 Figure 12.3 simple induction tree (step 2) {DS1, DS2, DS3, DS4} Outlook = Cloudy = Rain = Sunny Rain Temperature DS1 Enjoyable DS2 Not Enjoyable DS3 Not Enjoyable DS4 Not Enjoyable = Cold = Mild = Hot None DS1  Enjoyable DS4 Not Enjoyable
15 Writing the induced tree as rules Rule 1. If the Outlook is cloudy, then the Weather is not enjoyable. Rule 2. If the Outlook is rainy, then the Weather is not enjoyable. Rule 3. If the Outlook is sunny and Temperature is mild, then the Weather is enjoyable. Rule 4. If the Outlook is sunny and Temperature is cold, then the Weather is not enjoyable.
16 Learning decision trees for classification into multiple classes In the previous example, we were learning a function to predict a boolean (enjoyable = true/false) output. The same approach can be generalized to learn a function that predicts a class (when there are multiple predefined classes/categories). For example, suppose we are attempting to select a KBS shell for some application: with the following as our options: ThoughtGen, Offsite, Genie, SilverWorks, XS, MilliExpert using the following attributes and range of values: Development language: { Java, C++, Lisp } Reasoning method: { forward, backward } External interfaces: { dbase, spreadsheetxl, ASCII file, devices } Cost: any positive number Memory: any positive number
17 Table 12.2 collection of data samples (training set) described as vectors of attributes (feature vectors) Language Reasoning method Interface Method Cost Memory Classification Java Backward SpreadsheetXL MB MilliExpert Java Backward ASCII MB MilliExpert Java Backward dbase MB ThoughtGen Java * Devices MB OffSite C++ Forward * MB Genie LISP Forward * GB Silverworks C++ Backward * MB XS LISP Backward * MB XS
18 Figure 12.4 decision tree resulting from selection of the language attribute Language Java C++ Lisp MilliExpert ThoughtGen OffSite Genie XS SilverWorks XS
19 Figure 12.5 decision tree resulting from addition of the reasoning method attribute Language Java C++ Lisp Backwards MilliExpert ThoughtGen OffSite Backward Genie XS SilverWorks XS Backward Forward Forward Forward MilliExpert ThoughtGen OffSite OffSite XS Ginie XS SilverWorks
20 Figure 12.6 final decision tree Language Java C++ Lisp Backward MilliExpert ThoughtGen OffSite Backward Genie XS Backward SilverWorks XS Forward Forward Forward MilliExpert ThoughtGen OffSite OffSite XS Genie XS SilverWorks ASCII Devices SpreadsheetXL dbase MilliExpert MilliExpert ThoughtGen OffSite
21 Order of choosing attributes Note that the decision tree that is built depends greatly on which attributes you choose first
22 Figure 12.2 simple induction tree (step 1) {DS1, DS2, DS3, DS4} Outlook = Cloudy = Rainy = Sunny Rain DS2  Not Enjoyable DS3  Not Enjoyable DS1 Enjoyable DS4  Not Enjoyable
23 Figure 12.3 simple induction tree (step 2) {DS1, DS2, DS3, DS4} Outlook = Cloudy = Rain = Sunny Rain Temperature DS1 Enjoyable DS2 Not Enjoyable DS3 Not Enjoyable DS4 Not Enjoyable = Cold = Mild = Hot None DS1  Enjoyable DS4 Not Enjoyable
24 Table 12.1 decision tables (if ordered, then decision lists) Name Outlook Temperature Humidity Class Data sample1 Sunny Mild Dry Enjoyable Data sample2 Cloudy Cold Humid Not Enjoyable Data sample3 Rainy Mild Humid Not Enjoyable Data sample4 Sunny Hot Humid Not Enjoyable Note: DS = Data Sample
25 Figure 12.7 {DS1, DS2, DS3, DS4} Humidity = Humid = Dry DS2 Not Enjoyable DS3 Not Enjoyable DS4 Not Enjoyable DS1  Enjoyable
26 Order of choosing attributes (cont) One sensible objective is to seek the minimal tree, ie, the smallest tree required to classify all training set samples correctly Occam s Razor principle: the simplest explanation is the best What order should you choose attributes in, so as to obtain the minimal tree? Often too complex to be feasible Heuristics used Information gain, computed using information theoretic quantities, is the best way in practice
27 Artificial Neural Networks Provide a detailed description of the connectionist approach to data mining neural networks Present the basic neural network architecture the multilayer feed forward neural network Present the main supervised learning algorithm backpropagation Present the main unsupervised neural network architecture the Kohonen network
28 Figure 12.8 simple model of a neuron x 1 W 1 Inputs x 2 W 2 Σ y Activation function f() x k W n
29 Figure 12.9 three common activation functions 1.0 Threshold function Piecewise Linear function Sigmoid function
30 Figure simple singlelayer neural network Inputs Outputs
31 Figure twolayer neural network
32 Supervised Learning: Back Propagation An iterative learning algorithm with three phases: 1. Presentation of the examples (input patterns with outputs) and feed forward execution of the network 2. Calculation of the associated errors when the output of the previous step is compared with the expected output and back propagation of this error 3. Adjustment of the weights
33 Unsupervised Learning: Kohonen Networks Clustering by an iterative competitive algorithm Note relation to CBR
34 Figure clusters of related data in 2D space Variable B Cluster #2 Cluster #1 Variable A
35 Figure Kohonen selforganizing map W i Inputs
36 When to use what Provide useful guidelines for determining what technique to use for specific problems
37 Table 12.3 Goal Find linear combination of predictors that best separate the population Predict the probability of outcome being in a particular category Input Variables (Predictors) Output Variables (Outcomes) Statistical Technique Continuous Discrete Discriminant Analysis Continuous Discrete Logistic and Multinomial Regression Examples [SPSS, 2000] Predict instances of fraud Predict whether customers will remain or leave (churners or not) Predict which customers will respond to a new product or offer Predict outcomes of various medical procedures Predicting insurance policy renewal Predicting fraud Predicting which product a customer will buy Predicting that a product is likely to fail
38 Table 12.3 (cont.) Goal Input Variables (Predictors) Output Variables (Outcomes) Statistical Technique Examples [SPSS, 2000] Output is a linear combination of input variables For experiments and repeated measures of the same sample To predict future events whose history has been collected at regular intervals Continuous Continuous Linear Regression Most inputs must be Discrete Continuous Analysis of Variance (ANOVA) Continuous Continuous Time Series Analysis Predict expected revenue in dollars from a new customer Predict sales revenue for a store Predict waiting time on hold for callers to an 800 number. Predict length of stay in a hospital based on patient characteristics and medical condition. Predict which environmental factors are likely to cause cancer Predict future sales data from past sales records
39 Table 12.4 Goal Input (Predictor) Variables Output (Outcome) Variables Statistical Technique Examples [SPSS, 2000] Predict outcome based on values of nearest neighbors Continuous, Discrete, and Text Continuous or Discrete Memorybased Reasoning (MBR) Predicting medical outcomes Predict by splitting data into subgroups (branches) Continuous or Discrete (Different techniques used based on data characteristics) Continuous or Discrete (Different techniques used based on data characteristics) Decision Trees Predicting which customers will leave Predicting instances of fraud Predict outcome in complex nonlinear environments Continuous or Discrete Continuous or Discrete Neural Networks Predicting expected revenue Predicting credit risk
40 Table 12.5 Goal Predict by splitting data into more than two subgroups (branches) Predict by splitting data into more than two subgroups (branches) Input (Predictor) Variables Continuous, Discrete, or Ordinal Output (Outcome) Variables Discrete Statistical Technique Chisquare Automatic Interaction Detection (CHAID) Examples [SPSS, 2000] Predict which demographic combinations of predictors yield the highest probability of a sale Predict which factors are causing product defects in manufacturing Continuous Discrete C5.0 Predict which loan customers are considered a good risk Predict which factors are associated with a country s investment risk
41 Table 12.5 (cont.) Goal Input (Predictor) Variables Output (Outcome) Variables Statistical Technique Examples [SPSS, 2000] Predict by splitting data into binary subgroups (branches) Predict by splitting data into binary subgroups (branches) Continuous Continuous Classification and Regression Trees (CART) Continuous Discrete Quick, Unbiased, Efficient, Statistical Tree (QUEST) Predict which factors are associated with a country s competitiveness Discover which variables are predictors of increased customer profitability Predict who needs additional care after heart surgery
42 Table 12.6 Goal Input Variables (Predictor) Output Variables (Outcome) Statistical Technique Examples [SPSS, 2000] Find large groups of cases in large data files that are similar on a small set of input characteristics, Continuous or Discrete No outcome variable Kmeans Cluster Analysis Customer segments for marketing Groups of similar insurance claims To create large cluster memberships Kohonen Neural Networks Cluster customers into segments based on demographics and buying patterns Create small set associations and look for patterns between many categories Logical No outcome variable Market Basket or Association Analysis with Apriori Identify which products are likely to be purchased together Identify which courses students are likely to take together
43 Errors and their significance in DM Discuss the importance of errors in data mining studies Define the types of errors possible in data mining studies
44 Table 12.7 Confusion Matrix Heart Disease Diagnostic Predicted No Disease Predicted Presence of Disease Actual No Disease 118 (72%) 46 (28%) Actual Presence of Disease 43 (30.9%) 96 (69.1%)
45 Table 12.7 Confusion Matrix Heart Disease Diagnostic Predicted No Disease Predicted Presence of Disease Actual No Disease 118 (72%) 46 (28%) Actual Presence of Disease 43 (30.9%) 96 (69.1%) false negatives
46 Table 12.7 Confusion Matrix false positives Heart Disease Diagnostic Predicted No Disease Predicted Presence of Disease Actual No Disease 118 (72%) 46 (28%) Actual Presence of Disease 43 (30.9%) 96 (69.1%)
47 Conclusions You should know when to use: Curvefitting algorithms. Statistical methods for clustering. The C5.0 algorithm to capture rules from examples. Basic feedforward neural networks with supervised learning. Unsupervised learning, clustering techniques and the Kohonen networks. Other statistical techniques.
48 Chapter 12 Discovering New Knowledge Data Mining BecerraFernandez, et al.  Knowledge Management 1/e Prentice Hall Additional material 2007 Dekai Wu
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationPredictive Dynamix Inc
Predictive Modeling Technology Predictive modeling is concerned with analyzing patterns and trends in historical and operational data in order to transform data into actionable decisions. This is accomplished
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their daytoday
More informationEFFICIENT DATA PREPROCESSING FOR DATA MINING
EFFICIENT DATA PREPROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More informationKnowledgebased systems and the need for learning
Knowledgebased systems and the need for learning The implementation of a knowledgebased system can be quite difficult. Furthermore, the process of reasoning with that knowledge can be quite slow. This
More informationIndex Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
More informationMBA 8473  Data Mining & Knowledge Discovery
MBA 8473  Data Mining & Knowledge Discovery MBA 8473 1 Learning Objectives 55. Explain what is data mining? 56. Explain two basic types of applications of data mining. 55.1. Compare and contrast various
More informationTitle. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 810 December 2010.
Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 810 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Daybyday Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationWhat is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling
MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : msiris@cityu.edu.hk 1 Aims To introduce the basic concepts of data mining
More informationData Mining and Neural Networks in Stata
Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di MilanoBicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it
More information8. Machine Learning Applied Artificial Intelligence
8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name
More informationNeural Networks and Support Vector Machines
INF5390  Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF539013 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationData Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationA Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA022015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationRole of Neural network in data mining
Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationPredictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.datamines.com Louise.francis@datamines.cm
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN 22771956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationLearning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationComparison of Kmeans and Backpropagation Data Mining Algorithms
Comparison of Kmeans and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
More informationApplying Data Mining Technique to Sales Forecast
Applying Data Mining Technique to Sales Forecast 1 Erkin Guler, 2 Taner Ersoz and 1 Filiz Ersoz 1 Karabuk University, Department of Industrial Engineering, Karabuk, Turkey erkn.gler@yahoo.com, fersoz@karabuk.edu.tr
More informationA Decision Tree for Weather Prediction
BULETINUL UniversităŃii Petrol Gaze din Ploieşti Vol. LXI No. 1/2009 7782 Seria Matematică  Informatică  Fizică A Decision Tree for Weather Prediction Elia Georgiana Petre Universitatea PetrolGaze
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.7 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Linear Regression Other Regression Models References Introduction Introduction Numerical prediction is
More informationData Mining Techniques Chapter 6: Decision Trees
Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................
More informationData Mining Techniques
15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationData Mining for Knowledge Management. Classification
1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationNeural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department
DOI: 10.5769/C2012010 or http://dx.doi.org/10.5769/c2012010 Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department Antonio Manuel Rubio Serrano (1,2), João Paulo
More informationClassification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationData Mining with Weka
Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Data Mining with Weka a practical course on how to
More informationPotential Value of Data Mining for Customer Relationship Marketing in the Banking Industry
Advances in Natural and Applied Sciences, 3(1): 7378, 2009 ISSN 19950772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened
More information6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
More informationData Mining on Streams
Data Mining on Streams Using Decision Trees CS 536: Machine Learning Instructor: Michael Littman TA: Yihua Wu Outline Introduction to data streams Overview of traditional DT learning ALG DT learning ALGs
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationWeather forecast prediction: a Data Mining application
Weather forecast prediction: a Data Mining application Ms. Ashwini Mandale, Mrs. Jadhawar B.A. Assistant professor, Dr.Daulatrao Aher College of engg,karad,ashwini.mandale@gmail.com,8407974457 Abstract
More informationData Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
More informationLecture 6. Artificial Neural Networks
Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm
More informationNew Work Item for ISO 35345 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 35345 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, MayJun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationSilvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spsssa.com
SPSSSA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spsssa.com SPSSSA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
More informationNEURAL NETWORKS IN DATA MINING
NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,
More informationPractical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING
Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction
More informationData Mining for Model Creation. Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA 98370 paul.below@eds.
Sept 032305 22 2005 Data Mining for Model Creation Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA 98370 paul.below@eds.com page 1 Agenda Data Mining and Estimating Model Creation
More informationNine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
More informationHow To Make A Credit Risk Model For A Bank Account
TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions
More informationDecisionTree Learning
DecisionTree Learning Introduction ID3 Attribute selection Entropy, Information, Information Gain Gain Ratio C4.5 Decision Trees TDIDT: TopDown Induction of Decision Trees Numeric Values Missing Values
More informationPrediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
More informationData Mining Applications in Fund Raising
Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
More informationFeedForward mapping networks KAIST 바이오및뇌공학과 정재승
FeedForward mapping networks KAIST 바이오및뇌공학과 정재승 How much energy do we need for brain functions? Information processing: Tradeoff between energy consumption and wiring cost Tradeoff between energy consumption
More informationAttribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)
Machine Learning 1 Attribution Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) 2 Outline Inductive learning Decision
More informationUniversité de Montpellier 2 Hugo AlatristaSalas : hugo.alatristasalas@teledetection.fr
Université de Montpellier 2 Hugo AlatristaSalas : hugo.alatristasalas@teledetection.fr WEKA Gallirallus Zeland) australis : Endemic bird (New Characteristics Waikato university Weka is a collection
More informationGerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
More informationChurn Prediction. Vladislav Lazarov. Marius Capota. vladislav.lazarov@in.tum.de. mariuscapota@yahoo.com
Churn Prediction Vladislav Lazarov Technische Universität München vladislav.lazarov@in.tum.de Marius Capota Technische Universität München mariuscapota@yahoo.com ABSTRACT The rapid growth of the market
More informationCOMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationThe KDD Process: Applying Data Mining
The KDD Process: Applying Nuno Cavalheiro Marques (nmm@di.fct.unl.pt) Spring Semester 2010/2011 MSc in Computer Science Outline I 1 Knowledge Discovery in Data beyond the Computer 2 by Visualization Lift
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationData Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE  Result Making data accessible Wider distribution
More informationData Mining  Evaluation of Classifiers
Data Mining  Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationData Mining Techniques and its Applications in Banking Sector
Data Mining Techniques and its Applications in Banking Sector Dr. K. Chitra 1, B. Subashini 2 1 Assistant Professor, Department of Computer Science, Government Arts College, Melur, Madurai. 2 Assistant
More information1. Classification problems
Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Webbased Analytics Table
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationData quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
More informationMachine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More informationWelcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA
Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/
More informationFoundations of Artificial Intelligence. Introduction to Data Mining
Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present
More informationMore Data Mining with Weka
More Data Mining with Weka Class 5 Lesson 1 Simple neural networks Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Lesson 5.1: Simple neural networks Class
More informationAnalytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
More informationData Mining Jargon. Bob Muenchen The Statistical Consulting Center
Data Mining Jargon Bob Muenchen The Statistical Consulting Center Data mining is the automated search for useful patterns in data. It uses tools from many different disciplines, each of which uses its
More informationData Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine
Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining
More informationA Basic Guide to Modeling Techniques for All Direct Marketing Challenges
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview
More informationKeywords data mining, prediction techniques, decision making.
Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining
More informationLecture 6  Data Mining Processes
Lecture 6  Data Mining Processes Dr. Songsri Tangsripairoj Dr.Benjarath Pupacdi Faculty of ICT, Mahidol University 1 CrossIndustry Standard Process for Data Mining (CRISPDM) Example Application: Telephone
More informationApplication of Predictive Analytics for Better Alignment of Business and IT
Application of Predictive Analytics for Better Alignment of Business and IT Boris Zibitsker, PhD bzibitsker@beznext.com July 25, 2014 Big Data Summit  Riga, Latvia About the Presenter Boris Zibitsker
More informationNeural network models: Foundations and applications to an audit decision problem
Annals of Operations Research 75(1997)291 301 291 Neural network models: Foundations and applications to an audit decision problem Rebecca C. Wu Department of Accounting, College of Management, National
More informationIntroduction to Artificial Intelligence G51IAI. An Introduction to Data Mining
Introduction to Artificial Intelligence G51IAI An Introduction to Data Mining Learning Objectives Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees
More informationArtificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Email Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 22773878, Volume1, Issue6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
More informationANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationMedical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu
Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation:  Feature vector X,  qualitative response Y, taking values in C
More informationDecision Trees. Andrew W. Moore Professor School of Computer Science Carnegie Mellon University. www.cs.cmu.edu/~awm awm@cs.cmu.
Decision Trees Andrew W. Moore Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 422687599 Copyright Andrew W. Moore Slide Decision Trees Decision trees
More informationEMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract One
More informationCOLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COSSTAT747 Principles of Statistical Data Mining
More informationEvent driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016
Event driven trading new studies on innovative way of trading in Forex market Michał Osmoła INIME live 23 February 2016 Forex market From Wikipedia: The foreign exchange market (Forex, FX, or currency
More informationVTT INFORMATION TECHNOLOGY. Overview of Data Mining for Customer Behavior Modeling
VTT INFORMATION TECHNOLOGY RESEARCH REPORT TTE1200118 LOUHI Overview of Data Mining for Customer Behavior Modeling Version 1 29 June, 2001 by Catherine Bounsaythip and Esa RintaRunsala Version history
More information