Data Mining

Is also known as “Machine Learning”
Data Mining is divided into two subcategories
1. Unsupervised Learning
2. Supervised Learning

Unsupervised Technique:
If Output(Y) is not Known, then we will go for Unsupervised Technique.
A Few of Unsupervised Data Mining Techniques are:
• Association Rules
• Recommendation system
• Clustering
• Dimension Reduction Techniques
• Network Analysis

Association Rules: –
Association Rules are also known as Market Basket Analysis & Affinity Analysis

“IF” part = Antecedent = A
“THEN” part = Consequent = C

Apriori Algorithm:

? Set minimum support criteria
? Generate list of one-item sets that meet the support criterion
? Use list of one-item sets to generate list of two-item sets that meet support criterion
? Use list of two-item sets to generate list of three-item sets that meet support criterion
? Continue up through k-item sets

a. Support:
? Consider only combinations that occur with higher frequency in the database
? Support is the criterion based on frequency

Formula:

Percentage / Number of transactions in which IF/Antecedent & THEN / Consequent appear in the data

Mathematically:
# transactions in which A & C appear together / # Total no. of transactions

b. Confidence
Formula: Percentage of If/Antecedent transactions that also have the Then/Consequent item set

Mathematically:

P (Consequent | Antecedent) = P (C & A) / P(A)

# transactions in which A & C appear together / # transactions with A

Confidence – Weakness
If antecedent and consequent have:
High Support => High / Biased Confidence
c. Lift Ratio:

Confidence / Benchmark confidence

Benchmark assumes independence between antecedent & consequent:

Benchmark confidence:

P(C|A) = P (C & A) / P(A) = P(C) X P(A) /P(A) = P(C)

# transactions with consequent item sets / # transactions in database

Interpreting Lift:
Lift > 1 indicates a rule that is useful in finding consequent item sets

courses offered

Post Comments

DATA SCIENCE-Data Mining -Unsupervised Learning

Data Mining

courses offered

About the Author

Post Comments

CONFIDENCE INTERVAL IN DATA SCIENCE

Popular Courses

Big Data Hadoop & Spark Analytics

Business Analyst Course Training

Data Analyst Course Training

Python

Digital Marketing Certification Course Training

Google Cloud Platform Training

Guidewire Training

Machine Learning Certification Course Training

Microsoft Azure Certification Course Training

PG/Masters Program in Data Science and AI

PMP® Certification Course Training

Amazon Web Services (AWS)

Analytics / Data Analytics Certification Training Course

Certified Business Analytics Training Program

Data Science Course Training

DATA SCIENCE-Data Mining -Unsupervised Learning

Data Mining

courses offered

About the Author

Post Comments

CONFIDENCE INTERVAL IN DATA SCIENCE

Popular Courses

Big Data Hadoop & Spark Analytics

Business Analyst Course Training

Data Analyst Course Training

Python

Digital Marketing Certification Course Training

Google Cloud Platform Training

Guidewire Training

Machine Learning Certification Course Training

Microsoft Azure Certification Course Training

PG/Masters Program in Data Science and AI

PMP® Certification Course Training

Amazon Web Services (AWS)

Analytics / Data Analytics Certification Training Course

Certified Business Analytics Training Program

Data Science Course Training

SUBSCRIBE TO OUR BLOGS