Frequent pattern is a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set. It used for finding inherent regularities in data.

Ex. What products were often purchased together? - Beer and diapers (fact) ?!

Because blue collars usually buy beers on Friday night after work. At that time, their wives will ask them buy some diapers by the way.

Frequent Patterns and Association Rules

Itemset

Find all the rules X -> Y with minimum support and confidence

  • Support s, probability that a transaction contains . Support define the rule is popular or not.
  • Confidence c, conditional probability that a transaction having X also contains Y

Example:

Transaction-id Items bought
10 A, B, D
20 A, C, D
30 A, D, E
40 B, E, F
50 B, C, D, E, F

Frequent Patterns: { A:3, B:3, D:4, D:4, E:3, AD:3 }

We Define

Association rules:

  • A -> D: ( s = 3/5, c = 3/3 ) = ( 60% , 100% )
  • D -> A: ( s = 3/5, c = 3/4 ) = ( 60% , 75% )
  • Therefore, A and D is strongly associated.

The downward closure property of frequent patterns:
Any subset of a frequent itemset must be frequent

major scalable mining methods

  1. Apriori Algorithm (Agrawal & Srikant, 1994)
  2. Freq. pattern growth (Han, Pei & Yin, 2000)
  3. Vertical data format approach (Zaki & Gouda, 2003)