At present, existing association rules mining algorithms have redundant candidate frequent itemsets and repeated computing. This paper proposes an algorithm of locating order mining based on sequence number, which is suitable for mining long frequent itemsets. In order to fast search long frequent itemsets, the algorithm adopts not only traditional down search, but also the method of locating order of subset to generate candidate frequent itemsets. It has two aspects, which are different from traditional down search mining algorithm. One is that the algorithm need locate order of subsets of non frequent itemsets via down search. The other is that the algorithm uses character of attribute sequence number to compute support for only scanning database once. The algorithm may efficiently delete repeated L-candidate frequent itemsets generated by (L+1)-non frequent itemsets via locating subsets' order, whose efficiency is improved. The result of experiment indicates that the algorithm is suitable for mining long frequent itemsets, and it is faster and more efficient than present algorithms of mining long frequent itemsets.
Definition 1 Binary Transaction (BT), there is a transaction T is expressed as character stringExample let I= {1, 2, 3, 4, 5, 6} be an itemsets, if a transaction is expressed as T i = {2, 5, 6}, and then BT i = (010011).Definition 2 Digital Transaction (DT), it is an integer, and its value is decimal integer of binary transaction.Example if BT= (010010), and then DT=18. Definition 3 Digital Item (DI), it is an integer, and it is the simplest digital transaction which only expresses an item attribute.Example let I= {i 1 , i 2 …i m }, and then DI 1 =1… DI m = 2 m-1 .Definition 4 Digital Transaction Length (DTL), it is an integer, it is equal to the sum of "1" in Binary Transaction.Definition 5 Suppose digital transaction of T 1 is denoted by DT 1 , digital transaction of T 2 is denoted by DT 2 . If T 1 ⊆ T 2 , and then DT 1 ⊆ DT 2 , DT 1 is regarded as subset of DT 2 , which is regarded as superset of DT 1 .Definition 6 Sequence Number (SN), it is a group of ordered number, here, these numbers may be repeated, and each number is called a sub-Item of sequence number.Example let SN= {46, 124, 65, 125, 79, 62, and 112} be a sequence number, thereinto, 124 is called a sub-Item of Sequence Number.Definition 7 Sub Item Dimension (SID), it is an integer, it is equal to the sum of "1" in binary code of sub-Item.Example, let 58 be a sub-Item, and then SID (58) = SID (111010) 2 =4.Definition 8 Sequence Number Dimension (SND), it is an integer, it is equal to the sum of items' SID contained by 403 2010 IEEE 978-1-4244-6527-9/10/$26.00 ©