PMID- 33817031
OWN - NLM
STAT- PubMed-not-MEDLINE
LR  - 20210405
IS  - 2376-5992 (Electronic)
IS  - 2376-5992 (Linking)
VI  - 7
DP  - 2021
TI  - TKFIM: Top-K frequent itemset mining technique based on equivalence classes.
PG  - e385
LID - 10.7717/peerj-cs.385 [doi]
LID - e385
AB  - Frequently used items mining is a significant subject of data mining studies. In 
      the last ten years, due to innovative development, the quantity of data has grown 
      exponentially. For frequent Itemset (FIs) mining applications, it imposes new 
      challenges. Misconceived information may be found in recent algorithms, including 
      both threshold and size based algorithms. Threshold value plays a central role in 
      generating frequent itemsets from the given dataset. Selecting a support 
      threshold value is very complicated for those unaware of the dataset's 
      characteristics. The performance of algorithms for finding FIs without the 
      support threshold is, however, deficient due to heavy computation. Therefore, we 
      have proposed a method to discover FIs without the support threshold, called 
      Top-k frequent itemsets mining (TKFIM). It uses class equivalence and set-theory 
      concepts for mining FIs. The proposed procedure does not miss any FIs; thus, 
      accurate frequent patterns are mined. Furthermore, the results are compared with 
      state-of-the-art techniques such as Top-k miner and Build Once and Mine Once 
      (BOMO). It is found that the proposed TKFIM has outperformed the results of these 
      approaches in terms of execution and performance, achieving 92.70, 35.87, 28.53, 
      and 81.27 percent gain on Top-k miner using Chess, Mushroom, and Connect and 
      T1014D100K datasets, respectively. Similarly, it has achieved a performance gain 
      of 97.14, 100, 78.10, 99.70 percent on BOMO using Chess, Mushroom, Connect, and 
      T1014D100K datasets, respectively. Therefore, it is argued that the proposed 
      procedure may be adopted on a large dataset for better performance.
CI  - (c)2021 Iqbal et al.
FAU - Iqbal, Saood
AU  - Iqbal S
AD  - Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, 
      KPK, Pakistan.
FAU - Shahid, Abdul
AU  - Shahid A
AD  - Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, 
      KPK, Pakistan.
FAU - Roman, Muhammad
AU  - Roman M
AD  - Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, 
      KPK, Pakistan.
FAU - Khan, Zahid
AU  - Khan Z
AD  - Robotics and Internet of Things Lab, Prince Sultan University, Riyadh, Saudi 
      Arabia.
FAU - Al-Otaibi, Shaha
AU  - Al-Otaibi S
AD  - Information Systems Department, College of Computer and Information Sciences, 
      Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.
FAU - Yu, Lisu
AU  - Yu L
AD  - School of Information Engineering, Nanchang University, Jiangxi, China.
AD  - State Key Laboratory of Computer Architecture, Institute of Computing Technology, 
      Chinese Academy of Sciences, Beijing, China.
LA  - eng
PT  - Journal Article
DEP - 20210308
PL  - United States
TA  - PeerJ Comput Sci
JT  - PeerJ. Computer science
JID - 101660598
PMC - PMC7959650
OTO - NOTNLM
OT  - Algorithm Analysis
OT  - Artifical Intelligence
OT  - Frequent Itemsets
OT  - Support Threshold
OT  - Top-k Frequent Itemsets
COIS- The authors declare there are no competing interests.
EDAT- 2021/04/06 06:00
MHDA- 2021/04/06 06:01
PMCR- 2021/03/08
CRDT- 2021/04/05 06:13
PHST- 2020/12/16 00:00 [received]
PHST- 2021/01/16 00:00 [accepted]
PHST- 2021/04/05 06:13 [entrez]
PHST- 2021/04/06 06:00 [pubmed]
PHST- 2021/04/06 06:01 [medline]
PHST- 2021/03/08 00:00 [pmc-release]
AID - cs-385 [pii]
AID - 10.7717/peerj-cs.385 [doi]
PST - epublish
SO  - PeerJ Comput Sci. 2021 Mar 8;7:e385. doi: 10.7717/peerj-cs.385. eCollection 2021.