Using machine learning algorithms to study the smoking behavior of Iraqi students
Main Article Content
Abstract
Many Machine learning studies analyzed smoking behavior through the relation of this complex behavior and smokers inherited genes other studies tried to predict the negative impact of smoking by monitoring body movements (like arm movement, breathing patterns). All these studies have limited ability to analyze the reasons led to this behavior specially for teenagers. In this study we present a methodology with five predictors to analyze the smoking behavior of school students in Iraq. The data obtained from National Youth Tobacco Survey Data set (NYTS) in 2019 which contain selfreported questions for 2560 individual arranged into 99 attributes. Naïve Bayes (NB), K*, PART, Logitboost, and REPTree, have been used to analyze the student smoking status from three vectors the overall smoking behavior, smoking cessation, and school environment impact. The results showed good outcomes for the three vectors, K* model seems to be the best predictor for overall smoking behavior with 91,02 accuracy, Logitboost scored 84.6611 accuracy for smoking cessation, and 78.383 as best result in evaluating school influence. These results proof that machine learning models have promising ability to predict student smoking behavior