Implementation of machine learning and data mining to improve cyber security and limit vulnerabilities to cyber attacks

Of the many challenges that continue to make detection of cyber-attack detection elusive, lack of training data remains the biggest one. Even though organizations and business turn to known network monitoring tools such as Wireshark, millions of people are still vulnerable because of lack of information pertaining to website behaviors and features that can amount to an attack. In fact, most of the attacks do not occur because of threat actors’ resort to complex coding and evasion techniques but because victims lack the basic tools to detect and avoid the attacks. Despite these challenges, machine learning is proving to revolutionize the understanding of the nature of cyber-attacks, and this study implemented machine learning techniques to Phishing Website data with the objective of comparing five algorithms and providing insight that the general public can use to avoid phishing pitfalls. The findings of the study suggest that Neural Network is the best performing algorithm and the model suggest that inclusion of an IP address in the domain name, longer URL, use of URL shortening services, inclusion of “@” symbol in the URL, inclusion of “−” symbol in the URL, use of non-trusted SSL certificates with expiry duration less than 6 months, domains registered for less than one year, and favicon redirecting from other URLs as the leading features of phishing websites. Neural Network is based on multi-layer perceptron and is the basis of intelligence so that in future, phishing detection will be automated and rendered an artificial intelligence task.