Sign In
Learn practical skills, build real-world projects, and advance your career

Learn how machine learning can help reduce commercial aviation fatalities by detecting dangerous cognitive states in aircrews using physiological data. Explore logistic regression, decision tree, random forest, and XGB classifiers for accuracy. Follow along with the project to see how hyperparameters are tuned to improve accuracy and reduce errors.

Reducing Commercial Aviation Fatalities with Machine Learning (XGBoost)

Most aviation-related fatalities stem form a loss of "airplane state awareness". Lack of awareness can be because of distractions, sleepy or other dangerous cognitive states (1). The goal of this project is to detect abberations is the attention span of aircrews based on their physiological data collected from test situations. This can then help develop measures to pre-empt dangerous cognitive states. The data for this project is from the former Kaggle competition - 'Reducing Commercial Aviation Fatalities' topic on the Kaggle platform. The machine learning problem in this project is identified as a classification problem. Therefore logistic regression, decision tree classifier, random forest classifier and XGB Classifier models were built and tested for accuracy.

1. Introduction

Commercial aviation is widely regarded as the safest mode of transport. Yet, fatal incidents are not uncommon - owing to several factors. Numerous analyses were done on aviation fatalities and their causes. Most of these analyses pointed to pilot error as the leading cause of a fatal aviation incident (2), (3), (4).

Human error can be caused by mental factors such as distraction, improper judgement, lack of training or other factors like cockpit intrusion and so on. In this project, the goal is to predict when an airline pilot gets into a distractive mental state.

To solve this problem with machine learning, several models are built and tested. Specific hyperparameters for each type of model are tuned to further improve accuracy. The evaluation metric for this problem is 'multiclass log loss' which can be calculated using functions in the sklearn library.

Let's install the Jovian library. We can use jovian.commit() to save our work periodically.