Titanic Machine Learning Study from Disaster
Author(s) | Cao, Emma Yiqin | |
Author(s) | Xie, Weitao | |
Author(s) | Dong, Chunzhi | |
Author(s) | Qiu, Jing | |
Date Accessioned | 2020-07-17T20:37:51Z | |
Date Available | 2020-07-17T20:37:51Z | |
Publication Date | 2020-05 | |
Abstract | Machine learning plays an important role in the data science field nowadays. They can be used for classification problems. In this project, we are interested in understanding what kinds of people were more likely to survive the sinking of Titanic using different machine learning methods. Different predictors of passenger information were provided, and the survival chance of different passengers was predicted based on their covariates using 5 different machine learning methods including Conventional Logistic Regression, Random Forest, K-Nearest Neighbor, Support Vector Machine and Gradient Boosting. Grid Search Cross-validation was used for calibrating the prediction accuracy of different methods. The SVM model performs the best for our data with nine predictors and the prediction accuracy is about 83%. The Random Forest model performs the best for our data with six predictors and the prediction accuracy is also about 83%. We used Python for the whole analysis including cleaning the data, visualization, validation, and modeling. | en_US |
Sponsor | We thank Dr. Jing Qiu and Dr. Thomas Ilvento for their research assistance. | en_US |
URL | http://udspace.udel.edu/handle/19716/27322 | |
Publisher | Department of Applied Economics and Statistics, University of Delaware, Newark, DE. | en_US |
Part of Series | APEC Research Reports;RR20-01 | |
Keywords | Machine learning | en_US |
Keywords | Titanic | en_US |
Keywords | Survival rate | en_US |
Keywords | Prediction accuracy | en_US |
Title | Titanic Machine Learning Study from Disaster | en_US |
Type | Working Paper | en_US |