
Competition description : The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships.
One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.
In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy.
1. Load libraries and read the data
1.1 Load libraries
# Python libraries # Classic,data manipulation and linear algebra import pandas as pd import numpy as np # Plots import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline import plotly.offline as py import plotly.graph_objs as go from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot import plotly.tools as tls import plotly.figure_factory as ff py.init_notebook_mode(connected=True) import squarify # Data processing, metrics and modeling from sklearn.preprocessing import StandardScaler, LabelEncoder from sklearn.model_selection import GridSearchCV, cross_val_score, train_test_split, GridSearchCV, RandomizedSearchCV from sklearn.metrics import precision_score, recall_score, confusion_matrix, roc_curve, precision_recall_curve, accuracy_score, roc_auc_score import lightgbm as lgbm # Stats import scipy.stats as ss from scipy.stats import randint as sp_randint from scipy.stats import uniform as sp_uniform # Time from contextlib import contextmanager @contextmanager def timer(title): t0 = time.time() yield print("{} - done in {:.0f}s".format(title, time.time() - t0)) #ignore warning messages import warnings warnings.filterwarnings('ignore')
1.2 Read data
# Reading dataset train = pd.read_csv("../input/train.csv") test = pd.read_csv("../input/test.csv")
2. Overview
2.1 Head
# Head train and test display(train.head()) display(test.head())

The rest of this analysis is available in the folowing link: https://bit.ly/2GKjWUA
Be the first to comment