Predicting mortality in heart failure patients using machine learning (PyCaret)

In this post, I’m going to develop machine learning models to predict death in heart failure patients.

Dataset

The description of the dataset can be here.

import pandas as pd

df = pd.read_csv('heart_failure_clinical_records_dataset.csv')
df.head()
ageanaemiacreatinine_phosphokinasediabetesejection_fractionhigh_blood_pressureplateletsserum_creatinineserum_sodiumsexsmokingtimeDEATH_EVENT
075.005820201265000.001.91301041
155.0078610380263358.031.11361061
265.001460200162000.001.31291171
350.011110200210000.001.91371071
465.011601200327000.002.71160081
df.shape
(299, 13)

Here I’m going to build classifiers using the machine learning library PyCaret.

from pycaret.classification import *

cls = setup(df, target='DEATH_EVENT', 
            categorical_features=['anaemia', 'diabetes', 'high_blood_pressure', 'sex', 'smoking'], 
            numeric_features=['age', 'creatinine_phosphokinase', 'ejection_fraction', 'platelets', 'serum_creatinine', 'serum_sodium'], 
            ignore_features=['time'])
compare_models()
ModelAccuracyAUCRecallPrec.F1KappaMCCTT (Sec)
lrLogistic Regression0.75600.73230.45710.73880.54760.39670.42650.0050
ridgeRidge Classifier0.73710.00000.44290.68720.52140.35560.38160.0040
ldaLinear Discriminant Analysis0.73710.76430.44290.68720.52140.35560.38160.0050
rfRandom Forest Classifier0.72740.76810.50000.64330.54750.35980.37490.0330
catboostCatBoost Classifier0.72260.76900.44290.67880.51510.33320.36070.4300
nbNaive Bayes0.70330.69560.28570.64830.39090.23120.26860.0040
etExtra Trees Classifier0.69860.73740.37140.64210.43170.25360.28490.0310
adaAda Boost Classifier0.69430.69520.48570.61040.51600.29970.32070.0090
xgboostExtreme Gradient Boosting0.69330.71910.45710.56730.49020.27980.29180.1980
lightgbmLight Gradient Boosting Machine0.68880.72790.41430.57530.46040.25700.27360.1200
gbcGradient Boosting Classifier0.67980.69880.48570.54060.49980.26840.27740.0090
qdaQuadratic Discriminant Analysis0.67950.64400.32860.54830.40270.20440.22060.0070
dtDecision Tree Classifier0.65570.61730.50000.50480.49000.23380.24230.0040
svmSVM – Linear Kernel0.63170.00000.10000.03330.05000.00000.00000.0040
knnK Neighbors Classifier0.61260.45620.12860.30830.1756-0.0139-0.00850.0060

Comments

Copied title and URL