Python

Handling datetime in Pandas DataFrame

Create a Year-Month column year_month = pd.date_range(start='2020-04', end='2021-09', freq='M') df = pd.DataFrame({...
R

Merging consecutive or overlapping time periods in R

I had a table something like this: patient_id <- c(rep(100,5), rep(101,6)) drug <- c(rep("A",3), rep("B",2...
Python

How to split data into training, validation, and test sets

In this post, I am going to introduce several ways to split data into training, validation, and test sets for your mach...
Python

Handling a large tabular dataset using Vaex

When you have to handle a large tabular dataset (for example, 100 GB dataset with one billion rows), which cannot fit i...
R

Visualizing prescription drugs using a Dumbbell chart in R

In this post, I am going to show you how to visualize prescription periods of drugs using a Dumbbell chart in R. lib...
dataset

Publicly available datasets in healthcare

This post introduces publicly available datasets in healthcare for those who are interested in learning statistical ana...
Python

Plotting functions in Python

Here I'm going to plot several machine learning-related functions in Python. First, import necessary modules: imp...
dataset

Heart failure dataset

The data was collected from 299 patients (105 women and 194 men) who were admitted to Institute of Cardiology and Allie...
Python

Predicting mortality in heart failure patients using machine learning (PyCaret)

In this post, I'm going to develop machine learning models to predict death in heart failure patients. Dataset Th...
Python

Analyzing emails in Python

Open EML files Create a list of paths to .eml files in a directory and extract their file names and texts. from p...
Copied title and URL