In this post, I am going to introduce several ways to split data into training, validation, and test sets for your machine learning project.
Assuming that you have lists of paths to images (
image_paths) and corresponding masks (
mask_paths) and want to split each list to training (70%), validation (15%), and test sets (15%).
### sklearn v.0.24.2 from sklearn.model_selection import train_test_split # First, split data into train and test sets train_image_paths, test_image_paths, train_mask_paths, test_mask_paths = train_test_split(image_paths, mask_paths, test_size=0.3, random_state=0) # And then split the test set into validation and test sets val_image_paths, test_image_paths, val_mask_paths, test_mask_paths = train_test_split(test_image_paths, test_mask_paths, test_size=0.5, random_state=0)
to be updated