A majority of data science is cleaning or cleansing the data set you are given. You have to detect any values that may give you trouble later on for certain algorithms. You have to deal with any missing values. Sometimes you have to drop certain data if there is just too many missing values for a specific column or feature. Some data is filled with NaN’s while other may have placeholder data like ‘?’ or maybe a default value. It is important to check all of these.