Missing Data Analysis with MICE

Outliers and missing values are the most important for any data science engineers need to deal with, we already discussed about outliers. Before talking about how to deal with missing values, let’s talk about types of missing values. Missing at Random (MAR) Missing completely at random (MCAR) Missing not at Random (MNAR) Let’s take one example,Continue reading “Missing Data Analysis with MICE”

Synthetic Minority Over-sampling Technique (SMOTE)

Imbalanced data is one of the main issue in classification problem. Why we will have imbalanced data? Let’s say if I have 100 customer who is holding credit card, may be maximum I may have 2 or 3% defaulters and remaining 95 to 97% are perfect payers (This is called presence of minority class ),Continue reading “Synthetic Minority Over-sampling Technique (SMOTE)”