From the course: Data Cleaning in Python Essential Training

Unlock this course with a free trial

Join today to access over 25,400 courses taught by industry experts.

Deleting bad data

Deleting bad data

- One of the easiest ways to deal with bad rows is to delete them. Say we have some rides data and we would like to ignore or delete the rows that are not valid. So here is our data. We have the name, the plate, and the distance, and we would like to ignore rows that either the distance is less than zero or that we're missing the name. So here is our code. First, let's load the csv. And we can see that here the distance is lower than zero, and here we do not have a name. So I'm going to create a mask using the eval method of the data frame, which is either the name is null or the distance is smaller or equal to zero. If I'm going to run it, I'm going to get a true or false per roll, and I'm seeing that row three and four are bad ones. Now I'm going to say the data frame is the data frame and I'm going to negate the mask using the still design, meaning I want all the rows that are not considered bad. When I'm running…

Contents