w3resource

Removing columns with too many missing values using dropna() in Pandas


Pandas: Data Cleaning and Preprocessing Exercise-13 with Solution


Write a Pandas program to remove columns with too many missing values.

Following exercise removes columns that contain too many missing values using dropna().

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with missing values
df = pd.DataFrame({
    'Name': ['Selena', 'Annabel', 'Caeso'],
    'Age': [25, None, 22],
    'Salary': [None, None, 70000]
})

# Remove columns with more than 50% missing values
df_cleaned = df.dropna(thresh=2, axis=1)

# Output the result
print(df_cleaned)

Output:

      Name   Age
0   Selena  25.0
1  Annabel   NaN
2    Caeso  22.0

Explanation:

  • Created a DataFrame with multiple columns containing missing values.
  • Used dropna(thresh=2, axis=1) to remove columns with more than 50% missing values.
  • Returned the DataFrame with only columns that have sufficient data.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.