Removing columns with too many missing values using dropna() in Pandas
Pandas: Data Cleaning and Preprocessing Exercise-13 with Solution
Write a Pandas program to remove columns with too many missing values.
Following exercise removes columns that contain too many missing values using dropna().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with missing values
df = pd.DataFrame({
'Name': ['Selena', 'Annabel', 'Caeso'],
'Age': [25, None, 22],
'Salary': [None, None, 70000]
})
# Remove columns with more than 50% missing values
df_cleaned = df.dropna(thresh=2, axis=1)
# Output the result
print(df_cleaned)
Output:
Name Age 0 Selena 25.0 1 Annabel NaN 2 Caeso 22.0
Explanation:
- Created a DataFrame with multiple columns containing missing values.
- Used dropna(thresh=2, axis=1) to remove columns with more than 50% missing values.
- Returned the DataFrame with only columns that have sufficient data.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python-exercises/pandas/pandas-remove-columns-with-too-many-missing-values.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics