Pandas DataFrame: Count the NaN values in one or more columns in DataFrame
35. Count NaN Values
Write a Pandas program to count the NaN values in one or more columns in DataFrame.
Sample data:
Original DataFrame
attempts name qualify score
0 1 Anastasia yes 12.5
1 3 Dima no 9.0
2 2 Katherine yes 16.5
3 3 James no NaN
4 2 Emily no 9.0
5 3 Michael yes 20.0
6 1 Matthew yes 14.5
7 1 Laura no NaN
8 2 Kevin no 8.0
9 1 Jonas yes 19.0
Number of NaN values in one or more columns:
2
Sample Solution :
Python Code :
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
df = pd.DataFrame(exam_data)
print("Original DataFrame")
print(df)
print("\nNumber of NaN values in one or more columns:")
print(df.isnull().values.sum())
Sample Output:
Original DataFrame attempts name qualify score 0 1 Anastasia yes 12.5 1 3 Dima no 9.0 2 2 Katherine yes 16.5 3 3 James no NaN 4 2 Emily no 9.0 5 3 Michael yes 20.0 6 1 Matthew yes 14.5 7 1 Laura no NaN 8 2 Kevin no 8.0 9 1 Jonas yes 19.0 Number of NaN values in one or more columns: 2
Explanation:
The above code creates a pandas DataFrame ‘df’ from a dictionary ‘exam_data’ containing information about some exam scores.
df.isnull().values.sum(): This code uses the isnull() function to check which values in the DataFrame are null or NaN, and returns a DataFrame containing the same shape as ‘df’ with True for missing values and False for non-missing values. The values attribute is used to extract the values of the resulting DataFrame and the sum() function is applied to the values to get the total count of missing values in the original DataFrame.
Finally print() function prints the total number of missing values in the DataFrame.
For more Practice: Solve these Related Problems:
- Write a Pandas program to count the number of NaN values in each column and then display the result as a Series.
- Write a Pandas program to compute the total NaN count across the entire DataFrame and then output the percentage per column.
- Write a Pandas program to count NaN values in selected columns and then plot these counts using a bar chart.
- Write a Pandas program to identify columns with NaN counts exceeding a threshold and then remove those columns.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to set a given value for particular cell in DataFrame using index value.
Next: Write a Pandas program to drop a list of rows from a specified DataFrame.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.