w3resource

Pandas DataFrame: Select the rows where the score is missing


9. Selecting Rows with Missing Score

Write a Pandas program to select the rows where the score is missing, i.e. is NaN.

Sample DataFrame:
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

Sample Solution :

Python Code :

import pandas as pd
import numpy as np
exam_data  = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
        'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
        'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
        'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

df = pd.DataFrame(exam_data , index=labels)
print("Rows where score is missing:")
print(df[df['score'].isnull()])

Sample Output:

Rows where score is missing:
   attempts   name qualify  score
d         3  James      no    NaN
h         1  Laura      no    NaN                              

Explanation:

In the above code -

df = pd.DataFrame(exam_data , index=labels): This line creates a pandas DataFrame called ‘df’ from a dictionary ’exam_data’ with specified index labels. The DataFrame has columns 'name', 'score', 'attempts', and 'qualify' which are created from the corresponding values in the dictionary.

print(df[df['score'].isnull()]): This code filters the rows in the DataFrame where the 'score' column is null using the isnull() method and prints the resulting subset of the DataFrame using boolean indexing with df[df['score'].isnull()]. This will show the rows where the 'score' column has missing or NaN (not a number) values.


For more Practice: Solve these Related Problems:

  • Write a Pandas program to select rows where the 'score' column is NaN and then count these rows.
  • Write a Pandas program to filter out rows with missing scores and then fill these missing values with the mean score.
  • Write a Pandas program to select rows with a missing 'score' and then update the 'qualify' column to 'unknown'.
  • Write a Pandas program to extract rows with NaN in 'score' and then display only the 'name' and 'score' columns.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to count the number of rows and columns of a DataFrame.
Next: Write a Pandas program to select the rows the score is between 15 and 20 (inclusive).

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.