w3resource

Pandas: Keep a DataFrame with valid entries

Pandas Handling Missing Values: Exercise-10 with Solution

Write a Pandas program to keep the valid entries of a given DataFrame.

Test Data:

     ord_no  purch_amt    ord_date  customer_id
0       NaN        NaN         NaN          NaN
1       NaN     270.65  2012-09-10       3001.0
2   70002.0      65.26         NaN       3001.0
3       NaN        NaN         NaN          NaN
4       NaN     948.50  2012-09-10       3002.0
5   70005.0    2400.60  2012-07-27       3001.0
6       NaN    5760.00  2012-09-10       3001.0
7   70010.0    1983.43  2012-10-10       3004.0
8   70003.0    2480.40  2012-10-10       3003.0
9   70012.0     250.45  2012-06-27       3002.0
10      NaN      75.29  2012-08-17       3001.0
11      NaN        NaN         NaN          NaN

Sample Solution:

Python Code :

import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[np.nan,np.nan,70002,np.nan,np.nan,70005,np.nan,70010,70003,70012,np.nan,np.nan],
'purch_amt':[np.nan,270.65,65.26,np.nan,948.5,2400.6,5760,1983.43,2480.4,250.45, 75.29,np.nan],
'ord_date': [np.nan,'2012-09-10',np.nan,np.nan,'2012-09-10','2012-07-27','2012-09-10','2012-10-10','2012-10-10','2012-06-27','2012-08-17',np.nan],
'customer_id':[np.nan,3001,3001,np.nan,3002,3001,3001,3004,3003,3002,3001,np.nan]})
print("Original Orders DataFrame:")
print(df)
print("\nKeep the said DataFrame with valid entries:")
result = df.dropna(inplace=False)
print(result)

Sample Output:

Original Orders DataFrame:
     ord_no  purch_amt    ord_date  customer_id
0       NaN        NaN         NaN          NaN
1       NaN     270.65  2012-09-10       3001.0
2   70002.0      65.26         NaN       3001.0
3       NaN        NaN         NaN          NaN
4       NaN     948.50  2012-09-10       3002.0
5   70005.0    2400.60  2012-07-27       3001.0
6       NaN    5760.00  2012-09-10       3001.0
7   70010.0    1983.43  2012-10-10       3004.0
8   70003.0    2480.40  2012-10-10       3003.0
9   70012.0     250.45  2012-06-27       3002.0
10      NaN      75.29  2012-08-17       3001.0
11      NaN        NaN         NaN          NaN

Keep the said DataFrame with valid entries:
    ord_no  purch_amt    ord_date  customer_id
5  70005.0    2400.60  2012-07-27       3001.0
7  70010.0    1983.43  2012-10-10       3004.0
8  70003.0    2480.40  2012-10-10       3003.0
9  70012.0     250.45  2012-06-27       3002.0

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to drop those rows from a given DataFrame in which specific columns have missing values.
Next: Write a Pandas program to calculate the total number of missing values in a DataFrame.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Become a Patron!

Follow us on Facebook and Twitter for latest update.

It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.

https://198.211.115.131/python-exercises/pandas/missing-values/python-pandas-missing-values-exercise-10.php