Replacing missing values with column mean in Pandas DataFrame
Python Pandas Numpy: Exercise-15 with Solution
Replace missing values in a Pandas DataFrame with the mean of the column.
Sample Solution:
Python Code:
import pandas as pd
import numpy as np
# Create a sample DataFrame with missing values
data = {'A': [1, 2, np.nan, 4, 5],
'B': [10, np.nan, 30, 40, 50],
'C': [100, 200, 300, np.nan, 500],
'D': [1000, 2000, 3000, 4000, np.nan]}
df = pd.DataFrame(data)
# Replace missing values with the mean of each column
df_filled = df.fillna(df.mean())
# Display the DataFrame with missing values replaced
print(df_filled)
Output:
A B C D 0 1.0 10.0 100.0 1000.0 1 2.0 32.5 200.0 2000.0 2 3.0 30.0 300.0 3000.0 3 4.0 40.0 275.0 4000.0 4 5.0 50.0 500.0 2500.0
Explanation:
In the exerciser above,
- Create a sample DataFrame (df) with some missing values (represented by np.nan).
- The df.mean() calculates the mean of each column.
- The df.fillna(df.mean()) replaces the missing values in each column with the mean of that column.
- The result is a new DataFrame (df_filled) with missing values replaced by the mean of each column.
Flowchart:
Python Code Editor:
Previous: Reshaping Pandas DataFrame with pivot_table in Python.
Next: Creating Histogram with NumPy and Matplotlib in Python.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python-exercises/pandas_numpy/pandas_numpy-exercise-15.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics