Replacing missing values with column mean in Pandas DataFrame
Replace missing values in a Pandas DataFrame with the mean of the column.
Sample Solution:
Python Code:
import pandas as pd
import numpy as np
# Create a sample DataFrame with missing values
data = {'A': [1, 2, np.nan, 4, 5],
'B': [10, np.nan, 30, 40, 50],
'C': [100, 200, 300, np.nan, 500],
'D': [1000, 2000, 3000, 4000, np.nan]}
df = pd.DataFrame(data)
# Replace missing values with the mean of each column
df_filled = df.fillna(df.mean())
# Display the DataFrame with missing values replaced
print(df_filled)
Output:
A B C D 0 1.0 10.0 100.0 1000.0 1 2.0 32.5 200.0 2000.0 2 3.0 30.0 300.0 3000.0 3 4.0 40.0 275.0 4000.0 4 5.0 50.0 500.0 2500.0
Explanation:
In the exerciser above,
- Create a sample DataFrame (df) with some missing values (represented by np.nan).
- The df.mean() calculates the mean of each column.
- The df.fillna(df.mean()) replaces the missing values in each column with the mean of that column.
- The result is a new DataFrame (df_filled) with missing values replaced by the mean of each column.
Flowchart:
Python Code Editor:
Previous: Reshaping Pandas DataFrame with pivot_table in Python.
Next: Creating Histogram with NumPy and Matplotlib in Python.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics