w3resource

Replacing missing values with column mean in Pandas DataFrame


Replace missing values in a Pandas DataFrame with the mean of the column.

Sample Solution:

Python Code:

import pandas as pd
import numpy as np

# Create a sample DataFrame with missing values
data = {'A': [1, 2, np.nan, 4, 5],
        'B': [10, np.nan, 30, 40, 50],
        'C': [100, 200, 300, np.nan, 500],
        'D': [1000, 2000, 3000, 4000, np.nan]}

df = pd.DataFrame(data)

# Replace missing values with the mean of each column
df_filled = df.fillna(df.mean())

# Display the DataFrame with missing values replaced
print(df_filled)

Output:

     A     B      C       D
0  1.0  10.0  100.0  1000.0
1  2.0  32.5  200.0  2000.0
2  3.0  30.0  300.0  3000.0
3  4.0  40.0  275.0  4000.0
4  5.0  50.0  500.0  2500.0

Explanation:

In the exerciser above,

  • Create a sample DataFrame (df) with some missing values (represented by np.nan).
  • The df.mean() calculates the mean of each column.
  • The df.fillna(df.mean()) replaces the missing values in each column with the mean of that column.
  • The result is a new DataFrame (df_filled) with missing values replaced by the mean of each column.

Flowchart:

Flowchart: Replacing missing values with column mean in Pandas DataFrame.

Python Code Editor:

Previous: Reshaping Pandas DataFrame with pivot_table in Python.
Next: Creating Histogram with NumPy and Matplotlib in Python.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.