Pandas: Fill missing values in time series data
Write a Pandas program to fill missing values in time series data.
From Wikipedia , in the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing new data points within the range of a discrete set of known data points.
Sample Solution :
Python Code :
import pandas as pd
import numpy as np
sdata = {"c1":[120, 130 ,140, 150, np.nan, 170], "c2":[7, np.nan, 10, np.nan, 5.5, 16.5]}
df = pd.DataFrame(sdata)
df.index = pd.util.testing.makeDateIndex()[0:6]
print("Original DataFrame:")
print(df)
print("\nDataFrame after interpolate:")
print(df.interpolate())
Sample Output:
Original DataFrame: c1 c2 2000-01-03 120.0 7.0 2000-01-04 130.0 NaN 2000-01-05 140.0 10.0 2000-01-06 150.0 NaN 2000-01-07 NaN 5.5 2000-01-10 170.0 16.5 DataFrame after interpolate: c1 c2 2000-01-03 120.0 7.00 2000-01-04 130.0 8.50 2000-01-05 140.0 10.00 2000-01-06 150.0 7.75 2000-01-07 160.0 5.50 2000-01-10 170.0 16.50
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to create DataFrames that contains random values, contains missing values, contains datetime values and contains mixed values.
Next: Write a Pandas program to use a local variable within a query.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics