Mastering Pandas: 100 Exercises with solutions for Python data analysis
Welcome to w3resource's 100 Pandas exercises collection! This comprehensive set of exercises is designed to help you master the fundamentals of Pandas, a powerful data manipulation and analysis library in Python. Whether you're a beginner or an experienced user looking to improve your skills, these exercises cover a wide range of topics. They provide practical challenges to enhance your Pandas understanding.
[An editor is available at the bottom of the page to write and execute the scripts. Go to the editor]
Exercise 1:
Create a DataFrame from a dictionary of lists.
Solution:
Output:
X Y 0 1 5 1 2 6 2 3 7 3 4 8
Exercise 2:
Select the first 3 rows of a DataFrame.
Solution:
Output:
X Y 0 1 5 1 2 6 2 3 7
Exercise 3:
Select the 'X' column from a DataFrame.
Solution:
Output:
0 1 1 2 2 3 3 4 Name: X, dtype: int64
Exercise 4:
Filter rows based on a column condition.
Solution:
Output:
X Y 2 3 7 3 4 8
Exercise 5:
Add a new column to an existing DataFrame.
Solution:
Output:
X Y Z 0 1 5 6 1 2 6 8 2 3 7 10 3 4 8 12
Exercise 6:
Remove a column from a DataFrame.
Solution:
Output:
X Y 0 1 5 1 2 6 2 3 7 3 4 8
Exercise 7:
Sort a DataFrame by a column.
Solution:
Output:
X Y 3 1 5 2 2 6 1 3 7 0 4 8
Exercise 8:
Group a DataFrame by a column and calculate the mean of each group.
Solution:
Output:
Y X 1 6.0 2 7.0
Exercise 9:
Replace missing values in a DataFrame.
Solution:
Output:
X Y 0 1.0 5.0 1 2.0 0.0 2 0.0 7.0 3 4.0 8.0
Exercise 10:
Convert a column to datetime.
Solution:
Output:
X 0 2020-01-01 1 2020-01-02 2 2020-01-03
Exercise 11:
Create a DataFrame with specific column names.
Solution:
Output:
col1 col2 0 1 4 1 2 5 2 3 6
Exercise 12:
Calculate the sum of values in each column.
Solution:
Output:
X 6 Y 15 dtype: int64
Exercise 13:
Calculate the mean of values in each row.
Solution:
Output:
0 2.5 1 3.5 2 4.5 dtype: float64
Exercise 14:
Concatenate two DataFrames.
Solution:
Output:
X Y 0 1 4 1 2 5 2 3 6
Exercise 15:
Merge two DataFrames on a key.
Solution:
Output:
key value1 value2 0 X 1 4 1 Y 2 5
Exercise 16:
Create a pivot table from a DataFrame.
Solution:
Output:
Y one two X bar 3.0 4.0 foo 1.0 2.0
Exercise 17:
Reshape a DataFrame from long to wide format.
Solution:
Output:
Y one two X bar 3 4 foo 1 2
Exercise 18:
Calculate the correlation between columns in a DataFrame.
Solution:
Output:
X Y X 1.0 -1.0 Y -1.0 1.0
Exercise 19:
Iterate over rows in a DataFrame using iterrows().
Solution:
Output:
0 1 4 1 2 5 2 3 6
Exercise 20:
Apply a function to each element in a DataFrame.
Solution:
Output:
X Y 0 2 8 1 4 10 2 6 12
Exercise 21:
Create a DataFrame from a list of dictionaries.
Solution:
Output:
X Y 0 1 2 1 3 4
Exercise 22:
Rename columns in a DataFrame.
Solution:
Output:
X Y 0 1 4 1 2 5 2 3 6
Exercise 23:
Filter rows by multiple conditions.
Solution:
Output:
X Y 2 3 6
Exercise 24:
Calculate the cumulative sum of a column.
Solution:
Output:
X Cumulative_Sum 0 1 1 1 2 3 2 3 6 3 4 10
Exercise 25:
Drop rows with missing values.
Solution:
Output:
X Y 0 1.0 4.0 1 2.0 5.0
Exercise 26:
Replace values in a DataFrame based on a condition.
Solution:
Output:
X Y 0 1 5 1 2 6 2 3 0 3 4 0
Exercise 27:
Create a DataFrame with a MultiIndex.
Solution:
Output:
Value Group Number X 1 10 2 20 Y 1 30 2 40
Exercise 28:
Calculate the rolling mean of a column.
Solution:
Output:
X Rolling_Mean 0 1 NaN 1 2 NaN 2 3 2.0 3 4 3.0 4 5 4.0 5 6 5.0
Exercise 29:
Create a DataFrame from a list of tuples.
Solution:
Output:
X Y 0 1 2 1 3 4 2 5 6
Exercise 30:
Add a row to a DataFrame.
Solution:
Output:
X Y 0 1 3 1 2 4 2 5 6
Exercise 31:
Create a DataFrame with random values.
Solution:
Output:
X Y Z 0 0.688292 0.950264 0.665916 1 0.497719 0.840536 0.923938 2 0.285218 0.091178 0.722034 3 0.037824 0.248689 0.584696
Exercise 32:
Calculate the rank of values in a DataFrame.
Solution:
Output:
X Y Rank 0 3 2 3.0 1 1 3 1.5 2 4 1 4.0 3 1 4 1.5
Exercise 33:
Change the data type of a column.
Solution:
Output:
X 0 1 1 2 2 3
Exercise 34:
Filter rows based on string matching.
Solution:
Output:
X 1 bar 2 baz
Exercise 35:
Create a DataFrame with specified row and column labels.
Solution:
Output:
col1 col2 col3 row1 1 2 3 row2 4 5 6 row3 7 8 9
Exercise 36:
Transpose a DataFrame.
Solution:
Output:
0 1 2 X 1 2 3 Y 4 5 6
Exercise 37:
Set a column as the index of a DataFrame.
Solution:
Output:
Y X 1 4 2 5 3 6
Exercise 38:
Reset the index of a DataFrame.
Solution:
Output:
X Y 0 1 4 1 2 5 2 3 6
Exercise 39:
Add a prefix or suffix to column names.
Solution:
Output:
col_X col_Y 0 1 4 1 2 5 2 3 6
Exercise 40:
Filter rows based on datetime index.
Solution:
Output:
X 2020-01-03 3 2020-01-04 4 2020-01-05 5
Exercise 41:
Create a DataFrame with duplicate rows and remove duplicates.
Solution:
Output:
X Y 0 1 4 1 2 5 3 3 6
Exercise 42:
Create a DataFrame with hierarchical index.
Solution:
Output:
Value Group Number X 1 10 2 20 Y 1 30 2 40
Exercise 43:
Calculate the difference between consecutive rows in a DataFrame.
Solution:
Output:
X Difference 0 1 NaN 1 3 2.0 2 6 3.0 3 10 4.0
Exercise 44:
Create a DataFrame with hierarchical columns.
Solution:
Output:
Group X Y Type C1 C2 C1 C2 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12
Exercise 45:
Filter rows based on the length of strings in a column.
Solution:
Output:
Empty DataFrame Columns: [X] Index: []
Exercise 46:
Calculate the percentage change between rows in a DataFrame.
Solution:
Output:
X Pct_Change 0 1 NaN 1 2 1.000000 2 3 0.500000 3 4 0.333333
Exercise 47:
Create a DataFrame from a dictionary of Series.
Solution:
Output:
X Y 0 1 4 1 2 5 2 3 6
Exercise 48:
Filter rows based on whether a column value is in a list.
Solution:
Output:
X Y 1 2 6 2 3 7
Exercise 49:
Calculate the z-score of values in a DataFrame.
Solution:
Output:
X Y zscore_A 0 1 4 -1.341641 1 2 5 -0.447214 2 3 6 0.447214 3 4 7 1.341641
Exercise 50:
Create a DataFrame with random integers and calculate descriptive statistics.
Solution:
Output:
X Y Z count 5.000000 5.000000 5.000000 mean 60.600000 71.800000 42.600000 std 38.435661 13.971399 12.218838 min 5.000000 53.000000 28.000000 25% 40.000000 64.000000 34.000000 50% 69.000000 72.000000 41.000000 75% 91.000000 82.000000 55.000000 max 98.000000 88.000000 55.000000
Exercise 51:
Calculate the rank of values in each column of a DataFrame.
Solution:
Output:
X Y Rank_A Rank_B 0 3 2 3.0 2.0 1 1 3 1.5 3.0 2 4 1 4.0 1.0 3 1 4 1.5 4.0
Exercise 52:
Filter rows based on multiple string conditions.
Solution:
Output:
X 1 bar 2 baz 3 qux
Exercise 53:
Create a DataFrame with random values and calculate the skewness.
Solution:
Output:
X 1 bar 2 baz 3 qux
Exercise 54:
Create a DataFrame and calculate the kurtosis.
Solution:
Output:
X 2.958407 Y -2.639654 Z 2.704430 dtype: float64
Exercise 55:
Calculate the cumulative product of a column in a DataFrame.
Solution:
Output:
X Cumulative_Product 0 1 1 1 2 2 2 3 6 3 4 24
Exercise 56:
Create a DataFrame and calculate the rolling standard deviation.
Solution:
Output:
X Rolling_Std 0 1 NaN 1 2 NaN 2 3 1.0 3 4 1.0 4 5 1.0 5 6 1.0
Exercise 57:
Create a DataFrame and calculate the expanding mean.
Solution:
Output:
X Expanding_Mean 0 1 1.0 1 2 1.5 2 3 2.0 3 4 2.5 4 5 3.0 5 6 3.5
Exercise 58:
Create a DataFrame with random values and calculate the covariance matrix.
Solution:
Output:
X Y Z X 0.054079 0.007398 -0.031403 Y 0.007398 0.053211 -0.020480 Z -0.031403 -0.020480 0.048057
Exercise 59:
Create a DataFrame with random values and calculate the correlation matrix.
Solution:
Output:
X Y Z X 1.000000 -0.258187 0.541044 Y -0.258187 1.000000 -0.432419 Z 0.541044 -0.432419 1.000000
Exercise 60:
Create a DataFrame and calculate the rolling correlation between two columns.
Solution:
Output:
X Y Rolling_Corr 0 1 6 NaN 1 2 5 NaN 2 3 4 -1.0 3 4 3 -1.0 4 5 2 -1.0 5 6 1 -1.0
Exercise 61:
Create a DataFrame and calculate the expanding variance.
Solution:
Output:
X Expanding_Var 0 1 NaN 1 2 0.500000 2 3 1.000000 3 4 1.666667 4 5 2.500000 5 6 3.500000
Exercise 62:
Create a DataFrame with datetime index and resample by month.
Solution:
Output:
X 2020-01-31 465 2020-02-29 1305 2020-03-31 2325 2020-04-30 855
Exercise 63:
Create a DataFrame and calculate the exponential moving average.
Solution:
Output:
X EMA 0 1 1.00000 1 2 1.50000 2 3 2.25000 3 4 3.12500 4 5 4.06250 5 6 5.03125
Exercise 64:
Create a DataFrame with random integers and calculate the mode.
Solution:
Output:
X Y Z 0 2 1.0 2.0 1 3 3.0 7.0 2 5 NaN NaN 3 6 NaN NaN 4 9 NaN NaN
Exercise 65:
Create a DataFrame and calculate the z-score of each column.
Solution:
Output:
X Y zscore_A zscore_B 0 1 4 -1.341641 -1.341641 1 2 5 -0.447214 -0.447214 2 3 6 0.447214 0.447214 3 4 7 1.341641 1.341641
Exercise 66:
Create a DataFrame with random values and calculate the median.
Solution:
Output:
X 0.787042 Y 0.477837 Z 0.696911 dtype: float64
Exercise 67:
Create a DataFrame and apply a custom function to each column.
Solution:
Output:
X Y 0 2 5 1 3 6 2 4 7
Exercise 68:
Create a DataFrame with hierarchical index and calculate the mean for each group.
Solution:
Output:
Value Group X 15.0 Y 35.0
Exercise 69:
Create a DataFrame and calculate the percentage of missing values in each column.
Solution:
Output:
X 25.0 Y 25.0 dtype: float64
Exercise 70:
Create a DataFrame and apply a custom function to each row.
Solution:
Output:
X Y Sum 0 1 4 5 1 2 5 7 2 3 6 9
Exercise 71:
Create a DataFrame with random values and calculate the quantiles.
Solution:
Output:
X Y Z 0.25 0.174265 0.184036 0.520573 0.50 0.468040 0.315593 0.644571 0.75 0.767870 0.436426 0.771297
Exercise 72:
Create a DataFrame and calculate the interquartile range (IQR).
Solution:
Output:
X 0.354244 Y 0.329573 Z 0.245520 dtype: float64
Exercise 73:
Create a DataFrame with datetime index and calculate the rolling mean.
Solution:
Output:
X Rolling_Mean 2020-01-01 0 NaN 2020-01-02 1 NaN 2020-01-03 2 1.0 2020-01-04 3 2.0 2020-01-05 4 3.0 2020-01-06 5 4.0 2020-01-07 6 5.0 2020-01-08 7 6.0 2020-01-09 8 7.0 2020-01-10 9 8.0
Exercise 74:
Create a DataFrame and calculate the cumulative maximum.
Solution:
Output:
X Cumulative_Max 0 1 1 1 2 2 2 3 3 3 2 3 4 1 3
Exercise 75:
Create a DataFrame and calculate the cumulative minimum.
Solution:
Output:
X Cumulative_Min 0 1 1 1 2 1 2 3 1 3 2 1 4 1 1
Exercise 76:
Create a DataFrame with random values and calculate the cumulative variance.
Solution:
Output:
X Y Z Cumulative_Var 0 0.315669 0.900791 0.404858 NaN 1 0.462000 0.463257 0.922495 0.010706 2 0.328968 0.200027 0.967625 0.006548 3 0.630370 0.992849 0.231884 0.021460 4 0.574397 0.968600 0.926893 0.020023 5 0.204077 0.889864 0.589022 0.027130 6 0.386806 0.630882 0.242157 0.022759 7 0.319831 0.935747 0.829739 0.020630 8 0.786435 0.377739 0.879458 0.034407 9 0.523467 0.077937 0.764476 0.031194
Exercise 77:
Create a DataFrame and apply a custom function to each element.
Solution:
Output:
X Y 0 2 8 1 4 10 2 6 12
Exercise 78:
Create a DataFrame with random values and calculate the z-score for each element.
Solution:
Output:
X Y Z 0 1.027393 0.656858 1.032853 1 0.674079 -1.277904 -0.220065 2 -0.996641 -0.298841 0.475217 3 -0.704831 0.919887 -1.288005
Exercise 79:
Create a DataFrame and calculate the cumulative sum for each group.
Solution:
Output:
X Y Cumulative_Sum 0 foo 1 1 1 bar 2 2 2 foo 3 4 3 bar 4 6
Exercise 80:
Create a DataFrame with random values and calculate the rank for each element.
Solution:
Output:
X Y Z 0 4.0 3.0 3.0 1 3.0 2.0 2.0 2 1.0 4.0 1.0 3 2.0 1.0 4.0
Exercise 81:
Create a DataFrame and calculate the cumulative product for each group.
Solution:
Output:
X Y Cumulative_Product 0 foo 1 1 1 bar 2 2 2 foo 3 3 3 bar 4 8
Exercise 82:
Create a DataFrame with random values and calculate the expanding sum.
Solution:
Output:
X Y Z Expanding_Sum 0 0.815750 0.062819 0.699743 0.815750 1 0.128772 0.843222 0.411903 0.944522 2 0.857516 0.219424 0.234460 1.802038 3 0.011010 0.774375 0.259412 1.813048
Exercise 83:
Create a DataFrame and calculate the expanding minimum for each group.
Solution:
Output:
X Y Expanding_Min 0 foo 1 1.0 1 bar 2 2.0 2 foo 3 1.0 3 bar 4 2.0
Exercise 84:
Create a DataFrame with random values and calculate the expanding maximum for each group.
Solution:
Output:
X Y Z Expanding_Max 0 0.751392 0.015856 0.313990 0.015856 1 0.812436 0.701808 0.069307 0.701808 2 0.148614 0.838726 0.290646 0.838726 3 0.764419 0.586510 0.470466 0.586510
Exercise 85:
Create a DataFrame and calculate the expanding variance for each group.
Solution:
Output:
X Y Expanding_Var 0 foo 1 NaN 1 bar 2 NaN 2 foo 3 2.0 3 bar 4 2.0
Exercise 86:
Create a DataFrame with random values and calculate the expanding standard deviation.
Solution:
Output:
X Y Z Expanding_Std 0 0.693184 0.088273 0.109510 NaN 1 0.031186 0.163005 0.803467 0.468103 2 0.294881 0.409395 0.278145 0.333272 3 0.918778 0.854961 0.791329 0.397322
Exercise 87:
Create a DataFrame and calculate the expanding covariance.
Solution:
Output:
X Y Expanding_Cov 0 1 4 NaN 1 2 3 -0.500000 2 3 2 -1.000000 3 4 1 -1.666667
Exercise 88:
Create a DataFrame with random values and calculate the expanding correlation.
Solution:
Output:
X Y Z Expanding_Corr 0 0.094026 0.320246 0.044218 NaN 1 0.422531 0.002172 0.995907 -1.000000 2 0.265459 0.391239 0.589878 -0.751147 3 0.118812 0.061489 0.837821 -0.372750
Exercise 89:
Create a DataFrame and calculate the expanding median.
Solution:
Output:
X Expanding_Median 0 1 1.0 1 2 1.5 2 3 2.0 3 4 2.5 4 5 3.0 5 6 3.5
Exercise 90:
Create a DataFrame with datetime index and calculate the expanding mean for each group.
Solution:
Output:
X Y Expanding_Mean 2020-01-01 foo 0 0.0 2020-01-02 bar 1 1.0 2020-01-03 foo 2 1.0 2020-01-04 bar 3 2.0 2020-01-05 foo 4 2.0 2020-01-06 bar 5 3.0 2020-01-07 foo 6 3.0 2020-01-08 bar 7 4.0 2020-01-09 foo 8 4.0 2020-01-10 bar 9 5.0
Exercise 91:
Create a DataFrame with random values and calculate the rolling sum for each group.
Solution:
Output:
X Y Z Rolling_Sum 0 0.342706 0.579330 0.902681 NaN 1 0.182432 0.163406 0.156607 NaN 2 0.983085 0.052785 0.588865 NaN 3 0.756982 0.123991 0.704262 NaN 4 0.876875 0.710953 0.923588 NaN 5 0.359818 0.135520 0.277327 NaN 6 0.693156 0.590918 0.985834 NaN 7 0.892253 0.633529 0.169000 NaN 8 0.084238 0.007579 0.076730 NaN 9 0.663869 0.780832 0.644874 NaN
Exercise 92:
Create a DataFrame and calculate the rolling mean for each group.
Solution:
Output:
X Y Rolling_Mean 0 foo 0 NaN 1 bar 1 NaN 2 foo 2 NaN 3 bar 3 NaN 4 foo 4 2.0 5 bar 5 3.0 6 foo 6 4.0 7 bar 7 5.0 8 foo 8 6.0 9 bar 9 7.0
Exercise 93:
Create a DataFrame with random values and calculate the rolling standard deviation for each group.
Solution:
Output:
X Y Z Rolling_Std 0 0.154838 0.162793 0.808882 NaN 1 0.740167 0.920318 0.650240 NaN 2 0.033449 0.007883 0.249656 NaN 3 0.983601 0.261995 0.399816 NaN 4 0.883155 0.051084 0.125735 NaN 5 0.986930 0.470328 0.612276 NaN 6 0.981338 0.016731 0.627210 NaN 7 0.670522 0.247346 0.530971 NaN 8 0.978909 0.752500 0.903401 NaN 9 0.185614 0.362602 0.541459 NaN
Exercise 94:
Create a DataFrame and calculate the rolling variance for each group.
Solution:
Output:
X Y Rolling_Var 0 foo 0 NaN 1 bar 1 NaN 2 foo 2 NaN 3 bar 3 NaN 4 foo 4 4.0 5 bar 5 4.0 6 foo 6 4.0 7 bar 7 4.0 8 foo 8 4.0 9 bar 9 4.0
Exercise 95:
Create a DataFrame with random values and calculate the rolling correlation for each group.
Solution:
Output:
X Z Group Rolling_Corr 0 0.374540 0.950714 0.731994 A NaN 1 0.598658 0.156019 0.155995 A NaN 2 0.058084 0.866176 0.601115 A 0.992633 3 0.708073 0.020584 0.969910 A -0.095420 4 0.832443 0.212339 0.181825 A -0.180021 5 0.183405 0.304242 0.524756 B NaN 6 0.431945 0.291229 0.611853 B NaN 7 0.139494 0.292145 0.366362 A -0.869948 8 0.456070 0.785176 0.199674 B -0.984073 9 0.514234 0.592415 0.046450 B -0.788379
Exercise 96:
Create a DataFrame and calculate the rolling covariance for each group.
Solution:
Output:
X Y Z Rolling_Cov 0 foo 0 10 NaN 1 bar 1 11 NaN 2 foo 2 12 NaN 3 bar 3 13 NaN 4 foo 4 14 4.0 5 bar 5 15 4.0 6 foo 6 16 4.0 7 bar 7 17 4.0 8 foo 8 18 4.0 9 bar 9 19 4.0
Exercise 97:
Create a DataFrame with random values and calculate the rolling skewness for each group.
Solution:
Output:
X Y Z Rolling_Skew 0 0.808397 0.304614 0.097672 NaN 1 0.684233 0.440152 0.122038 NaN 2 0.495177 0.034389 0.909320 NaN 3 0.258780 0.662522 0.311711 NaN 4 0.520068 0.546710 0.184854 NaN 5 0.969585 0.775133 0.939499 NaN 6 0.894827 0.597900 0.921874 NaN 7 0.088493 0.195983 0.045227 NaN 8 0.325330 0.388677 0.271349 NaN 9 0.828738 0.356753 0.280935 NaN
Exercise 98:
Create a DataFrame and calculate the rolling kurtosis for each group.
Solution:
Output:
X Y Rolling_Kurt 0 foo 0 NaN 1 bar 1 NaN 2 foo 2 NaN 3 bar 3 NaN 4 foo 4 NaN 5 bar 5 NaN 6 foo 6 NaN 7 bar 7 NaN 8 foo 8 NaN 9 bar 9 NaN
Exercise 99:
Create a DataFrame with random values and calculate the rolling median for each group.
Solution:
Output:
X Y Z Rolling_Median 0 0.542696 0.140924 0.802197 NaN 1 0.074551 0.986887 0.772245 NaN 2 0.198716 0.005522 0.815461 NaN 3 0.706857 0.729007 0.771270 NaN 4 0.074045 0.358466 0.115869 NaN 5 0.863103 0.623298 0.330898 NaN 6 0.063558 0.310982 0.325183 NaN 7 0.729606 0.637557 0.887213 NaN 8 0.472215 0.119594 0.713245 NaN 9 0.760785 0.561277 0.770967 NaN
Exercise 100:
Create a DataFrame and calculate the expanding sum for each group.
Solution:
Output:
X Y Expanding_Sum 0 foo 0 0.0 1 bar 1 1.0 2 foo 2 2.0 3 bar 3 4.0 4 foo 4 6.0 5 bar 5 9.0 6 foo 6 12.0 7 bar 7 16.0 8 foo 8 20.0 9 bar 9 25.0
Python-Pandas Code Editor:
More to Come !
Do not submit any solution of the above exercises at here, if you want to contribute go to the appropriate exercise page.
Test your Python skills with w3resource's quiz