Calculating correlation matrix for DataFrame in Python
Python Pandas Numpy: Exercise-11 with Solution
Calculate the correlation matrix for a Pandas DataFrame.
Sample Solution:
Python Code:
import pandas as pd
# Create a sample DataFrame
data = {'Age': [25, 30, 22, 35, 28],
'Salary': [50000, 60000, 45000, 70000, 55000],
'Experience': [2, 5, 1, 8, 4]}
df = pd.DataFrame(data)
# Calculate the correlation matrix
correlation_matrix = df.corr()
# Display the correlation matrix
print(correlation_matrix)
Output:
Age Salary Experience Age 1.000000 0.997791 0.995910 Salary 0.997791 1.000000 0.996616 Experience 0.995910 0.996616 1.000000
Explanation:
In the exerciser above
- First we create a sample DataFrame (df) with columns 'Age', 'Salary', and 'Experience'.
- The df.corr() method calculates the correlation matrix for the numeric columns in the DataFrame.
- The resulting correlation_matrix is then printed to the console.
The correlation matrix provides information about the pairwise correlations between the columns. Values range from -1 to 1, where 1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 indicates no correlation
Flowchart:
Python Code Editor:
Previous: Applying NumPy function to DataFrame column in Python.
Next: Calculating cumulative sum in Pandas DataFrame with NumPy array.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python-exercises/pandas_numpy/pandas_numpy-exercise-11.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics