Merging Pandas DataFrames on multiple columns
Merge two DataFrames based on multiple columns.
Sample Solution:
Python Code:
import pandas as pd
# Create two sample DataFrames
data1 = {'ID': [1, 2, 3, 4],
'Name': ['Imen', 'Karthika', 'Cosimo', 'Cathrine'],
'Department': ['HR', 'IT', 'Finance', 'IT']}
data2 = {'ID': [1, 2, 3, 5],
'Salary': [50000, 60000, 45000, 70000],
'Department': ['HR', 'IT', 'Finance', 'Marketing']}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
# Merge DataFrames based on 'ID' and 'Department'
merged_df = pd.merge(df1, df2, on=['ID', 'Department'], how='inner')
# Display the merged DataFrame
print(merged_df)
Output:
ID Name Department Salary 0 1 Imen HR 50000 1 2 Karthika IT 60000 2 3 Cosimo Finance 45000
Explanation:
Here's a breakdown of the above code:
- First we create two sample DataFrames (df1 and df2) with columns 'ID', 'Name', 'Department' and 'ID', 'Salary', 'Department', respectively.
- The pd.merge(df1, df2, on=['ID', 'Department'], how='inner') line merges the DataFrames based on the common columns 'ID' and 'Department'.
- The how='inner' parameter specifies that only the rows with matching values in both DataFrames will be included in the result.
- The resulting "merged_df" DataFrame contains the merged data.
Flowchart:
Python Code Editor:
Previous: Transposing DataFrame: Pandas data manipulation.
Next: Aggregating data in Pandas: Multiple functions example.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics