Ensuring Consistent Data Types in a Pandas DataFrame
Pandas: Data Validation Exercise-2 with Solution
Write a Pandas program to check and ensure that the data types of all columns are consistent.
The following exercise demonstrates how to check and ensure that the data types of all columns are consistent using dtypes.
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with different data types
df = pd.DataFrame({
'Name': ['Orville', 'Arturo', 'Ruth', None],
'Age': [25, '30', 22, 35],
'Salary': [50000, 60000, '70000', 80000]
})
# Check the data types of each column
data_types = df.dtypes
# Output the result
print(data_types)
Output:
Name object Age object Salary object dtype: object
Explanation:
- Created a DataFrame with mixed data types (string and integers).
- Used dtypes to check the data type of each column.
- Name: The Name column contains strings (or None, which is interpreted as NaN), so its data type is object.
- Age: The Age column contains mixed data types—both integers (25, 22, 35) and a string ('30'). Pandas automatically converts the entire column to object because it cannot store mixed types in a numerical column.
- Salary: Similarly, the Salary column contains integers (50000, 60000, 80000) and a string ('70000'). Pandas converts this column to object as well because of the string.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics