w3resource

Generate and analyze synthetic data with NumPy and SciPy


NumPy: Integration with SciPy Exercise-16 with Solution


Write a NumPy program to generate synthetic data using NumPy and apply SciPy's stats module to perform various statistical tests (t-test, chi-square test).

Sample Solution:

Python Code:

import numpy as np
from scipy import stats

# Set the random seed for reproducibility
np.random.seed(42)

# Generate synthetic data: two samples from normal distributions
sample1 = np.random.normal(loc=50, scale=5, size=100)
sample2 = np.random.normal(loc=52, scale=5, size=100)

# Perform a two-sample t-test
t_stat, t_p_value = stats.ttest_ind(sample1, sample2)

# Generate synthetic data for chi-square test
observed = np.array([40, 30, 20, 10])
expected = np.array([25, 25, 25, 25])

# Perform a chi-square test
chi2_stat, chi2_p_value = stats.chisquare(f_obs=observed, f_exp=expected)

# Print the results
print("Two-sample t-test results:")
print(f"t-statistic: {t_stat:.3f}, p-value: {t_p_value:.3f}")

print("\nChi-square test results:")
print(f"chi2-statistic: {chi2_stat:.3f}, p-value: {chi2_p_value:.3f}")

Output:

Two-sample t-test results:
t-statistic: -3.995, p-value: 0.000

Chi-square test results:
chi2-statistic: 20.000, p-value: 0.000

Explanation:

  • Import libraries:
    • Import the necessary modules from NumPy and SciPy.
  • Set random seed:
    • Ensure reproducibility by setting a random seed.
  • Generate synthetic data:
    • Create two samples from normal distributions with specified means and standard deviations.
  • Perform t-test:
    • Use SciPy's ttest_ind function to conduct a two-sample t-test on the generated data.
  • Generate data for chi-square test:
    • Create observed and expected frequency arrays for the chi-square test.
  • Perform chi-square test:
    • Use SciPy's "chisquare" function to perform the chi-square test.
  • Finally print the statistics and p-values from the tests.

Python-Numpy Code Editor: