w3resource

NumPy: Compute the mean, standard deviation, and variance of a given array along the second axis


Write a NumPy program to compute the mean, standard deviation, and variance of a given array along the second axis.

From Wikipedia: There are several kinds of means in various branches of mathematics (especially statistics).
For a data set, the arithmetic mean, also called the mathematical expectation or average, is the central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values. The arithmetic mean of a set of numbers x1, x2, ....., xn is typically denoted by x¯, pronounced "x bar". If the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is the sample mean (denoted x¯) to distinguish it from the mean of the underlying distribution.
In probability and statistics, the population mean, or expected value, are a measure of the central tendency either of a probability distribution or of the random variable characterized by that distribution. In the case of a discrete probability distribution of a random variable X, the mean is equal to the sum over every possible value weighted by the probability of that value; that is, it is computed by taking the product of each possible value x of X and its probability px, and then adding all these products together, giving μ = xpx. An analogous formula applies to the case of a continuous probability distribution. Not every probability distribution has a defined mean; see the Cauchy distribution for an example. Moreover, for some distributions the mean is infinite.

Sample Solution:

Python Code:

# Importing the NumPy library
import numpy as np

# Creating an array 'x' using arange with 6 elements
x = np.arange(6)

# Displaying the original array 'x'
print("\nOriginal array:")
print(x)

# Calculating the mean of the array 'x' using np.mean()
r1 = np.mean(x)

# Calculating the average of the array 'x' using np.average()
r2 = np.average(x)

# Asserting if the results from np.mean() and np.average() are close
assert np.allclose(r1, r2)

# Displaying the calculated mean of the array 'x'
print("\nMean: ", r1)

# Calculating the standard deviation of the array 'x' using np.std()
r1 = np.std(x)

# Calculating the standard deviation manually
r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2))

# Asserting if the results from np.std() and manual calculation are close
assert np.allclose(r1, r2)

# Displaying the calculated standard deviation of the array 'x'
print("\nstd: ", r1)

# Calculating the variance of the array 'x' using np.var()
r1 = np.var(x)

# Calculating the variance manually
r2 = np.mean((x - np.mean(x)) ** 2)

# Asserting if the results from np.var() and manual calculation are close
assert np.allclose(r1, r2)

# Displaying the calculated variance of the array 'x'
print("\nvariance: ", r1) 

Sample Output:

Original array:
[0 1 2 3 4 5]

Mean:  2.5

std:  1

variance:  2.9166666666666665

Explanation:

In the above code –

  • x = np.arange(6): This line creates a NumPy array x containing the numbers from 0 to 5.
  • r1 = np.mean(x): This line calculates the mean of the numbers in x.
  • r2 = np.average(x): This line calculates the weighted average of the numbers in x, where each number has an equal weight. Since all the weights are equal, np.average(x) is equivalent to np.mean(x).
  • assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
  • r1 = np.std(x): This line calculates the standard deviation of the numbers in x.
  • r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2 )): This line calculates the standard deviation of the numbers in x using the formula sqrt(mean((x - mean(x))**2)). This is another way to calculate the standard deviation, where the mean of the squared differences from the mean is calculated first and then the square root is taken.
  • assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
  • r1= np.var(x): This line calculates the variance of the numbers in x.
  • r2 = np.mean((x - np.mean(x)) ** 2 ): This line calculates the variance of the numbers in x using the formula mean((x - mean(x))**2). This is another way to calculate the variance, where the mean of the squared differences from the mean is calculated directly.
  • assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.

Python-Numpy Code Editor: