NumPy: Compute the mean, standard deviation, and variance of a given array along the second axis

Last update on May 05 2025 11:15:50 (UTC/GMT +8 hours)

7. Mean, Standard Deviation, and Variance Along Second Axis

Write a NumPy program to compute the mean, standard deviation, and variance of a given array along the second axis.

From Wikipedia: There are several kinds of means in various branches of mathematics (especially statistics).
For a data set, the arithmetic mean, also called the mathematical expectation or average, is the central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values. The arithmetic mean of a set of numbers $x_{1}, x_{2}, . . . . ., x_{n}$ is typically denoted by $\bar{x}$ , pronounced " $x$ bar". If the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is the sample mean (denoted $\bar{x}$ ) to distinguish it from the mean of the underlying distribution.
In probability and statistics, the population mean, or expected value, are a measure of the central tendency either of a probability distribution or of the random variable characterized by that distribution. In the case of a discrete probability distribution of a random variable $X$ , the mean is equal to the sum over every possible value weighted by the probability of that value; that is, it is computed by taking the product of each possible value $x$ of $X$ and its probability $p (x)$ , and then adding all these products together, giving $μ = \sum_{} x p (x)$ . An analogous formula applies to the case of a continuous probability distribution. Not every probability distribution has a defined mean; see the Cauchy distribution for an example. Moreover, for some distributions the mean is infinite.

Sample Solution:

Python Code:

# Importing the NumPy library
import numpy as np

# Creating an array 'x' using arange with 6 elements
x = np.arange(6)

# Displaying the original array 'x'
print("\nOriginal array:")
print(x)

# Calculating the mean of the array 'x' using np.mean()
r1 = np.mean(x)

# Calculating the average of the array 'x' using np.average()
r2 = np.average(x)

# Asserting if the results from np.mean() and np.average() are close
assert np.allclose(r1, r2)

# Displaying the calculated mean of the array 'x'
print("\nMean: ", r1)

# Calculating the standard deviation of the array 'x' using np.std()
r1 = np.std(x)

# Calculating the standard deviation manually
r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2))

# Asserting if the results from np.std() and manual calculation are close
assert np.allclose(r1, r2)

# Displaying the calculated standard deviation of the array 'x'
print("\nstd: ", r1)

# Calculating the variance of the array 'x' using np.var()
r1 = np.var(x)

# Calculating the variance manually
r2 = np.mean((x - np.mean(x)) ** 2)

# Asserting if the results from np.var() and manual calculation are close
assert np.allclose(r1, r2)

# Displaying the calculated variance of the array 'x'
print("\nvariance: ", r1)

Sample Output:

Original array:
[0 1 2 3 4 5]

Mean:  2.5

std:  1

variance:  2.9166666666666665

Explanation:

In the above code –

x = np.arange(6): This line creates a NumPy array x containing the numbers from 0 to 5.
r1 = np.mean(x): This line calculates the mean of the numbers in x.
r2 = np.average(x): This line calculates the weighted average of the numbers in x, where each number has an equal weight. Since all the weights are equal, np.average(x) is equivalent to np.mean(x).
assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
r1 = np.std(x): This line calculates the standard deviation of the numbers in x.
r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2 )): This line calculates the standard deviation of the numbers in x using the formula sqrt(mean((x - mean(x))**2)). This is another way to calculate the standard deviation, where the mean of the squared differences from the mean is calculated first and then the square root is taken.
assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
r1= np.var(x): This line calculates the variance of the numbers in x.
r2 = np.mean((x - np.mean(x)) ** 2 ): This line calculates the variance of the numbers in x using the formula mean((x - mean(x))**2). This is another way to calculate the variance, where the mean of the squared differences from the mean is calculated directly.
assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.

For more Practice: Solve these Related Problems:

Write a function that computes the mean, standard deviation, and variance for each row of a 2D array using np.mean, np.std, and np.var along axis 1.
Create a program that returns these statistical measures as a tuple of arrays and verifies them against manual calculations.
Implement a solution that normalizes each row by subtracting the mean and dividing by the standard deviation, then computes the variance of the normalized rows.
Develop a method that applies these statistics to a subset of columns and compares the results with full-row statistics.

Go to:

PREV : Weighted Average of an Array
NEXT : Covariance Matrix of Two Arrays

Python-Numpy Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.