NumPy: Compute the mean, standard deviation, and variance of a given array along the second axis
7. Mean, Standard Deviation, and Variance Along Second Axis
Write a NumPy program to compute the mean, standard deviation, and variance of a given array along the second axis.
From Wikipedia: There are several kinds of means in various branches of mathematics (especially statistics).
For a data set, the arithmetic mean, also called the mathematical expectation or average, is the central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values. The arithmetic mean of a set of numbers x1, x2, ....., xn is typically denoted by ˉx, pronounced "x bar". If the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is the sample mean (denoted ˉx) to distinguish it from the mean of the underlying distribution.
In probability and statistics, the population mean, or expected value, are a measure of the central tendency either of a probability distribution or of the random variable characterized by that distribution. In the case of a discrete probability distribution of a random variable X, the mean is equal to the sum over every possible value weighted by the probability of that value; that is, it is computed by taking the product of each possible value x of X and its probability p(x), and then adding all these products together, giving μ = ∑xp(x). An analogous formula applies to the case of a continuous probability distribution. Not every probability distribution has a defined mean; see the Cauchy distribution for an example. Moreover, for some distributions the mean is infinite.
Sample Solution:
Python Code:
Sample Output:
Original array: [0 1 2 3 4 5] Mean: 2.5 std: 1 variance: 2.9166666666666665
Explanation:
In the above code –
- x = np.arange(6): This line creates a NumPy array x containing the numbers from 0 to 5.
- r1 = np.mean(x): This line calculates the mean of the numbers in x.
- r2 = np.average(x): This line calculates the weighted average of the numbers in x, where each number has an equal weight. Since all the weights are equal, np.average(x) is equivalent to np.mean(x).
- assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
- r1 = np.std(x): This line calculates the standard deviation of the numbers in x.
- r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2 )): This line calculates the standard deviation of the numbers in x using the formula sqrt(mean((x - mean(x))**2)). This is another way to calculate the standard deviation, where the mean of the squared differences from the mean is calculated first and then the square root is taken.
- assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
- r1= np.var(x): This line calculates the variance of the numbers in x.
- r2 = np.mean((x - np.mean(x)) ** 2 ): This line calculates the variance of the numbers in x using the formula mean((x - mean(x))**2). This is another way to calculate the variance, where the mean of the squared differences from the mean is calculated directly.
- assert np.allclose(r1, r2): This assertion tests whether the values of r1 and r2 are close enough (within a certain tolerance) to be considered equal. If the assertion fails, it will raise an error.
For more Practice: Solve these Related Problems:
- Write a function that computes the mean, standard deviation, and variance for each row of a 2D array using np.mean, np.std, and np.var along axis 1.
- Create a program that returns these statistical measures as a tuple of arrays and verifies them against manual calculations.
- Implement a solution that normalizes each row by subtracting the mean and dividing by the standard deviation, then computes the variance of the normalized rows.
- Develop a method that applies these statistics to a subset of columns and compares the results with full-row statistics.
Go to:
Previous: Write a NumPy program to compute the weighted of a given array.
Next: Write a NumPy program to compute the covariance matrix of two given arrays.
Python-Numpy Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.