## Z-score standardization in statistics.

A standard score or a z-score is used for standardizing scores on the same scale by dividing a score's deviation by the standard deviation std in a dataset. The result is a standard score thet measures the number of standard deviations that a given data point is from the data center - the mean.

Standardized data set has mean 0 and standard deviation 1, and retains the shape properties of the original data set (same skewness and kurtosis).

## Creating data:

```
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
data = np.random.poisson(3,1000)**2
## compute the mean and std
datamean = np.mean(data)
datastd = np.std(data,ddof=1)
# the previous two lines are equivalent to the following two lines
#datamean = data.mean()
#datastd = data.std(ddof=1)
plt.plot(data,'s',markersize=3)
plt.xlabel('Data index')
plt.ylabel('Data value')
plt.title(f'Mean = {np.round(datamean,2)}; std = {np.round(datastd,2)}')
plt.show()
```

## Z-scoring and visualisation:

```
# z-score is data minus mean divided by stdev
dataz = (data-datamean) / datastd
# can also use Python function
dataz = stats.zscore(data)
# compute the mean and std
dataZmean = np.mean(dataz)
dataZstd = np.std(dataz,ddof=1)
plt.plot(dataz,'s',markersize=3)
plt.xlabel('Data index')
plt.ylabel('Data value')
plt.title(f'Mean = {np.round(dataZmean,2)}; std = {np.round(dataZstd,2)}')
plt.show()
```