Ugrás a tartalomhoz

## SOCIAL STATISTICS

Renáta Németh, Dávid Simon

ELTE

The shape of the distribution

## The shape of the distribution

A distribution of an interval-ratio variable can be either symmetrical or skewed, depending on whether there are a few outliers at one end of the distribution.

Terminology: ends of distributions are called tails; right tail / left tail.

A distribution is symmetrical if the frequencies at the right and left tails of the distribution are identical.

A hypothetical example:

In case of symmetrical unimodal distributions the mode, the mean and the median are identical.

A hypothetical example:

The median and the mean are identical in this case as well.

In skewed distributions, there are a few outliers on one side of the distribution. A negatively skewed distribution has extremely low values. In this case the left tail is longer than the right tail. And vice versa: those distributions with a few extremely high values are called positively skewed, with a right tail longer than the left tail.

In a negatively skewed distribution, the mean will be pulled down; in a positively skewed distribution, it will be pulled up. For example income data are practically always positively skewed.

A simple rule to identify the shape of the distribution:

• If the mean is greater than the median, the distribution is positively skewed.

• If the mean is lower than the median, the distribution is negatively skewed.

Example: mean and median hours worked weekly (in hours, sorted by the mean):

 Country mean median NL-Netherl 35.30 36.00 CA-Canada 37.27 40.00 IE-Ireland 37.40 39.00 GB-Great B 37.47 39.00 CH-Switzer 37.82 42.00 NZ-New Zea 37.88 40.00 FI-Finland 38.23 38.00 FR-France 38.54 38.00 SE-Sweden 38.59 40.00 DK-Denmark 38.61 37.00 NO-Norway 38.62 40.00 DE-Germany 38.90 40.00 HU-Hungary 39.98 40.00 ZA-South A 40.52 40.00 AU-Austral 40.85 40.00 VE-Venezue 40.96 40.00 PT-Portuga 41.21 40.00 ES-Spain 41.40 40.00 IL-Israel 41.77 40.00 RU-Russia 41.82 40.00 US-United 42.32 40.00 LV-Latvia 42.36 40.00 SI-Sloveni 42.75 40.00 UY-Uruguay 42.80 44.00 HR-Croatia 43.50 40.00 PL-Poland 44.05 40.00 CL-Chile 44.24 45.00 JP-Japan 44.51 45.00 CZ-Czech R 45.42 43.00 DO-Dominic 45.52 45.00 PH-Philipp 47.19 48.00 KR-South K 48.71 48.00 TW-Taiwan 49.49 48.00

In which countries is the mean much greater than the median?

In which countries is the median much greater?

What do these imply regarding the shape of the distributions?

What do these imply regarding the particular country’s working conditions? Compare e.g. the USA and Switzerland.