Ugrás a tartalomhoz

SOCIAL STATISTICS

Renáta Németh, Dávid Simon

ELTE

The shape of the distribution

The shape of the distribution

A distribution of an interval-ratio variable can be either symmetrical or skewed, depending on whether there are a few outliers at one end of the distribution.

Terminology: ends of distributions are called tails; right tail / left tail.

A distribution is symmetrical if the frequencies at the right and left tails of the distribution are identical.

A hypothetical example:

In case of symmetrical unimodal distributions the mode, the mean and the median are identical.

What about symmetrical bimodal distributions?

A hypothetical example:

The median and the mean are identical in this case as well.

In skewed distributions, there are a few outliers on one side of the distribution. A negatively skewed distribution has extremely low values. In this case the left tail is longer than the right tail. And vice versa: those distributions with a few extremely high values are called positively skewed, with a right tail longer than the left tail.

In a negatively skewed distribution, the mean will be pulled down; in a positively skewed distribution, it will be pulled up. For example income data are practically always positively skewed.

A simple rule to identify the shape of the distribution:

  • If the mean is greater than the median, the distribution is positively skewed.

  • If the mean is lower than the median, the distribution is negatively skewed.

Example: mean and median hours worked weekly (in hours, sorted by the mean):

Country

mean

median

NL-Netherl

35.30

36.00

CA-Canada

37.27

40.00

IE-Ireland

37.40

39.00

GB-Great B

37.47

39.00

CH-Switzer

37.82

42.00

NZ-New Zea

37.88

40.00

FI-Finland

38.23

38.00

FR-France

38.54

38.00

SE-Sweden

38.59

40.00

DK-Denmark

38.61

37.00

NO-Norway

38.62

40.00

DE-Germany

38.90

40.00

HU-Hungary

39.98

40.00

ZA-South A

40.52

40.00

AU-Austral

40.85

40.00

VE-Venezue

40.96

40.00

PT-Portuga

41.21

40.00

ES-Spain

41.40

40.00

IL-Israel

41.77

40.00

RU-Russia

41.82

40.00

US-United

42.32

40.00

LV-Latvia

42.36

40.00

SI-Sloveni

42.75

40.00

UY-Uruguay

42.80

44.00

HR-Croatia

43.50

40.00

PL-Poland

44.05

40.00

CL-Chile

44.24

45.00

JP-Japan

44.51

45.00

CZ-Czech R

45.42

43.00

DO-Dominic

45.52

45.00

PH-Philipp

47.19

48.00

KR-South K

48.71

48.00

TW-Taiwan

49.49

48.00

In which countries is the mean much greater than the median?

In which countries is the median much greater?

What do these imply regarding the shape of the distributions?

What do these imply regarding the particular country’s working conditions? Compare e.g. the USA and Switzerland.