Ugrás a tartalomhoz

SOCIAL STATISTICS

Renáta Németh, Dávid Simon

ELTE

Median

Median

Used with variables that are at least at an ordinal level of measurement,

represents the middle of the distribution:

  • half the cases are above,

  • half the cases are below the median.

For example according to Hungarian ISSP data from 1992:

  • the answers to the question „How much do you think a cabinet minister in the national government earns?” have a median of 116,000 Ft,  

  • the answers to the question „ How much do you think a cabinet minister in the national government should earn?” have a median of 80,000 Ft.

Determine the level of measurement for the two variables whose median we identified.

Finding the median in sorted data (few observations)

An odd number of observations, high level of measurement:

  1. Sort the observations according to the variable

  2. Find the middle observation, the category associated with it is the median

Example

Suicide rate by regions (Társadalmi helyzetkép 2002, Central Bureau of Statistics)

Suicide rate here is defined as suicides per 100.000 inhabitants

1990

Western-Transdanubia

26.1

Southern-Transdanubia

34.5

Central-Hungary

35.6

Northern-Hungary

37.4

Central-Transdanubia

37.8

Northern-Great Plain

51.2

Southern-Great Plain

53.1

What is the unit of analysis of the variable?

What is the possible range of the variable?

Determine the level of measurement for suicide rate.

Identify the median.

The table below shows data of 2001. How did the median change?

2001

Western-Transdanubia

19.9

Southern-Transdanubia

24.2

Central-Hungary

24.7

Northern-Hungary

27.5

Central-Transdanubia

27.9

Northern-Great Plain

37.0

Southern-Great Plain

41.5

Odd number of observations, ordinal variable:

Example: The sample consists of 5 respondents, the median category is „Neither satisfied, nor dissatisfied”

Question: Are you satisfied with your GP?

Answer

Respondent

Very satisfied

János

Very satisfied

Júlia

Neither satisfied, nor dissatisfied

Péter

Dissatisfied

Mária

Very dissatisfied

József

(Note that always an answer category and not the corresponding observation (here: Péter) is the median!)

Small, even number of observations:

If the variable is measured at high level, the median can be defined as the mean of the values associated to the two middle observations.

Turning back to our previous example on suicide rate, omitting Southern-Great Plain the median in 1990 is

(35.6+37.4)/2= 36.5;

while in 2001 (24.7+27.5)/2=26.1.

Mean is obviously not appropriate for ordinal variables:

Question: Are you satisfied with your GP?

Answer

Respondent

Very satisfied

János

Very satisfied

Júlia

Neither satisfied, nor dissatisfied

Péter

Dissatisfied

István

Very dissatisfied

Mária

Very dissatisfied

József

Finding the median in a frequency distribution (great number of observations)

  • We have to find the observation located at the middle of the distribution.

  • For this reason we construct a cumulative percentage distribution (see page 61).

  • The observation located at the middle of the distribution is the one that has a cumulative percentage value equal to 50%.

  • If there is no observation with a cumulative percentage precisely equal to 50%, then (following our rule of thumb) choose the lowest category that has a cumulative percentage greater than 50%.

Example: In Japan (ISSP, 2006) the median hours worked weekly is 45 hours:

Hours worked weekly

Frequency

Percentage

Cumulative percentage

2.0

1

.1

.1

3.0

2

.3

.4

4.0

3

.4

.9

5.0

3

.4

1.3

6.0

4

.6

1.8

7.0

2

.3

2.1

8.0

6

.9

3.0

9.0

10

1.4

4.4

10.0

5

.7

5.1

11.0

1

.1

5.2

12.0

9

1.3

6.5

13.0

2

.3

6.8

15.0

5

.7

7.5

16.0

5

.7

8.2

17.0

2

.3

8.5

18.0

7

1.0

9.5

19.0

2

.3

9.8

20.0

21

3.0

12.8

21.0

3

.4

13.2

22.0

2

.3

13.5

23.0

2

.3

13.8

24.0

4

.6

14.3

25.0

12

1.7

16.0

26.0

1

.1

16.2

27.0

1

.1

16.3

28.0

3

.4

16.7

29.0

1

.1

16.9

30.0

27

3.8

20.7

31.0

2

.3

21.0

32.0

3

.4

21.4

33.0

2

.3

21.7

34.0

1

.1

21.8

35.0

17

2.4

24.3

36.0

6

.9

25.1

37.0

3

.4

25.5

38.0

5

.7

26.2

39.0

1

.1

26.4

40.0

100

14.2

40.6

41.0

2

.3

40.9

42.0

19

2.7

43.5

43.0

7

1.0

44.5

44.0

3

.4

45.0

45.0

47

6.7

51.6

46.0

5

.7

52.3

47.0

2

.3

52.6

48.0

46

6.5

59.1

50.0

95

13.5

72.6

51.0

4

.6

73.2

52.0

4

.6

73.8

54.0

6

.9

74.6

55.0

25

3.5

78.2

56.0

11

1.6

79.7

57.0

3

.4

80.1

58.0

2

.3

80.4

59.0

1

.1

80.6

60.0

60

8.5

89.1

61.0

1

.1

89.2

62.0

2

.3

89.5

63.0

2

.3

89.8

65.0

8

1.1

90.9

66.0

4

.6

91.5

67.0

1

.1

91.6

68.0

1

.1

91.8

70.0

16

2.3

94.0

72.0

7

1.0

95.0

75.0

4

.6

95.6

76.0

1

.1

95.7

78.0

2

.3

96.0

80.0

8

1.1

97.2

84.0

2

.3

97.4

85.0

2

.3

97.7

90.0

2

.3

98.0

91.0

1

.1

98.2

95.0

1

.1

98.3

96 or more

12

1.7

100.0

Total

705

100.0

Finding the median of ordinal variables goes the same way. (Remember to sort the categories!)

Example: ISSP 2006, USA. „On the whole, do you think it should or should not be the government’s responsibility to…”

…provide a job for everyone who wants one?

…provide health care for the sick?

Freq.

%

Cum. %

Freq.

%

Cum. %

Definitely should be

239

15.9

15.9

850

56.4

56.4

Probably should be

356

23.7

39.6

502

33.3

89.8

Probably should not be

521

34.6

74.2

116

7.7

97.5

Definitely should not be

388

25.8

100.0

38

2.5

100.0

Total

1504

100.0

1506

100.0

Find the median of both variables, interpret their difference.

An application: detecting a trend

USA, General Social Survey, 1991 and 1994, Government spending on the military.

1991

1994

%

Cumulative %

%

Cumulative %

Too low

14.5

14.5

16.5

16.5

About right

57.6

72.1

49.3

65.8

Too much

27.9

100.00

34.2

100.0

Total

100.0

100.0

The median was “About right” in both year. According to the median, public opinion did not change between the two years.

Note that the unchanged median masks the fact that percentage of “Too much” increased by its fourth.