Ugrás a tartalomhoz

## SOCIAL STATISTICS

Renáta Németh, Dávid Simon

ELTE

Median

## Median

Used with variables that are at least at an ordinal level of measurement,

represents the middle of the distribution:

• half the cases are above,

• half the cases are below the median.

For example according to Hungarian ISSP data from 1992:

• the answers to the question „How much do you think a cabinet minister in the national government earns?” have a median of 116,000 Ft,

• the answers to the question „ How much do you think a cabinet minister in the national government should earn?” have a median of 80,000 Ft.

Determine the level of measurement for the two variables whose median we identified.

### Finding the median in sorted data (few observations)

An odd number of observations, high level of measurement:

1. Sort the observations according to the variable

2. Find the middle observation, the category associated with it is the median

Example

Suicide rate by regions (Társadalmi helyzetkép 2002, Central Bureau of Statistics)

Suicide rate here is defined as suicides per 100.000 inhabitants

 1990 Western-Transdanubia 26.1 Southern-Transdanubia 34.5 Central-Hungary 35.6 Northern-Hungary 37.4 Central-Transdanubia 37.8 Northern-Great Plain 51.2 Southern-Great Plain 53.1

What is the unit of analysis of the variable?

What is the possible range of the variable?

Determine the level of measurement for suicide rate.

Identify the median.

The table below shows data of 2001. How did the median change?

 2001 Western-Transdanubia 19.9 Southern-Transdanubia 24.2 Central-Hungary 24.7 Northern-Hungary 27.5 Central-Transdanubia 27.9 Northern-Great Plain 37 Southern-Great Plain 41.5

Odd number of observations, ordinal variable:

Example: The sample consists of 5 respondents, the median category is „Neither satisfied, nor dissatisfied”

Question: Are you satisfied with your GP?

 Answer Respondent Very satisfied János Very satisfied Júlia Neither satisfied, nor dissatisfied Péter Dissatisfied Mária Very dissatisfied József

(Note that always an answer category and not the corresponding observation (here: Péter) is the median!)

Small, even number of observations:

If the variable is measured at high level, the median can be defined as the mean of the values associated to the two middle observations.

Turning back to our previous example on suicide rate, omitting Southern-Great Plain the median in 1990 is

(35.6+37.4)/2= 36.5;

while in 2001 (24.7+27.5)/2=26.1.

Mean is obviously not appropriate for ordinal variables:

Question: Are you satisfied with your GP?

 Answer Respondent Very satisfied János Very satisfied Júlia Neither satisfied, nor dissatisfied Péter Dissatisfied István Very dissatisfied Mária Very dissatisfied József

### Finding the median in a frequency distribution (great number of observations)

• We have to find the observation located at the middle of the distribution.

• For this reason we construct a cumulative percentage distribution (see page 61).

• The observation located at the middle of the distribution is the one that has a cumulative percentage value equal to 50%.

• If there is no observation with a cumulative percentage precisely equal to 50%, then (following our rule of thumb) choose the lowest category that has a cumulative percentage greater than 50%.

Example: In Japan (ISSP, 2006) the median hours worked weekly is 45 hours:

 Hours worked weekly Frequency Percentage Cumulative percentage 2.0 1 .1 .1 3.0 2 .3 .4 4.0 3 .4 .9 5.0 3 .4 1.3 6.0 4 .6 1.8 7.0 2 .3 2.1 8.0 6 .9 3.0 9.0 10 1.4 4.4 10.0 5 .7 5.1 11.0 1 .1 5.2 12.0 9 1.3 6.5 13.0 2 .3 6.8 15.0 5 .7 7.5 16.0 5 .7 8.2 17.0 2 .3 8.5 18.0 7 1.0 9.5 19.0 2 .3 9.8 20.0 21 3.0 12.8 21.0 3 .4 13.2 22.0 2 .3 13.5 23.0 2 .3 13.8 24.0 4 .6 14.3 25.0 12 1.7 16.0 26.0 1 .1 16.2 27.0 1 .1 16.3 28.0 3 .4 16.7 29.0 1 .1 16.9 30.0 27 3.8 20.7 31.0 2 .3 21.0 32.0 3 .4 21.4 33.0 2 .3 21.7 34.0 1 .1 21.8 35.0 17 2.4 24.3 36.0 6 .9 25.1 37.0 3 .4 25.5 38.0 5 .7 26.2 39.0 1 .1 26.4 40.0 100 14.2 40.6 41.0 2 .3 40.9 42.0 19 2.7 43.5 43.0 7 1.0 44.5 44.0 3 .4 45.0 45.0 47 6.7 51.6 46.0 5 .7 52.3 47.0 2 .3 52.6 48.0 46 6.5 59.1 50.0 95 13.5 72.6 51.0 4 .6 73.2 52.0 4 .6 73.8 54.0 6 .9 74.6 55.0 25 3.5 78.2 56.0 11 1.6 79.7 57.0 3 .4 80.1 58.0 2 .3 80.4 59.0 1 .1 80.6 60.0 60 8.5 89.1 61.0 1 .1 89.2 62.0 2 .3 89.5 63.0 2 .3 89.8 65.0 8 1.1 90.9 66.0 4 .6 91.5 67.0 1 .1 91.6 68.0 1 .1 91.8 70.0 16 2.3 94.0 72.0 7 1.0 95.0 75.0 4 .6 95.6 76.0 1 .1 95.7 78.0 2 .3 96.0 80.0 8 1.1 97.2 84.0 2 .3 97.4 85.0 2 .3 97.7 90.0 2 .3 98.0 91.0 1 .1 98.2 95.0 1 .1 98.3 96 or more 12 1.7 100.0 Total 705 100.0

Finding the median of ordinal variables goes the same way. (Remember to sort the categories!)

Example: ISSP 2006, USA. „On the whole, do you think it should or should not be the government’s responsibility to…”

 …provide a job for everyone who wants one? …provide health care for the sick? Freq. % Cum. % Freq. % Cum. % Definitely should be 239 15.9 15.9 850 56.4 56.4 Probably should be 356 23.7 39.6 502 33.3 89.8 Probably should not be 521 34.6 74.2 116 7.7 97.5 Definitely should not be 388 25.8 100.0 38 2.5 100.0 Total 1504 100.0 1506 100.0

Find the median of both variables, interpret their difference.

An application: detecting a trend

USA, General Social Survey, 1991 and 1994, Government spending on the military.

 1991 1994 % Cumulative % % Cumulative % Too low 14.5 14.5 16.5 16.5 About right 57.6 72.1 49.3 65.8 Too much 27.9 100.00 34.2 100.0 Total 100.0 100.0

The median was “About right” in both year. According to the median, public opinion did not change between the two years.

Note that the unchanged median masks the fact that percentage of “Too much” increased by its fourth.