Ugrás a tartalomhoz

## SOCIAL STATISTICS

Renáta Németh, Dávid Simon

ELTE

The relationship of nominal or ordinal variables with crosstabs

## The relationship of nominal or ordinal variables with crosstabs

Crosstab: showing the joint distribution of two nominal or ordinal variables within one single table

Joint distribution: the distribution of both variables for the categories of the other one

We know the distribution of the cross-combination of both variables. E.g.: The distribution of the origins of best friendships by settlement type (From: Social Report, 2002, KSH)

 Origin of best friendship Settlement type Total Capital County capital Other city or town Village Childhood 22,2 20,0 22,2 29,5 24,2 School 33,6 24,9 22,9 18,4 24,0 Work 21,1 24,7 23,5 16,9 21,1 Family 5,7 5,3 6,7 8,1 6,7 Neighbours 8,7 13,1 13,5 15,3 13,0 Other 8,6 11,9 11,2 11,8/ 11,0 Total 100% 100% 100% 100% 100%

Level of measurement? What do the rows and columns stand for?

What kind of percentage data does the table give for the joint distribution?

(Question: To what extent are you attached to the coutry of your residence?, ISSP 1995)

 USA H SK Very much 35,4% 463 79,6% 794 41,6% 567 Considerably 46.6% 596 16,8% 168 47,7% 650 Not very much 15.3% 200 2,8% 28 7,6% 103 Not at all 3.7% 48 0,8% 8 3,1% 42 Total 100% 1307 100% 998 100% 1362

% within country

 D-E H CZ SLO PL BG RUS LV SK Very much 27,7% 79,6% 47,5% 49,3% 54,6% 72,1% 41,7% 41,3% 41,6% Considerably 53,6% 16,8% 44,2% 43,7% 39,3% 20,6% 40,1% 45,1% 47,7% Not very much 16,6% 2,8% 7,0% 6,0% 5,1% 4,3% 12,3% 10,9% 7,6% Not at all 2,1% 0,8% 1,4% 1,1% 0,9% 3,1% 5,9% 2,7% 3,1% Total 100,0% 100,0% 100,0% 100,0% 100,0% 100,0% 100,0% 100,0% 100,0%

What's the difference?

Row percentage

Column percentage

Cell percentage

### Dependent and independent variables

If our hypothesis is that the origin of the best friendships varies by type of settlement, that is where you live, affects where you make friends, in this model ’origin of friendship’ is the dependent variable and ’settlement type’ is the independent variable.

Crosstabulation: terminology (in the above example)

Row variable: origin of friendship

Column variable: type of settlement

Cell: the overlap of a given row with a given column

Marginal: the distribution of the row variable or the column variable without breaking it up (here: the last column)

In the example above, it seemed more practical to give the row percentages, since the column variable was independent

If the data can be presented both ways (with either the row or the column variable being dependent), we can give both the row and the column percentages

### Example for investigating dependency

A fictitious example: the relationship between financial status and mental health

The direction of the relationhip is not obvious – why?

How can you interpret the tables below? Which table presents which type of influence?

Column percentage:

 MENTAL HEALTH - HAVING A MEDICAL CONDITION FINANCIAL STATUS Relatively bad Relatively good Total Yes 46% 43% 44% No 54% 57% 56% Total 100% 100% 100%

Row percentage:

 MENTAL HEALTH - HAVING A MEDICAL CONDITION FINANCIAL STATUS Relatively bad Relatively good Total Yes 44% 56% 100% No 42% 58% 100% Total 43% 57% 100%