Renáta Németh, Dávid Simon

ELTE

How close do you feel to Europe? |
Total | ||||

Very close |
Close |
Not very close | |||

How close do you feel to the town where you live? |
Very close |
521 89,5% |
41 7,0% |
20 3,4 % |
582 100,0 % |

Close |
123 50,4% |
106 43,4% |
15 6,1% |
244 100,0% | |

Not very close |
100 63,7% |
36 22,9% |
21 13,4% |
157 100,0% | |

Total |
744 75,7% |
183 18,6% |
56 5,7% |
983 100,0% |

Based on the percentages which variable is the independent one?

Is there a connection?

What level of measurement are the variables?

How could we use PRE here?

This time the respondents come in pairs. Let’s try to guess for each respondent whether or not they feel closer to Europe than their pair if we know that they feel closer to their town of residence then their pair.

Let’s repeat the procedure knowing also the percentage of pairs where the one who feels closer to Europe also feels closer to their town. How to do this?

How could we formulate the improvement?

How many pairs are there where the one who feels closer to Europe feels closer to their town as well?

How to calculate this?

Let’s proceed from cell to cell from the bottom right corner. Let’s multiply each cell by the sum of cells left and above it. Let’s do this for each cell where it’s possible.

Europe |
Total | ||||

Very close |
Close |
Not very close | |||

Town of residence |
Very close |
521 89,5% |
41 7,0% |
20 3,4 % |
582 100,0 % |

Close |
123 50,4% |
106 43,4% |
15 6,1% |
244 100,0% | |

Not very close |
100 63,7% |
36 22,9% |
21 13,4% |
157 100,0% | |

Total |
744 75,7% |
183 18,6% |
56 5,7% |
983 100,0% |

N_{s}=21*(521+41+123+106) + 15*(521+41) + 36*(521+123) +
106*521= 103 451

How many pairs are there where the one who feels closer to Europe feels less close to their town?

How to calculate this?

Let’s proceed from cell to cell from the bottom left corner. Let’s multiply each cell by the sum of the cells to the right and above it . Let’s do this for each cell where it’s possible.

Europe |
Total | ||||

Very close |
Close |
Not very close | |||

Town of residence |
Very close |
521 89,5% |
41 7,0% |
20 3,4 % |
582 100,0 % |

Close |
123 50,4% |
106 43,4% |
15 6,1% |
244 100,0% | |

Not very close |
100 63,7% |
36 22,9% |
21 13,4% |
157 100,0% | |

Total |
744 75,7% |
183 18,6% |
56 5,7% |
983 100,0% |

N_{d}=100*(41+20+106+15) + 123*(41+20) + 36*(15+20) +
106*20=29 083

Gamma is the name for the following associational index:

In this specific case:

**Characteristics of gamma**

symmetrical

it’s between -1 and +1

it’s 0 in case of independence

meaning: from all the pairs that can be arranged according to both variables to what extent the probability of error diminishes compared to chance ( (N

_{s}+N_{d})/2)

Another possible associational index: *Somer’s d.*

Let’s calculate the pairs that can not be arranged according to the dependent variable (Nty).

How to calculate this?

Let’s find the smallest value of the dependent variable and within that the cell where the smallest value of the independent variable is located. The number of cases found here should be multiplied by the sum of the number of cases of the same value of the dependent variable and with higher value (all) of the independent variable.

Europe |
Total | ||||

Very close |
Close |
Not very close | |||

Town of residence |
Very close |
521 89,5% |
41 7,0% |
20 3,4 % |
582 100,0 % |

Close |
123 50,4% |
106 43,4% |
15 6,1% |
244 100,0% | |

Not very close |
100 63,7% |
36 22,9% |
21 13,4% |
157 100,0% | |

Total |
744 75,7% |
183 18,6% |
56 5,7% |
983 100,0% |

N_{ty}=21*(15+20)+15*20+36*(106+41)+106*41+100*(123+521)+123*521=139
156

Somer’s d can be calculated using the following formula:

In this specific case:

Formula:

where

x, y the ordinal variables

N no. of cases

symmetrical

it’s between -1 and +1

it’s 0 if independent

For the same set of data, which is larger, gamma or Somer’s d?

What index can we use for ordinal variables if we don’t know which is the independent variable?