Ugrás a tartalomhoz

SOCIAL STATISTICS

Renáta Németh, Dávid Simon

ELTE

Linear relationship

Linear relationship

Introductory definition:

  • The relationship between two high measurement level variables is linear if by increasing the independent variable by one unit the value of the dependent variable is expected to change in the same extent and direction in all the cases.

  • The relationship between two high measurement level variables can be described by the line (and its properties) its values define

What line do the values define in the graph below?

The line can be characterised by two parameters:

  1.  steepness

  2.  where does it intercept Axis y

The general equation for a straight line:

y = a + bx

where

a the point where the line intercepts Axis y (the value of y when x=0) (intercept)

b the steepness of the line (stepping one unit on Axis x means stepping how much on y)

Steepness

It describes the direction and extent of the relationship:

  • If it’s negative, the relationship is reverse (the higher the independent, the lower the dependent variable)

  • if it’s positive, the relationship is straightforward

  • if it’s 0, the two are independent

  • its absolute value describes the strength of the relationship

Even though the input data were the same, we got different values for (b), indicating the strength. The only difference was in the unit of measurement applied. Conclusion: the value of (b) depends on the unit of measurement used for both variables

Thus if we were to compare the strength of the relationship using different sources (e.g. data from different countries), we need to consider the units of measurement concerned. The same goes for comparing the effect of several different independent variables on a given dependent variable (e.g. if we want to know whether income is more affected by age than the number of years spent in education).

Similarly, (b) depends on standard deviation.

The intercept

It’s easy to see that if the independent variable is 0, the dependent variable shows the value of the dependent on condition the independent equals 0.

Is this of any use in social science? It depends on the variables in question. The intercept makes no sense when looking at income and age, because we can’t sensibly assign an income value to age 0. However, this is not always the case.

Revision:

What can we say about the relationship between age and income so far?