Scientists are commonly looking for patterns in their data, trying to identify relationships between variables. Whilst their goal might be a full explanation in terms of predictions derived from theory it is unlikely that cause and effect can be isolated and determined that easily. To do that all other variables which might have an influence would have to have been kept constant (controlled for). Often the best we can do is to look for correlations between a particular factor and an outcome.
Click to enlarge
Correlation describes a consistent relationship between two factors: these could rise and fall together such as the amount someone smokes and the amount of money smoking costs them (high correlation) or how much they smoke and the likelihood that they will get lung or throat cancer (lower but still significant correlations). These are positive correlations.
Click to enlarge
Or one factor could rise whilst the other falls; increasing the amount of fresh fruit and vegetables in your diet is correlated with a decrease in potential for heart disease. This is a negative correlation, the variables change in opposite directions.
Correlation is not the same as causation
Just because two variables are correlated it does not mean that one caused the other. In Australia sales of ice cream are highly correlated with shark attacks on swimmers. This does not indicate that the more ice cream you eat the more likely you are to be attacked, just that the incidence of both rises during the summer. Incidence of malaria used to correlate highly with living near swamps and, with our usual tendency to attribute more to patterns like these than they really imply, people believed malaria was caused by bad air. It wasn’t until the role of the mosquito as a carrier was discovered at the end of the nineteenth century that people stopped trying to shut “bad air” out of their houses.
Describing correlation visually
Scattergrams or x-y plots are used to show correlation. The relationship between size of a correlation and how it looks on the scatter gram is shown clearly by David Howell here.
The scattergram below shows the ozone levels and minimum temperatures in the Arctic circle from 1979 to 1998. The ozone layer’s thickness is measured in Dobson units (DU).
In which year was the ozone layer the thinnest?
What is the relationship between the minimum temperature and ozone layer thickness?
Or a correlation co-efficient can be calculated to see if two sets of measurement are related –.
The nearer the coefficient is to 1, the stronger the relationship.
A positive correlation shows that high scores in the first group of measurements are associated with high scores in the second group.
A negative correlation shows that high scores in the first group are associated with low scores in the second group.