Authors
Clive Seale

Pub Date: December 2011
Pages: 648

Click here for more information.
Clive Seale
19 Statistical reasoning: from one to two variables
Alice Bloch

1. Using the data matrix in Table 19.1 draw a frequency distribution for the variable Working. Do the same for Age, recoding it into three broad categories. Draw a bar chart of these distributions. Calculate the mean, median and mode for each variable.

Construct contingency tables that show the relationship between: Sex and Working; Sex and Jobsat; Working and Jobsat. Ensure that each cell contains a count and column and row percentages. Describe the character of the relationships which you find.

Draw a scattergram, plotting Age against Jobsat. Describe the character of this relationship.

Using the recoded version of Age construct contingency tables showing the relationship between this variable and each of the other three variables. Describe the character of the relationships you find.

If you are learning SPSS or another statistical package try inputting these data. You will find it easier to get the computer to do the analyses specified above. You can also generate tests of association and significance and consider the meaning of these. Try using the software to produce output in the form of graphs (e.g., pie charts, histograms).

Table 19.1. A data matrix

                              Variables or questions

Sex

Age

Working

Jobsat

Case 1

Male

66

No

Missing

Case 2

Female

34

Full time

1

Case 3

Female

25

Part time

2

Case 4

Female

44

Full time

5

Case 5

Male

78

No

Missing

Case 6

Male

40

Full time

2

Case 7

Male

33

Full time

1

Case 8

Male

16

No

Missing

Case 9

Female

35

Full time

1

Case 10

Female

45

Full time

2

Case 11

Male

30

No

Missing

Case 12

Female

56

Part time

4

Case 13

Male

79

No

Missing

Case 14

Male

60

Part time

4

Case 15

Female

55

Part time

4

Case 16

Female

54

Part time

5

Case 17

Male

55

Full time

1

Case 18

Male

17

No

Missing

Case 19

Female

23

Full time

3

Case 20

Female

20

No

Missing

2. Table 19.2 consists of four contingency tables demonstrating different types of relationship between the two variables of social class and home ownership. Below each is a p-value and the result of a test of association (Q). For each table, describe the character of the relationship and explain why the p-values and tests of association vary.

Table19.2. Tables showing different relationships between social class and home ownership (column %)

(a)                                                          (b)
                               Social class                                            Social class

Home ownership

Lower

Middle

Upper

Home ownership

Lower

Middle

Upper

Owner

20

30

50

Owner

60

40

3

Private, rented

30

40

30

Private, rented

35

35

45

Council, rented

50

30

20

Council, rented

5

25

52

               p <0.01, Q = 0.6                        p <0.01, Q = –0.8

(c)                                                           (d)
                               Social class                                            Social class

Home ownership

Lower

Middle

Upper

Home ownership

Lower

Middle

Upper

Owner

33

32

36

Owner

56

10

59

Private, rented

30

28

33

Private, rented

23

20

22

Council, rented

37

40

31

Council, rented

21

70

19

               p <0.05, Q = 0.04                       p <0.01, Q = –0.02

Table 20.1. Demonstration of the elaboration paradigm: the relationship between education and income, as affected by gender

(a)(i) Zero-order table showing an association

Educational achievement

Income

High

Low

High

120 (60%)

80 (40%)

Low

80 (40%)

120 (60%)

Total

200 (100%)

200 (100%)

p = 0.00006; phi = 0.20; gamma = 0.38

(a)(ii) Replication
Men                                                       Women

Educational achievement

Educational achievement

Income

High

Low

Income

High

Low

High

40 (61%)

26 (39%)

High

80 (60%)

54 (40%)

Low

26 (39%)

40 (61 %)

Low

54 (40%)

80 (60%)

Total

66 (100%)

66 (100%)

Total

134 (100%)

134 (100%)

p = 0.01481; phi = 0.21; gamma = 0.41   p = 0.00149; phi = 0.19; gamma = 0.37

(a)(iii) Spurious or intervening
Men                                                       Women

Educational achievement

Educational achievement

Income

High

Low

Income

High

Low

High

112 (78%)

64 (76%)

High

8 (14%)

16 (14%)

Low

32 (22%)

20 (24%)

Low

48 (86%)

100 (86%)

Total

144 (100%)

84 (100%)

Total

56 (100%)

116 (100%)

p = 0.78290; phi = 0.02; gamma = 0.04   p = 0.93038; phi = 0.01; gamma = 0.02

(a) (iv) Specification
Men                                                       Women

Educational achievement

Educational achievement

Income

High

Low

Income

High

Low

High

90 (60%)

50 (33%)

High

30 (60%)

30 (60%)

Low

60 (40%)

100 (67%)

Low

20 (40%)

20 (40%)

Total

150 (100%)

150 (100%)

Total

50 (100%)

50 (100%)

p <0.00000; phi = 0.27; gamma = 0.50    p = 1.00000; phi =0.00; gamma = 0.00

(b)(i) Zero-order table showing no association

                  Educational achievement

Income   High                    Low

High        120 (60%)           120 (60%)

Low        80 (40%)             200 (100%)

Total       80 (40%)             200 (100%)

p = 1.00000; phi = 0.00; gamma = 0.00

(b) (ii) Suppressor
Men                                                        Women

Educational achievement

Educational achievement

Income

High

 Low

Income

High

Low

High

20 (67%)

20 (20%)

High

100 (59%)

100 (100%)

Low

10 (33%)

80 (80%)

Low

70 (41%)

0 (0%)

Total

30 (100%)

100 (100%)

Total

170 (100%)

100 (100%)

p < 0.00000; phi = 0.43; gamma = 0.78   p <0.00000; phi = -0.45; gamma = -1.00

3. This is a structured exercise in reading a statistical table that aims to give you a general strategy for perceiving the main messages of such tables. You could apply this approach to Table 17.1 in the book, or find tables as suggested in the fourth workshop and discussion exercise associated with Chapter 17.You will find that not all of the questions are relevant to every table, but experience has shown that these steps, if followed carefully, enable a deeper understanding of any statistical table.

    (a) Read the title before you look at any numbers. What does this reveal about the content of the table?
    (b) Look at the source: Who produced the data, with what purpose? Was it a census or a sample?
    (c) Look at any notes above or below the table. How will they influence its scope and your interpretation?
    (d) Read the column and row titles. They indicate which variables are applied to the data.
    (e) How many variables are there and what are they? Can any be considered independent or dependent?
    (f) How are the variables measured? Are there any omissions or peculiarities in the measurement scale? How else might such a measure have been constructed?
    (g) What units are used – percentages; thousands; millions? If you are dealing with percentages, then which way adds up to 100%?
    (h) Look at the 'All' or 'Total' column. These are usually found on the right-hand column and/ or the bottom row (the 'margins' of a table). What do variations in the row or column tell you about the variables concerned?
    (i) Now look at some rows and/or columns inside the table. What do these tell you about the relationships between variables? What social processes might have generated the trends you find?
    (j) Is it possible to make causal statements about the relationship between variables? If so, do any of these involve the interaction of more than two variables?
    (k) What are the shortcomings of the data in drawing conclusions about social processes?
    (l) What other enquiries could be conducted to take this analysis further?
    (m) Finally, consider the issue of whether the table reveals something about social reality, or creates a particular way of thinking about reality.