1 Introduction | 2 Sampling methodology | 3 Questionnaire development and testing | 4 Fieldwork methods and interviewers | 5 Checks and audits | 6 Response rate and interview length | 7 Classifications and coding | 8 Survey weights | 9 Imputation | 10 Variance estimation and significance tests | References | A1 Response rate by interviewer experience | A2 Sample and population profiles | A3 ACNeilsen area sampling frame | A4 Effect of area unit population changes | A5 Derivation of eligibility probability estimate | A6 Investigation of incident dates | A7 Contact sheets | A8 Showcards | A9 Selected CAPI screenshots
Imputing the number of
incidents
Date imputation
Relevance imputation
Offence code imputation
Duplicated incidents
Heavy victimisation cutoff
Comparable imputations
Since some respondents are heavily victimised, a proportion of incidents will have no victim forms. In fact, most incidents have no victim forms. Victim forms were completed for only 33% of the incidents reported in the NZCASS. Most of the data collected in victim forms was analysed using the incident weights described in Chapter 8, which provide an effective method of analysis when each victim form can be viewed in isolation. However, direct calculation of incidence and prevalence rates required information about all the incidents experienced[31] by each survey participant, including three items collected on each victim form: whether the incident occurred during the year 2005; whether the incident was an offence within the scope of the survey (termed being relevant); and which detailed offence codes the incident falls under. Another item that can now be derived from victim form data (subject to some assumptions) is how likely the incident was to have been reported a second time at another screener question.
Information on these items was required for all incidents, including the 67% of incidents without victim forms, to enable the direct calculation of incidence and prevalence figures. However this data is missing for the incidents without victim forms, necessitating some form of imputation[32] for this missing data. The process of imputation also affects the sampling error of the results, although for some imputation methods it is hard to figure out how much. Multiple imputation (Rubin, 1987) has been used in the NZCASS to quantify this effect, via Lumley’s (2004) ‘mitools’ package. Ten imputations have been used throughout.
Imputing the number of incidents
If the number of incidents at any screener question was missing (i.e. the respondent said "Don't know / Can't remember" or "Don't wish to answer"), a value of 1 was imputed. In other words, it was assumed that the respondent reported being a victim of just one incident. This is likely to be a poor assumption in many of these cases, but it was hoped that this provided a reasonable compromise between overcounting for those who weren’t victims and undercounting for those who really experienced more than one such incident. It was suspected, however, that most of these responses would be from victims, since non-victims would presumably not have trouble remembering the answer and would have less reason to be averse to answering the question. This suggests that the approach used here probably underestimates the true level of victimisation, and that other common imputation methods (such as a random hot-deck) would also suffer from a similar problem.[33] Hot-deck imputation would also introduce more variability. The primary reasons for imputing a value of 1 were that this approach was used in the 2001 survey, and that no clearly superior method was identified.
The number of incidents was missing in 100 places for the screener questions in the main questionnaire, coming from 72 respondents. The missing values were not uniformly distributed across these 15 screener questions; roughly a third of them (35) affected Q31 (attempted break-ins), accounting for 10% of the positive responses to this screener question. Missing information was more common in the self-completion sections, with 281 missing values accounting for approximately one third of the 837 non-negative responses. Missing information was most prevalent at the sexual victimisation screener questions, where slightly more respondents failed to provide information than provided a specific positive number of incidents. Even within this section there was substantial variation between the screener questions, with missing information being twice as likely as complete positive information for the first screener question (which asked about forced sexual intercourse, the most serious crime covered in this section).
If mean imputation or random hot-deck imputation was used instead of imputing a value of 1, still assuming that all the missing responses were from victims (and restricting the mean or the donor pool to victims accordingly), this would roughly double the estimated incidence of rape. The estimated prevalence would also be somewhat higher, though not as much. The estimated incidence and prevalence of all offences would increase slightly.
Another cause of missing information for the self-completion screener questions was the complete refusal of some 6% of respondents to answer the self-completion questionnaire. No overt imputation has been conducted to correct for this, i.e. it is effectively assumed that these people experienced no offences of the types covered by the self-completion screener questions. This will have led to underestimation of the true victimisation rates for these offence types, although the bias will not have been large due to (1) the small level of self-completion non-response, and (2) its skew towards older respondents.
For each incident without a victim form from the main screeners, the calendar year in which the incident occurred was imputed randomly assuming that it had an equal chance of occurring on each day between 1 January 2005 and the date of the interview. That is, the year each incident occurred was imputed as being in 2005 with probability equal to 365 divided by the number of days between 1 January 2005 and the date of the interview. This was done independently across incidents and for each of the ten imputations conducted per incident. This is the same method as used in the 1996 and 2001 surveys.
For self-completion incidents, the same method was used, except when the incident with date information from that section occurred during 2005. Since that incident is the last incident in that section to have occurred, all the others are then imputed as occurring during 2005. This is the same method as used for the first self-completion section in the 2001 survey (dates were not gathered in the other sections). The assumption of even spread is not ideal even when the last incident occurred in 2006, because knowing this provides some additional information about when the other incidents are likely to have occurred, but the 2001 method has been continued for consistency.
The assumption of even spread also does not account for recall bias. An investigation of the known incident dates (described in Appendix A6) suggested that this is likely to have had a substantial effect on the victimisation risk estimates from this survey, and even stronger effects in the previous surveys.[34] However, no easy method of correcting for this has been apparent.
Different types of offences have widely varying relevance rates (and varying proportions of missing data). These are shown in the following tables, broken down by source question (i.e. the screener question at which that incident was enumerated). Here "relevant" means having an offence code other than 85, 86 or 87.
Table 9.1 Missing forms and relevance rates by main screener question
|
Source |
Description |
Percentage of |
Relevance rate (for |
|
Q28 |
Vehicle Theft |
39 |
87 |
|
Q29 |
Theft from Vehicle |
34 |
90 |
|
Q30 |
Damage to Vehicle |
37 |
72 |
|
Q31 |
Attempt to break in |
42 |
63 |
|
Q32 |
Burglary |
54 |
90 |
|
Q34 |
Theft from property |
49 |
84 |
|
Q35 |
Theft from inside home |
53 |
87 |
|
Q35.41 |
Household damage |
42 |
84 |
|
Q35.40 |
Theft from person |
40 |
80 |
|
Q35.41 |
Other theft |
47 |
64 |
|
Q36 |
Assault |
46 |
81 |
|
Q37 |
Threat of assault |
55 |
66 |
|
Q38 |
Other Damage |
48 |
55 |
|
Q39 |
Threat of damage |
79 |
51 |
|
Q43 |
Other Incidents (Main Questionnaire) |
68 |
38 |
|
Q167.420 |
Assault (by current partner) |
88 |
92 |
|
Q167.422 |
Threat of assault (by current partner) |
||
|
Q167.424 |
Damage (by current partner) |
||
|
Q167.426 |
Threat of damage (by current partner) |
||
|
Q228 |
Assault (by person well known) |
86 |
90 |
|
Q230 |
Threats (by person well known) |
||
|
Q232 |
Damage (by person well known) |
||
|
Q234 |
Threat of damage (by person well known) |
||
|
Q434 |
Forced sexual intercourse |
84 |
-[36] |
|
Q436 |
Attempted forced sexual intercourse |
||
|
Q438 |
Distressing sexual touching |
||
|
Q440 |
Other sexual violence, incl. threats |
The relevance rates for the main screeners vary from 38% to 90%, so the screener question should be a useful predictor of relevance status. An imputation model was chosen by stepwise selection, starting with a model that included (for incidents from the main questionnaire) the screener question, household composition, household size, tenure/landlord, gender, age, marital status, employment status, ethnicity, urbanisation, NZSEI, and the NZDep2001 score. This reduced to a model with the screener question, age, ethnicity, household size, tenure/landlord, NZSEI, and the NZDep2001 score as predictors. Details of the model are shown in Table 9.2. According to the le Cessie-van Houwelingen normal test statistic, there was some lack of fit (Z = -2.5), but once the complex sample design is allowed for, this is unlikely to be statistically significant. (The associated Brier score was 0.16, while the Somer’s D and gamma statistics were both 0.41.) This model was used to multiply impute relevance status for incidents from the main screener questions without victim forms.
Table 9.2 Relevance imputation for incidents from main screeners
|
Predictor variable |
Level (relative to base level, |
Parameter |
Standard |
|
(Intercept) |
|
-0.6001 |
0.5819 |
|
Screener |
Things stolen from/off vehicle |
-0.3883 |
0.2880 |
|
Screener |
Vehicle tampering/damage |
0.8152 |
0.2599 |
|
Screener |
Unsuccessful burglary |
1.3279 |
0.2723 |
|
Screener |
Successful burglary |
-0.3430 |
0.3189 |
|
Screener |
Theft from property-outside |
0.1737 |
0.2690 |
|
Screener |
Theft from property-inside |
0.0304 |
0.3014 |
|
Screener |
Theft from a person |
0.5388 |
0.3311 |
|
Screener |
Other theft |
1.3038 |
0.2695 |
|
Screener |
Damage |
1.6720 |
0.3278 |
|
Screener |
Threatened to damage |
1.8636 |
0.3776 |
|
Screener |
Assault |
0.5180 |
0.2945 |
|
Screener |
Threatened to assault |
1.2535 |
0.2632 |
|
Screener |
Other |
2.2848 |
0.2828 |
|
Screener |
Damage to HH property |
0.1623 |
0.2683 |
|
Age |
25-39 |
0.3451 |
0.1109 |
|
Age |
40-59 |
0.3899 |
0.1158 |
|
Age |
60 or older |
0.4574 |
0.1551 |
|
NZSEI |
|
-0.0050 |
0.0025 |
|
NZDep01 |
|
-0.0011 |
0.0005 |
|
European |
|
-0.3469 |
0.0935 |
|
Māori |
|
-0.1925 |
0.0935 |
|
Household size |
|
0.0962 |
0.0456 |
|
Tenure and landlord |
Rented - private landlord |
-0.1863 |
0.0910 |
|
Tenure and landlord |
Rented - other landlord |
-0.4400 |
0.1324 |
|
Tenure and landlord |
Other |
0.3002 |
0.2126 |
For self-completion incidents, the source screener question for the last incident is not collected. This was imputed randomly within each section with probability proportional to the number of incidents reported at each screener question. Then the same process was used to choose a relevance imputation model, with the same candidate variables (except of course for the screener questions). Certain screener questions (mainly sexual) had no variation in relevance status, and were omitted from the screener variable.
Offence codes were imputed using a "hot deck" imputation method (the approximate Bayesian bootstrap[37]), with imputation classes defined by source screener question. This technique will reproduce the distribution of offence codes from each screener, on average. In contrast, mode imputation was used in the 1996 and 2001 surveys. The old technique would have depressed the estimated rates for offences like bicycle theft that do not have dedicated screener questions, and overstated the rates for other offences that did have their own screener question.
A new set of questions were added to the 2006 survey early in the victim form to establish whether the current incident was actually the same as in one of the previous victim forms. If so, this meant that the respondent included the same incident in their reported counts at two different screener questions, and the rest of the victim form was skipped. However, this question only detects duplication between the three (or fewer) incidents for which victim forms were completed. By estimating the rate of duplication per potential clash, and assuming independence, unobserved duplications were imputed for other incidents. The estimated duplication rate per potential clash was 3.75%, and approximately 15% of incidents from the main screeners with no victim form were projected to be duplicates. These were deleted to help avoid over-reporting through failure to follow the "apart from…" instructions.
These new questions were not added to the self-completion questionnaire, so there is no data on duplications here. The rate from main screener incidents was applied to all self-completion incidents, and around 17% were estimated to be duplicates.
After imputation, a cut-off was applied to improve the reliability of the estimated rates. The number of valid offences (after removing out-of-scope or duplicated incidents, and those not in 2005) from the main questionnaire was not allowed to exceed 30. Any further offences above this value were not included in the victimisation estimates. The same cut-off was applied independently to incidents from the self-completion components.[38]
No cut-off was applied in 2001, but the introduction of this cut-off was partly prompted by the easing in 2006 of controls on how many incidents could be reported at each screener question.
Date, offence code, and relevance imputations were repeated for the 2006 data using 2001 procedures and offence codes, and a new set of results produced. These were meant solely for comparison purposes, since many of the procedures have been improved upon.
Footnotes
31 Strictly speaking, this is most critical for prevalence rates, since incidence figures could be calculated directly from the incident weights. These figures would be less reliable than those based on imputation, however, and since imputed values have been produced to enable calculation of prevalence rates, it makes sense to use the same values to calculate incidence rates as well.
32 Imputation is a commonly used remedy for missing data, which involves filling in the missing values with allowable values for the variable in question. Many imputation methods have been devised; for an overview, see Seastrom et al. (2002) at http://nces.ed.gov/statprog/2002/appendixb3.asp.
33 This assumes the donor pool would consist of all respondents with complete data for that screener question. Another possibility is to restrict the donor pool to those reporting some incidents at that screener, which would probably then err in the other direction.
34 Similar patterns of bias have been observed in the 2005 Irish International Crime Survey (van Dijk, Manchin, van Kesteren, Nevala and Hideg 200; pages 9-11).
35 Questionnaire numbering: Q416 followed Q35; hence it is referred to as Q35.416. This protocol is repeated throughout the survey documentation.
36 Incident descriptions were not collected for any sexual incidents. All the last incidents from this section were coded as valid offences, and the relevance imputation method used here will apply this procedure to the rest of these incidents. This is different from the 2001 method, where it was assumed that these incidents had similar relevance rates to other incidents from the self-completion questionnaire.
38 Averaged across imputations, this cut-off ruled out 5.5% of incidents from the main questionnaire that would otherwise have been counted. These came from 22 respondents. The cut-off had a greater effect on self-completion incidents, ruling out 17.5% of these on average (again restricting consideration just to those incidents that would otherwise have been counted). These came from just 25 respondents, however.