SSF
    SSF Overview
    Study Design
    Survey Sampling
    Instrument
    Design
    Data Collection
    CATI

    General Policies
    Fees & Finances

  
  SRC Home
  UCDATA
  CUE
  CCRDC
  UCB

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

 

 

....

Data Processing and Management


Computer-Assisted Data Entry

CATI technology may also be used as a method for entering data from source documents (e.g., self-administered questionnaires, records, and interview forms) directly onto computer files. Often it is possible to combine coding, data entry, and data cleaning into a single operation, bypassing the traditional paper-and-pencil coding and data entry procedures. Computer-assisted data entry on SRC's UNIX operating system has proven to be less costly than the traditional methods, and it generally results in more rapid and efficient production of data files and tapes.

For more information on computer-assisted data entry, contact Robert L. McCarthy, Manager of Data Services, (510) 642-6596, bobmc@berkeley.edu.


Coding

The Center's data management staff prepares and edits completed interviews, questionnaires, and related documents for computer processing. Its services are available to investigators making separate arrangements for fieldwork as well as to those utilizing the Center's field staff. These services include:

  • laying out and precoding of data collection instruments prior to printing in order to facilitate efficient processing;
  • editing and coding of both check-list and open-ended questions and related documents in preparation for data entry;
  • coding of occupations, industries, and geographic areas to U.S. Census classifications;
  • constructing codes for open-ended survey questions, depth interviews, and other qualitative materials;
  • preparing study codebooks and variable labels to facilitate analysis; and
  • identifying and correcting data errors by reference to source documents.

Since coding costs vary greatly depending on the layout and extent of precoding of survey instruments, clients are urged to consult with the coding staff prior to final drafting of their data collection instruments.

For more information on coding, contact Robert L. McCarthy, Manager of Data Services, (510) 642-6596, bobmc@berkeley.edu.


Data Cleaning, File Construction & Codebook Preparation

Once data have been entered into computer files, SRC staff can assist users in checking, cleaning, recoding, and labeling the data prior to analysis. Such assistance may include the following:

  • running edit programs to check for identification numbers, to identify wild codes, and to resolve inconsistencies;
  • preparing analysis files with appropriate variable and category labels, missing data specifications, and weights;
  • creating new variables by arithmetic, logical, scaling, and recoding operations;
  • combining data from linked or nested files into composite files, e.g., assigning Census tract-level variables to households or individuals within tracts;
  • updating master files and producing routine reports, summary descriptive statistics, and preliminary tabulations; and
  • preparing full documentation on the structure, content, and marginal distributions of datasets, including Microsoft Word and PDF versions of codebooks.

The final delivery products each client should expect is: (1) a clean, raw data file in the format desired; (2) a data definition file describing the raw data; and (3) a codebook containing variable names, titles, category labels, and question text. Typically, these codebooks also depict the frequency distribution of all variables as well as summary statistics. Our capability of creating codebooks also provides us with a useful byproduct, that of preparing data definition files for three of the more popular statistical packages, i.e., SAS, SPSS and STATA. Clients should be able to commit themselves to analysis once receiving these deliverables. Codebooks and data files may also be used as interim reports. Our staff will prepare and deliver by the requested dates.

For further information on data cleaning, file construction, and codebook preparation, contact Robert L. McCarthy, Manager of Data Services, (510) 642-6596, bobmc@csm.berkeley.edu.


Data Analysis and Computing Services

Staff members can assist clients in the statistical analysis of their data. In the planning stages of a study, staff members can help frame analysis strategies to guide the choice of data collection procedures. Once data are ready for analysis, staff members can advise on or carry out specific analyses, including the following:

  • data reduction techniques to build indices for subsequent study, e.g., scale construction, clustering, and factor analysis;
  • tabulation techniques for both investigation and final presentation;
  • graphic techniques for picturing univariate distributions, time series, geographic occurrence, and more complex data structures, e.g., bar charts, scatter plots, and maps;
  • estimation of standard errors for sample statistics from complex samples; and
  • multivariate techniques for quantitative and categorical variables, e.g., linear regression and analysis of variance, log-linear analysis of cross-classifications, logistic regression for binary and categorical dependent variables, and structural equation estimation.

Experience with a wide variety of packaged computer programs enables Center staff to help link clients' analytic goals to the capabilities of available hardware and software. In conjunction with data analysis work, Center staff can also prepare written technical summaries of methods used or full reports on study results.

For further information on data analysis services offered by the Survey Services Facility, contact Thomas L. Piazza, Senior Survey Statistician, piazza@csm.berkeley.edu or Yuteh Cheng, Manager of Technical Services, yuteh@hobbes.berkeley.edu.


Last modified: 4 February 2008