Data Centre Serbia

Data Centre Serbia for Social Sciences

Institute of Economic Sciences

Data Documentation

Data documentation explains how data were collected (context of data collection), what they mean, what are their content and structure, and specifies any manipulations that may have taken place. Good documentation should be considered best practice when creating, organising, and managing data, and is equally important for data preservation. Data files must be clean and clear, i.e., interpretable and searchable by other researchers

The data should be accompanied by adequate documentation that explains the context of data collection (history, aim, and hypotheses), methodology, and any material relevant to the data. This includes not only the general information about the study itself (metadata) compatible with international standards (DDI, in order to share this metadata) but also to give as much material as possible to make the study as suitable as possible for secondary use. This kind of accompanying material could include:

  • Methodological reports
  • Codebooks
  • Questionnaires
  • Coding instructions
  • Interviewer guides
  • Database dictionaries
  • Bibliographies of publications related to the data
  • Links to online tools

 Study level

Study level metadata is a subset of core information about the conducted research. It provides information about the research context, data collection methods, preparation of data for analysis, analytical methods, and the overviews of results. Study level documentation should be in line with the DDI metadata standard (The Data Documentation Initiative) used for describing social science data and should include the following:

  • The context of data collection: project history, aims, objectives, and hypotheses
  • Data collection methods: data collection protocols, sampling design, instruments used, hardware and software used, data scale and resolution, temporal coverage and geographic coverage, and digitization or transcription methods
  • Structure of data files, number of cases, records, variables and relationships between files
  • Data sources used and provenance of materials, e.g., for transcribed or derived data
  • Data validation, checking, proofing, cleaning and other quality assurance procedures carried out, such as checking for equipment and transcription errors, calibration procedures, data capture resolution and repetitions, or editing, proofing or quality control of materials
  • Modifications made to data over time since their original creation and identification of different versions of datasets
  • For time series or longitudinal surveys, changes made to methodology, variable content, question text, variable labeling, measurements or sampling
  • Information on data confidentiality, access, and use conditions, where applicable

Dataset level

Dataset metadata could be embedded in data files in the form of variable and code descriptions in database/data-tables or the form of headers in transcript files. Such information includes:

  • Names, labels and descriptions for variables, records and their values
  • Explanation of codes and classification schemes used
  • Codes of, and reasons for, missing values
  • Derived data created after collection, with code, algorithm or command file used to create them
  • Weighting and grossing variables created and how they should be used
  • Data list describing cases, individuals or items studied, for example for logging qualitative interviews
  • Additional information about dataset could be recorded in an external structured document if needed


Institute of Economic Sciences

Zmaj Jovina 12, 11000 Belgrade, Serbia

Work Hours

Mon-Fri: 9 am - 3 pm
Weekends: Closed

Phone & Email

Tel. + 381 64 12 33 254