The Juvenile Justice Professional's Guide to
Human Subjects Protection and the IRB Process
Home Before we begin Let's begin History of H.S. Protection Confidentiality of Secondary Youth Data Responsibility for Protecting Human Subjects Administration of the IRB
Research Juvenile Justice Site Map
Confidentiality of Secondary Youth Information
Statistical Disclosure Limitation Techniques
Restricted Data Access Measures
Privacy Certificate
Information Transfer Agreement


The IRB reviews proposals for research projects involving previously collected youth information to ensure that the confidentiality of identifiable private information is maintained.

The original data collectors and juvenile justice researchers who utilize the previously collected youth data have legal and ethical obligations to protect these youth, especially when the information contains identifiable private information.

When researchers collect information from or about juvenile offenders, it is essential that the research protocol include explicit procedures to prevent breaches of confidentiality. Not only is this critical when the researcher acquires information directly from youths (i.e. primary data collection), but also when the researcher utilizes previously collected youth data (secondary data) containing information that can be used, either alone or in combination with other information, to identify youth. Title 28 Code of Federal Regulations Part 22 (28 CFR 22), Confidentiality of Identifiable Research and Statistical Information, is a statutory requirement that governs the use and release of research and statistical information identifiable to a private person. All recipients of funding from the Federal government are subject to the regulations. This Federal law addresses the assurance of confidentiality of youth data along with details of criteria that must be met and procedures that researchers must implement to ensure that the confidentiality of identifiable private information is maintained. The purpose of these regulations is to:
  • Protect individuals’ privacy
  • Increase the credibility and reliability of federally supported research and statistics
  • Provide guidance to persons engaged in research on the use of identifiable information
  • Insure the appropriate balance between individual privacy and essential research needs to advance knowledge in criminal justice
  • Prevent the use of research and statistical information for judicial proceedings
The essence of 28 CFR 22 is that agencies that release youth data files and researchers who utilize them are subject to all of the regulatory requirements of this code that govern the use and release of research and statistical information.

Primary Data and Secondary Data
Juvenile justice researchers must collect relevant youth information. Researchers must decide what information is needed to satisfy their research design, how it will be collected, and how it will be utilized and interpreted. When juvenile justice professionals collect data themselves, for example, by asking youth to complete a questionnaire, the data are called primary data. Secondary data are data that has been previously collected and organized by someone else. For example, juvenile arrest data are collected by the FBI to support FBI publications, but are subsequently used by numerous other investigators for their own research and statistical purposes. Likewise, the National Juvenile Court Data Archive collects automated case-level data files from states across the country, extracts commonly defined data elements from these files, and analyzes millions of case records to prepare the Juvenile Court Statistics series. However, the Archive makes these data files available to researchers and policymakers for secondary analysis.

Researchers must collect primary data when secondary data sources are unavailable. Primary data can be generated in either of two formats, anonymous, unidentified data, or identified data. Unidentified data do not reveal specific information about an individual youth. All identifiable personal information (e.g. name, social security, address) was either not collected or, if collected, was stripped from the data file making it unlikely to identify a particular youth.

However, unidentified data can, by virtue of sample size or by a combination of unidentified pieces of information, be reasonably interpreted as referring to a specific youth. For example, when reporting drug violations for certain racial groups in a school, if a probation officer has one youth on his caseload in a particular racial group at a certain grade level, then the identity of the youth could be accurately inferred by combining race and grade information. Under these circumstances, even when data are thought to be unidentified, unintentional disclosure of confidential information is a potential threat to the rights and well being of youth.

Identified data are data linked to a specific individual through personal information such as name, social security number, address, telephone number, names of parents or other family members, or by a combination of demographic information. Court records, school records, medical records, and employment records are examples of identified data when originally collected. Typically, identified data require more safeguards and protection when collected, maintained, stored, and shared than do unidentified data.

Juvenile justice researchers who work with secondary youth information obtain data sets in one of three formats:
  • Unidentified data contain no identifiable personal information (alone or in combination) and are anonymous.
  • Unlinked data are anonymized data; all identifiable personal information (either alone or in combination) has been removed or disguised (e.g. replacing names with random numbers) and there is no link back to the original data.
  • Identified data include personal identifier(s) that can be linked directly to the individual from whom the information was obtained.
Identified data may include both unique or direct identifiers and indirect identifiers. Name, student ID number, date of birth, and case number are direct identifiers and are the most commonly recognized form of personally identifiable information. Indirect identifiers include gender, race, and offense, along with other individual youth factors that when used in combination may uniquely identify at least one subject in the data base.

The need for confidentiality exists in all studies in which data are released about identified youth. The difficult issue for most researchers is maintaining confidentiality when releasing valuable information for research. When confidential data are inappropriately released, disclosure takes place. Identity disclosure occurs when youth can be identified from the released data. In some instances, revealing identity may not violate confidentiality requirements of a research study. However, if identification of individual youth leads to divulging confidential information about that particular youth, then attribute disclosure is the result. Attribute disclosure compromises the guarantee of anonymity and privacy and is a violation of confidentiality.

Juvenile justice researchers are responsible for preventing disclosures of confidential youth data. Procedures must be in place to assure that the all youth participating in a research study cannot under reasonable circumstances be harmed as a result of the research. Additionally, Federal law (28 CFR 22) regulates that all statistical information that can be identified to a specific youth either directly (by name or other personal identifiers) or indirectly (by sample size or a combination of factors) may only be used for research and statistical purposes.

Methods to Ensure Confidentiality of Youth Data
Federal, state, and local youth agencies collect information about youth with the assurance of confidentiality. Prior to distributing this information for research or statistical analysis, these agencies use an array of methods to protect youth data and to ensure that the risk of disclosure is minimized. Statistical disclosure limitation techniques restrict the amount of information released to the public. Once in the public domain, there are no restrictions on eligibility and intended use. However, the statistical methods that protect the confidentiality of youth information can often make the data unsuitable for detailed statistical analyses. When additional information is needed for research and statistics, agencies may consent to release more detailed data files under highly controlled conditions. These restricted data access measures include imposing conditions on who can access the data, the purpose for which the data can be used, where the data can be used, how the data must be stored, and other features associated with access to the data files.

OJJDP Home | NCJJ Home | National Juvenile Court Data Archive | Site Map