Ofqual - Office of Qualifications and Examinations Regulation

Information for:

The Assessment Validity Programme

Assessment Validity Regulated national assessments, public examinations and qualifications in England are used for a variety of purposes, including:

  • the measurement of attainment in different subject areas for individual learners
  • the selection of individuals for learning programmes or higher education and employment
  • the accountability of individual educational professionals and institutions.

The importance placed upon these purposes has led to an ongoing debate about the reliability and validity of their results. We have been conducting a research programme, the Reliability of Results Programme, to look at issues associated with the reliability of results from regulated assessments. To expand our work beyond reliability, we have initiated a new research programme, the Assessment Validity Programme, to investigate issues with the validity of regulated assessments in England.

Background

Validity refers to the extent to which inferences made or actions taken on the basis of assessment results are appropriate or supported by theory and empirical evidence. Evidence of validity is used to judge the overall quality of an assessment. The Ofqual Validity Programme aims to address the following issues associated with the validity of regulated assessments:

  • There has been little systematic evaluation of the appropriateness of the use of results from national assessments and public examinations in relation to their current uses in England.
  • There have been concerns expressed about the quality of regulated assessments in terms of qualification relevance and performance standards.
  • There have also been concerns expressed about teaching to the test for high stakes tests and examinations which is perceived to result in the narrowing of the curriculum.
  • There is a lack of an integrated and systematic approach to the evaluation of the validity of regulated assessments.

Aims and objectives

The primary aim of this programme is to gather evidence which will help us to establish a systematic validity auditing system for regulated assessments with a view to further improving the national assessment systems and increasing public confidence. The main objectives of this programme include:

  • To discuss assessment validity and validation and the roles of major stakeholders in the assessment development and validation processes.
  • To conduct research into specific aspects of validity for a selection of regulated assessments.
  • To validate a number of regulated assessments against their defined purposes and performance standards using an argument based validation framework.
  • To investigate the impact of some of the current use of regulated assessments.
  • To stimulate national debate on the significance of the validity evidence generated from this programme and from other validity and validation studies.
  • To review and revise the regulatory requirements for assessment validity.

Scope of work

The programme will involve five strands of work:

Strand 1: Debating the roles of major stakeholders in the process of assessment development and validation.

Given the curriculum or syllabus, the purposes of an assessment and the constructs to be assessed must be set before the assessment is developed. It is important for major stakeholders to reach consensus on the following aspects of the assessment development and validation processes:

  • The curriculum: Who should be responsible for designing and developing the curriculum, and identifying or defining the learning outcomes?
  • Assessment purposes and performance standards: What are they and who should define them?
  • Assessment construct, design, criteria and methods: Who should be responsible for specifying them?
  • Assessment development, administration, analysis, reporting and interpretation, and evaluation: The responsibility of assessment providers.
  • Conceptualisation of validity and approaches to validation:
    • How should assessment validity be conceptualised?
    • What is the best approach to validation and what kinds of evidence should be included in the validation process?
    • Who should be responsible for carrying out the validation process?
    • Who should be responsible for evaluating the strength of the validity argument and drawing conclusions about validity?
  • How should Ofqual regulate assessment validity?
  • The roles and responsibilities of the major stakeholders: How should they be involved in the assessment development and validation processes?
Strand 2: Conducting a series of validity studies and validation case studies.

Validation is the process of developing a sound argument based on theory and empirical evidence collected from all stages of the assessment process to support the proposed interpretation and intended use of the assessment results. Each proposed interpretation and intended use of results from an assessment will need to be validated because different interpretations or uses may involve different inferences and assumptions.

Interpretations and uses of regulated assessments

For National Curriculum assessments and general qualifications, final results for individuals are reported using performance categories and should therefore be interpreted in terms of the level of attainment in the curriculum as required by the performance standards which are pre-defined. This interpretation of results for individuals applies to the primary purposes (uses) of regulated assessments and will be retained in others uses.

Currently, assessments regulated by Ofqual are primarily used for:

  • Certification/reporting (at individual level: NCAs, GCSEs, GCEs and VQs)
  • Employment/entry to university (at individual level: GCSEs, GCEs and VQs)
  • Accountability (at classes, schools, LAs levels: NCAs, and GCSEs)
  • National performance monitoring (at national level: NCAs and GCSEs)

In the present study, focus of the validation of the selected assessments will be on the use of results at individual student level (reporting or certification / university entry / employment).

An argument-based validation framework

Argument-based approaches have been widely used for validating educational assessments. Given the purpose and intended use/s of the assessment and the proposed interpretation of the results, an argument-based approach to validation general involves:

  • The development of a set of clear and coherent propositions that support the proposed interpretation and intended use of the results.
  • The evaluation of the strength of the propositions using appropriate evidence collected from various stages of the assessment process (ie to develop a sound validity argument).

In view of the nature of regulated assessments and the regulatory role of Ofqual, an argument-based framework involving the following propositions has been suggested for validating Ofqual regulated assessments for the purpose of individual certification:

  1. Alignment between the assessment and the curriculum: The assessment tasks and criteria are well aligned with the learning outcomes of the curriculum or syllabus.
  2. Accuracy and reliability of scores: The assessment scores (results) are accurate and reliable.
  3. Alignment between performance on the assessment and the performance standards defined for the curriculum or syllabus: The assessment performance category boundary marks are well aligned with the pre-defined performance standards.
  4. Indicator of future performance: Assessment results are a good indicator of future performance (when the results are used for selection to enter university/employment or prediction of future attainment).

These propositions form a theory or interpretive argument that supports the proposed interpretation and intended use of the results and need to be evaluated using appropriate evidence collected from the assessment process (detailed guidance on how to evaluate these propositions are provided in a document that can be downloaded from the link provided at the end of this page).

Validity studies and validation case studies

A series of empirical studies will be carried out to:

  • Investigate specific aspects of validity for a selection of regulated assessments.
  • Validate a range of regulated assessments at the overall assessment level.
Strand 3: Investigating the impact of the main uses of regulated assessments.

All assessments regulated by Ofqual are of high stakes. Investigations will be conducted to explore the impact of the use of regulated assessments:

  • Impact on teaching
  • Impact on learning
  • Resource implications and practical difficulties
  • Impact on the curriculum in general
Strand 4: Debating the significance of the findings from the programme.

Results from the validity studies, validation case studies, and impact investigations will be presented to the major stakeholders, including educational researchers, assessment experts, policy makers and practitioners, using seminars or workshops for further debate on:

  • The strength of evidence for the specific aspects of validity of the assessments investigated.
  • The strength of the validity argument for the assessments investigated in the validation case studies in terms of:
    • meeting the defined performance standards
    • meeting the requirements of the users
  • Impact on learning, teaching and the curriculum or syllabus (and therefore on future assessment).
  • Aspects of the assessment system that need improving.
  • Communication of validity evidence to the public.
Strand 5: Developing Ofqual validity auditing procedures.

This will involve:

  • Review of existing regulatory criteria and requirements.
  • Debate on how Ofqual should regulate assessment validity.
  • Evaluation of findings from this programme and other validity studies and practices used in other countries.
  • The development of Ofqual validity auditing procedures.

Methodology

Methods used in this programme include:

  • Review of literature related to studies of validity, validation and assessment impact.
  • Debate and discussion about various issues with validity and validation using seminars and workshops involving key stakeholders.
  • Development of an argument-based framework for validating regulated assessments.
  • Empirical studies for investigating specific aspects of validity for a selection of assessments.
  • Empirical studies for validating a range of regulated assessments
  • Investigation of assessment impact using workshops, interviews and questionnaire surveys.
  • Evaluation of findings from this programme and from other studies and practices adopted by qualification regulatory authorities in other countries.

Programme outcomes

  • A series of research reports
  • Revised Ofqual assessment validity auditing procedures.

For a detailed description of the Ofqual Validity Programme, please download the programme document at .