Validity and reliability of Imagine Learning Assessments

Imagine Learning assessments have been meticulously designed and externally reviewed to ensure they truly measure student learning. Here you can learn more about the validity of these assessments.

Assessment Types

Imagine Learning offers four different types of assessments to measure student learning:

  1. Diagnostic Assessment occurs at the beginning of each course and assesses student’s prior knowledge of content and establishes a customized learning path over the specific content. Administrators can also enable pretesting where when students begin a new lesson, they are presented with a 10-question objective-based assessment. If the student passes a predetermined threshold, s/he will move on to the next pretest. If the student does not meet mastery, they will have an opportunity to proceed through the lesson at their own pace.
  2. Formative Assessments embedded within a lesson check understanding of concepts and skills as they are presented. Assignments, which follow the lesson, also serve as formative assessments. By providing corrective feedback, Imagine Learning’s formative assessments help students understand where their gaps in knowledge exist, and learn where additional practice or support is needed.
  3. Interim Assessments occur after students finish an Imagine Learning lesson. The items for these assessments are drawn from an item bank, each aligned to a specific lesson objective. Using Webb’s Depth of Knowledge and Bloom’s Taxonomy, items are labeled based on their level of difficulty. Typically there is a 1-2-1 ratio of easy – medium – hard items.
  4. Summative Assessments are provided at the end of each unit and/or course to evaluate students’ overall performance.

Validity and Reliability of Imagine Learning Assessments

Validity of a test is the degree to which an assessment actually measures what it claims to measure. Imagine Learning measures two types of validity:

  1. Content Validity: Content validity refers to the adequacy with which relevant content has been sampled and represented in the test. Each diagnostic, formative, interim, and summative assessment in Imagine Learning is designed to measure content-area achievement. Items are aligned to Imagine Learning course content material and represent the breadth of content described in current state and Common Core Standards. All targets and distractors are reviewed by experienced classroom teachers and content specialists to align with Haladyna (2006) and the Smarter Balanced Assessment Consortium’s (2012) bias, fairness, and sensitivity standards. Teachers and specialists also ensure that items measure content and objectives presented in each course. If discrepancies are found, items are revised and/or replaced.
  2. Construct Validity: Construct validity assesses the degree to which a test measures the theoretical construct it is designed to measure.

Reliability refers to the degree to which an assessment produces consistent scores. Imagine Learning measures internal consistency reliability, the degree to which items that propose to measure the same general construct produce similar scores.

Example: Algebra 1 Polynomial Quiz Evaluation

In 2011, Imagine Learning evaluated the validity and reliability of the polynomial quiz in the Algebra 1 course. The evaluation focused on 465 high school students from across the country. Results revealed:

  • Content Validity: Six content-area experts reviewed the polynomial quiz for content validity. Results revealed that the overall item-congruency validity was 0.86. This indicates that 86 percent of experts rendered the items to be a perfect match to their objectives.
  • Construct Validity: In order to examine the construct validity of the polynomial quiz, quiz items were correlated to their objectives. A confirmatory factor analysis revealed that the polynomial quiz had an exceptionally strong construct validity X2 (72) = 75.23, p = .37; RMSEA = 0.01(.90 CI = .000 - .029). The standardized component factor correlation was 0.995. Typically, any correlations greater than .80 is considered substantial.
  • Internal Consistency Reliability: The internal consistency reliability coefficient for the polynomial quiz was 0.75, the highest possible value being 1.0. This finding provides strong support that the polynomial quiz is reliable.