Research

Instruments

We are developing several instruments to be used in the project.
These instruments will provide feedback for the teachers, can be used to give grade to the students, and will provide the data needed to answer the research questions.

At three times during the baseline study and field tests (beginning, middle and end of the school year), students will complete a survey of science affect.
This survey is be based on an existing survey that has been validated for use with middle school students.

The survey measure self-efficacy for learning more science and using science ideas to make sense of new issues, motivation to learn more science, sense of agency related to GCs, interest in various science and socio-science topics, and interest in STEM professions.

To explore the potential impact of GC learning experiences on affect, we will first employ partial credit Rasch analysis to locate students’ self-efficacy, motivation, agency, and interest on separate metrics.

We will then compare survey results from students in the baseline study with those from the field tests at each time point. To better understand how student affect might change over time associated with GC learning, we will compare field test students’ survey responses across the three time points.

In considering learning of science content, our we take a multi-level assessment approach. Assessments can be considered at various distances from an intervention. A proximal assessment’s items and tasks are closely related to an intervention’s learning context. More distal assessments connect to an intervention in terms of relevant standards or general expectations but present items and tasks dissimilar from the learning context. Proximal measures are more sensitive to intervention effects; distal measures make it possible to compare results across different interventions and approaches. 

We will use three assessments of science knowledge: Curriculum-Aligned Assessments (CAAs) will be the most proximal measure, an Issue-Based Assessment (IBA) will provide an intermediately distanced measure, and student performance on TIMSS items will serve as the most distal measure.

A CAA is being developed for each GC unit and administered after that unit’s enactment. These assessments are being developed along with the units, are being tested during the pilots, and necessary revisions are being made prior to their use in the field tests. Each of the CAAs is accompanied by a scoring rubric. In order to monitor students’ learning, we will sample items from the individual CAAs to create a curriculum-aligned pretest and implement the pretest at the beginning of the school year. 

A CAA is being developed for each GC unit and administered after that unit’s enactment. These assessments are being developed along with the units, are being tested during the pilots, and necessary revisions are being made prior to their use in the field tests. Each of the CAAs is accompanied by a scoring rubric. In order to monitor students’ learning, we will sample items from the individual CAAs to create a curriculum-aligned pretest and implement the pretest at the beginning of the school year. 

By comparing student performance on these items between the pretest and post-unit CAAs using an estimate of fidelity as a covariate in the analytic model, we will be able to monitor student learning over time and analyze how the various GC contexts are associated with learning.

For the IBA, we are developing 6 tasks based on the format that was developed for the national middle school test in Israel. In contrast to traditional science assessments (including the CAA), IBAs involve more complex scenarios and present students with additional information beyond what they will have learned during instruction.
In this way, the items mirror real-world situations in which students must use what they know to interpret new information and to connect it to their existing knowledge for constructing explanations, asking questions, evaluating arguments, and making decisions.

Each of the IBA tasks is accompanied by scoring rubrics. The new tasks and rubrics will be tested in Year 1 pilots. Two IBA tasks will be given to students as class activities to familiarize them with the format. At the end of the year, each student will be randomly assigned two of the remaining tasks.

To better understand how students using GC materials perform on distal achievement tests relative to their peers nationally, we are making use of items and data from TIMSS science assessments. We have identified 6 TIMSS items assessing knowledge of science concepts addressed in the GC units (2 items per unit). 

These items will be administered at the end of the field tests. We will compare the scores on the TIMSS items with those obtained by the general US/Israeli student populations when the items were originally administered. This will allow us to explore how GC-oriented instruction supports learning in comparison national performance.

We anticipate the ways in which teachers enact the curriculum will be a significant mediator of student outcomes. Therefore, we will collect teacher enactment data in order to keep track of which learning opportunities students were afforded.
All participating teachers will complete instructional logs in which they self-report the completion of activities, engagement in practices, formative assessments, classroom discussions, and ways in which culturally relevant teaching principles were enacted.

We will interview a sub-set of students in the field test. This sample will be purposefully selected, based on results from the surveys, such that we include roughly equal numbers of students who demonstrate high affect, low affect, and significant changes in affect over the course of each field test. 

The interviews will focus on students’ experiences, interest, motivation, and agency when learning from a GC unit in comparison to learning from traditional curriculum.

We will conduct six classroom case studies (three in each country) during the second field test.
Primary data sources for the case studies will be video recordings of classroom enactments of at least two GC units, field notes from classroom observations, artifacts including student work samples, and teacher interviews. 

To analyze data for each case study, we will look across the data sources to inductively code for trends that reflect enactment affordances and barriers with particular attention to the ways in which assessments (both classroom and end-of-grade) are considered, discussed, and enacted. 

Following creation of individual case studies, we will conduct cross-case analyses to explore similarities and differences between teachers within the same national context and between the two different contexts.

Measures of Affect

At three times during the baseline study and field tests (beginning, middle and end of the school year), students will complete a survey of science affect.
This survey is be based on an existing survey that has been validated for use with middle school students.

The survey measure self-efficacy for learning more science and using science ideas to make sense of new issues, motivation to learn more science, sense of agency related to GCs, interest in various science and socio-science topics, and interest in STEM professions.

To explore the potential impact of GC learning experiences on affect, we will first employ partial credit Rasch analysis to locate students’ self-efficacy, motivation, agency, and interest on separate metrics.

We will then compare survey results from students in the baseline study with those from the field tests at each time point. To better understand how student affect might change over time associated with GC learning, we will compare field test students’ survey responses across the three time points.

Measures of Knowledge

In considering learning of science content, our we take a multi-level assessment approach. Assessments can be considered at various distances from an intervention. A proximal assessment’s items and tasks are closely related to an intervention’s learning context. More distal assessments connect to an intervention in terms of relevant standards or general expectations but present items and tasks dissimilar from the learning context. Proximal measures are more sensitive to intervention effects; distal measures make it possible to compare results across different interventions and approaches. 

We will use three assessments of science knowledge: Curriculum-Aligned Assessments (CAAs) will be the most proximal measure, an Issue-Based Assessment (IBA) will provide an intermediately distanced measure, and student performance on TIMSS items will serve as the most distal measure.

A CAA is being developed for each GC unit and administered after that unit’s enactment. These assessments are being developed along with the units, are being tested during the pilots, and necessary revisions are being made prior to their use in the field tests. Each of the CAAs is accompanied by a scoring rubric. In order to monitor students’ learning, we will sample items from the individual CAAs to create a curriculum-aligned pretest and implement the pretest at the beginning of the school year. 

By comparing student performance on these items between the pretest and post-unit CAAs using an estimate of fidelity as a covariate in the analytic model, we will be able to monitor student learning over time and analyze how the various GC contexts are associated with learning.

For the IBA, we are developing 6 tasks based on the format that was developed for the national middle school test in Israel. In contrast to traditional science assessments (including the CAA), IBAs involve more complex scenarios and present students with additional information beyond what they will have learned during instruction.
In this way, the items mirror real-world situations in which students must use what they know to interpret new information and to connect it to their existing knowledge for constructing explanations, asking questions, evaluating arguments, and making decisions.

Each of the IBA tasks is accompanied by scoring rubrics. The new tasks and rubrics will be tested in Year 1 pilots. Two IBA tasks will be given to students as class activities to familiarize them with the format. At the end of the year, each student will be randomly assigned two of the remaining tasks.

To better understand how students using GC materials perform on distal achievement tests relative to their peers nationally, we are making use of items and data from TIMSS science assessments. We have identified 6 TIMSS items assessing knowledge of science concepts addressed in the GC units (2 items per unit). 

These items will be administered at the end of the field tests. We will compare the scores on the TIMSS items with those obtained by the general US/Israeli student populations when the items were originally administered. This will allow us to explore how GC-oriented instruction supports learning in comparison national performance.

Teacher Logs

We anticipate the ways in which teachers enact the curriculum will be a significant mediator of student outcomes. Therefore, we will collect teacher enactment data in order to keep track of which learning opportunities students were afforded.
All participating teachers will complete instructional logs in which they self-report the completion of activities, engagement in practices, formative assessments, classroom discussions, and ways in which culturally relevant teaching principles were enacted.

Student Interviews

We will interview a sub-set of students in the field test. This sample will be purposefully selected, based on results from the surveys, such that we include roughly equal numbers of students who demonstrate high affect, low affect, and significant changes in affect over the course of each field test. 

The interviews will focus on students’ experiences, interest, motivation, and agency when learning from a GC unit in comparison to learning from traditional curriculum.

Observations, Student artifacts, Fieldnotes, and Teacher Interviews

We will conduct six classroom case studies (three in each country) during the second field test.
Primary data sources for the case studies will be video recordings of classroom enactments of at least two GC units, field notes from classroom observations, artifacts including student work samples, and teacher interviews. 

To analyze data for each case study, we will look across the data sources to inductively code for trends that reflect enactment affordances and barriers with particular attention to the ways in which assessments (both classroom and end-of-grade) are considered, discussed, and enacted. 

Following creation of individual case studies, we will conduct cross-case analyses to explore similarities and differences between teachers within the same national context and between the two different contexts.