Policy Implications of Long-Term Teacher
Effects on Student Achievements

Karen L. Bembry, Heather R. Jordan, Elvia Gomez
Mark C. Anderson, Robert L. Mendro
Dallas Independent School District
Dallas, TX 75204

divider.gif (4268 bytes)

Related Research

Longitudinal Teacher Effects Methodology

The analysis of longitudinal teacher effects for this research was conducted on four largely overlapping groups. (A complete discussion of the results summarized here is contained in Mendro, Jordan, Gomez, Bembry, and Anderson (1998). This discussion includes all of the statistical models used and a complete listing of significance tests and results and all data tables.) Two overall groups were 1) students with 5 years of complete testing data in reading or mathematics from 1993 to 1997 and associated teacher effectiveness data in reading or mathematics for the 4-year period 1994 to 1997, and 2) students with 4 years of complete testing data in reading or mathematics from 1994 to 1997 and associated teacher effectiveness data in reading or mathematics for the 3-year period 1995 to 1997. These overall groups were each divided into groups with complete reading data and with complete mathematics data. The test used was the survey form of the ITBS. All scores were expressed in terms of normal curve equivalent scores (NCE) using the publishers norm transformations.

The initial testing score for 1993 or 1994 was treated as a pretest score and the effects of teachers on students were studied for the remaining years. Obviously, the data for the groups overlap, since the group with 5 years of complete testing data forms a subset of the group with four years of complete data. Further, since most students in the Dallas Independent School District take both tests annually (the testing rate on students eligible to be tested on the ITBS is above .95 on both subtests), the math and reading groups are largely the same within year configurations. Within each of the four groups, cohorts were determined by grade level in 1997. Thus, in each group with 5 years of data, 4 cohorts were determined for students in grades 5 through 8. For example, the cohort for grade 5 had testing data from grade 1 in 1993 and testing data and a teacher effectiveness score for each grade from grade 2 in 1994 to grade 5 in 1997. Similarly, within each group and subject with 4 years of data, there were 5 cohorts for students in grades 4 through 8.

For each group of students, a three character notation was developed to refer to the different cohorts. Those with complete reading data were designated with an R and those with complete mathematics data were designated with a M. If the cohort was from the group with 5 years of data, it was designated as R5 or M5. If from the group with 4 years of data, the designation was R4 or M4. Finally, the grade level of the students in 1997 forms the last number in the notation so that students in Grade 6 in 1997 are designated R5-6 or M5-6 in the 5 year group and R4-6 or M4-6 in the 4-year group, etc.

The numbers of students and teachers in each cohort are presented in Table 1. Within each group, 4 or 5 year, the number of students represents an unduplicated count across grade levels. However, there is duplication of teachers (but not effectiveness measures since they are computed separately for each class) across cohorts. A teacher who teaches at a given grade level across years can be present in several cohorts. For example, a teacher at 5th grade in 1997 can be a teacher in the grade 6 cohort in 1996 and in the grade 7 cohort in 1995, etc.

The data in Table 1 show approximately 12,000 students in each 5-year group and approximately 2,900 students in each 5-year cohort. There are approximately 17,500 students in each 4-year group and at least 3,200 students in each 4-year cohort. The number of teachers associated with each 5-year cohort ranges from near 900 to approximately 1,500 and with each 4-year cohort from 450 to 1,300. The total number of teachers in each group, not counting duplications across cohorts and years is approximately 2,000 to 2,180.

For each of the four years from 1994 to 1997, teacher effectiveness values were computed within classroom assignment. They were computed from regression residuals derived from a two-stage, two-level regression/HLM process outlined in Webster and Mendro (1997). These classroom indices for each cohort were ranked for each cohort. The ranked indices were divided into 5 equal subsets from least effective (assigned a value of 1) to most effective (assigned a value of 5) for the 4-year cohorts and divided into 3 equal subsets from least effective (assigned a value of 1) to most effective (assigned a value of 3) for the 5-year cohorts.

Then, with each cohort, students were divided into subgroups based on the level of their teachers’ effectiveness in the three or four year span. These subgroups are designated numerically based on the level of teacher from 1995 to 1997 for four-year cohorts and from 1994 to 1997 for five-year cohorts. A three-digit number, then, is used to describe the groups in the 4-year cohorts and a four-digit number the groups in the 5-year cohorts. For a sub-group in the 4-year cohort, the first digit represents the level of the group’s teachers in 1995, the second the level in 1996, and the third, the level in 1997. Similarly, for sub-groups in the 5-year cohorts, the four digits correspond to the levels of the teachers in 1994, 1995, 1996, and 1997. For example, the subgroup 352 from a 4-year cohort had teachers in the 3rd quintile in 1995, in the 5th (or highest) quintile in 1996, and teachers in the 2nd quintile in 1997. The subgroup 3121 had teachers in the top third in 1994, the bottom third in 1995, the middle third in 1996, and the bottom third in 1997. Using these procedures, there were 125 subgroups in each 4-year cohort: 5 levels in each of 3 years. Similarly there were 81 subgroups in each of the 5-year cohorts: 3 levels in each of 4 years.

There were four stages to the analysis of longitudinal teacher effects. The populations were first analyzed for bias in the various distributions. Next, an analysis of covariance was conducted to determine effect sizes. Because of the failure to meet conditions of the Ancova procedure, the data were analyzed for teacher effects using hierarchical linear modeling (HLM). Third, an analysis of the adjusted effects from the HLM was conducted. Finally, an analysis of the raw NCE means was conducted.

table_1.gif (9751 bytes)

Line757.gif (4269 bytes)
Cover Page Related Research

Longitudinal Teacher Effects Results

Dallas Independent School District Mathematics Study

Related Research in Progress

Policy Implications

Papers Index

Bibliography