About 1 — Thomas and Collier

The Importance of thomas and collier RESEARCH: Why Policy Makers Pay Attention

… the pragmatic usefulness of Thomas and Collier findings… allows policy makers to follow the program evaluation maxim “First, find out what really works, and then put your available resources into that.” Thomas and Collier research works toward that goal.

Drs. Collier & Thomas were invited to summarize their life’s work in “Validating the Power of Bilingual Schooling: Thirty-Two Years of Large-Scale, Longitudinal Research,” published in the 2017 edition of the Annual Review of Applied Linguistics. (See the Publications section, under Professional Journal Articles.) This publication summarizes the effects of the following characteristics of Thomas and Collier research.

In their collaborative work, Drs. Thomas and Collier have contributed new theoretical perspectives for the field of bilingual/multicultural education. They are well known for developing the Prism Model, a theory and guide to empirical research. This model makes predictions about program effectiveness from a theoretical perspective. Drs. Thomas and Collier have tested the Prism Model by collecting and analyzing program effectiveness data, and they have refined the model based on empirical findings.

They have also developed unique theoretical perspectives on analyses of longitudinal student data, to demonstrate the importance of following English learners’ achievement over long periods of time, with school policy implications. By following individual student progress over 5-6 years at minimum (instead of the typical 1-2 years), they have shown that the typical short-term finding of “no significant difference across programs” has misled the field and policy makers; whereas, long-term findings yield extremely significant differences among school programs. Since students’ education is a long-term effort, not a short-term one, policy makers can see how their program choices compare, by the end of elementary, middle, and high school.

While short-term test gains can be achieved by emphasizing short-term instructional objectives and low-cognitive-level instruction as measured by easy tests, these results are not sustained in the long term as the difficulty of both the curriculum and the tests increases toward the end of elementary school. Rather, long-term gains, as measured by more difficult tests, result from substantial instruction provided over a range of low-to high cognitive demands in the curriculum over a number of instructional years.

In the analytical phases of their research, Thomas and Collier have endeavored to maximize not only internal validity, but also external validity, as well as statistical conclusion validity. This is important because most studies in education usually focus on internal validity but may well de-emphasize or even ignore the other two important factors. When researchers emphasize only internal validity (control of extraneous variables), this tends to decrease external validity (generalizability of findings to other contexts and populations), leading to a well-conducted study with quite limited applicability to different real-world situations and contexts. An example is using random assignment in the few situations where it is possible in school-based research – to legitimately enhance internal validity in a laboratory situation – but then illegitimately applying the findings to school contexts, education conditions, and groups of students that are completely different from those in the study. Since education decision makers need external validity as well as internal validity in the research they use, Thomas and Collier try to arrive at the best possible balance between internal and external validity by simultaneously maximizing both factors to the greatest extent possible through judicious and well-crafted evaluation designs.

In addition, they give attention to increasing statistical conclusion validity of their research studies so that decision-makers can have greater trust in the findings. Specifically, they analyze only very large data sets (increasing statistical power of analyses), avoid violations of test statistics employed (e.g., testing for heterogeneity of variance), and increase precision of measurement (e.g., using high-quality standardized tests with known measurement characteristics rather than custom-designed but un-validated measures) wherever possible. All of these lead to research that detects real effects in the data as significant when smaller samples would fail to detect them and find “no significant difference,” merely because the sample sizes were too small to achieve statistical significance at a given level of uncertainty. This false “no-significant-difference” problem has been very common in education studies for decades and has led to much confusion and uncertainty for policy makers seeking to use research to make better policy decisions.

In a related practice, Thomas and Collier compute measures of practical significance (“real world significance” of findings) by calculating statistical effect sizes along with measures of statistical significance. Effect sizes can serve to “calibrate” statistical significance tests by indicating how strong the finding is in the context of the study. Thus, policy makers can use effect sizes to better discern which findings are meaningful for their decision-making needs, while statistical significance is largely a result of sample size only.

Thomas and Collier have introduced degree of gap closure as a primary measure of program success, rather than the traditional pre-post score differences among compared groups. They have pointed out that, while it’s good to know which groups score higher, what really matters in education is whether either group substantially closes the achievement gap.

For example, other researchers have consistently found small achievement differences between “English only” and “transitional bilingual education” programs. Thomas and Collier have found that neither closes more than half of English learners’ achievement gap in the long-term. Only high quality and long-term bilingual programs (e.g. dual language) demonstrably close the second half of English learners’ achievement gap after 5-6 years of schooling through two languages.

Thomas and Collier have emphasized the analysis of multiple cohorts of students over time, to enhance the generalizability of their findings beyond one group followed over time. This cross-validation of student data sets leads to increased confidence in the findings that are sustained across time and groups, allows for increased accuracy in measuring the true effects of dual language programs, and increases the robustness of their findings. This practice increases statistical conclusion validity and external validity, and thus increases confidence in their findings for use by decision makers.

Another important characteristic of the Thomas and Collier research is that they have extensively disaggregated each very large data set to investigate student achievement of specific groups of interest—e.g. by ethnicity (Latinos, African Americans, Asians, Caucasians, American Indians), by social class (free & reduced lunch), by proficiency in English, and by students with special needs. This is important because different groups can show test performance of different magnitudes and directions, but these policy-relevant differences are obscured when these groups are combined and not examined separately.

Thomas and Collier have utilized data mining to access student performance data from the past, as a supplement to current performance data, as well as data conducted during the study. This allows for studies that focus on longer periods of instructional time, providing better and more comprehensive information for decision making.

They also adopt the program evaluation maxim that only programs that are fully mature and well-implemented should be compared in an outcomes analysis. This means that their research and evaluation studies avoid the common problem of confounding and confusing program type with level of program implementation. When researchers fail to examine how well the programs are actually implemented in the schools, policy makers might be falsely led to believe that one program is better than the other when, in reality, that program is merely better installed or better implemented than its comparison program. Thomas and Collier work to avoid this common problem by examining level and quality of program implementation before beginning data analyses, and then comparing well-implemented programs of one type (e.g., dual language) to well-implemented programs of another type (e.g., transitional bilingual education). This serves to separate out test score differences attributable to how well the program is implemented from how effective the program actually is.

Finally, Thomas and Collier have used statistical and research design techniques to control for extraneous variables that might otherwise obscure their findings. Their analyses emphasize the use of blocking to reduce unexplained variation and increase precision of measurement. They also use multiple regression techniques to conduct analysis of covariance and analysis of partial variance studies, in order to statistically control for factors that might otherwise confound and confuse the effects investigated. These strategies acknowledge the reality that full laboratory control of extraneous variables is not typically available in education studies, and that statistical control, properly implemented, is an effective way of providing decision makers with valid and useful information to meet their needs.

The combination of all of these strategies leads to program evaluations and research studies that tend to more validly and reliably assess the real and unique effects of programs of interest. Factors that might confuse policy makers or obscure their understanding of “what’s really going on” with their education programs are identified, separated out, or isolated from the main analyses of program effects. This set of decision-facilitative strategies greatly increases the pragmatic usefulness of Thomas and Collier findings. It also allows policy makers to follow the program evaluation maxim “First, find out what really works, and then put your available resources into that.” Thomas and Collier research works toward that goal.

EXTRAORDINARY BILINGUAL EDUCATION PIONEERS

Gustavo A. Mellander, Hispanic Outlook on Education Magazine, August 2020

Reflections of a Former College President

At one time children were punished if they spoke the language of their immigrant parents in our public schools, even on the playground. It happened to children who spoke German, Swedish and, of course, Spanish at home. Most teachers, I am sure, meant well. After all, if children were going to live there, it behooved them to master English.

Sink or Swim

But the immersion swim or sink system was horrible. Horrible because it denigrated the child’s first language and consequently their family, their culture, their self-esteem. It did not make education or assimilation any easier. I attended grade school in the Panama Canal Zone, and we were forbidden to speak Spanish. Most teachers were gentle about the mandate but firm nonetheless. It was worse in my native California.

Change

In time, and it took longer than it should have, many realized that to malign a child’s primary language was counterproductive to that child’s education. To change the modus operandi and establish a more effective way of educating children, English as a Second Language programs were common by the 1960s. Many were successful, some were not.

We should remember that some bilingual programs quickly evolved in many places to classes being taught in Spanish. Thus, students with a smattering of English did not progress in English. Teachers were hired and awarded tenure who did not speak much English. Bilingual morphed into classes being taught in Spanish. Knowledge was transmitted but fluency in English was not achieved. That most assuredly was the case in the 1970s and 80s in New Jersey. How do I know? I was a college president there from 1975 to 1985. I saw that reality firsthand. I was also on the State Board of Education where I opposed the existing practice. Bilingual education, I insisted, should be bilingual. In my four years on the Board the existing practice did not change.

New Age

Others nationwide feared the counter-productive result of many, again not all, bilingual programs. Today, we have evolved into English as a Second Language combined with exciting dual language programs. They are improvements. Dual language programs serve both English speakers and Spanish speakers in the same classroom. Both are introduced to a new language while retaining and improving their original language. Europe has been doing this for centuries. It is a successful model with long term benefits for children.

Bilingual Pioneers

Dr. Virginia Collier and Dr. Wayne Thomas – these scholars are different from many others who conduct meaningful research only to have it remain the purview of other scholars rarely encouraging change. These two data-based, hands-on practitioners have worked directly with policy makers to attain institutional changes.

In 1985, Dr. Collier was a seasoned and impassioned believer in the advantages of bilingual education. Why? Because of her firsthand experiences. Dr. Thomas, a competent professional computer expert, suggested they needed documentation to validate and strengthen their recommendations. First, they studied the history of all programs and documented their successes and shortcomings. Based on that information they devised data-based suggestions and implementation strategies they shared with policy makers to effectuate change.

For over three decades their research on effectiveness for English Language Learners (ELLs) has provided grist and justification for their recommendations. Amazingly they have analyzed more than eight million student records, the largest and longest longitudinal studies ever undertaken on ELL programs, practices and outcomes. Their research has influenced classroom practice, school district and state politics, and school policy in places as diverse as Norway and Uruguay, as well as many states in the Union. ELL educators who battle erroneous perceptions every day welcome the hard data facts that Collier and Thomas have produced. To facilitate change, this dynamic duo has conducted educational leadership training for superintendents, principals and education policy makers. Their award-winning United States national research studies have impacted school policies throughout the world.

Bottom Line

Quality long-term bilingual programs work. Dual language programs are providing outstanding results.

Sink or swim programs, short-term bilingual programs, Spanish-only or English-only programs are inferior and of dubious efficacy. These intensely documented conclusions have been accepted by both scholars and practitioners. The Collier-Thomas hands-on practical data approach has led to productive changes throughout the world.

There is light flickering at the end of the tunnel.