coeNEWS


“Ensuring that a test is fair is a lot more complicated than people might think,” said Karen Samuelsen, an assistant professor in the department of educational psychology and instructional technology. UGA Photo by Dot Paul.

Education Researcher Helps Create a New Framework to Validate Standardized Testing


Karen Samuelsen, an assistant professor of educational psychology, helped create a framework with a new vocabulary and a new way of thinking about validity that makes the validation of educational assessments a less daunting task.

Julie Sartor, BSEd, '00 | Apr 10, 2008



Get plenty of sleep. Eat a healthy breakfast. Every spring, school children are sent home with these directives to help them score well on standardized tests. Fears of their children being held back or failing to meet graduation requirements can send parents into a panic.

Teachers and school administrators also feel the pressure for students to post high test scores. A provision of No Child Left Behind requires schools to report adequate yearly progress (AYP) based on those scores. If the school does not meet AYP standards in consecutive years, schools can be forced to transfer students, replace teachers or be completely reorganized.

What might surprise many is that people at the state level also feel pressure when it comes to high stakes testing. States spend a great deal of time and money creating the tests, but a team of educational researchers including Karen Samuelsen of the University of Georgia’s College of Education has discovered that many people at the state level feel unsure about how to validate the tests they are developing.

“Ensuring that a test is fair is a lot more complicated than people might think,” said Samuelsen, an assistant professor in the department of educational psychology and instructional technology.

Samuelsen and Robert Lissitz, a professor of education and director of the Maryland Assessment Research Center for Education Success at the University of Maryland, have created a framework with a new vocabulary and a new way of thinking about validity that makes the validation of educational assessments a less daunting task.

The results of their work are featured in the November 2007 issue of Educational Researcher, the official journal of the American Educational Research Association (AERA). The article is followed by commentary from well-known scholars in the field of measurement with Lissitz and Samuelsen’s response to the comments. Comments on the article vary from total agreement or disagreement to some even expanding Lissitz and Samuelen’s framework.

Their article has raised controversy because some experts in the field view it as an attack on renowned psychometrician Samuel J. Messick. His work at the Educational Testing Service examining construct validity is the basis for the testing standards of the National Council on Measurement in Education, American Educational Research Association and American Psychological Association. While Lissitz and Samuelsen believe in Messick’s insistence on a variety of sources of validity evidence, conversations with those in state and local education showed Messick’s unitary theory of validity left them puzzled as to how to provide that evidence.

Samuelsen says she has received only positive feedback since the article was published. State governments have found the new framework useful, and others in educational measurement have reported that it has informed their decision making. Those working in the area of classroom assessment have also shown interest in how to help teachers better understand the validation process. Samuelsen is hopeful that their framework will continue to be adapted by others, with input from those on the front lines of educational assessment.

“In educational measurement, we are always keenly aware that tests cannot tell us the truth about students; they can only provide bits of evidence from which we can make inferences about students…,” said Samuelsen. “So when we talk about validity, we are really talking about making sure (or as sure as we can) that the inferences we make about students are the right ones. That is a complicated process that involves everything from making sure students can't gain an unfair advantage (by cheating as an example), to ensuring that the algebra test appropriately targets the Georgia standards for that content domain, to determining whether students respond to test questions appropriately (and don't read anything into them).”

Prior to joining the UGA faculty in 2006, Samuelsen served as the assistant director of the Center for Integrated Latent Variable Research at the University of Maryland. She received her Ph.D. in Measurement, Statistics and Evaluation from the University of Maryland in 2005.

See article, commentary articles: http://edr.sagepub.com/content/vol36/issue8/#FEATURES


Julie Sartor is an editor in the COE's Office of Communications and Publications.


© 2006 University of Georgia