
Texas' annual reading test adjusted its difficulty every year, masking whether students are improving
(The Conversation is an independent and nonprofit source of news, analysis and commentary from academic experts.)
Jeanne Sinclair, Memorial University of Newfoundland
(THE CONVERSATION) Texas children's performance on an annual reading test was basically flat from 2012 to 2021, even as the state spent billions of additional dollars on K-12 education.
I recently did a peer-reviewed deep dive into the test design documentation to figure out why the reported results weren't showing improvement. I found the flat scores were at least in part by design. According to policies buried in the documentation, the agency administering the tests adjusted their difficulty level every year. As a result, roughly the same share of students failed the test over that decade regardless of how objectively better they performed relative to previous years.
From 2008 to 2014, I was a bilingual teacher in Texas. Most of my students' families hailed from Mexico and Central America and were learning English as a new language. I loved seeing my students' progress.
Yet, no matter how much they learned, many failed the end-of-year tests in reading, writing and math. My hunch was that these tests were unfair, but I could not explain why. This, among other things, prompted me to pursue a Ph.D. in education to better understand large-scale educational assessment.
Ten years later, in 2024, I completed a detailed exploration of Texas's exam, currently known as the State of Texas Assessments of Academic Readiness, or STAAR. I found an unexpected trend: The share of students who correctly answered each test question was extraordinarily steady across years. Where we would expect to see fluctuation from year to year, performance instead appears artificially flat.
The STAAR's technical documents reveal that the test is designed much like a norm-referenced test – that is, assessing students relative to their peers, rather than if they meet a fixed standard. In other words, a norm-referenced test cannot tell us if students meet key, fixed criteria or grade-level standards set by the state.
In addition, norm-referenced tests are designed so that a certain share of students always fail, because success is gauged by one's position on the 'bell curve' in relation to other students. Following this logic, STAAR developers use practices like omitting easier questions and adjusting scores to cancel out gains due to better teaching.
Ultimately, the STAAR tests over this time frame – taken by students every year from grade 3 to grade 8 in language arts and math, and less frequently in science and social studies – were not designed to show improvement. Since the test is designed to keep scores flat, it's impossible to know for sure if a lack of expected learning gains following big increases in per-student spending was because the extra funds failed to improve teaching and learning, or simply because the test hid the improvements.
Why it matters
Ever since the federal education policy known as No Child Left Behind went into effect in 2002 and tied students' test performance to rewards and sanctions for schools, achievement testing has been a primary driver of public education in the United States.
Texas' educational accountability system has been in place since 1980, and it is well known in the state that the stakes and difficulty of Texas' academic readiness tests increase with each new version, which typically come out every five to 10 years. What the Texas public may not know is that the tests have been adjusted each and every year – at the expense of really knowing who should 'pass' or 'fail.'
The test's design affects not just students but also schools and communities. High-stakes test scores determine school resources, the state's takeover of school districts and accreditation of teacher education programs. Home values are even driven by local schools' performance on high-stakes tests.
Students who are marginalized by racism, poverty or language have historically tended to underperform on standardized tests. STAAR's design makes this problem worse.
On May 28, 2025, the Texas Senate passed a bill that would eliminate the STAAR test and replace it with a different norm-referenced test. As best as I can tell, this wouldn't address the problems I uncovered in my research.
What still isn't known
I plan to investigate if other states or the federal government use similarly designed tests to evaluate students.
My deep dive into Texas' test focused on STAAR before its 2022 redevelopment. The latest iteration has changed the test format and question types, but there appears to be little change to the way the test is scored. Without substantive revisions to the scoring calculations 'under the hood' of the STAAR test, it is likely Texas will continue to see flat performance.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
a day ago
- Yahoo
Scientists prep bold experiments to fight Arctic phenomenon with global implications: 'Dramatic changes'
A team of climate researchers and engineers is working together with local Arctic communities to implement bold new strategies to address melting sea ice. In a recent article for The Conversation, Shaun Fitzgerald, Director of the Centre for Climate Repair at the University of Cambridge, divulged takeaways from his recent trip to Cambridge Bay in Northern Canada. He visited the region with two projects, Real Ice and Arctic Reflections, both of which aim to slow down the loss of sea ice through experimental strategies. "The dramatic changes in the Arctic warrant investigation into interventions that could have an impact sooner than cutting emissions or removing greenhouse gases," Fitzgerald explained. Sea ice plays a crucial role in temperature regulation globally. With changing temperatures, this delicate balance between warming and cooling of the Arctic has been thrown off balance, causing sea ice to melt more rapidly than it can refreeze in the colder months. Arctic Reflections notes that rising global temperatures have "reduced the Arctic sea ice volume already by 75% over the last 40 years." The ramifications of melting Arctic ice impact more than just the local communities in the far north — this phenomenon has global implications. More melted sea ice means more water in the ocean, which means higher tides during extreme weather events, disrupting food systems with habitat loss for certain animals, and even prompting an increase in the spread of disease. A study published this year even asserted that Arctic sea ice melting is contributing to changing weather patterns in other parts of the world. The team Fitzgerald worked with in Cambridge Bay has three bold concepts to combat the rapid deterioration of the polar ice caps. The first promotes ice growth by pouring seawater onto the surface of sea ice. The second uses snow as an insulator and floods it to create a solid ice sheet that promotes more freezing underneath. The third addresses melt ponds that form in the summer by drilling into them to drain them, leaving reflective ice at the surface to protect the sea ice below. These experiments are educated hypotheses by the team based on mathematical modelling, lab experiments, and smaller field experiments. Should the government be able to control how we heat our homes? Definitely Only if it saves money I'm not sure No way Click your choice to see results and speak your mind. "The goal is to gather essential real-world data to rigorously assess if this intervention warrants further consideration," Fitzgerald explained. While initial results have been inconclusive, the field experiments have been encouraging. But Fitzgerald concluded that big swings like these experiments are an important step toward the ultimate goal of preserving the global climate. "With slow progress [in combatting global temperature increases] over the past few decades, additional measures may buy us time," he asserted. Join our free newsletter for good news and useful tips, and don't miss this cool list of easy ways to help yourself while helping the planet.
Yahoo
3 days ago
- Yahoo
Via Christi: Brain imaging technology could extend stroke treatment window
WICHITA, Kan. (KSNW) — May is Stroke Awareness Month, an opportunity to highlight the risk factors and preventive measures. Many are familiar with F.A.S.T., which shows that the warning signs of face drooping, arm weakness, and speech difficulty mean it's time to call 911. But what happens after recognizing the signs can also be key to recovery. Researchers at Via Christi are working to break through a 'four and a half' hour window for treatment. Traditionally, clot busters are administered within that window, and after that, doctors focus on supportive measures for patient care. Stroke Awareness Month: Recognizing stroke symptoms can help save lives Via Christi is collaborating with Wichita State University to potentially treat patients beyond the current window of opportunity using brain imaging technology. 'Patients who have had a stroke, well actually if they decide to participate in the study, we can look at whether or not some of the brain flow that we see during therapy can help us move that patient faster in the rehabilitation,' Tony Sadler, a certified physician assistant at Via Christi Research, said. She also says it's essential to be fast in response to a stroke, as thousands of brain cells are dying during a stroke. You can learn more about the signs of stroke and how to take steps to prevent one at an event this weekend. Via Christi is hosting a free seminar at the Evergreen Community Center and Library from 11 a.m. to 1 p.m., with a heart-healthy lunch included. There will be resources available in both English and Spanish. Call 316-303-8042 to reserve your seat. Copyright 2025 Nexstar Media, Inc. All rights reserved. This material may not be published, broadcast, rewritten, or redistributed.
Yahoo
3 days ago
- Yahoo
SpaceX's Ninth Starship Test Flight Delivers Mixed Results
In its ninth test flight, SpaceX's launch vehicle Starship once again reached space, surpassing problems that prematurely ended its two previous test launches. But as with those ill-fated preceding flights, in this one, Starship still failed to reach the ground intact. Instead the vehicle spun out of control and disintegrated during atmospheric reentry. Although each Starship test thus far has succeeded in demonstrating powerful new technical advances that are crucial for the program's further progress, this marks the third flight in a row in which the titanic vehicle suffered a 'rapid unscheduled disassembly' that sent fiery debris cascading down to Earth. All that effort, it's hoped, will prove worthwhile if or when Starship enters regular operations because SpaceX aims to make the vehicle, by far, the largest and most capable fully reusable spacecraft ever flown. In the latest test, around 50 minutes after launch, SpaceX confirmed that Starship met its demise. At first, everything in the vehicle's flight appeared to be going well. Starship—a 40-story-tall 'stack' that is composed of a giant, 33-engine Super Heavy booster and a 171-foot-long spacecraft powered by six additional engines—lifted off as planned from SpaceX's launch site in Starbase, Tex., at 7:37 P.M. EDT on Tuesday. But cheers were somewhat subdued until about 10 minutes after launch—when operators officially determined that the spacecraft's trajectory was nominal, taking it on a ballistic suborbital path through outer space. [Sign up for Today in Science, a free daily newsletter] 'Ship engine cutoff—three most beautiful words in the English language,' declared Dan Huot, a communications manager at SpaceX, during the company's livestream of the flight test near the launch site. Around him, sighs of relief could be heard as SpaceX employees began to ascertain that the day's flight would not be a repeat of the previous two, each of which had resulted in the vehicle exploding over the Atlantic Ocean less than 10 minutes after launch. Around 18 minutes after Tuesday's launch, however, issues began to emerge. First, operators decided not to deploy Starlink satellite demonstrations as planned because of a stuck payload door. Then, about a half an hour after launch, SpaceX mission control reported that suspected propellant leaks were driving the vehicle into a spin, which doomed it to burn up in the atmosphere during reentry—raining debris over the Indian Ocean. 'We're not going to get all of that reentry data that we're still really looking forward to,' Huot admitted in the livestream. 'This is a new generation of ship that ... we're really trying to put through the wringer, as there's a whole lot we still need to learn.' Meanwhile, although the Starship vehicle itself showed improved performance, the Super Heavy booster that helped it reach space ran into problems of its own. Moments after firing its engines to come in for a landing in the Atlantic Ocean, the booster instead broke apart. This wasn't entirely unexpected; in keeping with SpaceX's 'test to failure' approach, the Super Heavy had attempted to reenter in a different, potentially fuel-saving orientation that subjected the booster to more intense aerodynamic forces. Despite its unplanned disassembly, the booster did mark a significant milestone for SpaceX: for the first time, it flew with a nearly full suite of flight-proven engines that were previously used during Starship's seventh test. And the booster remains a marvelous demonstration of SpaceX's innovation; a Super Heavy previously made spaceflight history when it became the first rocket ever to be caught in midair with two mechanical arms. In the new launch, the Super Heavy was able to do its intended jobs of bringing Starship to space and testing new reentry techniques, explained Jessie Anderson, SpaceX's senior manufacturing engineering manager, during the flight's livestream. 'There's always a chance we don't reach every objective that we set for ourselves,' Anderson added, 'but success comes from what we learn on days like today.' On X, former NASA deputy administrator Lori Garver praised SpaceX's transparency but noted these were 'not the results we were hoping for.' Garver was instrumental in forging the space agency's partnership with SpaceX, which helped spark the company's unprecedented dominance of commercial launch services upon which NASA now heavily relies. Starship is the prized cornerstone of SpaceX's ambitious plan to build human settlements on Mars and is also slated to ferry crews to the lunar surface in a couple of years for NASA's Artemis III mission. Given the high stakes for the vehicle, its test program's mixed results are disappointing, to say the least. Notably, the previous two attempts, Flights 7 and 8, each ended with two spectacular explosions over the Atlantic Ocean. For Flight 8 in particular, the engines shut down unexpectedly minutes after launch, causing the spacecraft to essentially fall apart and self-destruct in midair. SpaceX received some public backlash after the spacecraft debris, which the company claimed would pose minimal risks, led to multiple midflight diversions for passenger airplanes that were under threat. Nevertheless, SpaceX appeared stalwart and even optimistic about Flights 7 and 8, calling the latter's mishap an 'energetic event' that occurred because of hardware complications. Last week the company said both explosions had a 'distinctly different' cause. And in a press release that followed the launch of Flight 9, it noted that Flight 8 greatly informed the upgrades and modifications to Starship for the latest test. 'Developmental testing by definition is unpredictable,' SpaceX said in a prelaunch press release for Flight 9. 'But by putting hardware in a flight environment as frequently as possible, we're able to quickly learn and execute design changes as we seek to bring Starship online as a fully and rapidly reusable vehicle.' What does Starship's questionable status mean for SpaceX's long-touted goal of 'making life multiplanetary'? If anything, it suggests the company's projections for the vehicle's regular, routine operation have been and remain unrealistically optimistic. Last year SpaceX founder Elon Musk stated in a social media post that the company plans to launch 'about five' uncrewed Starships to Mars in two years. In another post shortly after Flight 9's mixed results, he touted the vehicle's partial success and predicted that the next few flights would occur at a fast pace of about one per month. Whether or not such haste is feasible, it would certainly be desirable, given the pressure SpaceX faces to deliver on its lofty promises.