Compared to a multiple-choice, norm-referenced test, a standards based test can be recognized by:
A cut score is determined for different levels of performance. There are no cut scores for norm-referenced tests. There is no failing score on the SAT test. Each college or institution sets their own score standards for admission or awards.
Different levels of performance are set. Typically these are Above Standard, Meets Standard, Below Standard. These levels are typically set in a benchmarking process, even though such a process does not take into an account whether the test items are even appropriate for the grade level.
Tests are holistically graded against a free-written response, often with pictures, rather than graded correct or incorrect among multiple choices.
Tests are more expensive to grade because of this, typically $25-30 per test compared to $2-$5, not including the cost of developing the test, typically different every year for every state.
Tests are more difficult to grade because they are typically graded against a handful with no more than one or two example papers at each scoring level. They cannot be graded by computer
Tests are less reliable. Agreement may fall between 60 to 80 percent on a 4 point scale and be considered to be accurate.
Graders do not need teaching credentials, only a bachelor's degree in any field, and are typically paid $8 to $11 per hour for part time work.
Failure rates as high as 80 to 95 percent are not only not unusual, they are fully expected and announced as test programs are introduced to the local press. Under traditional graduation criteria, African Americans had achieved national graduation rates within a few points of whites. In 2006, three-quarters of African Americans who failed the WASL were promised by Superintendent Terry Bergeson that they would not get a diploma if they did not pass retakes of the test in two years, even though she had pledged earlier that "all students" would get a world class diploma.
Failure rates for minorities and special education students are typically two to four times higher than for majority groups as extended response questions are more difficult to answer than multiple choice
Content is often difficult even for adults to quickly answer, even at grade levels as low as the fourth grade, especially in mathematics. Professor Don Orlich called the WASL a "disaster", with math and science tests falling well above the normal development level of students at many grade levels.
Mathematics has a high proportion of statistics and geometry, and a low content of simple arithmetic.
Schools are scored as zero for students who do not take the test.
Passing such a test at the 10th grade level is typically planned as being required for graduating high school.
Passing such a test, rather than the 50th percentile, is defined as grade level performance.
A question with a correct answer may be graded as incorrect if it does not show how the answer as arrived at. A question with an incorrect numerical conclusion may not necessarily be graded as wrong.
The North Carolina Writing project gave out less than 1 percent exemplary '4' scores. Such papers employed vocabulary and knowledge on a level sometimes exceeding that of the college graduate graders, and well above the intended audience of a high school graduation. This level would be even more difficult than achieving an SAT score sufficient for entry into an Ivy Leage private college.
Scores typically rise much faster than standardized tests such as NAEP or SAT given over the same time period.