Increasing Value Through High-Stakes Testing

Popular debate surrounding “high-stakes testing” often points in the direction of the federal government’s handling of K-12 education policy, but there exists an equally consequential argument involving standardized tests as they influence career paths an

High-stakes tests are broadly defined as standardized, proctored exams that objectively measure an individual’s knowledge in a specific area and consequently result in a certification, charter or industry title. To be considered high-stakes, the exam must endure the test of time and be offered at time intervals to enhance ramifications of negative outcomes—as well as reduce cheating. Positive outcomes add value to both the individual and the organization, such as a hiring, salary increases or, for a business, a boost in profits. Depending on the industry and the current economy, there are also intangible gains for the individual, such as credibility, confidence, differentiation and personal satisfaction. For the employer, intangible gains can come in the form of competitive advantage, heightened productivity and improved consistency.

But before a learning executive invests the necessary time and money into certification for their audience, they must know that that the test and test-prep itself will bring about a practical, measurable gain in knowledge, skills or career advancement.

“The important factor across most professional sector high-stakes testing is that the test is designed to assess critical knowledge that has been determined essential to the ability to successfully perform in a particular profession,” said Donald W. Boucher, vice president of global marketing with Prometric, an industry leader in technology-enabled testing and assessment services for IT certification and professional licensure and certifications, which delivers more than 200 standardized tests.

“Ultimately, the market will determine the value of the test or program,” said Bob Johnson, CFA and executive vice president of the CFA program division of the Association for Investment Management and Research (AIMR). “There are many employers out there who are implicitly or explicitly requiring folks to become involved in the CFA program. To me, that is as good as an endorsement as you can get.”

Test design, namely its “validity” and “reliability,” has a profound impact on the value and credibility of a certification. Validity ensures that exams successfully test the knowledge or skills that are representative of the associated job tasks, while reliability provides a consistent way to measure different candidates with predictable results. Testing for validity and reliability in exam design is accomplished by performing job-task analyses and performing psychometric reviews on exam items. A beta phase in exam development helps verify whether questions are clear and easily understood, valid and statistically sound. If not, these questions are discarded or rewritten.

High-stakes tests must be reliable in order to provide all candidates with a fair opportunity to answer each question correctly. Often the best way to accomplish this is to perform stringent job-task analysis, which involves surveying experts who perform the same or a similar job. This helps ensure that the test targets a specific, uniform and valid domain of skills.

High-stakes test designers typically concentrate on five factors when creating a valid and reliable product: clarity of testing purpose, definition of test domain, job-task analyses to determine which skills knowledge and abilities should be tested, measurable objectives and development of comprehensive test blueprints.

Once these factors are decided, designers begin a secondary analysis, which answers the following questions: What is the purpose of the test? Who is the intended audience? What is the content domain? What is the competency level?

The AIMR, which designs the CFA exam, extends itself in job-task analysis, spending more than $500,000 every four years, asking practicing charter-holders from around the globe what knowledge, skills and ability they require to do their job well.

“[AIMR] has the luxury of 125,000 applicants, so we have economies of scale,” said Johnson. “Our per-candidate cost for job-task analysis is $4.”

Finally, high-stakes tests must be fair. Computerized delivery of standardized tests is often a necessary means to achieve this ends, boasting broader distribution and tighter control over test items and content. Scoring is automated. Tests are less expensive to administer and can be distributed globally.

But designing a test that is valid, reliable and fair is often more easily achieved through less costly, knowledge-based or “low-order thinking skills” testing, as opposed to performance-based, or “high-order thinking skills” testing.

Psychologist Benjamin Bloom’s Taxonomy—one of the most commonly applied theoretical foundations of test development—ranks different levels of thinking skills. The hierarchical scale lists knowledge as the lowest level of thinking, but necessary to build high-level thinking skills such as comprehension, application, analysis, synthesis and evaluation. Test-takers and employers alike confirm that knowledge-based high-stakes tests can encourage low levels of thinking. Unfortunately, developing standardized, performance-based, high-stakes tests often compromises reliability, and these tests are expensive to design and deliver.

AIMR, a not-for-profit that spends zero dollars on marketing, leverages its economy of scale to produce three levels of the CFA exam, the second two offered once a year. Each exam tests different taxonomy hierarchies. Level one tests knowledge and comprehension; level two tests analysis and application; and level three, synthesis and integration.

Whenever the variable of subjective grading is introduced into the equation, it raises the issue of what test developers call “inter-rater reliability.” The more subjective the grading, the more difficult it becomes to maintain reliability.

The CFA exam addresses inter-rater reliability of subjective essay components of the level two and level three exams by employing 650 test graders, all of whom are charter-holders. All tests falling within the middle 50th percentile are re-graded by a staff of senior graders.

The American Institute of Certified Public Accountants (AICPA), which designs the CPA exam in conjunction with Prometric, has recently taken measures to move away from paper-and-pencil exams and create a uniform, computer-based exam. The end goal of the test: reflecting the demands of a changing economy. The exam will place greater emphasis on general business knowledge and information technology, as well as understanding of auditing concepts. For parts of the test, the format will change to include new, condensed case studies, referred to as “simulations.” According to AICPA President and CEO Barry C. Melancon, “The computerized CPA exam will better assess the skills that new CPAs must possess in order to carry out their essential charge: protection of the public interest.”

Another factor contributing to the value of a high-stakes tests is how passing grades are designated. Do the test designers set the minimum passing score, or a minimum passing rate or percentage of test-takers? With the CFA exam, only one of five candidates who initiate the program actually achieves a charter. “Over time, our pass rates have gone down because candidate performance has declined,” said Johnson.

By contrast, Johnson mentions that the programs with guaranteed pass rates and fees for continuing education and certification renewal open the door for value scrutiny. Similarly, how often the tests are offered can also translate into value, especially if they function as a levee to the job market. Some professions may only offer certification exams on a limited calendar basis in order to regulate the net number of new people who can practice in that particular profession.

According to a recent AIMR survey, investment professionals with 10 or more years of experience holding a CFA charter earn an average of 21 percent more than their non-CFA-holding contemporaries. The same survey reports that median total compensation of a CFA charter holder ($140,000) outweighs that of a non-charter holder ($99,140). The CFA exam, which is a three-year certification process, involves intense levels of both subjective and objective testing.

In the real estate industry, earning a real estate broker’s license—as opposed to maintaining the title of “agent”—can lead directly to an increase in salary. In 2000, the median annual earnings of salaried real estate agents, including commission, was $27,640, whereas licensed real estate brokers averaged $47,690, according to the Bureau of Labor Statistics. Intangible gains of license holders include work flexibility and lifestyle or increased ability to make your own hours.

Much like other formal certification processes, obtaining a broker’s license involves rigorous training and commitment. Before obtaining test eligibility status, candidates require 60 to 90 hours of formal training and a minimal amount of experience selling real estate, usually one to three years. Tests themselves, usually issued by the state, are comprehensive examinations on real estate transactions and laws affecting sale of property.

In the vendor-driven IT industry, the sector’s ever-changing technology advancements and employment dynamics have created a shift in values. A recent study by Gartner and Prometric demonstrates that the top reason individuals pursued IT certification in 2002 was to enhance professional credibility (34 percent). This is a contrast to the previous studies, which consistently revealed the top reason to be compensation increase (25 percent in 2002).

For managers, the benefits of certification increasingly reside in the value it brings to the organization, as opposed to its use as a screening tool for potential job candidates.

“In the ’90s, technology companies were thriving,” said Michael Brannick, president and chief executive officer of Prometric. “IT workers sought certification to learn technology and enter a challenging new industry with higher salaries. The IT industry has changed in the 21st century, and IT professionals are expressing different motivations for pursuing certification. While the perceived value of certification remains high, those of us in the industry should heed these results carefully and work to ensure the preservation of certification’s position as a high-stakes/high-reward venture.”

In contrast to the CFA exam, which has been around for 40 years, IT certification is relatively new, with many programs starting within the past 10 to 15 years. In the IT industry, high-stakes testing is a way of measuring an individual’s product knowledge and skill set, while also providing a path for professional and technical development.

Some companies are requiring their IT professionals to move from being specialists to generalists. The reason is the decline in the IT market and the need for individuals to assume a broader scope of skill sets. One consequence is that learning executives are finding more value in cross-functional certifications, which often give a competitive advantage in the marketplace.

As mentioned, implementing performance testing in the IT industry has been an ongoing challenge because of changing technologies and test-development and implementation costs. For one, most IT tests are vendor-driven. Standardization, psychometric testing and simulator budgeting must come from within the organization itself. But there is an effort to combat the knowledge-driven format of IT high-stakes tests.

Tackling this issue is the Performance Testing Council, a coalition of IT certification directors focused on improving the correlation between certification benchmarks and true job ability for the IT industry. The group performs research, shares best practices and sets standards.

A recent case study illuminated the challenges in creating high-stakes performance tests further up Bloom’s Taxonomy. For one, the cost of developing a simulation, including field-testing, QA testing and localization, was around $250,000. But since the design of the simulation accommodated the development tools, the delivery systems and the data management tools, and allowed for initialization of custom data, the study revealed that the benefits outweighed the costs.

Vendor-neutral certifications enhance professional and technical credentials and demonstrate a dedication to pursuing continuing education in the profession. Vendor-neutral certifications, however, do not discern actual skill sets with specific vendors’ products.

As new forms of certification continue to proliferate, new sets of questions will arise over its reliability and validity—and its ability to stand the test of time—which ultimately determines the value of a high-stakes test.

James DiIanni is the director of Oracle’s certification program. James has been involved with developing Oracle’s certification program since 1997 and managed the design and global implementation of Oracle’s first live application performance test—The Oracle Certified Master DBA Practicum. He can be reached at jdiianni@clomedia.com.

December 2003 Table of Contents