Electronics have become a part of our daily lives. We can hardly live without them. Electronics open and close garage doors remotely. Refrigerators have been designed with interior cameras that allow us to peek inside them from anywhere via our cell phones to determine whether or not we need to make a stop at the grocery store on our way home from work. Furthermore, the most recent refrigerator innovations allow us to order a pizza simply by touching the screen on these novel appliances we used to call refrigerators. Actually, these electronic appliances have become known as “hubs” rather than refrigerators since they no longer just keep your food at the right temperature. Today’s “hubs” allow you to view a calendar or listen to your favorite music, post photos and notes, or even mirror a TV program in a separate room.
When there is a problem with our vehicles, the mechanic hooks them up to a computer for a diagnosis. We are testing robo vehicles on the road. A roofing company in central Ohio uses a drone to give you an estimate of your roofing repairs. Touch screen laptops, TVs, and electronic white boards have been fundamental hardware in classrooms for years.
It’s no secret that standardized tests are scored by computers. A single question and correct answer from a multiple choice is reviewed and scored in a nanosecond. That’s what makes computer or robo-grading appealing and efficient. However, now computers are being used to score lengthy essay questions. Ohio and Nevada have embraced this idea for several years, but other states are skeptical.
Proponents of robo-grading point out that computers are already performing many tasks for us with our cell phones and self-driving cars; consequently, computer grading or robo-grading of essays is the logical next step. Peter Foltz, PhD, a research professor at the University of Colorado at Boulder, has been studying the use of artificial intelligence (AI), also known as robo-grading, in scoring student essays for the past 25 years. According to Foltz, who has also been employed for the past 14 years by Pearson, an educational services company that provides a software program for robo-grading of student essays, the time has come for AI of student essays.
Developers insist that computers are already performing complicated tasks such as detecting cancer and carrying on conversations. So it seems quite natural that the next step should include robo-grading. Computers are “taught” what is considered good writing by analyzing essays that have been graded by humans. The computer programs scan essays for the same features that humans look for in good essays. These features include spelling and grammar, cohesiveness, logical conclusions, complex words and sentences, as well as other features.
The robo-grader determines an overall score based on a compilation of features and then breaks that score down to specific areas such as spelling and grammar.
Automated grading saves schools money and allows teachers to received results in minutes rather than months later; however, many English teachers remain unconvinced of the overall accuracy. Questions have arisen asking how can an art form of expression be measured by a mathematical algorithm? Others express doubt that computers can grade original ideas or the creative expressions of individuals. These same English teachers fear that robo-grading will encourage uninspired writing that will only follow the formula of a computer program to obtain a high score. This idea is not far-fetched according to Les Perelman, PhD, who is a research affiliate at the Massachusetts Institute of Technology (MIT) and who taught writing and composition for more than two decades while serving as the Director of Writing Across the Curriculum at MIT.
Perelman set out to challenge essay robo-grading by collaborating with students from MIT and Harvard to develop the BABEL (Basic Automatic B.S. Essay Language) Generator. Through this experiment, Perelman substantiated that “long pretentious incoherent essays could receive higher scores than well written essays.” The BABEL Generator produced essays that didn’t make any sense, nonetheless, they received high scores because of their length (500 words) and questionable multi-syllable language.
Proponents of automated grading continue to point out the low cost, the quick turnaround time, and a lack of bias that a machine can provide. Critics of automated grading, on the other hand, argue that computers don’t understand the meaning of words and students can fool algorithms by using numerous multisyllable words, complex sentences, and key phrases while making no sense at all, but still achieve a very high score,
As an educator and/or tax-paying citizen, where do you stand on essay robo-grading?
Wanda Dengel, B.S., M.A.T., is a long time local and Columbus inner-city schools teacher who served on the Diocesan Catholic Schools Advisory Commission in Columbus, Ohio. She can be reached at firstname.lastname@example.org