From Socrates and Plato, to master and apprentice, to brilliant professor and a small circle of students, the next step in higher education could be machine and human. Harvard and MIT are behind the technology that will make it possible, and Stanford is an early adopter.
An article in today’s New York Times described an artificial intelligence computer program, developed by the non-profit EdX which was founded by Harvard and MIT, that will grade college students’ essays, give them a grade immediately, and allow them to rewrite their essays so that they can be granted a better score. The benefits? Professors are freed up to do other things and students “learn much better with instant feedback,” according to EdX president Dr. Anant Agarwal.
Daphne Koller, founder of an organization that makes a similar system, echoes the sentiment: “It allows students to get immediate feedback on their work, so that learning turns into a game, with students naturally gravitating toward resubmitting the work until they get it right.”
Until they get it right...according to the computer. The problem, of course, is that good writing can never be reduced to a formula. It should be heart-wrenching, eye-opening, condemning, redeeming, transforming. Electrical engineers like Agarwal and computer scientists like Koller – and worse yet, their mechanical spawn – ought not to be the ones teaching our future scholars how to write essays, think critically and creatively, and participate in public discourse.
I prefer to entrust writing instruction to people like Les Perelman, retired director of writing and a current researcher at MIT (thankfully, the critics of this system are also coming from well-respected universities), who founded a group called Professionals Against Machine Scoring of Student Essays in High-Stakes Assessment. The group’s petition against the use of such automated essay grading systems convincingly expresses some of the problems:
Computers cannot “read.” They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others. Independent and industry studies show that by its nature computerized essay rating is
- trivial, rating essays only on surface features such as word size, topic vocabulary, and essay length
- reductive, handling extended prose written only at a grade-school level
- inaccurate, missing much error in student writing and finding much error where it does not exist
- undiagnostic, correlating hardly at all with subsequent writing performance
- unfair, discriminating against minority groups and second-language writers
- secretive, with testing companies blocking independent research into their products
While I wholeheartedly agree with this criticism of automated grading systems and believe Perelman makes some crucial arguments, I think he misses a key point. Not only can machines not read or be convinced intellectually; they also cannot feel. Essays are meant to get into the heads and the hearts of their readers. The beauty and the power of written discourse is that it is not black and white, right or wrong. (This is not, of course, to say that there is no wrong way to write an essay. There is. But there are also an infinite number of ways to write a “correct,” or rather an effective, essay.)
These programs are meant to be programmed to grade essays similarly to how the professor would grade them. How it works is the professor grades 100 essays personally and puts them into the computer, and the system “learns” to grade according to the professor’s style. It’s bad enough when students learn to write in a particular style in order to please a professor; it will be even worse when students learn to write in order to please a mechanized, programmed version of that professor. No matter how stuck in his ways the professor may be, he is still human – and therefore, a brilliant and creative essay still has the power to make him set aside his coffee mug, pull off his spectacles, and say, “Hot damn! I never thought of it that way.”
The mechanized version of the professor will look at word choice, sentence construction, and essay structure and determine that there was nothing special about this essay. There is no room for surprise, for illumination, for inspiration.
There is no room for genius.
To be sure, genius is a romantic notion. Probably 90% of college students at one point believe themselves to be geniuses and only .009% actually are. But that’s not the point. The point is that education – particularly higher education – should be encouraging moments of genius, flashes of creativity, and true discourse with fellow scholars.
The danger with automatic grading systems is not just that they are unfair to individual students and can negatively impact their academic and career futures. The bigger danger is on a societal level. Instead of educating a generation of independent thinkers, we are training students to approach writing, thinking, and speaking as they would approach a tactical problem with a clear solution: Follow these defined steps, check the correct boxes, and you will be given a gold star.
In a world where party lines, buzzwords, and sound bites are failing to provide solutions, we do not need more check-the-box citizens. We need people who are capable of thinking about issues from multiple angles, feeling deeply the cultural and narrative undercurrents of these issues, and articulating the nuance of their ideas and observations. We need people who know what it means to participate and persuade. We need people who write for people, not for machines.