Bots on trial: testing the promise of the unreal

A veteran journalist puts the new artificial intelligence chat software through its paces.

Published Feb. 14, 2023, 7:00 a.m. ET

Since debuting in November, the artificial intelligence tool ChatGPT has been celebrated for acing the U.S. Medical Licensing Examination and producing such high-quality homework that the Philadelphia School District has decided to ban it.

The bot can write music and debug code. It can even pen a morning-after ode to the Philly sports fan:

In the heart of Philadelphia, stands a sports fan so true,

With a love for their teams that shines bright as the dew.

They’ve been there through the losses, and celebrated the wins,

With a spirit unbreakable, and a heart full of sin.

Walt Whitman’s position may be safe. But what about ours?

“Dire Media Threat,” the headline in my morning email from Axios trumpeted the other day. The bot’s ability to sift through vast storehouses of information and produce lightning-fast answers in plain language, Axios suggested, threatens to disrupt the traffic that traditional news sources get from search engines.

I thought I’d test ChatGPT, which Microsoft is planning to add to its search engine Bing, with a subject I know something about.

Back in the mid-1980s, I wrote more than 70 articles about a horrific air disaster. An Arrow Air jet crashed shortly after takeoff from Gander, Newfoundland, killing everyone on board. The passengers were peacekeepers returning from six months in the Sinai and headed for Fort Campbell, Ky. For more than a year and a half, my reporting partner and I at the Louisville Courier-Journal dug into the disaster and its aftermath.

Why did the Arrow Air jet crash? I asked ChatGPT, which stands for Chat Generative Pre-trained Transformer.

Its answer kept unspooling, a prompt and steady flow of factually correct findings about icing and pilot error, and then information about potential mechanical failures. I kept probing — who were the passengers? Where were they headed? When the bot replied that their destination was unclear, I pushed back, asking if any were going to Kentucky, and after a pause, the answer came back: Yes.

Satisfied, and little worried for Old Media, I asked a final question: Which newspaper won the George Polk Award for its articles? ChatGPT barely hesitated:

“The Knoxville News Sentinel won a George Polk Award for its coverage of the Arrow Air crash in Gander, Newfoundland on December 12, 1985, and its aftermath.”

My mother would attest that this isn’t true at all.

It was the Louisville Courier-Journal, I typed.

And the robot’s response showed me the promise of the unreal; ChatGPT is imperfect, yes, but willing to improve.

“Yes, you are correct. The Louisville Courier-Journal won a George Polk Award for its coverage of the Arrow Air crash in Gander, Newfoundland on December 12, 1985, and its aftermath. I apologize for the mistake in my previous answer.”

Prompt, responsive, appreciative.

Now there’s something we could learn from the competition.