Saturday, August 12, 2023

Still A Few Bugs In The System: "ChatGPT's odds of getting code questions correct are worse than a coin flip"

From The Register, August 7:

But its suggestions are so annoyingly plausible

ChatGPT, OpenAI's fabulating chatbot, produces wrong answers to software programming questions more than half the time, according to a study from Purdue University. That said, the bot was convincing enough to fool a third of participants.

The Purdue team analyzed ChatGPT’s answers to 517 Stack Overflow questions to assess the correctness, consistency, comprehensiveness, and conciseness of ChatGPT’s answers. The US academics also conducted linguistic and sentiment analysis of the answers, and questioned a dozen volunteer participants on the results generated by the model.

"Our analysis shows that 52 percent of ChatGPT answers are incorrect and 77 percent are verbose," the team's paper concluded. "Nonetheless, ChatGPT answers are still preferred 39.34 percent of the time due to their comprehensiveness and well-articulated language style." Among the set of preferred ChatGPT answers, 77 percent were wrong.

OpenAI on the ChatGPT website acknowledges its software "may produce inaccurate information about people, places, or facts." We've asked the lab if it has any comment about the Purdue study....

....MUCH MORE

"An analyzing process must equally have been performed in order to furnish the Analytical Engine with the necessary operative data, and that herein may also lie a possible source of error. Granted that the actual mechanism is unerring in its processes, the cards may give it wrong orders."
—Ada Lovelace on inputting (programming) Babbage's Analytical Engine, 1843.

And:

....Famously, the very first instance of a computer "bug" was recorded at 3:45 pm (15:45) on the 9th of September 1947. This "bug" was an actual real-life moth, well, an ex-moth, that was extracted to the number 70 relay, Panel F, of the Harvard Mark II Aiken Relay Calculator.

This "bug" (which is a two-inch wingspan of 5 cm) was preserved behind a piece of adhesive tape on the machines' logbook with the now immortalized phrase "[The] first actual case of a bug being found". 

So the first "computer bug" was, in fact, a literal bug. 

The cause of the bug's appearance appears to have been down to members of the programming teams' late-night shift, which included the pioneering computer scientist, and former U.S. Navy Rear Admiral Grace Hopper. A team member left the windows of the room open at night. This was more than enough to let in the moth, which was attracted by the lights in the room and the heat of the calculator to nestle in the 'gubbins' of the Mark II Harvard, where it met its unfortunate end.

Here's the fascinating origin of the term "computer bug"
Source: U.S. Naval Historical Center/Wikimedia Commons

Moths and other insects tend to exhibit a behavior called transverse orientation. This is the way in which they tend to navigate by flying at relative angles to a distant light source.

For millions of years, this strategy served nocturnal insects well by allowing them to navigate by the light of the Moon. Of course, with the advent of electricity and artificial lighting, they often become confused. 

On the 9th of September 1947, Hopper traced an error on the Mark II to a dead moth that was trapped in a relay. The insect was carefully removed and taped to the logbook, and the term computer bug was used to describe the incident.

"This logbook, complete with attached moth, is part of the collection of the Smithsonian National Museum of American History, though it is not currently on display.

While it is certain that the Harvard Mark II operators did not coin the term 'bug', it has been suggested that the incident contributed to the widespread use and acceptance of the term within the computer software lexicon." - Graham Cluley

Henceforth the term "bug" entered more general use as a way to describe any errors or glitches in a program. 

However, as Hopper often mentioned, she neither coined the phrase nor found the insect in question herself. That was down to the other engineers on the team....

Both vignettes via Interesting Engineering and both women honored with NVIDIA namesake chips.

Our headline is an homage to an homage to an homage. If interested see 2017's "Still a Few Bugs In the System: 'DeepMind Shows AI Has Trouble Seeing Homer Simpson's Actions'