Friday, December 9, 2022

"OpenAI’s GPT chat bot enters the uncanny valley"

Here we have former Columbia Uni. research scientist Bob Carpenter using ChatGPT to poke a bit of fun at the proprietor of  the SMCISS blog. Among other positions, Carpenter was professor of computational linguistics at Carnegie Mellon University.

From Statistical Modeling, Causal Inference, and Social Science, December 6:

The baseline: Shannon (1948)

The first attempt at computer generation of human language of which I’m aware was by Claude Shannon in his seminal 1948 paper, A Mathematical Theory of Communication. Not only did this paper introduce the entire field of information theory, it also applied a notion of posterior predictive checking to language models. That is, it could generate novel language based on bigram statistics (predicting conditional probability of a word given the previous word). Here’s the example from Shannon’s paper:

THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.

Computing was pretty much all caps until the 1980s.

The state of the art: OpenAI GPT-3 (2022)

Fast forward 75 years. I’ve recently been hearing a lot about GPT-3, the language model developed by OpenAI. So I signed up for the
OpenAI ChatGPT API
and then went to their playground to see what ChatGPT could do.

Now I’m getting that same uneasy feeling I felt watching the lifelike movement and reactions of the Boston Dynamics robots. It’s the uncanny valley effect of watching something that’s almost, but not quite fully lifelike.

Here are the first prompts I gave it (bold) and what it responded. I chose the prompts knowing I’d be posting on the blog. Like all the other spell checkers, it broke “poststratification” into two words itself.

Write an introduction for Andrew Gelman giving a seminar on post stratification to R users.

Good morning everyone. It’s a pleasure to be here today to discuss post stratification with all of you R users. My name is Andrew Gelman and I am a statistician, professor of political science and statistics at Columbia University, and co-director of the Applied Statistics Center. I am here to talk to you about post stratification, a widely used statistical technique. Post stratification is used to adjust survey estimates when they are known to be biased due to the sampling design or nonresponse. In this seminar, I will provide an overview of post stratification, discuss how to perform the calculations in R, and provide examples of how post stratification can be applied. I hope that by the end of this seminar, you will have a better understanding of post stratification and be able to use it to improve the accuracy and reliability of your survey estimates.

AI had intended to get an introduction of Andrew, not an introduction by Andrew, but prepositions like “for” are ridiculously overloaded in human languages. That is, I was expecting something like, “Professor Gelman needs no introduction,” or maybe, “Andrew is currently a statistics professor at Columbia University in the City of New York. Many of you may know him through his book, BDA3 or his latest stories about regression.”....

....MUCH MORE

When Carpenter isn't yucking it up with chatbots (and noting their limitations) he posts stuff like "BridgeStan: Stan model log densities, gradients, Hessians, and transforms in R, Python, and Julia" at SMCISS.

If interested here are some prior visits with Professor Gelman:

“Using Benford’s Law to Detect Bitcoin Manipulation”

Professor Gelman Is Not Impressed By The "Nudge" People 

Andrew Gelman is Professor of statistics and political science at Columbia Uni., the guy who tells the other social scientists how to get their numbers right so they can at least give the appearance of being a science. He has a very tart tongue which, combined with a high level intellect is fun to watch taking on sacred cows and shibboleths. As long as you aren't the target of said intellect and/or sharp tongue.

Here he is looking at Cass Sunstein as Sunstein's new book rolls out....
And May 30, 2020: 
“So the real scandal is: Why did anyone ever listen to this guy?”

An old favorite from 2018:
"Big Oregano Strikes Again"
I can't wait until he gets around to some of the spurious correlations propagated by the International Parsley Cartel....

And many more, use the 'search blog' box, upper left, if interested. 

For more on Claude Shannon we have a couple dozen posts:

The subject of this article, Claude Shannon has a couple interesting connections to finance/investing/trading beyond 'just' creating information theory (along with MIT's Norbert Wiener who was coming in on a different angle of attack), more after the jump....

And finally on the Uncanny Valley, you know what it is supposed to be, and it gets really close but just misses:

Seinfeld, Virtual Reality and Mild Revulsion
The Uncanny Valley, Interior-Design Edition

https://cdn.theatlantic.com/thumbor/bUXDC-RbmcAswgl-vTXxGNJAAiw=/0x4:2358x1330/1952x1098/media/img/mt/2014/03/render_1/original.png

We’ve become pretty good at making CGI look like it’s almost real, but the real challenge is making it seem like its normal.

Japanese artists Teruyuki Ishikawa & Yuka Ishikawa — otherwise known as Telyuka — started a project in 2015 to create an extremely realistic computer-generated schoolgirl. Her name is Saya, and she has been improved on since then.

This is the 2016 version:
http://cdn1.tnwcdn.com/wp-content/blogs.dir/1/files/2016/09/big1.png
And these are some pictures of the 2015 version (again, click to enlarge):....MORE


And here's FT editor David Keohane's entry.