Saturday, May 9, 2026

"AI in fraud detection and compliance"

False positives and other challenges plus a bonus look at the Caravaggio we saw in "Are Prediction Markets Good for Anything?". 

From Arpit Gupta's Arpitrage substack, Apr 05:

Finding Needles in Haystacks 

This is the fifth week of my course summaries from teaching AI in Finance at NYU Stern (see lecture slides here; and last week’s summary here). This week focuses on fraud detection and compliance, which are two critical growth areas for AI applications.

The Base Rate Problem

Caravaggio’s The Cardsharps

We start off this week with Caravaggio’s The Cardsharps, which features a young man being cheated at cards through an accomplice looking at the cards signaling to his partner who has a few cards tucked behind their belt. I like this painting because it features all the everlasting elements of fraud: asymmetric information, coordination among bad actors, and the innocent mark. Fraud and financial crimes have since drastically shifted through scale and ease of execution through technology. Consumers report $12.5 billion in fraud to the FTC; global reported payment card fraud is maybe $30 billion a year; and global compliance costs on financial crimes is about $60 billion a year.

The core technical problem in fraud detection is the base rate problem. Despite the absolute magnitude of fraud, it’s quite rare in any given transaction, say less than 0.1%.

This means that the most “accurate” fraud detection algorithm is always going to be the classifier which tags all transactions as legitimate. It’s going to be 99.9%+ accurate! But this would be a terrible algorithm to deploy because failing to tag anything as fraudulent will be an open invitation for more crimes.

This is the fundamental tradeoff at the heart of fraud detection. Tightening the rules to capture more fraud (lowering false negatives, Type II errors) inevitably means flagging more legitimate transactions as fraudulent (raising false positive, Type 1 errors). These errors have complicated costs which vary based on contexts. Customers really dislike false positives, so tagging more fraud may lead to more customer churn. But letting real fraud through can also lead to huge losses, regulatory penalties, and reputational costs.

Beneish and Vorst have a nice paper illustrating this tradeoff using a range of fraud prediction models. Even the best models they consider have false positive rates in excess of 100:1, meaning that they tag 100 legitimate transactions as fraudulent for every one true fraud they capture. These tradeoffs are so bad they estimate it’s generally not cost effective to even adopt the fraud detection models.

From Rules to Representation

The broader history of fraud detection is similar to other areas of AI deployment in finance, particularly risk management which we had discussed in week 3. Prior to the 1990s we typically had manual review, which was expensive and slow to scale. From the 1990s and 2000s we had rule-based systems which were more scalable, but rigid and easy to game. The 2010s brought supervised ML techniques, such as random forests, which brought in labeled data to let the model learn about fraud drivers in more adaptable ways.

The impact of AI here has been to move towards richer representations of transaction data in ways that allow for more sophisticated analysis of fraudulent patterns of behavior. Purda and Skillicorn for instance show how even simple bag of words applied to the management discussions of financial reports can distinguish fraudulent from truthful filings. This shows that deceptive language carries detectable statistical signatures in word patterns which can be carefully mined for detection....

....MUCH MORE 

Previously with Professor Gupta:

"The End of Market Intelligence and the Last Analyst"

And on Caravaggio, we've actually we've seen "The Cardsharps" a few times as well as other visits to the master:

October 2017 - Turns out there’s a bit of Caravaggio in artificial intelligence.

Which backlinks to some earlier posts:

"The Case of the Mafia and the Stolen Caravaggio
Finance and Art: "When the Caravaggio No One Thinks Is a Caravaggio Becomes the Basis of an Asset Swap"
Questions America Is Asking: "Is there a €120m Caravaggio in your roof?"
Art: "Caravaggio and the Experts: Science v. Connoisseurs"
Banking: Ducats for Caravaggio, The Genoa Connection and Medici Moolah
UPDATE: "Villa Aurora: Rome property fails to sell for €471m at auction"
Following up on December's "Attention Art Fans: Large Caravaggio For Sale, Asking $552 Million".