The advent and spread of machine-readable text in the early part of this century opened new vistas for those "Too lazy to work, too nervous to steal."
And although it is not the focus of this article, creative readers who are denizens of the C-suite can probably glean some insight into how to gun the stock during the earnings call by using certain words and phrases, vocal intonations and wink-and-a-nod speech patterns, while maintaining plausible deniability.
From Columbia Law School's CLS Blue Sky blog, September 1, 2020:
The annual report, like other regulatory filings, is more than a legal requirement; it provides an opportunity for public companies to communicate their financial health, promote their culture and brand, and engage with a full spectrum of stakeholders. How readers process all this information affects their perception of, and hence participation in, the business in significant ways. More and more companies are realizing that the target audience for disclosures is no longer just human analysts and investors, but also robots and algorithms that recommend what shares to buy and sell after processing information with machine learning tools and natural language processing kits.
This development was probably inevitable, given technological progress and the sheer volume of disclosure materials. In any event, companies that wish to communicate and engage with stakeholders need to adjust how they talk about their finances and brands and make forecasts in the age of AI. That means heeding the logic and techniques underlying the language- and sentiment-analysis facilitated by large-scale machine-learning computation. An example of that sort of computation is a process that identifies positive, negative, and neutral opinions in, say, all disclosures by a company, a task that is beyond the processing ability of human brains. While the literature is catching up to and guiding investors’ use of machine learning and computational tools to extract qualitative information from disclosure and news, there has been no analysis of the feedback effect: how companies adjust the way they talk while knowing that machines are listening. Our new paper fills this void.
We start with a diagnostic test that connects the expected extent of AI readership for a company’s SEC filings on EDGAR (measured by Machine Downloads) with how machine-friendly its disclosure is (measured by Machine Readability). The first variable, Machine Downloads, is constructed with historical information by tracking IP addresses that conduct downloads in batches. We deem Machine Downloads a proxy for AI readership, both because a request by a machine request is a necessary condition for machine reading, and because the sheer volume of machine downloads makes it unlikely that human readers alone can process them. The second variable builds on the five elements identified by recent literature as affecting the ease with which a machine can parse, script, and synthesize.
We show that, in the cross-section of filings, a one standard deviation change in expected machine downloads is associated with 0.24 standard deviation increase in the Machine Readability of the filing. On the other hand, other (non-machine) downloads do not bear any meaningful correlation with machine readability, validating Machine Downloads as a proxy for machine readership. We further validate that Machine Downloads and Machine Readability are reasonable proxies (for the presence of machine readership and the ease for machines to process) by showing that trades in a company’s shares happen more quickly after a filing becomes public when Machine Downloads is higher, with even stronger interactive effect with Machine Readability. Such a result also demonstrates the real impact of machine-process on information dissemination.
After establishing a positive association between a high AI reader base and more machine-friendly disclosure documents, we further explore how firms manage “sentiment” and “tone” perceived by machines. It is well-documented that corporate disclosures attempt to strike the right tone with (human) readers by conveying positive sentiments and favorable tones without being explicitly dishonest or noncompliant. Hence, we expect a similar strategy tailored to machine readers. While researchers and practitioners had long relied on the Harvard Psychosociological Dictionary to construct “sentiment” as perceived by (mostly human) readers by counting and contrasting “positive” and “negative” words, the publication of Loughran and McDonald in the Journal of Finance in 2011, (“LM” hereafter) presents an instrumental event to test our hypothesis pertaining to machine readers. This is because not only Loughran and McDonald (2011) presented a new, specialized finance dictionary of positive/negative words and words that are informative about liability and uncertainty, but also the word lists that came with the paper has served as a leading lexicon for algorithms to sort out sentiments in both the industry and academia.
As a first step, we establish that firms which expect many machine downloads avoid LM-negative words but only after 2011 (the year of publication of the LM dictionary). Such a structural change is absent with respect to words deemed negative by the Harvard Dictionary, which was known to human readers for many years. As a result, the difference, LM – Harvard Sentiment, follows the same path as the LM Sentiment, suggesting that the change in disclosure style is indeed driven by the publication of the LM dictionary.
Loughran and McDonald (2011) developed multiple additional dictionaries of “tone” words aiming at capturing a richer set of annotations of a financial document, including dictionaries of litigious, uncertain, weak modal, and strong modal words. The authors show that the prevalence of words in each category predicts firm outcomes such as legal liability and reaction from the capital markets. We find that firms with higher expected machine readership became more averse to words from these dictionaries following the Loughran and McDonald (2011) publication. The combined results suggest that managers revise their corporate disclosure in consideration of multi-dimensional effects of their words to the eyes of the machines.While our analyses thus far focus on the textual information, the application of the underlying theme (i.e., “how to talk when a machine is listening”) to the speech setting serves as a test beyond the textual setting. Earlier work found that managers’ vocal expressions can convey incremental information valuable to analysts covering the firm...
....MUCH MORE