From Literary Hub, April 8:
Over the weekend, The New York Times published a long article on how tech companies are trawling and stealing to gather vast amounts of data to build their generative programs. Companies like Google, Meta, and OpenAI are chasing increasingly large amounts of information, especially “high-quality information, such as published books and articles, which have been carefully written and edited by professionals.” Meta considered getting that good data by “paying $10 a book for the full licensing rights to new titles,” or by outright “buying Simon & Schuster.” That Meta was even in a position to consider hoovering up one of the Big Five publishers is a grim indication of the immense power these companies have acquired.
All of this is in the service of what? So a boss can finally fire their writers and replace them with D.0n DrAIper? So you can generate a one-off movie based on your whims to watch alone and share with no one? So you can summarize the entirety of your local library and rapidly microdose the beauty of literature while on the toilet?
It’s so miserable to realize that I’m going to be thinking about AI for the rest of my life.
Another surprising detail was that these companies are now turning to “synthetic data” to train AIs—the snake reading its own tail. This seems like it’s already underway, as 404 Media documented cases of Google indexing books sluiced out of AI chum with titles like Maximize Your Twitter Presence: 101 Strategies for Marketing Success and Bears, Bulls, and Wolves: Stock Trading for the Twenty-Year-Old....
....MUCH MORE