From the London Review of Books, February 5 edition:
Hyperion is the name that Meta has chosen for a huge AI data centre it is building in Louisiana. In July, a striking image circulated on social media of Hyperion’s footprint superimposed on an aerial view of Manhattan. It covered a huge expanse of the island, from the East River to the Hudson, from Soho to the uptown edge of Central Park. I assumed that the image had been made by someone with misgivings about the sheer scale of the thing, but it turned out to have been posted to Threads by Mark Zuckerberg himself. He was proud of it.
The imperative to increase scale is deeply embedded in the culture of AI. This is partly because of the way the field developed. For decades, the neural networks that are the basis of much AI today – including large language models of the kind that underpin ChatGPT – were considered less promising than ‘symbolic AI’ systems which apply rules, roughly akin to those of symbolic logic, to systemic bodies of knowledge, often knowledge elicited from human experts. The proponents of neural networks took a different path. They believed that the loose similarity of those networks to the brain’s interconnected neurons and their capacity to learn from examples (rather than simply to apply pre-formulated rules) made them potentially superior as a route to artificial intelligence. Many of their colleagues thought they were wrong, especially Marvin Minsky, MIT’s leading AI expert. There was also the fact that early neural networks tended not to work as well as more mainstream machine-learning techniques.
‘Other methods
... worked a little bit better,’ Geoffrey Hinton, a leading proponent of neural networks, told Wired magazine in 2019. ‘We thought it was... because we didn’t have quite the right algorithms.’ But as things turned out, ‘it was mainly a question of scale’: early neural networks just weren’t big enough. Hinton’s former PhD student Ilya Sutskever, a co-founder of OpenAI (the start-up that developed ChatGPT), agrees. ‘For the longest time’, he said in 2020, people thought neural networks ‘can’t do anything, but then you give them lots of compute’ – the capacity to perform very large numbers of computations – ‘and suddenly they start to do things.’Building bigger neural networks wasn’t easy. They learn by making predictions, or guesses. What is the next word in this sentence? Is this image a cat? They then automatically adjust their parameters according to what the word actually turns out to be or whether a human being agrees that the image is indeed a cat. That process of learning requires vast quantities of data – very large bodies of digitally available text, lots of images labelled by human beings etc – and huge amounts of ‘compute’. Even as late as the early 2000s, AI faced limitations in these respects. A crucial aspect of the necessary computation is the multiplication of large matrices (arrays of numbers). If you do the component operations in those multiplications one after another, even on a fast conventional computer system, it’s going to take a long time, perhaps too long to train a big neural network successfully.
By about 2010, though, very big data sets were starting to become available. Particularly crucial was ImageNet, a giant digital assemblage of millions of pictures, each labelled by a human being. It was set up by the Stanford University computer scientist Fei-Fei Li and her colleagues, who recruited 49,000 people via Amazon’s Mechanical Turk platform, which enables the hiring of large numbers of online gig workers. Also around 2010, specialists in neural networks began to realise that they could do lots of matrix multiplications fast on graphics chips originally developed for video games, especially by Nvidia.
n 2012, those two developments came together in what can now be seen as the single most important moment in the launching of AI on its trajectory of ever increasing scale. Sutskever, Alex Krizhevsky (another of Hinton’s students) and Hinton himself entered their neural network system, AlexNet, into the annual ImageNet Challenge competition for automated image-recognition systems. Running on just two Nvidia graphics chips in Krizhevsky’s bedroom, AlexNet won hands down: its error rate was 30 per cent lower than the best of its more conventional rivals....
....MUCH MORE