Saturday, August 26, 2017

Where Are The Profits Of Big Data Flowing?

From TechRepublic, March 17:

Why machine learning benefits the rich, and everyone else is toast  
Big data started with cheap hardware and open source software, but the winners in machine learning are the world's richest companies. 
Why is machine learning finally real? It's the data, stupid. Lots (and lots) of data.

That's a key message from Cloudera co-founder Mike Olson's Strata + Hadoop World keynote earlier this week in San Jose, California. As he declared: "The algorithms that early researchers and current practitioners use are ravenous for data and we finally have enough data on the planet to feed them. They also need scale-out computation and storage at low cost."

In fact, the mountains of data that we now enjoy are a direct result of high-quality open source software running on commodity hardware: More applications churning out more data for more people.

A game only the rich can play
Despite this low-cost hardware and software, and its impact on machine learning, let's be clear: Big enterprises are the primary beneficiaries. Why? As Olson went on to explain, among enterprises doing over $1 billion a year in revenue—Cloudera's target customer—"the appetite for these [machine learning] capabilities is insatiable" as they "absolutely have the data at scale."

Data, after all, is necessary to train the machines. A small company could have big plans but without big data to feed those plans, it's a losing battle. As such, large enterprises are in a prime position to use big data to enrich themselves and effectively hold off would-be, smaller competitors.
(As a side note, as useful as open source has been, we really need to have open data sets. Stanford has been exemplary in this, annotating data to make it more readily useful for machine learning. This is a new frontier in "open source," and we need to explore it more.)

Sparking ML
One thing that aids these big companies has come from an egalitarian source: Apache Spark. I've written about Spark's impact on multiple occasions, but it's easy to understate just how important it has been. Indeed, though Cloudera recognized the importance of Apache Spark early on, Olson noted in his keynote, one aspect of it has "taken them by surprise."
[Spark] allowed people to build and deploy scale-out machine learning applications much faster than they had previously done. [Why?] Its flexibility and ease of programming meant that you could build machine learning apps, train up models on massive data very, very quickly. That has led to huge interest in the ecosystem....

Coming at the issue on a related tangent is The Verge, July 13:

Robots and AI are going to make social inequality even worse, says new report
Most economists agree that advances in robotics and AI over the next few decades are likely to lead to significant job losses. But what’s less often considered is how these changes could also impact social mobility. A new report from UK charity Sutton Trust explains the danger, noting that unless governments take action, the next wave of automation will dramatically increase inequality within societies, further entrenching the divide between rich and poor. 

The are a number of reasons for this, say the report’s authors, including the ability of richer individuals to re-train for new jobs; the rising importance of “soft skills” like communication and confidence; and the reduction in the number of jobs used as “stepping stones” into professional industries.

For example, the demand for paralegals and similar professions is likely to be reduced over the coming years as artificial intelligence is trained to handle more administrative tasks. In the UK more than 350,000 paralegals, payroll managers, and bookkeepers could lose their jobs if automated systems can do the same work. 

“Traditionally, jobs like these have been a vehicle for social mobility,” Sutton Trust research manager Carl Cullinane tells The Verge. Cullinane says that for individuals who weren’t able to attend university or get particular qualifications, semi-administrative jobs are often a way in to professional industries. “But because they don’t require more advanced skills they’re likely to be vulnerable to automation,” he says.

Similarly, as automation reduces the need for administrative skills, other attributes will become more sought after in the workplace. These include so-called “soft skills” like confidence, motivation, communication, and resilience. “It’s long established that private schools put a lot of effort into making sure their pupils have those sorts of skills,” says Cullinane. “And these will become even more important in a crowded labor market.”....

There are a couple other aspects of the AI money flows that are becoming apparent, we'll be back with those next week.