Climateer Investing: "Nvidia Welcomes Intel Into AI Era: Fancy a Benchmark Deathmatch?" (NVDA; INTC)

Monday, August 22, 2016

"Nvidia Welcomes Intel Into AI Era: Fancy a Benchmark Deathmatch?" (NVDA; INTC)

From The Register:

We love your deep learning benchmark 'mistakes'

HPC blog Nvidia just fired the first salvo in what promises to be a classic and long-lived benchmark death match vs Intel. In a webpage titled "Correcting Intel's Deep Learning Benchmark Mistakes," Nvidia claimed that Intel was using outdated GPU benchmark results and non-current hardware comparisons to show off its new Knights Landing Xeon Phi processors.

Nvidia called out three Intel claims in particular:

"Xeon Phi is 2.3 times faster in training than GPUs." This claim was made in a press presentation delivered at ISC'16 and on an Intel-produced "fact sheet" (PDFs available here and here). It specifically refers to a stat at the left side of slide 12 (and the second page of the fact sheet) where Intel claims Phi is 2.3 times faster on the AlexNet image training on a DNN (deep neural network).

Nvidia alleges that Intel is using 18-month-old AlexNet numbers for Nvidia (based on a Maxwell system), while using farm-fresh numbers for the Intel Phi.

According to Nvidia, its Pascal processors in the same four-accelerator configuration outperform Intel's Phi by 1.9 times. It also claims its new NVIDIA 8-GPU DGX-1 dedicated DNN training machine can complete AlexNet in two hours, outshining the 4 Phi system by 5.3 times. Ouch.

"Xeon Phi offers 38 per cent better scaling than GPUs across nodes." This claim also occurs in both of the Intel documents referenced above. In this case, Intel is saying that their Phi systems scale better than GPU-equipped boxes, namely when it comes to 32-way/accelerator configurations.

According to Nvidia, Intel is using four-year-old numbers from Oak Ridge's Titan machine, which was using the old Jaguar interconnect and old K20 GPUs, as a comparison to Intel's brand-new Omni Path Architecture connected Phi processors running deep learning workloads.

It points out Baidu-published specs from its speech training workload that show near linear GPU scaling not just to 32 nodes, but to 128 nodes. Ouch again....MORE

Recently:
"Competition For NVIDIA: The Nervana Systems Chip That Will Let Intel Advance Its Deep Learning (INTC; NVDA)"

See also:
Baidu Artificial Intelligence Beats Google, Microsoft In Image Recognition