FPGAs Accelerate Machine Learning and AI Applications

Posted by Danielle Leger

Jul 17, 2017 8:04:00 AM



Machine learning (ML) focuses on the ability of a computer to learn without being programmed to do so explicitly, while Artificial Intelligence (AI) research looks at the development of what might be called human-like behavior.

Amongst the targets for ML/AI are:

  • data centers
  • autonomous vehicles
  • industrial systems
  • analysis of massive data in order to solve unpredictable problems



Intel is no exception, addressing the topics through acquisitions and internal development in a move to create ‘end to end’ solutions. Bill Jenkins, senior AI product specialist with Intel’s Programmable Systems Group (PSG, formerly Altera) said: “Intel has formed an AI group, while PSG has a focus on machine learning, so there’s an AI effort across the company; it’s a big topic.”



Example: A car is now generating about 4Tbyte of data a day, while a factory might generate an exabyte of data a day. In the US, 94% of data collected is discarded, so we need to be smarter and use that data, rather than discarding it.

And the hardware and software needed for us to be smarter is available. For reasonable power and cost, we can now solve problems that couldn’t be addressed before.

Part of that hardware solution is the FPGA – and one reason why Intel acquired Altera.

With FPGAs, you create an architecture for the problem; you control the data path. Rather than having data move through a CPU, then offloaded to memory, it can come right into the FPGA from wherever. It’s then processed inline with the lowest latency.

FPGAs fit well with Artificial Intelligence and Machine Learning, offering workload agnostic computing, sensor fusion and unified tools and workflows. They can be used to create power efficient solutions to complicated analytics problems, with only the hardware needed for the problem. That’s the difference; FPGAs are all about system performance and latency.

While FPGAs have been around for decades, it’s the latest versions of the devices which are picking up the AI/ML load. Everything Jenkins has done so far in this field has been done using the Arria 10, because its floating point capability can maintain accuracy.

Arria 10 devices offer compute performance of up to 1.5TFLOPs, while Stratix 10 parts handle 10TFLOPs. Both have arbitrary precision data types and higher on chip bandwidth.


FPGAs have a huge amount of internal memory bandwidth, 8Tbyte/s, along with Megabytes of on board memory. So you can get data in very quickly, do the analytics and pass the results back with minimal latency. It’s also an order of magnitude better than a CPU; where a CPU needs to run at GHz clock rates, FPGAs can run at 300MHz and sustain throughput. Meanwhile, running the device more slowly and keeping the results on the FPGA saves power.

It is possible to perform ML/AI tasks using an ASIC... But your problem is that, even if you wanted to build an ASIC, you can’t make changes once it’s done. It’s also expensive to build an ASIC; while the value proposition changes with time, the fundamental need for FPGAs hasn’t. Using FPGAs is cost effective to perform ML/AI tasks.


