View profile

AMD's Right Moves | Intel's Neuromorphic Chip | Innovation in Number Representation | Federated Learning - Issue #14

I would like to preface this issue by making a couple of clarifications. My primary goal in this foru
July 24 · Issue #11 · View online
AI   Speech   &  Language  Processing  Update
I would like to preface this issue by making a couple of clarifications. My primary goal in this forum is to closely track and cover major technological developments in Deep Learning and Machine Learning that either directly or indirectly impact the underlying hardware that hosts AI workloads. This includes servers, embedded modules, accelerator chips, silicon intellectual properties as well as IoT devices. It also emcompasses gear deployed in the cloud, on-prem data centers, edge and extreme edge devices, as well as billions of end devices. Despite aforementioned scope, it turns out that a sizable portion of innovation in AI; as a whole, has tangible impact on the underlying hardware in one form or another. Models are getting larger forcing the underlying hardware to support more storage, more processing power, higher throughputs while consuming less power. I am fascinated to witness all of this coming together in such a mind-bending pace and I am excited to share my findings with you.
In this issue:
  • AMD Making All the Right Moves
  • As for AI chips …
  • New Approach Could Sink Floating Point Computation - Remarkable Innovation in Numerical Representation (Posit Arithmetic)
  • Intel’s Neuromorphic Research Platform Based on Loihi Chip
  • “Federated Learning”: Should You Be Worried About It?
  • Paper: Usage of Reinforcement Learning in Recommender Systems

AMD Making All the Right Moves
Lisa Su, AMD’s CEO; delivered the keynote address at Computex this year and announced a flurry of products and technologies. While AI was not necessarily the cornerstone of her presentation, I found some bits and pieces that undoubtedly will turn into AI-centric announcements in the coming months. I have tried to capture some key points here:
  • First and foremost, AMD seems to be leaning heavily on TSMC’s 7nm process (leapfrogging the competition) since most of the new products are based on this process node. This gives them an edge over both Intel and NVIDIA and will give them a leg up when it comes to power and their ability to drop prices to gain market share
  • I see a direct frontal attack on both Intel (in PC and Server CPUs) as well as NVIDIA (for gaming GPUs). She indicated that the next generation Playstation will be powered by a 7nm custom chip containing both a CPU and GPU
  • Another weapon in their arsenal seems to be their choice of the host interface I/O technology. Zen 2 cores (in Ryzen 3000 series) will come packed by PCIe Gen. 4 ports clearly intended for high-end machines
  • More relevant to the AI domain, they announced the Radeon RX 5000 family of GPUs “Navi”. This is AMD’s new architectured GPU implemented in 7nm node (referred to as Radeon DNA or RDNA) heavily relying on AMD’s Infinity Fabric (a descendant of HyperTransport)
  • I would be very curious to learn about the evolutionary path of the Infinity Fabric technology and its implications on large datacenter AI training/inference applications
  • I am willing to bet that AMD will make substantial announcements in the arena of AI inference and training soon challenging Nvidia’s (V100) and Intel’s (NNP-T, NNP-I) data center offerings
As for AI chips . . . .
GreenWaves Technology, a Fabless semiconductor company based in France; has earned the “Cool Vendor” title from Gartner for their GAP8 IoT Application Processor which is optimized for battery operation, incorporating AI acceleration and other signal processing algorithms supporting IoT and consumer applications. Medical wearables, applications dealing with people and object detection are front and center. The company uses variations of 9 RISC-V cores (one for sequential and a cluster of 8 for vector operations). They have employed a myriad of techniques including voltage and clock domain scaling among many others to reduce the power dissipation. GAP8 performance and power metrics for CNN implementation can be found here.
New Approach Could Sink Floating Point Computation
Intel's Neuromorphic Research Platform
Intel introduced a new research platform board (Nahuku) based on Loihi neuromorphic chip. Following are just a few key points about this announcement:
  • Loihi is a neuromorphic chip capable of learning as well as inference. It contains 128 fully asynchronous neuromorphic cores supporting upto 130k neurons and 130M synapses. the chip is fabricated in 14nm Intel process. Additionally the device contains mesh-based off-chip communication interfaces enabling the device to scale out to up to 4,096 on-chip cores and up to 16,384 chips in a system
  • Nahuku is an FPGA expansion board that can support up to 32 Loihi chips (16 chips per side). The board has an aggregate of 4,096 neuromorphic cores incorporating a total of 4,194,304 neurons and 4,160,000,000 synapses
  • Intel is also working on a larger scale system (Pohoiki Springs) that supports up to 768 Loihi chips
"Federated Learning" : Should You Be Worried?
If you thought that your edge AI chip only has to worry about inference, think again. “Federated Learning” will mandate even the very edge devices to have some training capability. The scenario is as follows. Most smartphones either have or will soon have multiple apps that use miniature machine learning models used to render some local predictions. Let us use the autofill feature in a google search running on an Android machine. Each user is presented with options which completes the search phrase and the user can accept or reject the autofill recommendations. In a way, each user’s feedback adds a tiny bit to the training data and the local model has to be tweaked to reflect the nuance. It is however not desirable to push the tweaks uplink to modify the global search machine learning model that is distributed throughout google’s data centers. For beginners the tweak is user-specific and might not apply to bulk of other users. Secondly, the local adjustments convey a user preference and can be an invasion of someone’s privacy. Lastly, shuttling models up and down a wireless link can be costly for the users. Instead, local tweaks are compressed and sent uplink when the phone is connected to power and have access to free Wi-Fi connection. Once thousands of tweaks are accumulated, then the larger search model is modified using elaborate averaging mechanisms and the adjustments are pushed down to individual devices after they are compressed. As you can see, even smartphones have to have some training capability and have to be engineered for it. The good news is that the smartphones application processor may have enough juice to handle the local training and there may not be a need for taxing the dedicated AI acceleration hardware.
Paper: Usage of Reinforcement Learning in Recommender Systems
We are all familiar with recommender systems used by the likes of Amazon, Netflix, YouTube and many other eCommerce platforms. The goal of a recommender system is to present users with products and services that are aligned with their profile and preferences (likes and dislikes). The goal is to maximize the probability that the users consume the recommendations and/or remains engaged with the platform. Existing recommender system focus on the immediate user engagement without paying much attention to the long-term effects of the recommendation on user preferences. This weakness is amplified when the users are presented with a list of recommendations (slate of multiple options) that usually have interactive effects on each other. Dealing with multiplicity of slates pose a huge combinatorial challenge that has to be solved somehow. Fortunately Reinforcement Learning (LR) has come to the rescue again. Researchers at YouTube and University of Texas in Austin have proposed a mechanism that uses RL to address the challenges posed by slate recommendations (see the paper here). Truly a remarkable work with outstanding results.

Hope you have benefited from this issue. Please forward to others if you find value in this content. I always welcome feedback.
Al Gharakhanian | www | Linkedin | blog | Twitter

Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue