Neuchips announces AI accelerator with deep learning recommendation models for data center servers
California-based Neuchips, a developer of purpose-built AI inference chip platforms, has announced its first chip designed to accelerate deep learning recommendation models (DLRM). The RecAccel N3000 application-specific integrated circuit (ASIC) uses 7nm process technology from TSMC for integration on various form factor modules.
“In 2019, when Facebook open-sourced their deep learning recommendation model and challenged the industry to deliver a balanced AI inference chip platform, we decided to pursue the challenge,” said Dr. Lin, Neuchips CEO, co-founder of Global Unichip Corp, a subsidiary of TSMC, and professor at National Tsing Hua University, Taiwan.
“Our continued improvements in MLPerf DLRM benchmarking and whole-chip emulation gives us confidence that our RecAccel AI hardware architecture co-designed with our software will scale to deliver industry leadership and exceed our target of 20M inferences per second at 20 Watts,” Dr Lin further added.
The embedded device manufacturer has partnered with semiconductor companies and cloud server ecosystem providers to deliver the ASIC on Dual M.2 modules and PCIe Gen 5 cards for data center servers during the second half of 2022.
The RecAccel N3000 has 8-bit coefficient quantization, calibration and hardware support that is capable of delivering 99.95% accuracy on FP32 (commonly referred to as single-precision floating-point). RecAccel N3000 inference platform is designed with in-house cache design and DRAM traffic optimization that reduces LPDDR5 memory access by approximately 50%. This optimization allows an increase in bandwidth utilization by about 30%.
Focusing on energy efficiency
With technological advancements for an energy-efficient data center ecosystem, SoC manufacturers have invested a lot of time in designing state-of-the-art energy efficiency architecture at an SoC level. Neuchips have integrated dedicated MLP compute engines that deliver 1 microjoule per inference energy consumption.
The RecAccel N3000 inference platform mainly supports the deep learning recommendation AI model that works with categorical data, which is used to describe higher-level attributes. This is usually challenging for a neural network to work efficiently with such sparse data and a lack of publicly available datasets, there has been slow progress. However, a commonly known example of deep learning recommendation models used is in Meta (previously known as Facebook) data centers.
“The deep learning recommendation models (DLRMs), which are responsible for more than 50% of the training demand in our (Meta) data centers,” a research paper explains. This gives an idea of the scale at which DLRMs are employed in today’s data centers for next-generation training platforms.
ZettaScale breaks off from Adlink, receives investment for auto market
Article Topics
ASIC | chip | data center | deep learning | Facebook | Tags- Neuchips | TSMC
Comments