HPE platform aims to help customers quickly build and train ML models at a scale
Hewlett Packard Enterprise has announced the public availability of a new ready-to-use solution, HPE Machine Learning Development System, an end-to-end model that integrates machine learning software platform, compute, and accelerators. The system is purpose-built for AI with the goal of reducing the entry barriers for enterprises to build and train machine learning models at scale.
The foundation behind the announcement is a machine learning (ML) platform called HPE Machine Learning Development Environment that is combined with the company’s high-performance compute offerings. With the new system, users can speed up the time to market and realize AI applications through faster building and training processing, as the company claims, “from weeks and months to days.”
As many large organizations are finding out, there is a gap between expectations and reality when it comes to building and operating AI models. Research from McKinsey found that 56 percent of all respondents to its 2021 Global Insight Survey have adopted AI in at least one function, up from 50 percent in 2020. On the other hand, estimates from Gartner suggest that 85% of AI projects fail due to data and algorithm errors.
One of the early-stage adopters of the new HPE development system is Aleph Alpha, a German startup, known for its AI technology to transform human-machine interaction. Aleph Alpha has trained its multimodal AI, which combines image and text processing in five languages to enable more sophisticated search and document creation for specialized fields of knowledge, for example. Using the HPE Machine Learning Development System, Aleph Alpha was able to train the model in record time, combining and monitoring hundreds of GPUs.
“We are seeing astonishing efficiency and performance of more than 150 teraflops by using the HPE Machine Learning Development System. The system was quickly set up and we began training our models in hours instead of weeks. While running these massive workloads, combined with our ongoing research, being able to rely on an integrated solution for deployment and monitoring makes all the difference,” says Jonas Andrulis, Founder, and CEO, of Aleph Alpha.
The system also improves accuracy in machine learning models with state-of-art distributed learning, automated hyperparameter optimization, and neural architecture search. The performance parameters published by HPE show on a small configuration of 32 NVIDIA GPUs, that the system delivers approximately 90% scaling efficiency for workloads such as NLP and computer vision.
“Enterprises seek to incorporate AI and machine learning to differentiate their products and services, but are often confronted with the complexity in setting up the infrastructure required to build and train accurate AI models at scale,” says Justin Hotard, executive vice president, and general manager, HPC and AI, at HPE. “The HPE Machine Learning Development System combines our proven end-to-end HPC solutions for deep learning with our innovative machine learning software platform into one system, to provide a performant out-of-the-box solution to accelerate time to value and outcomes with AI.”
Article Topics
AI/ML | AIOps | edge AI | HPE | model training
Comments