Baidu

Industrial Applications for PaddlePaddle – IEEE Spectrum

TensorFlow, PyTorch, and Keras: These three deep learning frameworks have dominated AI for years, even as new entrants gain traction. But one frame you don’t hear much about in the West is the Chinese Paddle Paddle, the most popular Chinese frame in the world’s most populous country.

It is an easy-to-use, efficient, flexible and scalable deep learning platform originally developed by Chinese AI giant Baidu to apply deep learning to good many of its own products. Today it is used by over 4.77 million developers and 180,000 companies worldwide. While it’s hard to get comparable numbers for other frameworks, suffice it to say, it’s huge.

Baidu recently announced new updates to PaddlePaddle, along with 10 great deep learning models that cover natural language processing, vision, and computational biology. Among the models are a hundred billion parameter natural language processing (NLP) model called ERNIE 3.0 Zeus, a pre-trained geography and language model called ERNIE-GeoL, and a pre-trained model for learning representations compounds called HELIX-GEM.

The company has also created three new large industry-focused models – one for the electric power industry, one for banking and another for aerospace – by refining the company’s ERNIE 3.0 Titan model. with industry data and expert knowledge in unsupervised learning tasks. .

Software frameworks are packages of supporting programs, compilers, code libraries, toolsets and application programming interfaces (APIs) associated to enable the development of a project or a system. Deep Learning Frameworks bring together everything you need to design, train, and validate deep neural networks through a high-level programming interface. Without these tools, implementing deep learning algorithms would be very time consuming because otherwise reusable pieces of code would have to be written from scratch.

Baidu started developing such tools as early as 2012 a few months after Geoffrey Hinton’s deep learning breakthrough in the ImageNet competition.

In 2013, a PhD student at the University of California, Berkeley created a framework called Caffe, which supports convolutional neural networks used in computer vision research. Baidu relied on Caffe to develop PaddlePaddle, which supports recurrent neural networks in addition to convolutional neural networks, giving it an edge in the field of NLP.

The name PaddlePaddle is derived from PArallel Distributed Deep Learning, a reference to the framework’s ability to train models on multiple GPUs.

Google’s open source TensorFlow in 2015 and Baidu’s open source PaddlePaddle the following year. When Eric Schmidt introduced TensorFlow to China in 2017, it turns out China was ahead of him.

While TensorFlow and Meta’s PyTorch, open-sourced in 2017, remain popular in China, PaddlePaddle is geared more towards industrial users.

“We have been putting a lot of effort into lowering the barriers to entry for individuals and businesses,” said Ma Yanjun, general manager of AI technology ecosystem at Baidu.

PyTorch and TensorFlow require more deep learning expertise from users compared to PaddlePaddle, whose toolkits are designed for non-experts in production environments.

“In China, many developers try to use AI in their work, but they don’t have much experience in AI,” Ma explained. “So to increase the use of AI in different industry sectors, we have provided PaddlePaddle with many low-threshold toolkits that are easier to use so that they can be used by a wider community.”

AI engineers don’t normally know much about industry sectors and industry experts don’t know much about AI. But PaddlePaddle’s easy-to-understand code comes with a host of learning materials and tools to help users. It scales easily and has a comprehensive set of APIs to meet various needs.

These developers used PaddlePaddle for a desert robot to automate the process of planting trees.Baidu

It supports large-scale data training and can train hundreds of machines in parallel. It provides a neural-machine translation system, recommender systems, image classification, sentiment analysis, and semantic role labeling.

Toolkits and libraries are the strong side of PaddlePaddle, said Ma. For example, PaddleSeg can be used for image segmentation. PaddleDetection can be used for object detection. “We cover the entire AI development pipeline, from data processing, to training, to compressing models, to adapting to different hardware,” Ma said, “and then how to deploy them in different systems, for example, on a Windows or a Linux operating system or on an Intel chip or on an Nvidia chip.

The platform also hosts toolkits for advanced research purposes, such as Paddle Quantum for quantum computing models and Paddle Graph Learning for graph learning models.

“That’s why PaddlePaddle is very popular in China right now,” he said. “Developers use such toolkits and not just the tool itself.”

Since being open source, PaddlePaddle has evolved rapidly to provide better performance and user experience in different industry sectors outside of Baidu as well as in countries outside of China with full English documentation. Currently, PaddlePaddle offers over 500 pre-trained algorithms and models to facilitate the rapid development of industrial applications. Baidu has made efforts to reduce the size of the models so that they can be deployed in real applications. Some of the models are very small and fast and can be deployed on a camera or mobile phone.

  • Transportation companies are using PaddlePaddle to deploy AI models that monitor traffic lights and improve traffic efficiency.
  • Manufacturing companies use PaddlePaddle to improve productivity and reduce costs.
  • Recycling companies are using PaddlePaddle to develop an object detection model that can identify different types of waste for a waste sorting robot.
  • Shouguang County in Shandong Province is deploying AI to monitor the growth of different vegetables, advising farmers on the best time to pick and pack them.
  • In Southeast Asia, PaddlePaddle has been used to control AI-powered forestry drones for fire prevention.

PaddlePaddle has parameter server technology to train sparse models that can be used in real-time search and recommendation systems. But it has also merged models into even larger systems that are used for scenarios that don’t require real-time results, like text generation or image generation.

Baidu sees large, dense models as another way to reduce the barrier to AI adoption, as so-called basic models can be tailored to specific scenarios. Without the base model, you have to develop everything from scratch.

Ma said the research areas are converging with cross-model learning of different modalities, like speech and vision. He said Baidu also uses knowledge graphs in the deep learning process. “Previously, a deep learning system processed raw text or raw images without any knowledge input, and the system used self-supervised learning to gather rules out of the data,” Ma said. “But now, we see knowledge graphs as an input.”

From articles on your site

Related articles on the web