Baidu AI Research Brings Significant Upgrade to PaddleOCR’s Open Source OCR System

This Article Is Based On The Research Article 'PaddleOCR, an Easy-to-Use and Open-Source OCR System, Rolls out Major Upgrade With Improved Accuracy and New Annotation Functions'. All Credit For This Research Goes To The Researchers of This Project 👏👏👏

Please Don't Forget To Join Our ML Subreddit

A significant enhancement has been made to PaddleOCR, Multilingual Optical Character Recognition (OCR) Toolkits. With over 80 different multilingual recognition models and an easy-to-use interface, PaddleOCR is an open-source OCR repository worth checking out.

OCRv3 PP-OCRv3 has a 5% to 11% increase in accuracy in English and multilingual scenarios. Annotation functions for tables, irregular text images and essential information extraction tasks have been added to PPOCRLabelv2. “Dive into OCR”, a new interactive e-book, is now available.


OCR has become a vital technology enabler in transforming printed images into searchable digital information in the digital age. OA systems, online education, factory automation and map making are just a few examples of how it has been used. PaddleOCR is a real-world OCR program.

Weighing just 17 ounces, this OCR system is small enough to fit in the palm of your hand. It recognizes more than 80 other multilingual patterns and the most commonly spoken languages ​​such as English and Chinese. Automated Annotation Software Table and fundamental information annotation modes are supported by the PPOCRLabel. Stylized text is an easy way to generate many images close to the appearance of the target scene image.

It is easy to use and supports PIP installation. It also supports various operating systems including Linux, Windows, and macOS. There are currently over 21,000 stars on the PaddleOCR GitHub page as of this writing. Developers can benefit from the fantastic, leading and practical multilingual OCR tools that allow them to train better models and put them into practice.

The PaddleOCR team has created an ultra-lightweight OCR device, dubbed PP-OCR, for use in the OCR industry, focusing on accuracy and speed. On the back of PP-OCRv2, PP-OCRv3 receives an improvement. The detection and recognition models of PP-text OCRv3 can be optimized in nine ways.

At a similar speed rate, compared to PP-OCRv2, the accuracy of English models increased by 11%, while Chinese models increased by 5%. The average recognition accuracy of eighty multilingual models increased by more than 5%. There are no significant changes in the detection network.


PP-OCRv3 improves instructor and student models in a more holistic way. It is about 11 times harder to predict accurately with SVTR inty (a lightweight text recognition network) than with PP-OCRv2, even though it has higher recognition accuracy. The processors take about 100 milliseconds to predict a single line of text.

PP-OCRv3 uses the following six optimization algorithms to speed up the recognition pattern, as shown in the figure below.


To automatically detect and recognize images, PPOCRLabel has a built-in Partial Optical Character Recognition (PP-OCR) model. The new PPOCRLabelv2 has the following new features:

  • New ways to annotate tables, images with irregular text (like seals and curves), and activities that require the extraction of critical information;
  • Box locking, batch processing, image rotation and dataset segmentation
  • Box rotation is now supported; it can also be installed through the WHL package.

“Dive Into OCR” is a manual developed by the PaddleOCR community that integrates OCR theory and practice. PaddleOCR supports a wide range of state-of-the-art OCR algorithms and produces industrial templates/solutions, such as Template OCR, PP-OCR and PP-Structure.


Shruti is an intern consultant at MarktechPost. She is currently pursuing her B.Tech from Indian Institute of Technology (IIT), Kanpur, India.