Baidu recently announced PLATO-XL, an AI model for generating dialogues, which has been trained on over a billion samples collected from social media conversations in English and Chinese. PLATO-XL achieves peak performance on several conversational benchmarks, outperforming currently available commercial chatbots.
the model and several experiences were described in an article published on arXiv. PLATO-XL is based on a Unified transformer architecture, which allows simultaneous learning of language comprehension and response generation. The model uses multi-party pre-training to distinguish utterances from different participants in a conversation, which improves the consistency of bot responses. When evaluated by human judges in an open domain conversation, PLATO-XL outperformed other chatbot models, including Facebook’s Blender. PLATO-XL also set new performance records on benchmarks for knowledge-based dialogue and task-based conversation. According to the Baidu team,
PLATO-XL opens up new horizons in open domain conversations, one of the most difficult tasks in natural language processing. As the largest pre-training model for Chinese and English dialogue, PLATO-XL achieves new levels of conversational consistency and factuality, one step closer to the future of finally learning and conversational skills. human type.
Natural Language Processing (NLP) AI models have been shown to improve performance at scale. These larger models are pre-formed on massive datasets, often pulled from the web, before being refined for specific NLP tasks. However, Baidu researchers pointed out that it is currently unclear whether the dialogue generation models used by chatbots still benefit from an increased scale, citing both Microsoft’s DialoGPT and Facebook’s Blender, where mid-size models outperformed larger models for those architectures. The key to achieving performance improvements on a larger scale, Baidu says, is the pre-training process.
PLATO-XL builds on the original PLATO model and improvement PLATO-2 released in 2020. The heart of the model is a unified transformer, instead of the more common encoder-decoder architecture; this allows the model to share parameters between the language comprehension and response generation tasks, making it more efficient. Like many other chatbots, PLATO-XL is pre-trained using conversations pulled from social media websites — in this case, comments from Reddit. However, because these conversations have multiple participants as well as a hierarchy of threads, models often mix information from different participants, producing inconsistent responses. To resolve this issue, Baidu added type and role integrate components into training text inputs, which are used to distinguish between different types of responses and participants in the conversation.
Using their Paddle Paddle Deep learning platform, Baidu trained PLATO-XL on English and Chinese datasets consisting of context / response pairs; the English data contained 811 million samples, and the Chinese data contained 1.2 B. To assess its performance, the team collected transcripts of English and Chinese conversations between humans and several different chatbots, including PLATO-XL as well as DialoGPT, Blender and PLATO-2. Judges rated the conversations for consistency, informative, engagement, inconsistency and hallucinations; PLATO-XL has outperformed all other robots. The team also evaluated PLATO-XL on three benchmark datasets: DuConv, DSTC9-Track1, and MultiWOZ; the bot established new peak performance, surpassing previous peak models by several percentage points.
InfoQ recently covered Baidu’s ERNIE 3.0 model, which surpassed basic human performance on the Super glue language understanding reference. Several other great models of NLP in Chinese have also been developed recently. Earlier this year, Huawei announced its parameter 200B PanGu-Alpha model, trained on 1.1TB of Chinese data, and cloud company Inspire announced its parameter 245B Yuan model, formed on 5TB of data, which the company claims to be “China’s largest high-quality corpus today.”
Baidu says they plan to release the PLATO-XL source code and the English model “before the end of November 2021” as part of their connoisseur toolkit, available on GitHub.