NLU调研

[TOC]

业务场景:小样本数据上的任务型对话理解。

对话领域三类

  1. 问答类
  2. 任务类
  3. 闲聊类

1. 规则方法

1.1 意图识别

  • 词典法
  • CFG(上下文无关语法)
  • JSGF(JSpeech Grammar Format)

参考资料:

1.2 命名实体识别

需要构造词典

参考:

2. 模型方法

A dataset survey about task-oriented dialogue, including recent datasets and SoA results & papers.

2.1 pipeline

pipeline方法将意图识别和槽填充分为两个独立的部分,分别进行训练。

2.1.1 意图识别

本质上是短文本分类任务,一般的文本分类算法都可以处理

传统算法:

  • LR
  • SVM
  • KNN
  • RF
  • GBDT

深度学习方法

  • Fasttext
  • TextCNN
  • GRU
  • LSTM
  • IDCNN
  • TextRNN

经调研,预训练fasttext词向量+单层textcnn从分类效果和速度上都相对较优,作为优先选择。

TextCNN的改进:

  • K-max pooling
  • DPCNN

2.1.2 槽填充

  • CRF
  • RNN/LSTM/CNN+CRF
  • BiLSTM+CRF
  • BiLSTM+CNN+CRF

2.2 joint model

其中第三条提到的模型: Convolutional Sequence to Sequence Learning

3. 企业做法

3.1 阿里小蜜

Arxiv: AliMe Assist: An Intelligent Assistant for Creating an Innovative E-commerce Experience

note: 经内部人员考证,这套框架太老已弃用

  • business rule parser: 大量的样式(patterns)组成的前缀树匹配结构(trie-based)
  • Intention classifier: 场景分类,pre-train采用fasttext,分类采用单层cnn
    • requesting for assistance
    • asking for information or solution
    • chaŠtting
  • Semantic Parser: a trie-based, 匹配知识图谱中的实体

3.2 美团

参考:美团对话理解技术及实践

上下文无关文法,工具,规则的写法

4. 数据

5. 开源工具

5.1 ChatterBot

github 9.1k

没有NLU模块,做法是匹配式,训练的输入是一系列完整的对话过程,数据库存储。

通过Logic adapters来获取输出结果

  • BestMatch
  • TimeLogicAdapter
  • MathematicalEvaluation

这个框架主要对问题文本 使用相似度匹配,找出库中预定好的答案。 比较适合,知识问答类的情形。

5.2 rasa

数据

意图识别

  • KeywordIntentClassifier:This classifier is mostly used as a placeholder. It is able to recognize hello and goodbye intents by searching for these keywords in the passed messages.
  • MitieIntentClassifier: This classifier uses MITIE to perform intent classification. The underlying classifier is using a multi-class linear SVM with a sparse linear kernel
  • SklearnIntentClassifier: The sklearn intent classifier trains a linear SVM which gets optimized using a grid search.需要前置feature extractor
  • EmbeddingIntentClassifier: The embedding intent classifier embeds user inputs and intent labels into the same space. Supervised embeddings are trained by maximizing similarity between them. This algorithm is based on StarSpace.

实体识别

  • MitieEntityExtractor:The underlying classifier is using a multi class linear SVM with a sparse linear kernel and custom features
  • SpacyEntityExtractor:Using spaCy this component predicts the entities of a message. spacy uses a statistical BILOU transition model.
  • EntitySynonymMapper: Maps synonymous entity values to the same value. 通过数据中的value来提供
  • CRFEntityExtractor:spaCy has to be installed. 貌似用的spaCy的实现
  • DucklingHTTPExtractor: Duckling lets you extract common entities like dates, amounts of money, distances, and others in a number of languages.

槽填充

官方文档:slot的使用

参考:

均可自定义component: Enhancing Rasa NLU models with Custom Components

5.3 DeepPavlov

deepmipt/DeepPavlov: 3.6k

An open source library for deep learning end-to-end dialog systems and chatbots. https://deeppavlov.ai

支持英文和俄语。功能全面,可作为学习参考。

基本概念

  • Agent is a conversational agent communicating with users in natural language (text).
  • Skill fulfills user’s goal in some domain. Typically, this is accomplished by presenting information or completing transaction (e.g. answer question by FAQ, booking tickets etc.). However, for some tasks a success of interaction is defined as continuous engagement (e.g. chit-chat).
  • Model is any NLP model that doesn’t necessarily communicates with user in natural language.
  • Component is a reusable functional part of Model or Skill.
  • Rule-based Models cannot be trained.
  • Machine Learning Models can be trained only stand alone.
  • Deep Learning Models can be trained independently and in an end-to-end mode being joined in a chain.
  • Skill Manager performs selection of the Skill to generate response.
  • Chainer builds an agent/model pipeline from heterogeneous components (Rule-based/ML/DL). It allows to train and infer models in a pipeline as a whole.

Models:

意图识别

  • BERT classifier (see here) builds BERT 8 architecture for classification problem on Tensorflow.
  • Keras classifier (see here) builds neural network on Keras with tensorflow backend.
  • Sklearn classifier (see here) builds most of sklearn classifiers.

模型很丰富

NER

  • standard RNN based and BERT based.
  • Multilingual BERT Zero-Shot Transfer
  • Few-shot Language-Model based

槽填充

官方文档: Neural Named Entity Recognition and Slot Filling

This model solves Slot-Filling task using Levenshtein search and different neural network architectures for NER.

Slotfiller will perform fuzzy search through the all variations of all entity values of given entity type. The entity type is determined by the NER component.

使用博客:DeepPavlov articles with Python code

规则编写

只见到了对话规则的编写,通过PatternMatchingSkill,使用正则编写pattern和response

有一个包装rasa的Rasa Skill

DeepPavlov存在的问题

  1. 环境依赖
    • DeepPavlov是基于TensorFlow和Keras实现的,不能继承其他计算框架的模型实现(如PyTorch)。
  2. 语言支持
    • Pre-train模型和评测数据集主要基于英文和俄文,不支持中文。
  3. 生产环境部署
    • DeepPavlov在运行时需要依赖整个框架源码,开发环境对框架修改后,生产环境需要更新整个框架。
    • 也不能直接将功能Component作为服务独立导出,不适合在生产环境的部署和发布。

5.4 Snips-nlu

snipsco/snips-nlu: 3k

Snips Python library to extract meaning from text https://snips-nlu.readthedocs.io

不支持中文

Tutorial: 意图和槽值都放在训练数据中了

# turnLightOn intent
---
type: intent
name: turnLightOn
slots:
  - name: room
    entity: room
utterances:
  - Turn on the lights in the [room](kitchen)
  - give me some light in the [room](bathroom) please
  - Can you light up the [room](living room) ?
  - switch the [room](bedroom)'s lights on please

This parser parses text using two steps: first it classifies the intent using an IntentClassifier and once the intent is known, it using a SlotFiller in order to extract the slots.

IntentClassifier

  • Logistic Regression
  • Feature extractor for text classification relying on ngrams tfidf and optionally word cooccurrences features
  • scikit-learn TfidfVectorizer
  • Featurizer that takes utterances and extracts ordered word cooccurrence features matrix from them

SlotFiller

  • Linear-Chain Conditional Random Fields

5.5 其他


参考:

  1. 2019-06-17问答系统项目落地调研
  2. YSSNLP2019 人机对话研究热点及前沿技术概述
  3. 美团对话理解技术及实践
  4. 对话系统 NLU/DM 任务详解
  5. NLP笔记 - NLU之意图分类
  6. 自然语言理解中的槽位填充