Belangrijkste concepten
Graph Integrated Language Transformers can improve next action prediction performance by removing dependency on external components and handling grounding issues in complex phone call conversations.
Samenvatting
The paper investigates an approach to predict the next action in complex phone call conversations without relying on external information extraction (i.e., slot-filling and intent-classification) or knowledge-based components.
The proposed models, Graph Integrated Language Transformers, learn the co-occurrences of actions and human utterances through a graph component and combine it with language transformers to add language understanding. The models are trained on conversations that followed a Standard Operating Procedure (SOP) without the need for explicit encoding.
The key highlights are:
- Integrating graph information and combining it with language transformers to remove dependency on NLU pipelines.
- Adding a graph component (i.e., history of action co-occurrence) to language transformers to predict the next action as one atomic task while also overcoming the token limit by removing the need to keep prior dialogue history.
- Evaluating the proposed next action prediction models in a production setting against a system that relies on an NLU pipeline with an explicitly defined dialogue manager.
The analyses indicate that keeping the action history with the order of actions using a graph embedding layer and combining it with language transformers generates higher quality outputs compared to more complex models that include connection details of actions (i.e., graph neural networks). The proposed models improve next action prediction regarding F1 score as well as product-level metrics and human-centered evaluation.
Statistieken
The dataset consists of around 593,156 dialogue turns from 21,220 phone calls. The average number of tokens per call is 544.16 and the average number of tokens per turn is 19.47.
The dataset contains 80 different next actions, with an imbalanced frequency distribution.
Citaten
"Integrating graph information and combining with language transformers to remove dependency on NLU pipelines."
"Adding a graph component (i.e., history of action co-occurrence) to language transformers to predict the next action as one atomic task while also overcoming the token limit by removing the need to keep prior dialogue history."
"Evaluating the proposed next action prediction model in a production setting against a system that relies on an NLU pipeline with an explicitly defined dialogue manager."