Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents

Published 9 May 2021 in cs.CL | (2105.03887v1)

Abstract: Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially NLP. Recently, inspired by the success of pre-trained LLMs (PLMs) in the generic domain, many LegalAI researchers devote their effort to apply PLMs to legal tasks. However, utilizing PLMs to address legal tasks is still challenging, as the legal documents usually consist of thousands of tokens, which is far longer than the length that mainstream PLMs can process. In this paper, we release the Longformer-based pre-trained LLM, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering. The experimental results demonstrate that our model can achieve promising improvement on tasks with long documents as inputs.

Abstract PDF Upgrade to Chat

Citations (200)

View on Semantic Scholar

Summary

The paper introduces Lawformer as a novel pre-trained model that efficiently handles long Chinese legal texts using both local sliding window and global task-driven full attention.
The paper demonstrates significant performance gains in tasks like judgment prediction, legal case retrieval, reading comprehension, and question answering over traditional PLMs.
The paper lays a framework for future legal domain adaptations by suggesting further exploration in knowledge-augmented pre-training and generative models tailored to legal reasoning.

Lawformer: A Pre-trained LLM for Chinese Legal Long Documents

The paper introduces Lawformer, a Longformer-based pre-trained LLM designed for the Chinese legal domain, particularly for understanding long legal documents. Lawformer addresses the challenge in LegalAI, where legal documents typically exceed the token limit of mainstream pre-trained LLMs such as BERT and RoBERTa.

Motivation and Model Design

LegalAI leverages AI technologies, notably NLP, to enhance legal systems' efficiency. Given the complexity and length of legal documents, traditional PLMs, which excel in generic domains, fail to perform adequately on legal texts that extend beyond their processing capacities. Lawformer uses a combination of local sliding window attention and global task-driven full attention to manage these long sequences efficiently. This architecture allows for linear time complexity in processing sequences, compared to the quadratic complexity of full self-attention generally used in transformers.

Pre-training and Model Evaluation

Lawformer is pre-trained on a vast collection of Chinese criminal and civil case documents. The authors utilize the masked language modeling objective to fine-tune the model from the RoBERTa base, adapting it to legal content. The pre-training process involved extensive datasets, processed to align with the real-world distribution of legal texts.

The model is evaluated on multiple LegalAI tasks:

Judgment Prediction: Utilizing newly constructed datasets, Lawformer shows superior performance, especially in representing long distanced context critical for accurate predictions.
Legal Case Retrieval: Lawformer outperforms traditional models on the LeCaRD dataset, showcasing its capability to retrieve relevant case documents with thousands of tokens.
Reading Comprehension: On the CJRC dataset, Lawformer exhibits performance gains attributed to its in-domain adaptation, albeit close results with models pre-trained on non-legal corpora.
Question Answering: Evaluated on the JEC-QA dataset, Lawformer demonstrates notable improvements, attributed to its sophisticated reasoning capabilities over extensive legal texts.

Implications and Future Work

The introduction of Lawformer significantly advances LegalAI by enabling the processing and understanding of long legal documents, a previously daunting challenge. The study also provides a robust framework for future adaptation of large pre-trained models to domain-specific requirements, illustrating the potential of extended context learning.

The authors suggest further exploration in knowledge-augmented legal pre-training and generative models for legal tasks, aiming to embed legal reasoning and domain-specific knowledge more effectively. Such efforts may transform the operational landscape of legal practices by automating and enhancing document comprehension and case analysis.

In conclusion, Lawformer marks an important step in bridging the gap between general NLP advances and domain-specific applications, particularly in the legal sector. This work paves the way for developing more nuanced, context-aware LLMs tailored to domain-specific challenges.

Markdown Report Issue