Assessing complexity of Russian legal texts: The model’s archtecture
DOI:
https://doi.org/10.24412/1811-1629-2022-2-4-13Abstract
The paper describes the metrics-based model for assessing complexity of Russian legal texts. The architecture of the model implies the use of 130 metrics divided into following categories: “basic metrics”, “readability formulas”, “words of different part-of-speech classes”, “n-grams of part-of-speech tags”, “frequency of lemmas”, “word-building patterns”, “grammes”, “lexical and semantic features, multi-word expressions”, “syntactic features”, “cohesion assessments”. Two metrics take into account hypertext links and the presence of vague contexts. Th e model is able to evaluate structural, conceptual, and hypertextual complexity, including both non-specific metrics traditionally used to predict complexity and style specific metrics developed taking into account the peculiarities of official texts. When evaluating morphological and syntactic features, the model refers to the markup layers performed by UDPipe (“rusyntagrus”) and pymorphy2. To make the model work a number of user dictionaries are involved, including a list of lexical means of text deixis, a list of graphic abbreviations (1,500 units), a list of acronyms (2,000 units), a list of legal terms (10,000 units), a list of abstract lemmas (17,000 units), a list of lexical indicators of deontic possibility and necessity, a list of light verb constructions. The values of complexity metrics were calculated for all documents of the CorCodex law corpus, the CorDec corpus of Constitutional court decisions, and the CorRIDA corpus of local acts (about 8 million tokens in total). Annotated legal corpora, complexity metrics, and user dictionaries are available for downloading from plaindocument.org.
Keywords:
Russian legal texts, complexity assessment model, linguistic metrics, readability
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Articles of "The World of Russian Word" are open access distributed under the terms of the License Agreement with Saint Petersburg State University, which permits to the authors unrestricted distribution and self-archiving free of charge.