Analyzing regulatory documents is a continuous challenge for numerous companies, especially
if it is a manual process. Considering the exponential growth in legal acts, legal practitioners
must invest vast amounts of time examining the legal text for relevant information. Nevertheless,
the manual analysis remains susceptible to errors and misinterpretation. This thesis concentrates
on semi-automating this procedure and presents an approach for extracting legal definitions and
their semantic relations from European regulatory documents using natural language processing
techniques. We further visualize the obtained data on the implemented web service, which serves as a
practical application for the approach. Since the existing methodologies addressing legal information
retrieval tasks struggle with interpreting legal text and lack semantic analysis and visualization, our
method intends to cover this research gap and deepen the understanding of regulatory documents.
In order to identify legal definitions, we primarily investigated the legal acts structure that regulatory
documents attempt to follow. After recognizing similar formats, we focused on a single article
specifying legal terms, extracted definitions and analyzed all semantic relations occurring, such
as hyponymy, meronymy, and synonymy. For this purpose, contingent upon the type of semantic
relationship, we applied pattern matching and natural language processing techniques, emphasizing
dependency parsing and noun phrase chunking. For visualization, the prototype collected the
data into separate files and extracted sentences mentioning legal definitions for each related term.
To rapidly discover these sentences in the text and obtain an overview of each term’s frequency,
the prototype listed the articles where the definitions occur and counted the number of retrieved
sentences. Additionally, it assigned annotations to the regulatory documents, explaining the legal
definitions in each paragraph to facilitate comprehension of the regulatory documents.
The evaluation outcomes demonstrated that the prototype could detect 99.9% of legal definitions
and 96.7% of their semantic relations correctly, thereby delivering accurate results for the introduced
approach. The study further fulfilled the established requirements intending to simplify the plat-
form’s usage. Consequently, these results demonstrate that natural language processing techniques
perform well in the classification phase and are suitable for definition and relation extraction.
«
Analyzing regulatory documents is a continuous challenge for numerous companies, especially
if it is a manual process. Considering the exponential growth in legal acts, legal practitioners
must invest vast amounts of time examining the legal text for relevant information. Nevertheless,
the manual analysis remains susceptible to errors and misinterpretation. This thesis concentrates
on semi-automating this procedure and presents an approach for extracting legal definitions and
their semantic...
»