User: Guest  Login
Authors:
Gebendorfer, Christoph; Elnaggar, Ahmed 
Author affiliation:
TUM 
Publisher:
TUM 
Title:
Legal JRC-Acquis - Translation Corpus 
Time of production:
30.01.2018 
Subject area:
DAT Datenverarbeitung, Informatik 
Other subject areas:
Legal Domain 
Resource type:
Textdokumente / text documents 
Data type:
Texte / texts 
Description:
This corpus is a derivation of the original JRC Acquis corpus which contains legislative documents of the European Parliament since 1958. This derivation contains a subset of the original corpus and is processed into aligned form (Moses/Giza++). It contains parallel text in 21 language pairs based on 7 languages (cs, de, en, es, fr, it, sv) whcih can be directly used in data-intense translation systems. The files are split up in training and test sets. Size: ~24 million sentence pairs Testse...    »
 
Method of data assessment:
Derivation of the JRC-Acquis corpus 
Key words:
legal-jrc-acquis; parallel legislative texts; jrc acquis documents 
Technical remarks:
Moses/Giza++ Format
View and download (2.1 GB, 23 files)
The data server also offers downloads with FTP
The data server also offers downloads with rsync (password m1446655):
rsync rsync://m1446655@dataserv.ub.tum.de/m1446655/ 
Language:
de 
Rights:
by, http://creativecommons.org/licenses/by/4.0 
Other rights:
Rights implied by original corpus (JRC-Acquis), Commission Decision of 12 December 2011 on the re-use of Commission documents, published in Official Journal of the European Union L330 of 14 December 2011, pages 39 to 42