Benutzer: Gast  Login
Dokumenttyp:
Forschungsdaten
Verantwortlich:
Zügner, Daniel
Autorinnen / Autoren:
Zügner, Daniel (1); Kirschstein, Tobias (1); Catasta, Michele (2); Leskovec, Jure (2); Günnemann, Stephan (1)
Institutionszugehörigkeit:
1. Technical University of Munich (TUM)
2. Computer Science Department, Stanford University
Herausgeber:
TUM
Titel:
Code Transformer: Pretrained Models and Preprocessed Data
Enddatum der Datenerzeugung:
31.03.2021
Fachgebiet:
DAT Datenverarbeitung, Informatik
Quellen der Daten:
Experimente und Beobachtungen / experiments and observations
Andere Quellen der Daten:
Pre-trained models
Datentyp:
mehrdimensionale Visualisierungen oder Modelle / models
Methode der Datenerhebung:
This repository contains the preprocessed data and pretrained model from the ICLR 2021 paper Language-Agnostic Representation Learning of Source Code from Structure and Context. The preprocessed data is from the CodeSearchNet Challenge ( https://github.com/github/CodeSearchNet/) as well as the code2seq paper (Java-small, Java-medium, Java-large; https://github.com/tech-srl/code2seq/). We first use GitHub semantic to construct abstract syntax trees (ASTs) for each code snippet. We then compute d...     »
Beschreibung:
This repository contains pretrained models as well as preprocessed data from the ICLR 2021 paper Language-Agnostic Representation Learning of Source Code from Structure and Context. See https://www.in.tum.de/daml/code-transformer/ for additional information such as the code, paper, and poster. The preprocessed data is based on the CodeSearchNet Challenge ( https://github.com/github/CodeSearchNet/) as well as the code2seq paper (Java-small, Java-medium, Java-large; https://github.com/tech-srl/code2seq/) datasets.
See the original datasets for their licenses and terms of use.
Links:

 

Additional Information: https://openreview.net/forum?id=Xh5eMZVONGF

https://github.com/danielzuegner/code-transformer

https://www.daml.in.tum.de/code-transformer/

Schlagworte:
machine learning; deep learning; transformer models; source code; representation learning
Technische Hinweise:
View and download (40 GB total, 15 Files)
The data server also offers downloads with FTP
The data server also offers downloads with rsync (password m1647000):
rsync rsync:// m1647000@dataserv.ub.tum.de/ m1647000/
Sprache:
en
Rechte:
by, http://creativecommons.org/licenses/by/4.0
 BibTeX