User: Guest  Login
Title:

Dataset of "Network Traffic Characteristics of Machine Learning Frameworks Under the Microscope"

Document type:
Forschungsdaten
Publication date:
27.10.2021
Responsible:
Zerwas, Johannes
Authors:
Zerwas, Johannes; Aykurt, Kaan; Schmid, Stefan; Blenk, Andreas
Author affiliation:
TUM
Publisher:
TUM
Identifier:
doi:10.14459/2021mp1632489
End date of data production:
15.06.2021
Subject area:
DAT Datenverarbeitung, Informatik
Resource type:
Experimente und Beobachtungen / experiments and observations
Data type:
Texte / texts; Datenbanken / data bases
Other data type:
Network traffic traces
Description:
Network traffic collection (PCAP) of three widely-used state-of-the-art Distributed Machine Learning (DML) frameworks (Tensorflow, Horovod, KungFu). The collection contains distributed training runs of four models (MobileNetV2, ResNet50, Resnet101, DenseNet201) with varying configurations of the frameworks. Varied parameters are the communication topology and backend, the distributed optimizer, the batch size and the packet loss in the network.
Method of data assessment:
The traffic was collected in a four worker testbed setup. The workers were interconnected with a 10G Ethernet network via a single packet switch. Each worker was equipped with an Nvidia Tesla T4 GPU. Traffic traces were directly taken on the worker nodes. The models were trained for 20 epochs on the CIFAR-10 image dataset.
Links:
This dataset relates to the publication: https://doi.org/10.23919/CNSM52442.2021.9615524
Key words:
Distributed Machine Learning; Network Traffic Measurement
Technical remarks:
View and download (151 GB total, 90 Files)
The data server also offers downloads with FTP
The data server also offers downloads with rsync (password m1632489):
rsync rsync://m1632489@dataserv.ub.tum.de/m1632489/
Language:
en
Rights:
by-sa, http://creativecommons.org/licenses/by-sa/4.0
 BibTeX