The variability of traffic as observed in the Internet and imposed on the distributed infrastructure is determined by many factors that are not well understood. To improve our understanding, we introduce an interdomain traffic model that can capture changes to content, user behavior, routing, and intra- and interdomain traffic. We propose to distinguish two levels of abstraction: (1) publisher demand: the volume of load originating from a Web publisher and destined to a client set (2) Web traffic demand: the volume of load originating from a specific Web publisher's server (i.e., an IP address) and destined to a client set. Web traffic demands explicitly consider the infrastructure of the publisher. Therefore they are appropriate for studying interdomain routing questions. Publisher demands are useful for studying user and/or publisher behavior. This paper introduces for the first time a methodology for deriving a significant subset of such demands based on four key observations. (1) We observe that a sizable fraction of the bytes delivered to clients originate at publishers that utilize content distribution networks (CDNs). (2) We show that it is possible to obtain by extrapolation the overall traffic served by all publishers that utilize CDNs to all client sets. In particular, to estimate the traffic between a publisher and all clients, we combine the logs from the CDN with an estimate of the fraction of the publisher demand that is served by the CDN. (3) While it cannot be presumed that each client set will access the same content, it is fair to presume that the fraction of demand served by the CDN on behalf of a publisher does not change dramatically from client set to client set. Therefore it is sufficient to estimate the fraction by ``just'' observing a few large and diverse client sets. (4) It is possible to map publishers to IP addresses with the help of the DNS system and information available from the interdomain routing system. Using logs from Akamai, a major CDN, and two different client sets, we discuss our experiences in deriving the interdomain traffic demands, and present a preliminary analysis of the observed dynamics of the demands.
«
The variability of traffic as observed in the Internet and imposed on the distributed infrastructure is determined by many factors that are not well understood. To improve our understanding, we introduce an interdomain traffic model that can capture changes to content, user behavior, routing, and intra- and interdomain traffic. We propose to distinguish two levels of abstraction: (1) publisher demand: the volume of load originating from a Web publisher and destined to a client set (2) Web traffi...
»