Unnecessary sharing of cache lines among threads of a program due to private data which is located in close proximity in memory is a performance obstacle. Depending on access frequency to this data and scheduling of threads to processor cores, this can lead to substantial overhead because of latency induced by cache lines exchanges, known as false sharing. Since processor hardware can not distinguish these effects from real data exchange (true sharing), all measurement tools have to rely on heuristics for detection. In this paper, we describe an approach using dynamic binary instrumentation to derive an estimate for the number of unnecessary exchanges of cache lines caused by false sharing and a tool assisting the programmer to identify the data structures involved as well as the code sections triggering false sharing. To evaluate the impact of false sharing the estimated number of occurrences is translated into a temporal overhead. Results of our tool are presented for two small example codes and a real-world application.
«
Unnecessary sharing of cache lines among threads of a program due to private data which is located in close proximity in memory is a performance obstacle. Depending on access frequency to this data and scheduling of threads to processor cores, this can lead to substantial overhead because of latency induced by cache lines exchanges, known as false sharing. Since processor hardware can not distinguish these effects from real data exchange (true sharing), all measurement tools have to rely on heur...
»