1 Star 1 Fork 0

jinsheng.zhao / perf-tools

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
perf-stat-hist_example.txt 7.24 KB
一键复制 编辑 原始数据 按行查看 历史
Brendan Gregg 提交于 2014-07-31 15:46 . include USAGE messages in examples
Demonstrations of perf-stat-hist, the Linux perf_events version.
Tracing the net:net_dev_xmit tracepoint, and building a power-of-4 histogram
for the "len" variable, for 10 seconds:
# ./perf-stat-hist net:net_dev_xmit len 10
Tracing net:net_dev_xmit, power-of-4, max 1048576, for 10 seconds...
Range : Count Distribution
0 : 0 | |
1 -> 3 : 0 | |
4 -> 15 : 0 | |
16 -> 63 : 2 |# |
64 -> 255 : 30 |### |
256 -> 1023 : 3 |# |
1024 -> 4095 : 446 |######################################|
4096 -> 16383 : 0 | |
16384 -> 65535 : 0 | |
65536 -> 262143 : 0 | |
262144 -> 1048575 : 0 | |
1048576 -> : 0 | |
This showed that most of the network transmits were between 1024 and 4095 bytes,
with a handful between 64 and 255 bytes.
Cat the format file for the tracepoint to see what other variables are available
to trace. Eg:
# cat /sys/kernel/debug/tracing/events/net/net_dev_xmit/format
name: net_dev_xmit
ID: 1078
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:void * skbaddr; offset:8; size:8; signed:0;
field:unsigned int len; offset:16; size:4; signed:0;
field:int rc; offset:20; size:4; signed:1;
field:__data_loc char[] name; offset:24; size:4; signed:1;
print fmt: "dev=%s skbaddr=%p len=%u rc=%d", __get_str(name), REC->skbaddr, REC->len, REC->rc
That's where "len" came from.
This works by creating a series of tracepoint and filter pairs for each
histogram bucket, and doing in-kernel counts. The overhead should in many cases
be better than user space post-processing, however, this approach is still
not ideal. I've called it a "perf hacktogram". The overhead is relative to
the frequency of events, multiplied by the number of buckets. You can modify
the script to use power-of-2 instead, or whatever you like, but the overhead
for more buckets will be higher.
Histogram of the returned read() syscall sizes:
# ./perf-stat-hist syscalls:sys_exit_read ret 10
Tracing syscalls:sys_exit_read, power-of-4, max 1048576, for 10 seconds...
Range : Count Distribution
0 : 90 |# |
1 -> 3 : 9587 |######################################|
4 -> 15 : 69 |# |
16 -> 63 : 590 |### |
64 -> 255 : 250 |# |
256 -> 1023 : 389 |## |
1024 -> 4095 : 296 |## |
4096 -> 16383 : 183 |# |
16384 -> 65535 : 12 |# |
65536 -> 262143 : 0 | |
262144 -> 1048575 : 0 | |
1048576 -> : 0 | |
Most of our read()s were tiny, between 1 and 3 bytes.
Using power-of-2, and a max of 1024:
# ./perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret
Tracing syscalls:sys_exit_read, power-of-2, max 1024, until Ctrl-C...
^C
Range : Count Distribution
-> -1 : 29 |## |
0 -> 0 : 1 |# |
1 -> 1 : 959 |######################################|
2 -> 3 : 1 |# |
4 -> 7 : 0 | |
8 -> 15 : 2 |# |
16 -> 31 : 14 |# |
32 -> 63 : 1 |# |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 1 |# |
1024 -> : 1 |# |
Specifying custom bucket sizes:
# ./perf-stat-hist -b "10 50 100 5000" syscalls:sys_exit_read ret
Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
^C
Range : Count Distribution
-> 9 : 989 |######################################|
10 -> 49 : 5 |# |
50 -> 99 : 0 | |
100 -> 4999 : 2 |# |
5000 -> : 0 | |
Specifying a single value to bifurcate statistics:
# ./perf-stat-hist -b 10 syscalls:sys_exit_read ret
Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
^C
Range : Count Distribution
-> 9 : 2959 |######################################|
10 -> : 7 |# |
This has the lowest overhead for collection, since only two tracepoint
filter pairs are used.
Use -h to print the USAGE message:
# ./perf-stat-hist -h
USAGE: perf-stat-hist [-h] [-b buckets|-P power] [-m max] tracepoint
variable [seconds]
-b buckets # specify histogram bucket points
-P power # power-of (default is 4)
-m max # max value for power-of
-h # this usage message
eg,
perf-stat-hist syscalls:sys_enter_read count 5
# read() request histogram, 5 seconds
perf-stat-hist syscalls:sys_exit_read ret 5
# read() return histogram, 5 seconds
perf-stat-hist -P 10 syscalls:sys_exit_read ret 5
# ... use power-of-10
perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret 5
# ... use power-of-2, max 1024
perf-stat-hist -b "10 50 100 500" syscalls:sys_exit_read ret 5
# ... histogram based on these bucket ranges
perf-stat-hist -b 10 syscalls:sys_exit_read ret 5
# ... bifurcate by the value 10 (lowest overhead)
See the man page and example file for more info.
1
https://gitee.com/linuxzjs/perf-tools.git
git@gitee.com:linuxzjs/perf-tools.git
linuxzjs
perf-tools
perf-tools
master

搜索帮助