In this post, we discuss about function and function_graph scenario of ftrace usage.
To find out which tracers are available, simply cat the available_tracers file in the tracing directory:
# cat available_tracers
blk function_graph mmiotrace wakeup_rt wakeup function nop
By default, no tracer will be set.
# cat current_tracer
nop
Let set current tracer as function
# echo function > current_tracer
# cat current_tracer
function
Enable the tracing
# echo 1 > tracing_on
Viewing trace log
# cat trace | head -50
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
metacity-1920 [000] 5127.451220: __pollwait <-unix_poll
metacity-1920 [000] 5127.451220: add_wait_queue <-__pollwait
metacity-1920 [000] 5127.451220: _raw_spin_lock_irqsave <-add_wait_queue
metacity-1920 [000] 5127.451221: _raw_spin_unlock_irqrestore <-add_wait_queue
metacity-1920 [000] 5127.451221: fput <-do_poll.isra.5
metacity-1920 [000] 5127.451221: poll_freewait <-do_sys_poll
metacity-1920 [000] 5127.451221: remove_wait_queue <-poll_freewait
metacity-1920 [000] 5127.451222: _raw_spin_lock_irqsave <-remove_wait_queue
metacity-1920 [000] 5127.451222: _raw_spin_unlock_irqrestore <-remove_wait_queue
metacity-1920 [000] 5127.451222: fput <-poll_freewait
metacity-1920 [000] 5127.451223: sys_writev <-syscall_call
metacity-1920 [000] 5127.451223: fget_light <-sys_writev
metacity-1920 [000] 5127.451223: vfs_writev <-sys_writev
metacity-1920 [000] 5127.451224: do_readv_writev <-vfs_writev
metacity-1920 [000] 5127.451224: rw_copy_check_uvector <-do_readv_writev
In the above example, sample call sequence.
unix_poll --> __pollwait --> add_wait_queue --> _raw_spin_lock_irqsave
_raw_spin_unlock_irqrestore
Do you have trouble in finding calling sequence??? set function_graph as current tracer.
# echo function_graph > current_tracer
Function graph with timing information.
# cat trace | head -50
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) 0.104 us | _raw_spin_lock_irqsave();
2) 0.104 us | _raw_spin_unlock_irqrestore();
2) | ep_poll_readyevents_proc() {
2) | ep_scan_ready_list.isra.8() {
2) | mutex_lock() {
2) 0.065 us | _cond_resched();
2) 0.602 us | }
2) 0.070 us | _raw_spin_lock_irqsave();
2) 0.107 us | _raw_spin_unlock_irqrestore();
2) 0.068 us | ep_read_events_proc();
2) 0.070 us | _raw_spin_lock_irqsave();
2) 0.106 us | _raw_spin_unlock_irqrestore();
2) 0.068 us | mutex_unlock();
2) 4.642 us | }
2) 5.244 us | }
Now I'm very much interested in listing kernel functions used by ls command. There are four commands executed in single line.
# echo > trace && echo 1 > tracing_on && ls && echo 0 > tracing_on
------------ ------------------- -- -------------------
| | |
| | |
| V V
| Enable Tracing Disable tracing after "ls"
V
Clear trace Log
Lets see the log.
# vim trace
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) 0.268 us | __fsnotify_parent();
1) | fsnotify() {
1) 0.088 us | __srcu_read_lock();
1) 0.064 us | __srcu_read_unlock();
1) 1.354 us | }
.....
.....
1) | mem_cgroup_charge_common() {
1) 0.066 us | lookup_page_cgroup();
------------------------------------------
2) <idle>-0 => ls-2556 <-- Here starts ls functions
------------------------------------------
2) | schedule_tail() {
1) 0.101 us | __mem_cgroup_try_charge();
2) 0.252 us | finish_task_switch();
1) | __mem_cgroup_commit_charge() {
2) 0.081 us | _cond_resched();
.....
.....
1) 0.065 us | native_pte_clear();
2) 0.109 us | native_load_sp0();
1) 0.447 us | native_flush_tlb_single();
2) 0.089 us | native_load_tls();
------------------------------------------
2) ls-2556 => <idle>-0 <-- Context switch to idle task
------------------------------------------
2) 0.187 us | finish_task_switch();
1) 0.359 us | fget_light();
....
....
------------------------------------------
2) <idle>-0 => ls-2556 <-- Context switch to "ls"
------------------------------------------
2) 0.188 us | finish_task_switch();
1) | security_file_ioctl() {
1) 0.066 us | cap_file_ioctl();
....
....
0) | __copy_from_user_ll() {
0) 0.059 us | __copy_from_user_ll.part.1();
0) 0.399 us | }
0) 1.050 us | }
It is easier to understand the implementation using it's code flow. The timing information is very useful for optimizing our code.