Friday, May 23, 2014

ftrace - The kernel function tracer - 2


In this post, we discuss about function and function_graph scenario of ftrace usage.
To find out which tracers are available, simply cat the available_tracers file in the tracing directory:
# cat available_tracers 
blk function_graph mmiotrace wakeup_rt wakeup function nop
By default, no tracer will be set.
# cat current_tracer 
nop
Let set current tracer as function
# echo function > current_tracer 
# cat current_tracer 
function
Enable the tracing
# echo 1 > tracing_on
Viewing trace log
# cat trace | head -50
# tracer: function
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
        metacity-1920  [000]  5127.451220: __pollwait <-unix_poll
        metacity-1920  [000]  5127.451220: add_wait_queue <-__pollwait
        metacity-1920  [000]  5127.451220: _raw_spin_lock_irqsave <-add_wait_queue
        metacity-1920  [000]  5127.451221: _raw_spin_unlock_irqrestore <-add_wait_queue
        metacity-1920  [000]  5127.451221: fput <-do_poll.isra.5
        metacity-1920  [000]  5127.451221: poll_freewait <-do_sys_poll
        metacity-1920  [000]  5127.451221: remove_wait_queue <-poll_freewait
        metacity-1920  [000]  5127.451222: _raw_spin_lock_irqsave <-remove_wait_queue
        metacity-1920  [000]  5127.451222: _raw_spin_unlock_irqrestore <-remove_wait_queue
        metacity-1920  [000]  5127.451222: fput <-poll_freewait
        metacity-1920  [000]  5127.451223: sys_writev <-syscall_call
        metacity-1920  [000]  5127.451223: fget_light <-sys_writev
        metacity-1920  [000]  5127.451223: vfs_writev <-sys_writev
        metacity-1920  [000]  5127.451224: do_readv_writev <-vfs_writev
        metacity-1920  [000]  5127.451224: rw_copy_check_uvector <-do_readv_writev
In the above example, sample call sequence.
unix_poll --> __pollwait --> add_wait_queue --> _raw_spin_lock_irqsave 
                                                _raw_spin_unlock_irqrestore 
Do you have trouble in finding calling sequence??? set function_graph as current tracer.
# echo function_graph > current_tracer
Function graph with timing information.
# cat trace | head -50
# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
 2)   0.104 us    |            _raw_spin_lock_irqsave();
 2)   0.104 us    |            _raw_spin_unlock_irqrestore();
 2)               |            ep_poll_readyevents_proc() {
 2)               |              ep_scan_ready_list.isra.8() {
 2)               |                mutex_lock() {
 2)   0.065 us    |                  _cond_resched();
 2)   0.602 us    |                }
 2)   0.070 us    |                _raw_spin_lock_irqsave();
 2)   0.107 us    |                _raw_spin_unlock_irqrestore();
 2)   0.068 us    |                ep_read_events_proc();
 2)   0.070 us    |                _raw_spin_lock_irqsave();
 2)   0.106 us    |                _raw_spin_unlock_irqrestore();
 2)   0.068 us    |                mutex_unlock();
 2)   4.642 us    |              }
 2)   5.244 us    |            }
Now I'm very much interested in listing kernel functions used by ls command. There are four commands executed in single line.
# echo > trace && echo 1 > tracing_on && ls && echo 0 > tracing_on
  ------------    -------------------    --    -------------------
    |                   |                              | 
    |                   |                              |
    |                   V                              V
    |              Enable Tracing               Disable tracing after "ls"  
    V                 
 Clear trace Log 
Lets see the log.
# vim trace
# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
 1)   0.268 us    |  __fsnotify_parent();
 1)               |  fsnotify() {
 1)   0.088 us    |    __srcu_read_lock();
 1)   0.064 us    |    __srcu_read_unlock();
 1)   1.354 us    |  }
 .....
 .....
 1)               |            mem_cgroup_charge_common() {
 1)   0.066 us    |              lookup_page_cgroup();
 ------------------------------------------
 2)    <idle>-0    =>    ls-2556                          <-- Here starts ls functions 
 ------------------------------------------

 2)               |  schedule_tail() {
 1)   0.101 us    |              __mem_cgroup_try_charge();
 2)   0.252 us    |    finish_task_switch();
 1)               |              __mem_cgroup_commit_charge() {
 2)   0.081 us    |    _cond_resched();
 .....
 .....
 1)   0.065 us    |            native_pte_clear();
 2)   0.109 us    |                native_load_sp0();
 1)   0.447 us    |            native_flush_tlb_single();
 2)   0.089 us    |                native_load_tls();
 ------------------------------------------
 2)    ls-2556     =>    <idle>-0                 <-- Context switch to idle task
 ------------------------------------------    
 2)   0.187 us    |      finish_task_switch();
 1)   0.359 us    |    fget_light();
 ....
 .... 
 ------------------------------------------
 2)    <idle>-0    =>    ls-2556               <-- Context switch to "ls"
 ------------------------------------------

 2)   0.188 us    |                finish_task_switch();
 1)               |    security_file_ioctl() {
 1)   0.066 us    |      cap_file_ioctl();
 ....
 ....
 0)               |        __copy_from_user_ll() {
 0)   0.059 us    |          __copy_from_user_ll.part.1();
 0)   0.399 us    |        }
 0)   1.050 us    |      }
It is easier to understand the implementation using it's code flow. The timing information is very useful for optimizing our code.

No comments:

Post a Comment