mirror of
https://git.yoctoproject.org/poky
synced 2026-01-29 21:08:42 +01:00
profile-manual: usage.rst: further style improvements
According to errors reported by "make stylecheck" (From yocto-docs rev: fc808b2dca4321209bfa6206d9e43aa1a5a0c5b8) Signed-off-by: Michael Opdenacker <michael.opdenacker@bootlin.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This commit is contained in:
committed by
Richard Purdie
parent
f6584a38ae
commit
d255faa35e
@@ -10,7 +10,7 @@ Basic Usage (with examples) for each of the Yocto Tracing Tools
|
|||||||
This chapter presents basic usage examples for each of the tracing
|
This chapter presents basic usage examples for each of the tracing
|
||||||
tools.
|
tools.
|
||||||
|
|
||||||
Perf
|
perf
|
||||||
====
|
====
|
||||||
|
|
||||||
The perf tool is the profiling and tracing tool that comes bundled
|
The perf tool is the profiling and tracing tool that comes bundled
|
||||||
@@ -26,12 +26,12 @@ of what's going on.
|
|||||||
|
|
||||||
In many ways, perf aims to be a superset of all the tracing and
|
In many ways, perf aims to be a superset of all the tracing and
|
||||||
profiling tools available in Linux today, including all the other tools
|
profiling tools available in Linux today, including all the other tools
|
||||||
covered in this HOWTO. The past couple of years have seen perf subsume a
|
covered in this How-to. The past couple of years have seen perf subsume a
|
||||||
lot of the functionality of those other tools and, at the same time,
|
lot of the functionality of those other tools and, at the same time,
|
||||||
those other tools have removed large portions of their previous
|
those other tools have removed large portions of their previous
|
||||||
functionality and replaced it with calls to the equivalent functionality
|
functionality and replaced it with calls to the equivalent functionality
|
||||||
now implemented by the perf subsystem. Extrapolation suggests that at
|
now implemented by the perf subsystem. Extrapolation suggests that at
|
||||||
some point those other tools will simply become completely redundant and
|
some point those other tools will become completely redundant and
|
||||||
go away; until then, we'll cover those other tools in these pages and in
|
go away; until then, we'll cover those other tools in these pages and in
|
||||||
many cases show how the same things can be accomplished in perf and the
|
many cases show how the same things can be accomplished in perf and the
|
||||||
other tools when it seems useful to do so.
|
other tools when it seems useful to do so.
|
||||||
@@ -41,7 +41,7 @@ want to apply the tool; full documentation can be found either within
|
|||||||
the tool itself or in the manual pages at
|
the tool itself or in the manual pages at
|
||||||
`perf(1) <https://linux.die.net/man/1/perf>`__.
|
`perf(1) <https://linux.die.net/man/1/perf>`__.
|
||||||
|
|
||||||
Perf Setup
|
perf Setup
|
||||||
----------
|
----------
|
||||||
|
|
||||||
For this section, we'll assume you've already performed the basic setup
|
For this section, we'll assume you've already performed the basic setup
|
||||||
@@ -54,14 +54,14 @@ image built with the following in your ``local.conf`` file::
|
|||||||
|
|
||||||
perf runs on the target system for the most part. You can archive
|
perf runs on the target system for the most part. You can archive
|
||||||
profile data and copy it to the host for analysis, but for the rest of
|
profile data and copy it to the host for analysis, but for the rest of
|
||||||
this document we assume you've ssh'ed to the host and will be running
|
this document we assume you're connected to the host through SSH and will be
|
||||||
the perf commands on the target.
|
running the perf commands on the target.
|
||||||
|
|
||||||
Basic Perf Usage
|
Basic perf Usage
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
The perf tool is pretty much self-documenting. To remind yourself of the
|
The perf tool is pretty much self-documenting. To remind yourself of the
|
||||||
available commands, simply type ``perf``, which will show you basic usage
|
available commands, just type ``perf``, which will show you basic usage
|
||||||
along with the available perf subcommands::
|
along with the available perf subcommands::
|
||||||
|
|
||||||
root@crownbay:~# perf
|
root@crownbay:~# perf
|
||||||
@@ -101,7 +101,7 @@ As a simple test case, we'll profile the ``wget`` of a fairly large file,
|
|||||||
which is a minimally interesting case because it has both file and
|
which is a minimally interesting case because it has both file and
|
||||||
network I/O aspects, and at least in the case of standard Yocto images,
|
network I/O aspects, and at least in the case of standard Yocto images,
|
||||||
it's implemented as part of BusyBox, so the methods we use to analyze it
|
it's implemented as part of BusyBox, so the methods we use to analyze it
|
||||||
can be used in a very similar way to the whole host of supported BusyBox
|
can be used in a similar way to the whole host of supported BusyBox
|
||||||
applets in Yocto::
|
applets in Yocto::
|
||||||
|
|
||||||
root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \
|
root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \
|
||||||
@@ -164,17 +164,17 @@ hits and misses::
|
|||||||
|
|
||||||
44.831023415 seconds time elapsed
|
44.831023415 seconds time elapsed
|
||||||
|
|
||||||
So ``perf stat`` gives us a nice easy
|
As you can see, ``perf stat`` gives us a nice easy
|
||||||
way to get a quick overview of what might be happening for a set of
|
way to get a quick overview of what might be happening for a set of
|
||||||
events, but normally we'd need a little more detail in order to
|
events, but normally we'd need a little more detail in order to
|
||||||
understand what's going on in a way that we can act on in a useful way.
|
understand what's going on in a way that we can act on in a useful way.
|
||||||
|
|
||||||
To dive down into a next level of detail, we can use ``perf record`` /
|
To dive down into a next level of detail, we can use ``perf record`` /
|
||||||
``perf report`` which will collect profiling data and present it to use using an
|
``perf report`` which will collect profiling data and present it to use using an
|
||||||
interactive text-based UI (or simply as text if we specify ``--stdio`` to
|
interactive text-based UI (or just as text if we specify ``--stdio`` to
|
||||||
``perf report``).
|
``perf report``).
|
||||||
|
|
||||||
As our first attempt at profiling this workload, we'll simply run ``perf
|
As our first attempt at profiling this workload, we'll just run ``perf
|
||||||
record``, handing it the workload we want to profile (everything after
|
record``, handing it the workload we want to profile (everything after
|
||||||
``perf record`` and any perf options we hand it --- here none, will be
|
``perf record`` and any perf options we hand it --- here none, will be
|
||||||
executed in a new shell). perf collects samples until the process exits
|
executed in a new shell). perf collects samples until the process exits
|
||||||
@@ -189,7 +189,7 @@ directory::
|
|||||||
[ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ]
|
[ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ]
|
||||||
|
|
||||||
To see the results in a
|
To see the results in a
|
||||||
"text-based UI" (tui), simply run ``perf report``, which will read the
|
"text-based UI" (tui), just run ``perf report``, which will read the
|
||||||
perf.data file in the current working directory and display the results
|
perf.data file in the current working directory and display the results
|
||||||
in an interactive UI::
|
in an interactive UI::
|
||||||
|
|
||||||
@@ -204,10 +204,10 @@ The above screenshot displays a "flat" profile, one entry for each
|
|||||||
profiling run, ordered from the most popular to the least (perf has
|
profiling run, ordered from the most popular to the least (perf has
|
||||||
options to sort in various orders and keys as well as display entries
|
options to sort in various orders and keys as well as display entries
|
||||||
only above a certain threshold and so on --- see the perf documentation
|
only above a certain threshold and so on --- see the perf documentation
|
||||||
for details). Note that this includes both userspace functions (entries
|
for details). Note that this includes both user space functions (entries
|
||||||
containing a ``[.]``) and kernel functions accounted to the process (entries
|
containing a ``[.]``) and kernel functions accounted to the process (entries
|
||||||
containing a ``[k]``). perf has command-line modifiers that can be used to
|
containing a ``[k]``). perf has command-line modifiers that can be used to
|
||||||
restrict the profiling to kernel or userspace, among others.
|
restrict the profiling to kernel or user space, among others.
|
||||||
|
|
||||||
Notice also that the above report shows an entry for ``busybox``, which is
|
Notice also that the above report shows an entry for ``busybox``, which is
|
||||||
the executable that implements ``wget`` in Yocto, but that instead of a
|
the executable that implements ``wget`` in Yocto, but that instead of a
|
||||||
@@ -218,7 +218,7 @@ Before we do that, however, let's try running a different profile, one
|
|||||||
which shows something a little more interesting. The only difference
|
which shows something a little more interesting. The only difference
|
||||||
between the new profile and the previous one is that we'll add the ``-g``
|
between the new profile and the previous one is that we'll add the ``-g``
|
||||||
option, which will record not just the address of a sampled function,
|
option, which will record not just the address of a sampled function,
|
||||||
but the entire callchain to the sampled function as well::
|
but the entire call chain to the sampled function as well::
|
||||||
|
|
||||||
root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
|
root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
|
||||||
Connecting to downloads.yoctoproject.org (140.211.169.59:80)
|
Connecting to downloads.yoctoproject.org (140.211.169.59:80)
|
||||||
@@ -233,26 +233,26 @@ but the entire callchain to the sampled function as well::
|
|||||||
:align: center
|
:align: center
|
||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
Using the callgraph view, we can actually see not only which functions
|
Using the call graph view, we can actually see not only which functions
|
||||||
took the most time, but we can also see a summary of how those functions
|
took the most time, but we can also see a summary of how those functions
|
||||||
were called and learn something about how the program interacts with the
|
were called and learn something about how the program interacts with the
|
||||||
kernel in the process.
|
kernel in the process.
|
||||||
|
|
||||||
Notice that each entry in the above screenshot now contains a ``+`` on the
|
Notice that each entry in the above screenshot now contains a ``+`` on the
|
||||||
left-hand side. This means that we can expand the entry and drill down
|
left side. This means that we can expand the entry and drill down
|
||||||
into the callchains that feed into that entry. Pressing ``Enter`` on any
|
into the call chains that feed into that entry. Pressing ``Enter`` on any
|
||||||
one of them will expand the callchain (you can also press ``E`` to expand
|
one of them will expand the call chain (you can also press ``E`` to expand
|
||||||
them all at the same time or ``C`` to collapse them all).
|
them all at the same time or ``C`` to collapse them all).
|
||||||
|
|
||||||
In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry
|
In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry
|
||||||
and several subnodes all the way down. This lets us see which callchains
|
and several subnodes all the way down. This lets us see which call chains
|
||||||
contributed to the profiled ``__copy_to_user_ll()`` function which
|
contributed to the profiled ``__copy_to_user_ll()`` function which
|
||||||
contributed 1.77% to the total profile.
|
contributed 1.77% to the total profile.
|
||||||
|
|
||||||
As a bit of background explanation for these callchains, think about
|
As a bit of background explanation for these call chains, think about
|
||||||
what happens at a high level when you run wget to get a file out on the
|
what happens at a high level when you run ``wget`` to get a file out on the
|
||||||
network. Basically what happens is that the data comes into the kernel
|
network. Basically what happens is that the data comes into the kernel
|
||||||
via the network connection (socket) and is passed to the userspace
|
via the network connection (socket) and is passed to the user space
|
||||||
program ``wget`` (which is actually a part of BusyBox, but that's not
|
program ``wget`` (which is actually a part of BusyBox, but that's not
|
||||||
important for now), which takes the buffers the kernel passes to it and
|
important for now), which takes the buffers the kernel passes to it and
|
||||||
writes it to a disk file to save it.
|
writes it to a disk file to save it.
|
||||||
@@ -262,16 +262,16 @@ is the part where the kernel passes the data it has read from the socket
|
|||||||
down to wget i.e. a ``copy-to-user``.
|
down to wget i.e. a ``copy-to-user``.
|
||||||
|
|
||||||
Notice also that here there's also a case where the hex value is
|
Notice also that here there's also a case where the hex value is
|
||||||
displayed in the callstack, here in the expanded ``sys_clock_gettime()``
|
displayed in the call stack, here in the expanded ``sys_clock_gettime()``
|
||||||
function. Later we'll see it resolve to a userspace function call in
|
function. Later we'll see it resolve to a user space function call in
|
||||||
busybox.
|
BusyBox.
|
||||||
|
|
||||||
.. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png
|
.. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png
|
||||||
:align: center
|
:align: center
|
||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
The above screenshot shows the other half of the journey for the data ---
|
The above screenshot shows the other half of the journey for the data ---
|
||||||
from the ``wget`` program's userspace buffers to disk. To get the buffers to
|
from the ``wget`` program's user space buffers to disk. To get the buffers to
|
||||||
disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to
|
disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to
|
||||||
the kernel, which then takes care via some circuitous path (probably
|
the kernel, which then takes care via some circuitous path (probably
|
||||||
also present somewhere in the profile data), to get it safely to disk.
|
also present somewhere in the profile data), to get it safely to disk.
|
||||||
@@ -281,8 +281,8 @@ of how to extract useful information out of it, let's get back to the
|
|||||||
task at hand and see if we can get some basic idea about where the time
|
task at hand and see if we can get some basic idea about where the time
|
||||||
is spent in the program we're profiling, wget. Remember that wget is
|
is spent in the program we're profiling, wget. Remember that wget is
|
||||||
actually implemented as an applet in BusyBox, so while the process name
|
actually implemented as an applet in BusyBox, so while the process name
|
||||||
is ``wget``, the executable we're actually interested in is BusyBox. So
|
is ``wget``, the executable we're actually interested in is ``busybox``.
|
||||||
let's expand the first entry containing BusyBox:
|
Therefore, let's expand the first entry containing BusyBox:
|
||||||
|
|
||||||
.. image:: figures/perf-wget-busybox-expanded-stripped.png
|
.. image:: figures/perf-wget-busybox-expanded-stripped.png
|
||||||
:align: center
|
:align: center
|
||||||
@@ -293,7 +293,7 @@ hex value instead of a symbol as with most of the kernel entries.
|
|||||||
Expanding the BusyBox entry doesn't make it any better.
|
Expanding the BusyBox entry doesn't make it any better.
|
||||||
|
|
||||||
The problem is that perf can't find the symbol information for the
|
The problem is that perf can't find the symbol information for the
|
||||||
busybox binary, which is actually stripped out by the Yocto build
|
``busybox`` binary, which is actually stripped out by the Yocto build
|
||||||
system.
|
system.
|
||||||
|
|
||||||
One way around that is to put the following in your ``local.conf`` file
|
One way around that is to put the following in your ``local.conf`` file
|
||||||
@@ -303,20 +303,20 @@ when you build the image::
|
|||||||
|
|
||||||
However, we already have an image with the binaries stripped, so
|
However, we already have an image with the binaries stripped, so
|
||||||
what can we do to get perf to resolve the symbols? Basically we need to
|
what can we do to get perf to resolve the symbols? Basically we need to
|
||||||
install the debuginfo for the BusyBox package.
|
install the debugging information for the BusyBox package.
|
||||||
|
|
||||||
To generate the debug info for the packages in the image, we can add
|
To generate the debug info for the packages in the image, we can add
|
||||||
``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example::
|
``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example::
|
||||||
|
|
||||||
EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs"
|
EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs"
|
||||||
|
|
||||||
Additionally, in order to generate the type of debuginfo that perf
|
Additionally, in order to generate the type of debugging information that perf
|
||||||
understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE`
|
understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE`
|
||||||
in the ``local.conf`` file::
|
in the ``local.conf`` file::
|
||||||
|
|
||||||
PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory'
|
PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory'
|
||||||
|
|
||||||
Once we've done that, we can install the debuginfo for BusyBox. The
|
Once we've done that, we can install the debugging information for BusyBox. The
|
||||||
debug packages once built can be found in ``build/tmp/deploy/rpm/*``
|
debug packages once built can be found in ``build/tmp/deploy/rpm/*``
|
||||||
on the host system. Find the ``busybox-dbg-...rpm`` file and copy it
|
on the host system. Find the ``busybox-dbg-...rpm`` file and copy it
|
||||||
to the target. For example::
|
to the target. For example::
|
||||||
@@ -324,11 +324,11 @@ to the target. For example::
|
|||||||
[trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31:
|
[trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31:
|
||||||
busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01
|
busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01
|
||||||
|
|
||||||
Now install the debug rpm on the target::
|
Now install the debug RPM on the target::
|
||||||
|
|
||||||
root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm
|
root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm
|
||||||
|
|
||||||
Now that the debuginfo is installed, we see that the BusyBox entries now display
|
Now that the debugging information is installed, we see that the BusyBox entries now display
|
||||||
their functions symbolically:
|
their functions symbolically:
|
||||||
|
|
||||||
.. image:: figures/perf-wget-busybox-debuginfo.png
|
.. image:: figures/perf-wget-busybox-debuginfo.png
|
||||||
@@ -351,7 +351,7 @@ expanded all the nodes using the ``E`` key):
|
|||||||
:align: center
|
:align: center
|
||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
Finally, we can see that now that the BusyBox debuginfo is installed,
|
Finally, we can see that now that the BusyBox debugging information is installed,
|
||||||
the previously unresolved symbol in the ``sys_clock_gettime()`` entry
|
the previously unresolved symbol in the ``sys_clock_gettime()`` entry
|
||||||
mentioned previously is now resolved, and shows that the
|
mentioned previously is now resolved, and shows that the
|
||||||
``sys_clock_gettime`` system call that was the source of 6.75% of the
|
``sys_clock_gettime`` system call that was the source of 6.75% of the
|
||||||
@@ -386,8 +386,8 @@ counter, something other than the default ``cycles``.
|
|||||||
The tracing and profiling infrastructure in Linux has become unified in
|
The tracing and profiling infrastructure in Linux has become unified in
|
||||||
a way that allows us to use the same tool with a completely different
|
a way that allows us to use the same tool with a completely different
|
||||||
set of counters, not just the standard hardware counters that
|
set of counters, not just the standard hardware counters that
|
||||||
traditional tools have had to restrict themselves to (of course the
|
traditional tools have had to restrict themselves to (the
|
||||||
traditional tools can also make use of the expanded possibilities now
|
traditional tools can now actually make use of the expanded possibilities now
|
||||||
available to them, and in some cases have, as mentioned previously).
|
available to them, and in some cases have, as mentioned previously).
|
||||||
|
|
||||||
We can get a list of the available events that can be used to profile a
|
We can get a list of the available events that can be used to profile a
|
||||||
@@ -527,14 +527,14 @@ workload via ``perf list``::
|
|||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
|
|
||||||
These are exactly the same set of events defined by the trace event
|
These are exactly the same set of events defined by the trace event
|
||||||
subsystem and exposed by ftrace / tracecmd / kernelshark as files in
|
subsystem and exposed by ftrace / trace-cmd / KernelShark as files in
|
||||||
``/sys/kernel/debug/tracing/events``, by SystemTap as
|
``/sys/kernel/debug/tracing/events``, by SystemTap as
|
||||||
kernel.trace("tracepoint_name") and (partially) accessed by LTTng.
|
kernel.trace("tracepoint_name") and (partially) accessed by LTTng.
|
||||||
|
|
||||||
Only a subset of these would be of interest to us when looking at this
|
Only a subset of these would be of interest to us when looking at this
|
||||||
workload, so let's choose the most likely subsystems (identified by the
|
workload, so let's choose the most likely subsystems (identified by the
|
||||||
string before the colon in the Tracepoint events) and do a ``perf stat``
|
string before the colon in the ``Tracepoint`` events) and do a ``perf stat``
|
||||||
run using only those wildcarded subsystems::
|
run using only those subsystem wildcards::
|
||||||
|
|
||||||
root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
|
root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
|
||||||
Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
|
Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
|
||||||
@@ -625,8 +625,8 @@ accounts for the function name actually displayed in the profile:
|
|||||||
}
|
}
|
||||||
|
|
||||||
A couple of the more interesting
|
A couple of the more interesting
|
||||||
callchains are expanded and displayed above, basically some network
|
call chains are expanded and displayed above, basically some network
|
||||||
receive paths that presumably end up waking up wget (busybox) when
|
receive paths that presumably end up waking up wget (BusyBox) when
|
||||||
network data is ready.
|
network data is ready.
|
||||||
|
|
||||||
Note that because tracepoints are normally used for tracing, the default
|
Note that because tracepoints are normally used for tracing, the default
|
||||||
@@ -646,7 +646,7 @@ high-level view of what's going on with a workload or across the system.
|
|||||||
It is however by definition an approximation, as suggested by the most
|
It is however by definition an approximation, as suggested by the most
|
||||||
prominent word associated with it, ``sampling``. On the one hand, it
|
prominent word associated with it, ``sampling``. On the one hand, it
|
||||||
allows a representative picture of what's going on in the system to be
|
allows a representative picture of what's going on in the system to be
|
||||||
cheaply taken, but on the other hand, that cheapness limits its utility
|
cheaply taken, but alternatively, that cheapness limits its utility
|
||||||
when that data suggests a need to "dive down" more deeply to discover
|
when that data suggests a need to "dive down" more deeply to discover
|
||||||
what's really going on. In such cases, the only way to see what's really
|
what's really going on. In such cases, the only way to see what's really
|
||||||
going on is to be able to look at (or summarize more intelligently) the
|
going on is to be able to look at (or summarize more intelligently) the
|
||||||
@@ -711,7 +711,7 @@ an infinite variety of ways.
|
|||||||
Another way to look at it is that there are only so many ways that the
|
Another way to look at it is that there are only so many ways that the
|
||||||
'primitive' counters can be used on their own to generate interesting
|
'primitive' counters can be used on their own to generate interesting
|
||||||
output; to get anything more complicated than simple counts requires
|
output; to get anything more complicated than simple counts requires
|
||||||
some amount of additional logic, which is typically very specific to the
|
some amount of additional logic, which is typically specific to the
|
||||||
problem at hand. For example, if we wanted to make use of a 'counter'
|
problem at hand. For example, if we wanted to make use of a 'counter'
|
||||||
that maps to the value of the time difference between when a process was
|
that maps to the value of the time difference between when a process was
|
||||||
scheduled to run on a processor and the time it actually ran, we
|
scheduled to run on a processor and the time it actually ran, we
|
||||||
@@ -722,12 +722,12 @@ standard profiling tools how much data every process on the system reads
|
|||||||
and writes, along with how many of those reads and writes fail
|
and writes, along with how many of those reads and writes fail
|
||||||
completely. If we have sufficient trace data, however, we could with the
|
completely. If we have sufficient trace data, however, we could with the
|
||||||
right tools easily extract and present that information, but we'd need
|
right tools easily extract and present that information, but we'd need
|
||||||
something other than pre-canned profiling tools to do that.
|
something other than ready-made profiling tools to do that.
|
||||||
|
|
||||||
Luckily, there is a general-purpose way to handle such needs, called
|
Luckily, there is a general-purpose way to handle such needs, called
|
||||||
"programming languages". Making programming languages easily available
|
"programming languages". Making programming languages easily available
|
||||||
to apply to such problems given the specific format of data is called a
|
to apply to such problems given the specific format of data is called a
|
||||||
'programming language binding' for that data and language. Perf supports
|
'programming language binding' for that data and language. perf supports
|
||||||
two programming language bindings, one for Python and one for Perl.
|
two programming language bindings, one for Python and one for Perl.
|
||||||
|
|
||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
@@ -737,7 +737,7 @@ two programming language bindings, one for Python and one for Perl.
|
|||||||
DProbes dpcc compiler, an ANSI C compiler which targeted a low-level
|
DProbes dpcc compiler, an ANSI C compiler which targeted a low-level
|
||||||
assembly language running on an in-kernel interpreter on the target
|
assembly language running on an in-kernel interpreter on the target
|
||||||
system. This is exactly analogous to what Sun's DTrace did, except
|
system. This is exactly analogous to what Sun's DTrace did, except
|
||||||
that DTrace invented its own language for the purpose. Systemtap,
|
that DTrace invented its own language for the purpose. SystemTap,
|
||||||
heavily inspired by DTrace, also created its own one-off language,
|
heavily inspired by DTrace, also created its own one-off language,
|
||||||
but rather than running the product on an in-kernel interpreter,
|
but rather than running the product on an in-kernel interpreter,
|
||||||
created an elaborate compiler-based machinery to translate its
|
created an elaborate compiler-based machinery to translate its
|
||||||
@@ -750,8 +750,8 @@ entry / exit events we recorded::
|
|||||||
root@crownbay:~# perf script -g python
|
root@crownbay:~# perf script -g python
|
||||||
generated Python script: perf-script.py
|
generated Python script: perf-script.py
|
||||||
|
|
||||||
The skeleton script simply creates a Python function for each event type in the
|
The skeleton script just creates a Python function for each event type in the
|
||||||
``perf.data`` file. The body of each function simply prints the event name along
|
``perf.data`` file. The body of each function just prints the event name along
|
||||||
with its parameters. For example:
|
with its parameters. For example:
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
@@ -794,7 +794,7 @@ We can run that script directly to print all of the events contained in the
|
|||||||
syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024
|
syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024
|
||||||
|
|
||||||
That in itself isn't very useful; after all, we can accomplish pretty much the
|
That in itself isn't very useful; after all, we can accomplish pretty much the
|
||||||
same thing by simply running ``perf script`` without arguments in the same
|
same thing by just running ``perf script`` without arguments in the same
|
||||||
directory as the ``perf.data`` file.
|
directory as the ``perf.data`` file.
|
||||||
|
|
||||||
We can however replace the print statements in the generated function
|
We can however replace the print statements in the generated function
|
||||||
@@ -816,7 +816,7 @@ event. For example:
|
|||||||
|
|
||||||
Each event handler function in the generated code
|
Each event handler function in the generated code
|
||||||
is modified to do this. For convenience, we define a common function
|
is modified to do this. For convenience, we define a common function
|
||||||
called ``inc_counts()`` that each handler calls; ``inc_counts()`` simply tallies
|
called ``inc_counts()`` that each handler calls; ``inc_counts()`` just tallies
|
||||||
a count for each event using the ``counts`` hash, which is a specialized
|
a count for each event using the ``counts`` hash, which is a specialized
|
||||||
hash function that does Perl-like autovivification, a capability that's
|
hash function that does Perl-like autovivification, a capability that's
|
||||||
extremely useful for kinds of multi-level aggregation commonly used in
|
extremely useful for kinds of multi-level aggregation commonly used in
|
||||||
@@ -876,7 +876,7 @@ System-Wide Tracing and Profiling
|
|||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The examples so far have focused on tracing a particular program or
|
The examples so far have focused on tracing a particular program or
|
||||||
workload --- in other words, every profiling run has specified the program
|
workload --- that is, every profiling run has specified the program
|
||||||
to profile in the command-line e.g. ``perf record wget ...``.
|
to profile in the command-line e.g. ``perf record wget ...``.
|
||||||
|
|
||||||
It's also possible, and more interesting in many cases, to run a
|
It's also possible, and more interesting in many cases, to run a
|
||||||
@@ -906,13 +906,13 @@ other processes running on the system as well:
|
|||||||
:align: center
|
:align: center
|
||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
In the snapshot above, we can see callchains that originate in libc, and
|
In the snapshot above, we can see call chains that originate in ``libc``, and
|
||||||
a callchain from Xorg that demonstrates that we're using a proprietary X
|
a call chain from ``Xorg`` that demonstrates that we're using a proprietary X
|
||||||
driver in userspace (notice the presence of ``PVR`` and some other
|
driver in user space (notice the presence of ``PVR`` and some other
|
||||||
unresolvable symbols in the expanded Xorg callchain).
|
unresolvable symbols in the expanded ``Xorg`` call chain).
|
||||||
|
|
||||||
Note also that we have both kernel and userspace entries in the above
|
Note also that we have both kernel and user space entries in the above
|
||||||
snapshot. We can also tell perf to focus on userspace but providing a
|
snapshot. We can also tell perf to focus on user space but providing a
|
||||||
modifier, in this case ``u``, to the ``cycles`` hardware counter when we
|
modifier, in this case ``u``, to the ``cycles`` hardware counter when we
|
||||||
record a profile::
|
record a profile::
|
||||||
|
|
||||||
@@ -924,7 +924,7 @@ record a profile::
|
|||||||
:align: center
|
:align: center
|
||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
Notice in the screenshot above, we see only userspace entries (``[.]``)
|
Notice in the screenshot above, we see only user space entries (``[.]``)
|
||||||
|
|
||||||
Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into
|
Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into
|
||||||
DSO`` menu item to show only entries associated with a specific DSO. In
|
DSO`` menu item to show only entries associated with a specific DSO. In
|
||||||
@@ -960,7 +960,7 @@ We can look at the raw output using ``perf script`` with no arguments::
|
|||||||
Filtering
|
Filtering
|
||||||
^^^^^^^^^
|
^^^^^^^^^
|
||||||
|
|
||||||
Notice that there are a lot of events that don't really have anything to
|
Notice that there are many events that don't really have anything to
|
||||||
do with what we're interested in, namely events that schedule ``perf``
|
do with what we're interested in, namely events that schedule ``perf``
|
||||||
itself in and out or that wake perf up. We can get rid of those by using
|
itself in and out or that wake perf up. We can get rid of those by using
|
||||||
the ``--filter`` option --- for each event we specify using ``-e``, we can add a
|
the ``--filter`` option --- for each event we specify using ``-e``, we can add a
|
||||||
@@ -999,7 +999,7 @@ purpose of demonstrating how to use filters, it's close enough.
|
|||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
|
|
||||||
These are exactly the same set of event filters defined by the trace
|
These are exactly the same set of event filters defined by the trace
|
||||||
event subsystem. See the ftrace / tracecmd / kernelshark section for more
|
event subsystem. See the ftrace / trace-cmd / KernelShark section for more
|
||||||
discussion about these event filters.
|
discussion about these event filters.
|
||||||
|
|
||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
@@ -1009,14 +1009,14 @@ purpose of demonstrating how to use filters, it's close enough.
|
|||||||
indispensable part of the perf design as it relates to tracing.
|
indispensable part of the perf design as it relates to tracing.
|
||||||
kernel-based event filters provide a mechanism to precisely throttle
|
kernel-based event filters provide a mechanism to precisely throttle
|
||||||
the event stream that appears in user space, where it makes sense to
|
the event stream that appears in user space, where it makes sense to
|
||||||
provide bindings to real programming languages for postprocessing the
|
provide bindings to real programming languages for post-processing the
|
||||||
event stream. This architecture allows for the intelligent and
|
event stream. This architecture allows for the intelligent and
|
||||||
flexible partitioning of processing between the kernel and user
|
flexible partitioning of processing between the kernel and user
|
||||||
space. Contrast this with other tools such as SystemTap, which does
|
space. Contrast this with other tools such as SystemTap, which does
|
||||||
all of its processing in the kernel and as such requires a special
|
all of its processing in the kernel and as such requires a special
|
||||||
project-defined language in order to accommodate that design, or
|
project-defined language in order to accommodate that design, or
|
||||||
LTTng, where everything is sent to userspace and as such requires a
|
LTTng, where everything is sent to user space and as such requires a
|
||||||
super-efficient kernel-to-userspace transport mechanism in order to
|
super-efficient kernel-to-user space transport mechanism in order to
|
||||||
function properly. While perf certainly can benefit from for instance
|
function properly. While perf certainly can benefit from for instance
|
||||||
advances in the design of the transport, it doesn't fundamentally
|
advances in the design of the transport, it doesn't fundamentally
|
||||||
depend on them. Basically, if you find that your perf tracing
|
depend on them. Basically, if you find that your perf tracing
|
||||||
@@ -1028,7 +1028,7 @@ Using Dynamic Tracepoints
|
|||||||
|
|
||||||
perf isn't restricted to the fixed set of static tracepoints listed by
|
perf isn't restricted to the fixed set of static tracepoints listed by
|
||||||
``perf list``. Users can also add their own "dynamic" tracepoints anywhere
|
``perf list``. Users can also add their own "dynamic" tracepoints anywhere
|
||||||
in the kernel. For instance, suppose we want to define our own
|
in the kernel. For example, suppose we want to define our own
|
||||||
tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf
|
tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf
|
||||||
subcommand::
|
subcommand::
|
||||||
|
|
||||||
@@ -1083,7 +1083,7 @@ up after 30 seconds)::
|
|||||||
[ perf record: Woken up 1 times to write data ]
|
[ perf record: Woken up 1 times to write data ]
|
||||||
[ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ]
|
[ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ]
|
||||||
|
|
||||||
Using ``perf script`` we can see each do_fork event that fired::
|
Using ``perf script`` we can see each ``do_fork`` event that fired::
|
||||||
|
|
||||||
root@crownbay:~# perf script
|
root@crownbay:~# perf script
|
||||||
|
|
||||||
@@ -1125,7 +1125,7 @@ Using ``perf script`` we can see each do_fork event that fired::
|
|||||||
gaku 1312 [000] 34237.202388: do_fork: (c1028460)
|
gaku 1312 [000] 34237.202388: do_fork: (c1028460)
|
||||||
|
|
||||||
And using ``perf report`` on the same file, we can see the
|
And using ``perf report`` on the same file, we can see the
|
||||||
callgraphs from starting a few programs during those 30 seconds:
|
call graphs from starting a few programs during those 30 seconds:
|
||||||
|
|
||||||
.. image:: figures/perf-probe-do_fork-profile.png
|
.. image:: figures/perf-probe-do_fork-profile.png
|
||||||
:align: center
|
:align: center
|
||||||
@@ -1140,11 +1140,11 @@ callgraphs from starting a few programs during those 30 seconds:
|
|||||||
|
|
||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
|
|
||||||
Dynamic tracepoints are implemented under the covers by kprobes and
|
Dynamic tracepoints are implemented under the covers by Kprobes and
|
||||||
uprobes. kprobes and uprobes are also used by and in fact are the
|
Uprobes. Kprobes and Uprobes are also used by and in fact are the
|
||||||
main focus of SystemTap.
|
main focus of SystemTap.
|
||||||
|
|
||||||
Perf Documentation
|
perf Documentation
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
Online versions of the manual pages for the commands discussed in this
|
Online versions of the manual pages for the commands discussed in this
|
||||||
@@ -1168,7 +1168,7 @@ section can be found here:
|
|||||||
|
|
||||||
- The top-level `perf(1) manual page <https://linux.die.net/man/1/perf>`__.
|
- The top-level `perf(1) manual page <https://linux.die.net/man/1/perf>`__.
|
||||||
|
|
||||||
Normally, you should be able to invoke the manual pages via perf itself
|
Normally, you should be able to open the manual pages via perf itself
|
||||||
e.g. ``perf help`` or ``perf help record``.
|
e.g. ``perf help`` or ``perf help record``.
|
||||||
|
|
||||||
To have the perf manual pages installed on your target, modify your
|
To have the perf manual pages installed on your target, modify your
|
||||||
@@ -1183,14 +1183,14 @@ of examples, can also be found in the ``perf`` directory of the kernel tree::
|
|||||||
tools/perf/Documentation
|
tools/perf/Documentation
|
||||||
|
|
||||||
There's also a nice perf tutorial on the perf
|
There's also a nice perf tutorial on the perf
|
||||||
wiki that goes into more detail than we do here in certain areas: `Perf
|
wiki that goes into more detail than we do here in certain areas: `perf
|
||||||
Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__
|
Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__
|
||||||
|
|
||||||
ftrace
|
ftrace
|
||||||
======
|
======
|
||||||
|
|
||||||
"ftrace" literally refers to the "ftrace function tracer" but in reality
|
"ftrace" literally refers to the "ftrace function tracer" but in reality
|
||||||
this encompasses a number of related tracers along with the
|
this encompasses several related tracers along with the
|
||||||
infrastructure that they all make use of.
|
infrastructure that they all make use of.
|
||||||
|
|
||||||
ftrace Setup
|
ftrace Setup
|
||||||
@@ -1199,11 +1199,11 @@ ftrace Setup
|
|||||||
For this section, we'll assume you've already performed the basic setup
|
For this section, we'll assume you've already performed the basic setup
|
||||||
outlined in the ":ref:`profile-manual/intro:General Setup`" section.
|
outlined in the ":ref:`profile-manual/intro:General Setup`" section.
|
||||||
|
|
||||||
ftrace, trace-cmd, and kernelshark run on the target system, and are
|
ftrace, trace-cmd, and KernelShark run on the target system, and are
|
||||||
ready to go out-of-the-box --- no additional setup is necessary. For the
|
ready to go out-of-the-box --- no additional setup is necessary. For the
|
||||||
rest of this section we assume you've ssh'ed to the host and will be
|
rest of this section we assume you're connected to the host through SSH and
|
||||||
running ftrace on the target. kernelshark is a GUI application and if
|
will be running ftrace on the target. KernelShark is a GUI application and if
|
||||||
you use the ``-X`` option to ssh you can have the kernelshark GUI run on
|
you use the ``-X`` option to ``ssh`` you can have the KernelShark GUI run on
|
||||||
the target but display remotely on the host if you want.
|
the target but display remotely on the host if you want.
|
||||||
|
|
||||||
Basic ftrace usage
|
Basic ftrace usage
|
||||||
@@ -1211,8 +1211,8 @@ Basic ftrace usage
|
|||||||
|
|
||||||
"ftrace" essentially refers to everything included in the ``/tracing``
|
"ftrace" essentially refers to everything included in the ``/tracing``
|
||||||
directory of the mounted debugfs filesystem (Yocto follows the standard
|
directory of the mounted debugfs filesystem (Yocto follows the standard
|
||||||
convention and mounts it at ``/sys/kernel/debug``). Here's a listing of all
|
convention and mounts it at ``/sys/kernel/debug``). All the files found in
|
||||||
the files found in ``/sys/kernel/debug/tracing`` on a Yocto system::
|
``/sys/kernel/debug/tracing`` on a Yocto system are::
|
||||||
|
|
||||||
root@sugarbay:/sys/kernel/debug/tracing# ls
|
root@sugarbay:/sys/kernel/debug/tracing# ls
|
||||||
README kprobe_events trace
|
README kprobe_events trace
|
||||||
@@ -1237,7 +1237,7 @@ the ftrace documentation.
|
|||||||
|
|
||||||
We'll start by looking at some of the available built-in tracers.
|
We'll start by looking at some of the available built-in tracers.
|
||||||
|
|
||||||
cat'ing the ``available_tracers`` file lists the set of available tracers::
|
The ``available_tracers`` file lists the set of available tracers::
|
||||||
|
|
||||||
root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers
|
root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers
|
||||||
blk function_graph function nop
|
blk function_graph function nop
|
||||||
@@ -1247,11 +1247,11 @@ The ``current_tracer`` file contains the tracer currently in effect::
|
|||||||
root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer
|
root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer
|
||||||
nop
|
nop
|
||||||
|
|
||||||
The above listing of current_tracer shows that the
|
The above listing of ``current_tracer`` shows that the
|
||||||
``nop`` tracer is in effect, which is just another way of saying that
|
``nop`` tracer is in effect, which is just another way of saying that
|
||||||
there's actually no tracer currently in effect.
|
there's actually no tracer currently in effect.
|
||||||
|
|
||||||
echo'ing one of the available_tracers into ``current_tracer`` makes the
|
Writing one of the available tracers into ``current_tracer`` makes the
|
||||||
specified tracer the current tracer::
|
specified tracer the current tracer::
|
||||||
|
|
||||||
root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer
|
root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer
|
||||||
@@ -1307,7 +1307,7 @@ tracer::
|
|||||||
.
|
.
|
||||||
|
|
||||||
Each line in the trace above shows what was happening in the kernel on a given
|
Each line in the trace above shows what was happening in the kernel on a given
|
||||||
cpu, to the level of detail of function calls. Each entry shows the function
|
CPU, to the level of detail of function calls. Each entry shows the function
|
||||||
called, followed by its caller (after the arrow).
|
called, followed by its caller (after the arrow).
|
||||||
|
|
||||||
The function tracer gives you an extremely detailed idea of what the
|
The function tracer gives you an extremely detailed idea of what the
|
||||||
@@ -1321,7 +1321,7 @@ great way to learn about how the kernel code works in a dynamic sense.
|
|||||||
|
|
||||||
It is a little more difficult to follow the call chains than it needs to
|
It is a little more difficult to follow the call chains than it needs to
|
||||||
be --- luckily there's a variant of the function tracer that displays the
|
be --- luckily there's a variant of the function tracer that displays the
|
||||||
callchains explicitly, called the ``function_graph`` tracer::
|
call chains explicitly, called the ``function_graph`` tracer::
|
||||||
|
|
||||||
root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer
|
root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer
|
||||||
root@sugarbay:/sys/kernel/debug/tracing# cat trace | less
|
root@sugarbay:/sys/kernel/debug/tracing# cat trace | less
|
||||||
@@ -1440,7 +1440,7 @@ As you can see, the ``function_graph`` display is much easier
|
|||||||
to follow. Also note that in addition to the function calls and
|
to follow. Also note that in addition to the function calls and
|
||||||
associated braces, other events such as scheduler events are displayed
|
associated braces, other events such as scheduler events are displayed
|
||||||
in context. In fact, you can freely include any tracepoint available in
|
in context. In fact, you can freely include any tracepoint available in
|
||||||
the trace events subsystem described in the next section by simply
|
the trace events subsystem described in the next section by just
|
||||||
enabling those events, and they'll appear in context in the function
|
enabling those events, and they'll appear in context in the function
|
||||||
graph display. Quite a powerful tool for understanding kernel dynamics.
|
graph display. Quite a powerful tool for understanding kernel dynamics.
|
||||||
|
|
||||||
@@ -1543,7 +1543,7 @@ The ``format`` file for the
|
|||||||
tracepoint describes the event in memory, which is used by the various
|
tracepoint describes the event in memory, which is used by the various
|
||||||
tracing tools that now make use of these tracepoint to parse the event
|
tracing tools that now make use of these tracepoint to parse the event
|
||||||
and make sense of it, along with a ``print fmt`` field that allows tools
|
and make sense of it, along with a ``print fmt`` field that allows tools
|
||||||
like ftrace to display the event as text. Here's what the format of the
|
like ftrace to display the event as text. The format of the
|
||||||
``kmalloc`` event looks like::
|
``kmalloc`` event looks like::
|
||||||
|
|
||||||
root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format
|
root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format
|
||||||
@@ -1596,7 +1596,7 @@ events in the output buffer::
|
|||||||
root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on
|
root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on
|
||||||
|
|
||||||
Now, if we look at the ``trace`` file, we see nothing
|
Now, if we look at the ``trace`` file, we see nothing
|
||||||
but the kmalloc events we just turned on::
|
but the ``kmalloc`` events we just turned on::
|
||||||
|
|
||||||
root@sugarbay:/sys/kernel/debug/tracing# cat trace | less
|
root@sugarbay:/sys/kernel/debug/tracing# cat trace | less
|
||||||
# tracer: nop
|
# tracer: nop
|
||||||
@@ -1651,8 +1651,8 @@ using the ``enable`` file in the subsystem directory) and get an
|
|||||||
arbitrarily fine-grained idea of what's going on in the system by
|
arbitrarily fine-grained idea of what's going on in the system by
|
||||||
enabling as many of the appropriate tracepoints as applicable.
|
enabling as many of the appropriate tracepoints as applicable.
|
||||||
|
|
||||||
A number of the tools described in this HOWTO do just that, including
|
Several tools described in this How-to do just that, including
|
||||||
``trace-cmd`` and kernelshark in the next section.
|
``trace-cmd`` and KernelShark in the next section.
|
||||||
|
|
||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
|
|
||||||
@@ -1668,7 +1668,7 @@ A number of the tools described in this HOWTO do just that, including
|
|||||||
``/sys/kernel/debug/tracing`` will be removed and replaced with
|
``/sys/kernel/debug/tracing`` will be removed and replaced with
|
||||||
equivalent tracers based on the "trace events" subsystem.
|
equivalent tracers based on the "trace events" subsystem.
|
||||||
|
|
||||||
trace-cmd / kernelshark
|
trace-cmd / KernelShark
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
trace-cmd is essentially an extensive command-line "wrapper" interface
|
trace-cmd is essentially an extensive command-line "wrapper" interface
|
||||||
@@ -1677,24 +1677,23 @@ that hides the details of all the individual files in
|
|||||||
events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to
|
events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to
|
||||||
collect traces and avoid having to deal with those details directly.
|
collect traces and avoid having to deal with those details directly.
|
||||||
|
|
||||||
As yet another layer on top of that, kernelshark provides a GUI that
|
As yet another layer on top of that, KernelShark provides a GUI that
|
||||||
allows users to start and stop traces and specify sets of events using
|
allows users to start and stop traces and specify sets of events using
|
||||||
an intuitive interface, and view the output as both trace events and as
|
an intuitive interface, and view the output as both trace events and as
|
||||||
a per-CPU graphical display. It directly uses trace-cmd as the
|
a per-CPU graphical display. It directly uses trace-cmd as the
|
||||||
plumbing that accomplishes all that underneath the covers (and actually
|
plumbing that accomplishes all that underneath the covers (and actually
|
||||||
displays the trace-cmd command it uses, as we'll see).
|
displays the trace-cmd command it uses, as we'll see).
|
||||||
|
|
||||||
To start a trace using kernelshark, first start kernelshark::
|
To start a trace using KernelShark, first start this tool::
|
||||||
|
|
||||||
root@sugarbay:~# kernelshark
|
root@sugarbay:~# kernelshark
|
||||||
|
|
||||||
Then bring up the ``Capture`` dialog by
|
Then open up the ``Capture`` dialog by choosing from the KernelShark menu::
|
||||||
choosing from the kernelshark menu::
|
|
||||||
|
|
||||||
Capture | Record
|
Capture | Record
|
||||||
|
|
||||||
That will display the following dialog, which allows you to choose one or more
|
That will display the following dialog, which allows you to choose one or more
|
||||||
events (or even one or more complete subsystems) to trace:
|
events (or even entire subsystems) to trace:
|
||||||
|
|
||||||
.. image:: figures/kernelshark-choose-events.png
|
.. image:: figures/kernelshark-choose-events.png
|
||||||
:align: center
|
:align: center
|
||||||
@@ -1702,7 +1701,7 @@ events (or even one or more complete subsystems) to trace:
|
|||||||
|
|
||||||
Note that these are exactly the same sets of events described in the
|
Note that these are exactly the same sets of events described in the
|
||||||
previous trace events subsystem section, and in fact is where trace-cmd
|
previous trace events subsystem section, and in fact is where trace-cmd
|
||||||
gets them for kernelshark.
|
gets them for KernelShark.
|
||||||
|
|
||||||
In the above screenshot, we've decided to explore the graphics subsystem
|
In the above screenshot, we've decided to explore the graphics subsystem
|
||||||
a bit and so have chosen to trace all the tracepoints contained within
|
a bit and so have chosen to trace all the tracepoints contained within
|
||||||
@@ -1716,12 +1715,12 @@ will turn into the 'Stop' button after the trace has started):
|
|||||||
:align: center
|
:align: center
|
||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
Notice that the right-hand pane shows the exact trace-cmd command-line
|
Notice that the right pane shows the exact trace-cmd command-line
|
||||||
that's used to run the trace, along with the results of the trace-cmd
|
that's used to run the trace, along with the results of the trace-cmd
|
||||||
run.
|
run.
|
||||||
|
|
||||||
Once the ``Stop`` button is pressed, the graphical view magically fills up
|
Once the ``Stop`` button is pressed, the graphical view magically fills up
|
||||||
with a colorful per-cpu display of the trace data, along with the
|
with a colorful per-CPU display of the trace data, along with the
|
||||||
detailed event listing below that:
|
detailed event listing below that:
|
||||||
|
|
||||||
.. image:: figures/kernelshark-i915-display.png
|
.. image:: figures/kernelshark-i915-display.png
|
||||||
@@ -1736,7 +1735,7 @@ events``:
|
|||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
The tool is pretty self-explanatory, but for more detailed information
|
The tool is pretty self-explanatory, but for more detailed information
|
||||||
on navigating through the data, see the `kernelshark
|
on navigating through the data, see the `KernelShark
|
||||||
website <https://kernelshark.org/Documentation.html>`__.
|
website <https://kernelshark.org/Documentation.html>`__.
|
||||||
|
|
||||||
ftrace Documentation
|
ftrace Documentation
|
||||||
@@ -1752,41 +1751,41 @@ Documentation directory::
|
|||||||
|
|
||||||
Documentation/trace/events.txt
|
Documentation/trace/events.txt
|
||||||
|
|
||||||
There is a nice series of articles on using ftrace and trace-cmd at LWN:
|
A nice series of articles on using ftrace and trace-cmd are available at LWN:
|
||||||
|
|
||||||
- `Debugging the kernel using Ftrace - part
|
- `Debugging the kernel using ftrace - part
|
||||||
1 <https://lwn.net/Articles/365835/>`__
|
1 <https://lwn.net/Articles/365835/>`__
|
||||||
|
|
||||||
- `Debugging the kernel using Ftrace - part
|
- `Debugging the kernel using ftrace - part
|
||||||
2 <https://lwn.net/Articles/366796/>`__
|
2 <https://lwn.net/Articles/366796/>`__
|
||||||
|
|
||||||
- `Secrets of the Ftrace function
|
- `Secrets of the ftrace function
|
||||||
tracer <https://lwn.net/Articles/370423/>`__
|
tracer <https://lwn.net/Articles/370423/>`__
|
||||||
|
|
||||||
- `trace-cmd: A front-end for
|
- `trace-cmd: A front-end for
|
||||||
Ftrace <https://lwn.net/Articles/410200/>`__
|
ftrace <https://lwn.net/Articles/410200/>`__
|
||||||
|
|
||||||
See also `KernelShark's documentation <https://kernelshark.org/Documentation.html>`__
|
See also `KernelShark's documentation <https://kernelshark.org/Documentation.html>`__
|
||||||
for further usage details.
|
for further usage details.
|
||||||
|
|
||||||
An amusing yet useful README (a tracing mini-HOWTO) can be found in
|
An amusing yet useful README (a tracing mini-How-to) can be found in
|
||||||
``/sys/kernel/debug/tracing/README``.
|
``/sys/kernel/debug/tracing/README``.
|
||||||
|
|
||||||
systemtap
|
SystemTap
|
||||||
=========
|
=========
|
||||||
|
|
||||||
SystemTap is a system-wide script-based tracing and profiling tool.
|
SystemTap is a system-wide script-based tracing and profiling tool.
|
||||||
|
|
||||||
SystemTap scripts are C-like programs that are executed in the kernel to
|
SystemTap scripts are C-like programs that are executed in the kernel to
|
||||||
gather / print / aggregate data extracted from the context they end up being
|
gather / print / aggregate data extracted from the context they end up being
|
||||||
invoked under.
|
called under.
|
||||||
|
|
||||||
For example, this probe from the `SystemTap
|
For example, this probe from the `SystemTap
|
||||||
tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a
|
tutorial <https://sourceware.org/systemtap/tutorial/>`__ just prints a
|
||||||
line every time any process on the system runs ``open()`` on a file. For each line,
|
line every time any process on the system runs ``open()`` on a file. For each line,
|
||||||
it prints the executable name of the program that opened the file, along
|
it prints the executable name of the program that opened the file, along
|
||||||
with its PID, and the name of the file it opened (or tried to open),
|
with its PID, and the name of the file it opened (or tried to open), which it
|
||||||
which it extracts from the open syscall's argstr.
|
extracts from the argument string (``argstr``) of the ``open`` system call.
|
||||||
|
|
||||||
.. code-block:: none
|
.. code-block:: none
|
||||||
|
|
||||||
@@ -1801,13 +1800,13 @@ which it extracts from the open syscall's argstr.
|
|||||||
}
|
}
|
||||||
|
|
||||||
Normally, to execute this
|
Normally, to execute this
|
||||||
probe, you'd simply install systemtap on the system you want to probe,
|
probe, you'd just install SystemTap on the system you want to probe,
|
||||||
and directly run the probe on that system e.g. assuming the name of the
|
and directly run the probe on that system e.g. assuming the name of the
|
||||||
file containing the above text is trace_open.stp::
|
file containing the above text is ``trace_open.stp``::
|
||||||
|
|
||||||
# stap trace_open.stp
|
# stap trace_open.stp
|
||||||
|
|
||||||
What systemtap does under the covers to run this probe is 1) parse and
|
What SystemTap does under the covers to run this probe is 1) parse and
|
||||||
convert the probe to an equivalent "C" form, 2) compile the "C" form
|
convert the probe to an equivalent "C" form, 2) compile the "C" form
|
||||||
into a kernel module, 3) insert the module into the kernel, which arms
|
into a kernel module, 3) insert the module into the kernel, which arms
|
||||||
it, and 4) collect the data generated by the probe and display it to the
|
it, and 4) collect the data generated by the probe and display it to the
|
||||||
@@ -1820,25 +1819,25 @@ kernel build system unfortunately isn't typically part of the image
|
|||||||
running on the target. It is normally available on the "host" system
|
running on the target. It is normally available on the "host" system
|
||||||
that produced the target image however; in such cases, steps 1 and 2 are
|
that produced the target image however; in such cases, steps 1 and 2 are
|
||||||
executed on the host system, and steps 3 and 4 are executed on the
|
executed on the host system, and steps 3 and 4 are executed on the
|
||||||
target system, using only the systemtap "runtime".
|
target system, using only the SystemTap "runtime".
|
||||||
|
|
||||||
The systemtap support in Yocto assumes that only steps 3 and 4 are run
|
The SystemTap support in Yocto assumes that only steps 3 and 4 are run
|
||||||
on the target; it is possible to do everything on the target, but this
|
on the target; it is possible to do everything on the target, but this
|
||||||
section assumes only the typical embedded use-case.
|
section assumes only the typical embedded use-case.
|
||||||
|
|
||||||
So basically what you need to do in order to run a systemtap script on
|
Therefore, what you need to do in order to run a SystemTap script on
|
||||||
the target is to 1) on the host system, compile the probe into a kernel
|
the target is to 1) on the host system, compile the probe into a kernel
|
||||||
module that makes sense to the target, 2) copy the module onto the
|
module that makes sense to the target, 2) copy the module onto the
|
||||||
target system and 3) insert the module into the target kernel, which
|
target system and 3) insert the module into the target kernel, which
|
||||||
arms it, and 4) collect the data generated by the probe and display it
|
arms it, and 4) collect the data generated by the probe and display it
|
||||||
to the user.
|
to the user.
|
||||||
|
|
||||||
systemtap Setup
|
SystemTap Setup
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
Those are a lot of steps and a lot of details, but fortunately Yocto
|
Those are many steps and details, but fortunately Yocto
|
||||||
includes a script called ``crosstap`` that will take care of those
|
includes a script called ``crosstap`` that will take care of those
|
||||||
details, allowing you to simply execute a systemtap script on the remote
|
details, allowing you to just execute a SystemTap script on the remote
|
||||||
target, with arguments if necessary.
|
target, with arguments if necessary.
|
||||||
|
|
||||||
In order to do this from a remote host, however, you need to have access
|
In order to do this from a remote host, however, you need to have access
|
||||||
@@ -1851,7 +1850,7 @@ having done a build::
|
|||||||
Error: No target kernel build found.
|
Error: No target kernel build found.
|
||||||
Did you forget to create a local build of your image?
|
Did you forget to create a local build of your image?
|
||||||
|
|
||||||
'crosstap' requires a local sdk build of the target system
|
'crosstap' requires a local SDK build of the target system
|
||||||
(or a build that includes 'tools-profile') in order to build
|
(or a build that includes 'tools-profile') in order to build
|
||||||
kernel modules that can probe the target system.
|
kernel modules that can probe the target system.
|
||||||
|
|
||||||
@@ -1867,11 +1866,11 @@ Practically speaking, that means you need to do the following:
|
|||||||
the BSP README and/or the widely available basic documentation
|
the BSP README and/or the widely available basic documentation
|
||||||
that discusses how to build images).
|
that discusses how to build images).
|
||||||
|
|
||||||
- Build an -sdk version of the image e.g.::
|
- Build an ``-sdk`` version of the image e.g.::
|
||||||
|
|
||||||
$ bitbake core-image-sato-sdk
|
$ bitbake core-image-sato-sdk
|
||||||
|
|
||||||
- Or build a non-sdk image but include the profiling tools
|
- Or build a non-SDK image but include the profiling tools
|
||||||
(edit ``local.conf`` and add ``tools-profile`` to the end of
|
(edit ``local.conf`` and add ``tools-profile`` to the end of
|
||||||
:term:``EXTRA_IMAGE_FEATURES`` variable)::
|
:term:``EXTRA_IMAGE_FEATURES`` variable)::
|
||||||
|
|
||||||
@@ -1887,15 +1886,14 @@ Practically speaking, that means you need to do the following:
|
|||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
SystemTap, which uses ``crosstap``, assumes you can establish an ssh
|
SystemTap, which uses ``crosstap``, assumes you can establish an SSH
|
||||||
connection to the remote target. Please refer to the crosstap wiki
|
connection to the remote target. Please refer to the crosstap wiki
|
||||||
page for details on verifying ssh connections. Also, the ability to ssh
|
page for details on verifying SSH connections. Also, the ability to SSH
|
||||||
into the target system is not enabled by default in ``*-minimal`` images.
|
into the target system is not enabled by default in ``*-minimal`` images.
|
||||||
|
|
||||||
So essentially what you need to
|
Therefore, what you need to do is build an SDK image or image with
|
||||||
do is build an SDK image or image with ``tools-profile`` as detailed in
|
``tools-profile`` as detailed in the ":ref:`profile-manual/intro:General Setup`"
|
||||||
the ":ref:`profile-manual/intro:General Setup`" section of this
|
section of this manual, and boot the resulting target image.
|
||||||
manual, and boot the resulting target image.
|
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
@@ -1903,12 +1901,12 @@ manual, and boot the resulting target image.
|
|||||||
to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and
|
to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and
|
||||||
the kernel in that machine's :term:`Build Directory` must match the kernel on
|
the kernel in that machine's :term:`Build Directory` must match the kernel on
|
||||||
the booted system exactly, or you'll get the above ``crosstap`` message
|
the booted system exactly, or you'll get the above ``crosstap`` message
|
||||||
when you try to invoke a script.
|
when you try to call a script.
|
||||||
|
|
||||||
Running a Script on a Target
|
Running a Script on a Target
|
||||||
----------------------------
|
----------------------------
|
||||||
|
|
||||||
Once you've done that, you should be able to run a systemtap script on
|
Once you've done that, you should be able to run a SystemTap script on
|
||||||
the target::
|
the target::
|
||||||
|
|
||||||
$ cd /path/to/yocto
|
$ cd /path/to/yocto
|
||||||
@@ -1937,7 +1935,7 @@ If you get an error connecting to the target e.g.::
|
|||||||
$ crosstap root@192.168.7.2 trace_open.stp
|
$ crosstap root@192.168.7.2 trace_open.stp
|
||||||
error establishing ssh connection on remote 'root@192.168.7.2'
|
error establishing ssh connection on remote 'root@192.168.7.2'
|
||||||
|
|
||||||
Try ssh'ing to the target and see what happens::
|
Try connecting to the target through SSH and see what happens::
|
||||||
|
|
||||||
$ ssh root@192.168.7.2
|
$ ssh root@192.168.7.2
|
||||||
|
|
||||||
@@ -1955,7 +1953,7 @@ no password):
|
|||||||
matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
|
matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
|
||||||
matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
|
matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
|
||||||
|
|
||||||
systemtap Documentation
|
SystemTap Documentation
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
The SystemTap language reference can be found here: `SystemTap Language
|
The SystemTap language reference can be found here: `SystemTap Language
|
||||||
@@ -1968,7 +1966,7 @@ page <https://sourceware.org/systemtap/documentation.html>`__
|
|||||||
Sysprof
|
Sysprof
|
||||||
=======
|
=======
|
||||||
|
|
||||||
Sysprof is a very easy to use system-wide profiler that consists of a
|
Sysprof is an easy to use system-wide profiler that consists of a
|
||||||
single window with three panes and a few buttons which allow you to
|
single window with three panes and a few buttons which allow you to
|
||||||
start, stop, and view the profile from one place.
|
start, stop, and view the profile from one place.
|
||||||
|
|
||||||
@@ -1978,16 +1976,16 @@ Sysprof Setup
|
|||||||
For this section, we'll assume you've already performed the basic setup
|
For this section, we'll assume you've already performed the basic setup
|
||||||
outlined in the ":ref:`profile-manual/intro:General Setup`" section.
|
outlined in the ":ref:`profile-manual/intro:General Setup`" section.
|
||||||
|
|
||||||
Sysprof is a GUI-based application that runs on the target system. For
|
Sysprof is a GUI-based application that runs on the target system. For the rest
|
||||||
the rest of this document we assume you've ssh'ed to the host and will
|
of this document we assume you're connected to the host through SSH and will be
|
||||||
be running Sysprof on the target (you can use the ``-X`` option to ssh and
|
running Sysprof on the target (you can use the ``-X`` option to ``ssh`` and
|
||||||
have the Sysprof GUI run on the target but display remotely on the host
|
have the Sysprof GUI run on the target but display remotely on the host
|
||||||
if you want).
|
if you want).
|
||||||
|
|
||||||
Basic Sysprof Usage
|
Basic Sysprof Usage
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
To start profiling the system, you simply press the ``Start`` button. To
|
To start profiling the system, you just press the ``Start`` button. To
|
||||||
stop profiling and to start viewing the profile data in one easy step,
|
stop profiling and to start viewing the profile data in one easy step,
|
||||||
press the ``Profile`` button.
|
press the ``Profile`` button.
|
||||||
|
|
||||||
@@ -2001,11 +1999,11 @@ with profiling data:
|
|||||||
The left pane shows a list of functions and processes. Selecting one of
|
The left pane shows a list of functions and processes. Selecting one of
|
||||||
those expands that function in the right pane, showing all its callees.
|
those expands that function in the right pane, showing all its callees.
|
||||||
Note that this caller-oriented display is essentially the inverse of
|
Note that this caller-oriented display is essentially the inverse of
|
||||||
perf's default callee-oriented callchain display.
|
perf's default callee-oriented call chain display.
|
||||||
|
|
||||||
In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and
|
In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and
|
||||||
looking up the callchain we can see that one of the callers of
|
looking up the call chain we can see that one of the callers of
|
||||||
``__copy_to_user_ll`` is ``sys_read()`` and the complete callpath between them.
|
``__copy_to_user_ll`` is ``sys_read()`` and the complete call path between them.
|
||||||
Notice that this is essentially a portion of the same information we saw
|
Notice that this is essentially a portion of the same information we saw
|
||||||
in the perf display shown in the perf section of this page.
|
in the perf display shown in the perf section of this page.
|
||||||
|
|
||||||
@@ -2014,7 +2012,7 @@ in the perf display shown in the perf section of this page.
|
|||||||
:width: 70%
|
:width: 70%
|
||||||
|
|
||||||
Similarly, the above is a snapshot of the Sysprof display of a
|
Similarly, the above is a snapshot of the Sysprof display of a
|
||||||
``copy-from-user`` callchain.
|
``copy-from-user`` call chain.
|
||||||
|
|
||||||
Finally, looking at the third Sysprof pane in the lower left, we can see
|
Finally, looking at the third Sysprof pane in the lower left, we can see
|
||||||
a list of all the callers of a particular function selected in the top
|
a list of all the callers of a particular function selected in the top
|
||||||
@@ -2030,18 +2028,17 @@ to the selected function, and so on.
|
|||||||
|
|
||||||
.. admonition:: Tying it Together
|
.. admonition:: Tying it Together
|
||||||
|
|
||||||
If you like sysprof's ``caller-oriented`` display, you may be able to
|
If you like Sysprof's ``caller-oriented`` display, you may be able to
|
||||||
approximate it in other tools as well. For example, ``perf report`` has
|
approximate it in other tools as well. For example, ``perf report`` has
|
||||||
the ``-g`` (``--call-graph``) option that you can experiment with; one of the
|
the ``-g`` (``--call-graph``) option that you can experiment with; one of the
|
||||||
options is ``caller`` for an inverted caller-based callgraph display.
|
options is ``caller`` for an inverted caller-based call graph display.
|
||||||
|
|
||||||
Sysprof Documentation
|
Sysprof Documentation
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
There doesn't seem to be any documentation for Sysprof, but maybe that's
|
There doesn't seem to be any documentation for Sysprof, but maybe that's
|
||||||
because it's pretty self-explanatory. The Sysprof website, however, is
|
because it's pretty self-explanatory. The Sysprof website, however, is here:
|
||||||
here: `Sysprof, System-wide Performance Profiler for
|
`Sysprof, System-wide Performance Profiler for Linux <http://sysprof.com/>`__
|
||||||
Linux <http://sysprof.com/>`__
|
|
||||||
|
|
||||||
LTTng (Linux Trace Toolkit, next generation)
|
LTTng (Linux Trace Toolkit, next generation)
|
||||||
============================================
|
============================================
|
||||||
@@ -2051,7 +2048,7 @@ LTTng Setup
|
|||||||
|
|
||||||
For this section, we'll assume you've already performed the basic setup
|
For this section, we'll assume you've already performed the basic setup
|
||||||
outlined in the ":ref:`profile-manual/intro:General Setup`" section.
|
outlined in the ":ref:`profile-manual/intro:General Setup`" section.
|
||||||
LTTng is run on the target system by ssh'ing to it.
|
LTTng is run on the target system by connecting to it through SSH.
|
||||||
|
|
||||||
Collecting and Viewing Traces
|
Collecting and Viewing Traces
|
||||||
-----------------------------
|
-----------------------------
|
||||||
@@ -2064,7 +2061,7 @@ tracing.
|
|||||||
Collecting and viewing a trace on the target (inside a shell)
|
Collecting and viewing a trace on the target (inside a shell)
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
First, from the host, ssh to the target::
|
First, from the host, connect to the target through SSH::
|
||||||
|
|
||||||
$ ssh -l root 192.168.1.47
|
$ ssh -l root 192.168.1.47
|
||||||
The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established.
|
The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established.
|
||||||
@@ -2156,11 +2153,11 @@ supplying your own name to ``lttng create``)::
|
|||||||
drwxr-xr-x 5 root root 1024 Oct 15 23:57 ..
|
drwxr-xr-x 5 root root 1024 Oct 15 23:57 ..
|
||||||
drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120
|
drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120
|
||||||
|
|
||||||
Collecting and viewing a userspace trace on the target (inside a shell)
|
Collecting and viewing a user space trace on the target (inside a shell)
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
For LTTng userspace tracing, you need to have a properly instrumented
|
For LTTng user space tracing, you need to have a properly instrumented
|
||||||
userspace program. For this example, we'll use the ``hello`` test program
|
user space program. For this example, we'll use the ``hello`` test program
|
||||||
generated by the ``lttng-ust`` build.
|
generated by the ``lttng-ust`` build.
|
||||||
|
|
||||||
The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust``
|
The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust``
|
||||||
@@ -2176,7 +2173,7 @@ Copy that over to the target machine::
|
|||||||
You now have the instrumented LTTng "hello world" test program on the
|
You now have the instrumented LTTng "hello world" test program on the
|
||||||
target, ready to test.
|
target, ready to test.
|
||||||
|
|
||||||
First, from the host, ssh to the target::
|
First, from the host, connect to the target through SSH::
|
||||||
|
|
||||||
$ ssh -l root 192.168.1.47
|
$ ssh -l root 192.168.1.47
|
||||||
The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established.
|
The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established.
|
||||||
@@ -2191,7 +2188,7 @@ Once on the target, use these steps to create a trace::
|
|||||||
Session auto-20190303-021943 created.
|
Session auto-20190303-021943 created.
|
||||||
Traces will be written in /home/root/lttng-traces/auto-20190303-021943
|
Traces will be written in /home/root/lttng-traces/auto-20190303-021943
|
||||||
|
|
||||||
Enable the events you want to trace (in this case all userspace events)::
|
Enable the events you want to trace (in this case all user space events)::
|
||||||
|
|
||||||
root@crownbay:~# lttng enable-event --userspace --all
|
root@crownbay:~# lttng enable-event --userspace --all
|
||||||
All UST events are enabled in channel channel0
|
All UST events are enabled in channel channel0
|
||||||
@@ -2263,13 +2260,13 @@ the entire blktrace and blkparse pipeline on the target, or you can run
|
|||||||
blktrace in 'listen' mode on the target and have blktrace and blkparse
|
blktrace in 'listen' mode on the target and have blktrace and blkparse
|
||||||
collect and analyze the data on the host (see the
|
collect and analyze the data on the host (see the
|
||||||
":ref:`profile-manual/usage:Using blktrace Remotely`" section
|
":ref:`profile-manual/usage:Using blktrace Remotely`" section
|
||||||
below). For the rest of this section we assume you've ssh'ed to the host and
|
below). For the rest of this section we assume you've to the host through SSH
|
||||||
will be running blkrace on the target.
|
and will be running blktrace on the target.
|
||||||
|
|
||||||
Basic blktrace Usage
|
Basic blktrace Usage
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
To record a trace, simply run the ``blktrace`` command, giving it the name
|
To record a trace, just run the ``blktrace`` command, giving it the name
|
||||||
of the block device you want to trace activity on::
|
of the block device you want to trace activity on::
|
||||||
|
|
||||||
root@crownbay:~# blktrace /dev/sdc
|
root@crownbay:~# blktrace /dev/sdc
|
||||||
@@ -2280,10 +2277,10 @@ In another shell, execute a workload you want to trace::
|
|||||||
Connecting to downloads.yoctoproject.org (140.211.169.59:80)
|
Connecting to downloads.yoctoproject.org (140.211.169.59:80)
|
||||||
linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
|
linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
|
||||||
|
|
||||||
Press Ctrl-C in the blktrace shell to stop the trace. It
|
Press ``Ctrl-C`` in the blktrace shell to stop the trace. It
|
||||||
will display how many events were logged, along with the per-cpu file
|
will display how many events were logged, along with the per-cpu file
|
||||||
sizes (blktrace records traces in per-cpu kernel buffers and simply
|
sizes (blktrace records traces in per-cpu kernel buffers and just
|
||||||
dumps them to userspace for blkparse to merge and sort later)::
|
dumps them to user space for blkparse to merge and sort later)::
|
||||||
|
|
||||||
^C=== sdc ===
|
^C=== sdc ===
|
||||||
CPU 0: 7082 events, 332 KiB data
|
CPU 0: 7082 events, 332 KiB data
|
||||||
@@ -2299,7 +2296,7 @@ with the device name as the first part of the filename::
|
|||||||
-rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0
|
-rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0
|
||||||
-rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1
|
-rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1
|
||||||
|
|
||||||
To view the trace events, simply invoke ``blkparse`` in the directory
|
To view the trace events, just call ``blkparse`` in the directory
|
||||||
containing the trace files, giving it the device name that forms the
|
containing the trace files, giving it the device name that forms the
|
||||||
first part of the filenames::
|
first part of the filenames::
|
||||||
|
|
||||||
@@ -2398,8 +2395,8 @@ Live Mode
|
|||||||
~~~~~~~~~
|
~~~~~~~~~
|
||||||
|
|
||||||
blktrace and blkparse are designed from the ground up to be able to
|
blktrace and blkparse are designed from the ground up to be able to
|
||||||
operate together in a "pipe mode" where the stdout of blktrace can be
|
operate together in a "pipe mode" where the standard output of blktrace can be
|
||||||
fed directly into the stdin of blkparse::
|
fed directly into the standard input of blkparse::
|
||||||
|
|
||||||
root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i -
|
root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i -
|
||||||
|
|
||||||
@@ -2468,7 +2465,7 @@ just ended::
|
|||||||
Total: 11800 events (dropped 0), 554 KiB data
|
Total: 11800 events (dropped 0), 554 KiB data
|
||||||
|
|
||||||
The blktrace instance on the host will
|
The blktrace instance on the host will
|
||||||
save the target output inside a hostname-timestamp directory::
|
save the target output inside a ``<hostname>-<timestamp>`` directory::
|
||||||
|
|
||||||
$ ls -al
|
$ ls -al
|
||||||
drwxr-xr-x 10 root root 1024 Oct 28 02:40 .
|
drwxr-xr-x 10 root root 1024 Oct 28 02:40 .
|
||||||
@@ -2540,7 +2537,7 @@ Tracing Block I/O via 'ftrace'
|
|||||||
It's also possible to trace block I/O using only
|
It's also possible to trace block I/O using only
|
||||||
:ref:`profile-manual/usage:The 'trace events' Subsystem`, which
|
:ref:`profile-manual/usage:The 'trace events' Subsystem`, which
|
||||||
can be useful for casual tracing if you don't want to bother dealing with the
|
can be useful for casual tracing if you don't want to bother dealing with the
|
||||||
userspace tools.
|
user space tools.
|
||||||
|
|
||||||
To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``,
|
To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``,
|
||||||
where ``xxx`` is the device name. This for example enables tracing for
|
where ``xxx`` is the device name. This for example enables tracing for
|
||||||
@@ -2598,7 +2595,7 @@ section can be found here:
|
|||||||
- https://linux.die.net/man/8/btrace
|
- https://linux.die.net/man/8/btrace
|
||||||
|
|
||||||
The above manual pages, along with manuals for the other blktrace utilities
|
The above manual pages, along with manuals for the other blktrace utilities
|
||||||
(btt, blkiomon, etc) can be found in the ``/doc`` directory of the blktrace
|
(``btt``, ``blkiomon``, etc) can be found in the ``/doc`` directory of the blktrace
|
||||||
tools git repo::
|
tools git repository::
|
||||||
|
|
||||||
$ git clone git://git.kernel.dk/blktrace.git
|
$ git clone git://git.kernel.dk/blktrace.git
|
||||||
|
|||||||
@@ -1,4 +1,20 @@
|
|||||||
|
autovivification
|
||||||
|
blkparse
|
||||||
|
blktrace
|
||||||
|
callee
|
||||||
|
debugfs
|
||||||
ftrace
|
ftrace
|
||||||
toolchain
|
KernelShark
|
||||||
systemd
|
Kprobe
|
||||||
LTTng
|
LTTng
|
||||||
|
perf
|
||||||
|
profiler
|
||||||
|
subcommand
|
||||||
|
subnode
|
||||||
|
superset
|
||||||
|
Sysprof
|
||||||
|
systemd
|
||||||
|
toolchain
|
||||||
|
tracepoint
|
||||||
|
Uprobe
|
||||||
|
wget
|
||||||
|
|||||||
@@ -1,4 +1,5 @@
|
|||||||
Yocto
|
|
||||||
BSP
|
|
||||||
BitBake
|
BitBake
|
||||||
|
BSP
|
||||||
|
crosstap
|
||||||
OpenEmbedded
|
OpenEmbedded
|
||||||
|
Yocto
|
||||||
|
|||||||
Reference in New Issue
Block a user