This will use default values when no distribution is set.
[YOCTO #15086]
(From OE-Core rev: 01eb8d4ad71c587d56608d83ec4187375b2f4c44)
Signed-off-by: Thomas Roos <throos@amazon.de>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
(cherry picked from commit 888fe63b46efceeff08dbe8c4f66fec33d06cb7a)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Read from serial console with a small delay to bundle data to e.g.
full lines. Reading one character at a time is not needed and causes
busy looping.
(From OE-Core rev: e91a09702680b713293bcfcc851b27a73e884a8b)
Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
(cherry picked from commit 0049f6757f6f956fb4cc77b3df6a672c20b53cf4)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
When a qemu machine hangs, the QMP calls can hang for ever
too, and when this happens any failing test commands from ssh
runner may be followed by dump_monitor() calls which
then also hang. Hangs followed by hangs.
Use runqemutime at setup and run_monitor() specific timeout
for later calls.
(From OE-Core rev: 3b99d0ce6445084038f89dfa98605a7aec49107b)
Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
(cherry picked from commit 3a07bdf77dc6ecbf4c620b051dd032abaaf1e4ff)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Use a short sleep to bundle serial console reads so that
we are not reading one character at a time which reduces busy
looping.
(From OE-Core rev: 3699e5bf2f9259266c49aaf69127183988b9d052)
Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
(cherry picked from commit cafe65d8cf7544edbd387f7f5f6d77c64c6b18fa)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
If test target qemu machine hangs completely, dump_target() calls
over serial console are taking a long time to time out, possibly
for every failing ssh command execution and a lot of test cases,
and same with dump_monitor().
Instead of trying for ever, count errors and after 5 stop trying
to dump_target() and dump_monitor() completely.
These help to end testing earlier when a test target is completely
deadlocked and all ssh, serial and QMP communication with it are
failing.
(From OE-Core rev: 91bc1e03bc990c527d8aadbdcd7bf97217db124e)
Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
(cherry picked from commit d9ad0a055abba983c6cee1dca4d2f0a8a3c48782)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This does not actually guarantee that the child runqemu process has completely exited:
poll() may return prematurely while the SIGTERM handler in runqemu is still running.
This thwarts the rest of the processing, and may terminate the handler before
it completes.
Use Popen.communicate() instead: this is what python documentation recommends as well:
https://docs.python.org/3/library/subprocess.html#subprocess.Popen.communicate
(From OE-Core rev: 88e7bac0f06d4f12cd6f078d37e4e9975871860f)
Signed-off-by: Alexander Kanavin <alex@linutronix.de>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit cd3e55606c427287f37585c5d7cde936471e52f4)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Updates the log message printed when login banner is seen in QEMU to
report the UNIX Epoch time in addition to the human readable time. This
makes it much easier and accurate to correlate logs with the guest, in
particular with the guest journalctl which prints log timestamps in
human readable format and the oeqa SSH debug logging which prints the
UNIX Epoch.
(From OE-Core rev: 2a860de611bebae2e1100380b975b7648b8560d9)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit 275b6f3c8d0eeafa3902c48a49655491a89c47bc)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Add a comment explaining the non-obvious return codes.
(From OE-Core rev: cdf3a1a20f02f43451f86a321c001e6b049a4ffc)
Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit 6572baffa02ba6b8a686490d55af17cacb528920)
Signed-off-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
| Failed to dump QMP CMD: query-status with
| Exception: [Errno 2] No such file or directory: '.../tmp/log/runtime-hostdump/qmp_00_query-status'
| Failed to dump QMP CMD: query-block with
| Exception: [Errno 2] No such file or directory: '.../tmp/log/runtime-hostdump/qmp_00_query-block'
| Failed to dump QMP CMD: dump-guest-memory with
| Exception: [Errno 2] No such file or directory: '.../tmp/log/runtime-hostdump/qmp_00_dump-guest-memory'
The qmp dump commands could fail, because of missing root directory.
So create it before any log writing.
(From OE-Core rev: c4dc5d674afe65fedb5195f187b68f23720646ba)
Signed-off-by: Andrej Valek <andrej.valek@siemens.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
The ltp compliancy parser is rewritten to actually
match the logs: they seem to be unstructured, test case names
are not printed and the only indication of failure is appearance of
FAIL[ED] somewhere.
(From OE-Core rev: 52766561dbfee625c89393905a85e10d85f69c6c)
Signed-off-by: Alexander Kanavin <alex@linutronix.de>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
sys.exit will cause finally statements and other code to run at exit. Since
we're using os.fork() here, os._exit() is apprioriate in this codepath.
(From OE-Core rev: ec08498ff29de9ccd23be88b9d7af3dab6bbb81e)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This is done when starting up qemu has failed, but is not done
when qemu started ok, but fails later in QMP communication.
Output from runqemu does contain valuable information to find out
why, so rather than fix all the QMP fails to include it, let's just
print it in stop().
(From OE-Core rev: 6e2bf68e4401db747484c2c8ba0f77500b1d2d49)
Signed-off-by: Alexander Kanavin <alex.kanavin@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Files in /proc/xxx/map_files/ may no longer exist, just ignore this rather than
raising an exception.
(From OE-Core rev: fb1027896a263cd91e2378a4e97dbdf0807b306b)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Need to ensure that the dump_dir is created correctly and available
When command arguemnts are passed construct a filename if needed and
convert the arguements to a json object to pass to QMP.
(From OE-Core rev: 9a2f4e1e95f4a3f7ebbf08f46445c8ea670adce3)
Signed-off-by: Saul Wold <saul.wold@windriver.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Add a couple of logging info to track time between activities, first
is from after the Popen(launch_cmd) to after qmp.connect(), second is
from qmp.connect() to the release of the qemu via the qmp("cont") command
this includes the mmap() activity.
Example output:
QMP connected to QEMU at 06/24/21 11:11:56 and took 0.9556229114532471 seconds from launch
QMP released QEMU at 06/24/21 11:11:56 and took 0.26789021492004395 seconds from connect
(From OE-Core rev: 547f49230ba4ebeefe5b696e0460ebaffa8e91e6)
Signed-off-by: Saul Wold <saul.wold@windriver.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This will allow for direct ssh connection without breaking
the first one that is used for monitoring. The "nowait" option
will cause qmp server connection to NOT block waiting.
(From OE-Core rev: 40f09e184afd42decf2f924896fef03beacddc4b)
Signed-off-by: Saul Wold <saul.wold@windriver.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We now spend time copying the VM image into a tmpfs and with IO load on the
system, the time + the boot time of the VM can take longer than 120s. Increase
the timeout to match the added overhead of copying the image file.
(From OE-Core rev: a40087c966af5ffb9309e1ddfdb3d06973e0bddd)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We've seeing issues where IO load appears to cause strange failures due to timeouts
within qemu. One theory for these is that it is is hitting hard page faults
at in-opportune moments which cause timing problems within the VM.
This patch is a bit of a hack which tries to ensure the data is paged in
at a point when we know we can take the time delays (waiting for the QMP
start signal). Whilst this isn't ideal, it does seem to improve things on
the autobuilder and shouldn't harm anything.
The code figures out which files to read my looking at the mmap'd files
the process has open from /proc. On Centos7 systems these files are not
user readable, if that is the case we just skip them.
(From OE-Core rev: e77844314d09ceff9c22338d366519928f4f7284)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We had debugging for qemu faiing to start which was no longer reachable
after the QMP changes. Reorder the code to enable this debugging to work
again which may allow insight into autobuilder failures in this area.
(From OE-Core rev: 8fac8c61565977c775d8ede5bddc856b7767a3e4)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
When running a shutdown command, the serial port can close without the
command returning. This is seen as the socket being readable but having
no data. Change the way this case is handled in the code to avoid
tracebacks.
(From OE-Core rev: 396a3ba884820d040c91f7592daf20ac28c49b5d)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
The recent logging changes for qemurunner showed up as errors on the
autobuilder where decode couldn't be called on the returned string.
Since the code returns binary data, return b'' instead of '' to match
to avoid tracebacks.
One of these cases was newly added, copied from the other which has
been there for a long time, always broken.
(From OE-Core rev: b8995b27db265b0a0b2d2ca595915f70f9f96e07)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This information is useful, but should not be a warning level.
[YOCTO #14382]
(From OE-Core rev: cd17d8bb00be1ecb7c92ab13eb8b162807aefed9)
Signed-off-by: Saul Wold <saul.wold@windriver.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
After the addition of the qmp socket, runqemu started failing:
ERROR - Failed to run qemu: qemu-system-aarch64: -qmp unix:/home/yocto/actions-runner-meta-openembedded/_work/meta-openembedded/meta-openembedded/yoe/build/tmp/.3eg5fiid,server,wait:
UNIX socket path '/home/yocto/actions-runner-meta-openembedded/_work/meta-openembedded/meta-openembedded/yoe/build/tmp/.3eg5fiid' is too long
Path must be less than 108 bytes
To avoid this, run qemu within tmpdir and use a relative path to the socket.
This avoids having to patch the socket code within qemu.
Update the client code to chdir and only use a relative path to the socket
to match.
(From OE-Core rev: 5c56e72fca18dc942f5c1fd377e98d46ae0126f1)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Rather than totally disabling the logging, inform it we're about to exit
so we can log messages over the exit cleanly too. This aids debugging. It
also avoids a race where the logging handler could still error whilst
shutting down.
Also remove a race window by notificing the handler of the shutdown
first, before triggering it. This removes a race window I watched in
local testing.
(From OE-Core rev: 0e19f31a1005f94105e1cef252abfffcef2aafad)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
LD_LIBRARY_PATH leaks into host executables too, and breaks them
as they are not uninative-enabled. E.g. on ubuntu 18.04 trying
to run host bash with a sysroot that was built on Fedora 33:
akanavin@ubuntu1804-ty-3:/home/pokybuild/yocto-worker/oe-selftest-ubuntu/build/build-st-24341/tmp/work/x86_64-linux/gnupg-native/2.3.1-r0/recipe-sysroot-native$ LD_LIBRARY_PATH=./usr/lib /bin/bash
/bin/bash: ./usr/lib/libtinfo.so.5: no version information available (required by /bin/bash)
/bin/bash: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by ./usr/lib/libtinfo.so.5)
This was seen e.g. here:
https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/2090/steps/14/logs/stdio
(From OE-Core rev: 0e9850486b74a3de934527ca1077df001d3a8d22)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This adds support for the Qemu Machine Protocol [0] extending
the current dump process for Host and Target. The commands are
added in the testimage.bbclass.
Currently, we setup qemu to stall until qmp gets connected and
sends the initialization and continue commands, this works
correctly. If the UNIX Socket does not exist, we wait an timeout
to ensure to socket file is created.
With this version, the monitor_dumper is created in OEQemuTarget
but then set in OESSHTarget as that's where we get the SSH failure
happens. Python's @property is used to create a setter/getter type
of setup in OESSHTarget to get overridden by OEQemuTarget.
By default the data is currently dumped to files for each command in
TMPDIR/log/runtime-hostdump/<date>_qmp/unknown_<seq>_qemu_monitor as
this is the naming convenstion in the dump.py code.
We use the qmp.py from qemu, which needs to get installed in the
recipe-sysroot-native of the target image.
[0] https://github.com/qemu/qemu/blob/master/docs/interop/qmp-spec.txt
(From OE-Core rev: 42af4cd2df72fc8ed9deb3fde4312909842fcf91)
Signed-off-by: Saul Wold <saul.wold@windriver.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We have a working theory that IO queues on the autobuilder are impacting
runtime testing under qemu, particularly async writes which inice does not
influence. We already pass the snapshot option to qemu which copies the
image and runs out of the copy. Add in the ability to copy the image to
a specificed location which can be a tmpfs. This means that writes to the
image would no longer be blocked by other writes to disk in the system.
Preliminary tests show that this does improve the qemu errors at the expense
of sometimes showing qemu startup timeouts as on a loaded system with a large
test image, it can take longer than 120s to copy the image to tmpfs. Having
a most consistent failure mode for loaded tests is probably desireable though.
(From OE-Core rev: fd1c26ab426c3699ffd8082b83d65a84c8eb8bff)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Some platforms do not use ttyS* for their serial consoles (e.g., qemuarm
and qemuarm64). The hardcoding of this can cause issues. Modify
runqemu to use the serial consoles defined in SERIAL_CONSOLES instead of
hardcoding.
(From OE-Core rev: 9dea4cd2f9f46ab3a75562639a22d8f56b4d26af)
Signed-off-by: Jon Mason <jon.mason@arm.com>
Change-Id: I746d56de5669c955c5e29d3ded70c0a4d3171f17
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Bitbake changed the debug() logging call to make it compatible with
standard python logging by no longer including a debug level as the
first argument. Fix up the few places this was being used.
Tweaked version of a patch from Joshua Watt.
(From OE-Core rev: 5aecb6df67b876aa12eec54998f209d084579599)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Avoid command not found errors shown in selftest logs due to changes to PATH
settings which also risks intermittent problems due to IO load.
(From OE-Core rev: 40bcae01b0be2f293dea9ab42c6b7f8f47827cf5)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Python 3.9 dropped isAlive() so use the preferred is_alive().
(From OE-Core rev: 9bb06428cbb2ac0f3d98a1696f050d3393385503)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We previously put a sync call into devtool to try and combat the bitbake
timeout issues on the autobuilder. It isn't enough as the timeouts occur
mid test. They are also occurring on non-devtool tests.
Add in sync calls around command execution instead.
(From OE-Core rev: ceca5ed121e2b54415a7ab3a217882e4ea86923a)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Increase the serial login timeout from 60 to 120s. This seems like a
long time, however for a qemumips image with systemd+PAM and openssh,
(e.g. core-image-sato-sdk + DISTRO=poky-altcfg), the getty connects
to systemd's pam module which waits on logind and 45s for all this
to happen at the same time as things like ssh key generation happens
is not unknown.
Increase the timeout to match the longer times we know these things
can take in the worst case scenarios since we're tired of intermittent
issues related to the serial login affecting the autobuilder.
(From OE-Core rev: d8b4292db741de660f756dfb766210814d587b7a)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This is a part of a refactor that will split the package manager
code so that it's possible to use other package managers in other
layers.
RP: Fixes to parse/build
(From OE-Core rev: 510d5c48c0496f23a3d7aede76ea8735da2d371d)
Signed-off-by: Fredrik Gustafsson <fredrigu@axis.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This is part of a refactor that will split the package manager
code so that it's possible to use other package managers in other
layers.
RP: Fixes to parse/build
(From OE-Core rev: 3ef5a3c885e1010cddfe7eba1cd3728f15270d78)
Signed-off-by: Fredrik Gustafsson <fredrigu@axis.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This is part of a refactor that will split the package manager
code so that it's possible to use other package managers in other
layers.
RP: Fixes to parse/build
(From OE-Core rev: 8b776ed9ed291dd8e112621561762449c7eb5ee2)
Signed-off-by: Fredrik Gustafsson <fredrigu@axis.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We're seeing failures due to system load. In theory we've set process
nice levels which should compensate for this. Add debugging so we can
find out if they're being correctly applied.
(From OE-Core rev: 1e4e345bba8216b9b5623682206a7dae7cad261c)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
One element of the error message guarded against None as a value
but I missed the other, fix this.
(From OE-Core rev: dbce6baec68d7658453b8c44159e1d1fef746151)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
When qemu fails to start we're struggling to work out why. Add more debug
info which can at least confirm/rule out various things.
This code is only on the error handling path and more info shoudl help
us debug issues.
(From OE-Core rev: 3001d0d8f3429e5ff0c37ea7192e85e7001cdb32)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
The pid location could vary due to changes in cwd as only a filename
is specified, not a full path. This in theory could be resulting in
some of our autobuilder failures. Whilst its difficult to know if this
is causing a problem, Using a full path removes any question of such an
issue.
(From OE-Core rev: 55c186ff410c99570242478b99ac24ebc40aa6bd)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Occasionally we've been seeing leftover threads from runCmd. The stdin test
assumes we clean up all threads but the code assumes that the daemonic thread
can be left behind.
The issue can be reproduced by adding a time.sleep(10) to the end of
writeThread() which will mean it stays resident past the end of the command.
We may as well add it to the threads list and clean it up properly,
hopefully removing the race in the tests from the autobuilder.
[YOCTO #13055]
(From OE-Core rev: 9b251dcaffe52d32c1faf41ab57ab414fbc29722)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
We're seeing:
WARNING: bitbake/lib/bb/cookerdata.py:136: ResourceWarning: unclosed file <_io.FileIO
name='tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/testimage/qemurunner_log.20200601181912'
mode='ab' closefd=True
which can only be caused by the qemu.stop() method not being called.
Tweak the error handling to fix the blanket exception handler which
is likely meaning this function isn't getting called.
(From OE-Core rev: ee707090848d793e3b2d82dd3861ae22222682c0)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
When falling back from detecting ip from /proc/./cmdline the
output of runqemu is acutally
'Network configuration: ip=192.168.7.2::192.168.7.1::255.255.255.0'
which doesn't match the given regex and leading to run failure, although
IP is detectable.
Fix regex by inserting an optional 'ip=' prefix to first IP
(From OE-Core rev: 75f2471d15fab024775c59cb70c54e3f25f9ae72)
Signed-off-by: Konrad Weihmann <kweihmann@outlook.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>