bitbake: siggen: Fix insufficent entropy in sigtask file names

Signature generation uses mkstemp() to get a file descriptor to a unique
file and then write the signature into it. However, the unique file name
generation in glibc is based on the system timestamp, which means that
with highly parallel builds it is more likely than one might expect
expected that a conflict will occur between two different builder nodes.
When operating over NFS (such as a shared sstate cache), this can cause
race conditions and rare failures (particularly with NFS servers that
may not correctly implement O_EXCL).

The signature generation code is particularly susceptible to races since
a single "sigtask." prefix used for all signatures from all tasks, which
makes collision even more likely.

To work around this, add an internal implementation of mkstemp() that
adds additional truly random entropy to the file name to eliminate
conflicts.

(Bitbake rev: 97955f3c1c738aa4b4478a6ec10a08094ffc689d)

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This commit is contained in:
Joshua Watt
2022-08-03 09:04:41 -05:00
committed by Richard Purdie
parent c58fca98e4
commit fdc9d9e4fc
2 changed files with 22 additions and 1 deletions

View File

@@ -425,7 +425,7 @@ class SignatureGeneratorBasic(SignatureGenerator):
bb.error("Taskhash mismatch %s versus %s for %s" % (computed_taskhash, self.taskhash[tid], tid))
sigfile = sigfile.replace(self.taskhash[tid], computed_taskhash)
fd, tmpfile = tempfile.mkstemp(dir=os.path.dirname(sigfile), prefix="sigtask.")
fd, tmpfile = bb.utils.mkstemp(dir=os.path.dirname(sigfile), prefix="sigtask.")
try:
with bb.compress.zstd.open(fd, "wt", encoding="utf-8", num_threads=1) as f:
json.dump(data, f, sort_keys=True, separators=(",", ":"), cls=SetEncoder)

View File

@@ -28,6 +28,8 @@ import signal
import collections
import copy
import ctypes
import random
import tempfile
from subprocess import getstatusoutput
from contextlib import contextmanager
from ctypes import cdll
@@ -1754,3 +1756,22 @@ def is_local_uid(uid=''):
if str(uid) == line_split[2]:
return True
return False
def mkstemp(suffix=None, prefix=None, dir=None, text=False):
"""
Generates a unique filename, independent of time.
mkstemp() in glibc (at least) generates unique file names based on the
current system time. When combined with highly parallel builds, and
operating over NFS (e.g. shared sstate/downloads) this can result in
conflicts and race conditions.
This function adds additional entropy to the file name so that a collision
is independent of time and thus extremely unlikely.
"""
entropy = "".join(random.choices("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890", k=20))
if prefix:
prefix = prefix + entropy
else:
prefix = tempfile.gettempprefix() + entropy
return tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=dir, text=text)