Distributed chrooted pkgsrc bulk builds
Once you are up and running with pkgsrc, one of the most common requests is a way to automatically build a number of packages, either a specific list plus their dependencies, or everything currently available.
There are a number of ways to accomplish this, but for this tutorial I will
concentrate on pbulk
, as it is used by a number of pkgsrc developers, and
has support for distributed and chrooted builds.
In this example I am building a set of pkgsrc-2013Q2
packages, and I have
tested it on:
- Linux
- OSX
- SmartOS
Please let me know if it doesn’t work correctly on your platform.
Layout
First, have a think about where you will store pkgsrc, source tarballs,
packages, etc. I put everything under /content
which makes it easy to then
mount just that directory and have everything below it available:
/content/bulklog
is where pbulk saves the per-package build logs/content/distfiles
is where source tarballs are kept/content/mk
contains somemake
fragment files for configuration/content/packages
is the top-level directory of binary packages/content/pkgsrc
is where pkgsrc is located/content/scripts
to hold some miscellaneous scripts
Pre-create some required directories:
Then write a couple of mk.conf
files which will be used by the packaging tools.
/content/mk/mk-generic.conf
:
/content/mk/mk-pbulk.conf
:
/content/mk/mk-pkg.conf
:
Get pkgsrc
Either use git..
..or CVS
At the present time (2013Q2
) there are a couple of patches you need to apply,
one for mksandbox to support some additional features, and one for pbulk to
support chroots and a couple of other bits we’ve developed at Joyent.
Build pbulk
pbulk needs to be installed to its own prefix, from where it will manage the main build.
Then build the necessary pacakges
It is recommended that builds are done as an unprivileged user, which is normally
named pbulk
, so now would be a good time to create that user, usually with
something like this:
Check the useradd/groupadd
syntax for your system. The user can be set to
no-password
, it will only be used via su
from the root user.
Set up chroot
Next, check that the mksandbox script works on your system. It is designed to
be cross-platform, but on certain systems (e.g. OSX) there is no native support
for loopback mounts, and so you will first need to configure NFS in order to
share system directories to the chroot, usually with / -alldirs -maproot=root
in /etc/exports
then nfsd enable
.
Once you are happy the chroot is working as expected, write a couple of wrapper scripts to create and delete them with an optional argument with the name of the chroot, which will be used by pbulk. Below are the scripts I use.
/content/scripts/mksandbox
:
/content/scripts/rmsandbox
:
Build pkg bootstrap
Next step is to build the bootstrap for the target packages, i.e. the main
prefix you will be using. Again we use the bootstrap script, but here you
may want to tweak the settings - check the pkgsrc guide or the --help
output for more information.
If the prefix you want to build for (i.e. /usr/pkg
) is already in use on the
system, simply do the bootstrap inside a chroot.
I use something like this:
Configure pbulk
Now we’re finally ready to configure pbulk. There is a single configuration file you need to edit, and I will show the changes I have made to it.
This section adds a ulimit
to stop runaway processes from hanging the build.
This section configures the location of the bulk build report. I upload my results to Joyent’s Manta object store as it allows arbitrary storage plus distributed Unix queries on the data at a later time.
Turn on reuse_scan_results
, it makes subsequent runs faster.
In this example I am using a single host which will perform concurrent builds
inside chroots, and so I need to unset scan_clients
and build_clients
and
set master_ip
to localhost.
If you have multiple hosts, simple set master_ip
to a public address, and add
the list of slave IP addresses to *_clients
. They will need to be accessible
via SSH as root from the master, and will need to have their own installs of
/usr/pbulk
as well as sharing the same /content
mount as the master, most
likely over NFS.
If you wish to completely disable any concurrency or distributed builds, set
master_mode=no
, though note that the build with then run completely
single-threaded and will be much slower.
If you wish to publish to Manta, here are the settings you will need. I have
installed a local copy of the Manta tools to /content/manta
, as the upload
script will need them.
Configure the location where to rsync packages to and where to send the report.
If you are not using Manta, then you will want to set report_rsync_target
to
an appropriate location.
Where to find the /usr/pkg
bootstrap tarball:
Configure build chroots. Here we set the paths to the mksandbox
and
rmsandbox
scripts we created earlier, and provide a basename of the chroot
directory. By setting chroot_dir=/chroot/pkgsrc-2013Q2
, pbulk will actually
create /chroot/pkgsrc-2013Q1-build-{1,2,3,4}
and
/chroot/pkgsrc-2013Q1-scan-{1,2,3,4}
.
You will want to experiment with the tradoffs between MAKE_JOBS
and the
number of chroots. Generally it will be better to have more chroots compared
to an increase in MAKE_JOBS
, as certain parts of the build will be single
threaded anyway (e.g. large configure scripts). However, you also need to be
aware of the increased disk I/O caused by too many chroots.
As long as you have everything correctly shared, there is nothing stopping you using distributed hosts and chroots, and it is highly recommended if you can as clearly it provides the best performance. With such a setup, at Joyent we are able to do full bulk builds of all 12,000 packages in pkgsrc in under 12 hours.
Finally, configure paths to the ones we have chosen.
One option not mentioned above is limited_list
. If you only want to build a
subset of packages rather than run a full bulk build, simply set limited_list
to a file containing paths to packages you want. It is worth doing this
initially anyway, just to check that everything is working fine, e.g.:
Run the bulk build
Assuming everything was done correctly, it should now just be a matter of
running the bulkbuild. If you have set the chroot_*
variables then this will
run chrooted at the appropriate places, so that your host system’s /usr/pkg
is not affected.
One of the benefits of the Joyent patch is that it adds support for different
configuration files, so if you really want to you can run concurrent instances
of pbulk. Just write separate pbulk.conf
files and then pass them as
arguments to bulkbuild
. Again, we use this to run multiple builds across the
same hosts, all thanks to the chroot support.
Caveats
There are some known issues, I will document them here as they are found.
OSX chroot DNS resolution
On OSX, name resolution is broken inside a chroot. This is due to
mDNSResponder being used for DNS lookups, which relies on the
/var/run/mDNSResponder
UNIX socket. Unfortunately, making that socket
available in the chroot (either by mounting or creating a proxy with socat
)
does not fix the issue, so I would welcome input on this.
For now you need to set MASTER_SITE_OVERRIDE
and then ensure that the IP
address for that mirror is set in /etc/hosts
.
Chroot creation race conditions
As you can see in my example mksandbox
script, I have to work around a race
condition where a previous scan run may still be cleaning up whilst a new one
is starting. For now I am simply sleeping until the chroot is free, but this
should be fixed properly, probably with process groups and waiting for them to
complete.
All Posts
- 16 Jul 2015 » Reducing RAM usage in pkgin
- 03 Mar 2015 » pkgsrc-2014Q4: LTS, signed packages, and more
- 06 Oct 2014 » Building packages at scale
- 04 Dec 2013 » A node.js-powered 8-bit CPU - part four
- 03 Dec 2013 » A node.js-powered 8-bit CPU - part three
- 02 Dec 2013 » A node.js-powered 8-bit CPU - part two
- 01 Dec 2013 » A node.js-powered 8-bit CPU - part one
- 21 Nov 2013 » MDB support for Go
- 30 Jul 2013 » What's new in pkgsrc-2013Q2
- 24 Jul 2013 » Distributed chrooted pkgsrc bulk builds
- 07 Jun 2013 » pkgsrc on SmartOS - creating new packages
- 15 Apr 2013 » What's new in pkgsrc-2013Q1
- 19 Mar 2013 » Installing SVR4 packages on SmartOS
- 27 Feb 2013 » SmartOS is Not GNU/Linux
- 18 Feb 2013 » SmartOS development preview dataset
- 17 Jan 2013 » pkgsrc on SmartOS - fixing broken builds
- 15 Jan 2013 » pkgsrc on SmartOS - zone creation and basic builds
- 10 Jan 2013 » Multi-architecture package support in SmartOS
- 09 Jan 2013 » Solaris portability - cfmakeraw()
- 08 Jan 2013 » Solaris portability - flock()
- 06 Jan 2013 » pkgsrc-2012Q4 illumos packages now available
- 23 Nov 2012 » SmartOS and the global zone
- 24 Oct 2012 » Setting up Samba on SmartOS
- 10 Oct 2012 » pkgsrc-2012Q3 packages for illumos
- 23 Aug 2012 » Creating local SmartOS packages
- 10 Jul 2012 » 7,000 binary packages for OSX Lion
- 09 Jul 2012 » 9,000 packages for SmartOS and illumos
- 07 May 2012 » Goodbye Oracle, Hello Joyent!
- 13 Apr 2012 » SmartOS global zone tweaks
- 12 Apr 2012 » Automated VirtualBox SmartOS installs
- 30 Mar 2012 » iptables script for Debian / Ubuntu
- 20 Feb 2012 » New site design
- 11 Jan 2012 » Set up anonymous FTP upload on Oracle Linux
- 09 Jan 2012 » Kickstart Oracle Linux in VirtualBox
- 09 Jan 2012 » Kickstart Oracle Linux from Ubuntu
- 22 Dec 2011 » Last day at MySQL
- 15 Dec 2011 » Installing OpenBSD with softraid
- 21 Sep 2011 » Create VirtualBox VM from the command line
- 14 Sep 2011 » Creating chroots for fun and MySQL testing
- 30 Jun 2011 » Graphing memory usage during an MTR run
- 29 Jun 2011 » Fix input box keybindings in Firefox
- 24 Jun 2011 » How to lose weight
- 23 Jun 2011 » How to fix stdio buffering
- 13 Jun 2011 » Serving multiple DNS search domains in IOS DHCP
- 13 Jun 2011 » Fix Firefox URL double click behaviour
- 20 Apr 2011 » SSH via HTTP proxy in OSX
- 09 Nov 2010 » How to build MySQL releases
- 29 Apr 2010 » 'apt-get' and 5,000 packages for Solaris10/x86
- 16 Sep 2009 » ZFS and NFS vs OSX
- 12 Sep 2009 » pkgsrc on Solaris
- 09 Dec 2008 » Jumpstart from OSX
- 31 Dec 2007 » Set up local caching DNS server on OSX 10.4