The Perpetual Notion: portage

Showing posts with label portage. Show all posts

20090223

Bundled Libraries in XBMC and Boxee

Recently, I became interested in building Boxee and XBMC and so I wrote up a quick ebuild, as I usually do with packages that I like to build. Incidentally, this happened on the same day that Mike Frysinger (a.k.a. vapier) submitted the media-tv/xbmc-9999 ebuild to the portage tree. A couple of bugs quickly arose, namely #198849, in which there was mention of this post by Diego (a.k.a flameeyes) where he warns about the use of bundled libraries.

Now, first of all, the term 'bundling a library' refers to the act of taking a snapshot of the source code of an open-source library, e.g. a jpeg library, and building it in to a separate application. While the separate application is in development, the original source code branch of the jpeg library could evolve and it could cover up vulnerabilities, etc, while the snapshot of the jpeg library used in the application is not updated. Bundling libraries is essentially 'forking' the project. Diego mentions that there are several reasons not to do this, including security reasons.

However, my main argument against bundling libraries, is that

bundling libraries makes tracking changes very, very, very difficult!

New applications may very well feel that it is necessary to bundle applications at the beginning of the project to avoid breakage due to API changes, and that has some merit. However, I would like to stress, that one of the initial goals of a new open-source project should be to remove those bundled libraries, and isolate which versions of external software packages are required. When that is done, changes can be made much, much easier and incrementally, making only tiny ripples in the 'dependency pond' as opposed to large waves.

When a project gets to be as large as Boxee, having a dozen or more bundled libraries, at a point in it's development cycle that is so close to a major release, the job of reducing and upstreaming the respective changesets of each individual bundled library becomes enormous. I can say, honestly, because I am currently packaging both XBMC and Boxee, is that it is ugly, and I don't want to be stuck with the job of committing upstream changesets. Furthermore, XBMC / Boxee will not achieve an acceptible, stable state in the portage tree by Gentoo (and potentially other distros) until they stop bundling libraries.

My advice to the maintainers of these projects is to 'prune' their trees one-bundled-library at-a-time. Please help to reduce package maintainer headaches.

20080724

Bin-Ebuilds for Grails, Groovy, Gant, & Eclipse-3.4

Eclipse-3.4 has been out for some time. Unfortunately, the Gentoo-Java maintainer was devaway for a while and couldn't make an ebuild to build Eclipse-3.4 from source. Hence Gentoo bug #229609.

I say "to hell with building Eclipse from source" ... especially not for me, being an EEE PC user :)

For those who would just like the binary package for Eclipse 3.4 (eclipse-sdk-bin), and for those who may also be interested in groovy-bin, grails-bin, or gant-bin, try this out:

echo "dev-portage/eclipse-sdk-bin" >> /etc/portage/package.keywords
PORTAGE_BINHOST="http://virtb.visibleassets.com:2080/geeentoo/packages/All" \
FEATURES="getbinpkg" \
emerge -Kav1 =dev-util/eclipse-sdk-bin-3.4

For installing dev-java/grails-bin, dev-java/groovy-bin, or dev-java/gant-bin simply substitute the package name in the commands above.

Note: dev-java/grails-bin already contains precompiled Gant and Groovy jars, in order to run grails, but it doesn't support Groovy or Gant from the command line. If you need a command-line Groovy or Gant, then install groovy-bin or gant-bin as well.

As usual, you can download any of my Portage overlays with the command below (requires app-portage/layman).

layman -o http://virtb.visibleassets.com:2080/layman.conf -L -v --nocolor | less

Then use the '/' search function of the pager application 'less' to look for the string 'virtb' and you should be able to see my repositories. I guess for security purposes, and the fact that mine are not official gentoo overlays, they are not displayed by default unless you specify the '-v' option to layman.

You can add the overlay with

layman -o http://virtb.visibleassets.com:2080/layman.conf -a [overlayName]

Currently, overlayName can be one of "vuze-bin, eclipse, grails-groovy-gant, eee". If you'd like to check out any of my other overlays (mainly for ARM development), try using http://vaiprime.visibleassets.com/~cfriedt/layman.conf .

20080611

How-To: The Full Portage Tree on the EEE PC

Gentoo is often considered to be 'bloated' because the Portage tree takes up at least 500 MB on disk. Depending on the filesystem, that could mean that the usage can be sometimes up to 750 MB!

On a UMPC such as the EEE, with only very limited hard-disk space, 750 MB is far over the limit of acceptibility.

My first solution was simply to use a binary Gentoo system. The portage tree was not necessary as long as an internet connection was available and a suitable binary package repository was configured. That has actually been working incredibly well and I have no complaints yet whatsoever. However, I do occasionally like to look into the portage tree for examples on creating ebuilds when I'm doing custom software packaging, so i thought it would be nice to have it wherever I can take my EEE.

Then someone on EEE-User mentioned using SquashFS for the Portage tree. This made absolute sense, because the Portage tree did not need to be updated frequently at all, and could easily be made read only. SquashFS enabled me to have the benefits of a source-based Gentoo distribution on my EEE but compressed the Portage tree from 700 MB to 42 MB !!!

The following 5 steps will demonstrate how easy it is to use Gentoo - even with it's "bloated" Portage tree - on the EEE PC.

Note: I performed these steps on a modified EEE PC with 2GB of physical RAM, which explains how I could mount 768MB of RAM as tmpfs. If you have less than 2GB of physical RAM, then I would suggest making the SquashFS Portage image on a regular desktop computer running Gentoo Linux.

Step 1: Find a portage mirror
You can find all of the Gentoo mirrors on the official Gentoo mirror list. I use

MIRROR="http://gentoo.mirrors.tera-byte.com"

Step 2: Install squashfs-tools

emerge -av1 squashfs-tools

Step 3: Download and Extract the Latest Portage Snapshot

mkdir -p /tmp/tmp2
mount -o size=768m -t tmpfs none /tmp/tmp2
wget -O - "${MIRROR}"/snapshots/portage-latest.tar.bz2 | tar xpvjf - -C /tmp/tmp2

Step 4: Create the SquashFS Image

mksquashfs /tmp/tmp2/portage /tmp/tmp2/portage.sqfs
mv /tmp/tmp2/portage.sqfs /usr

umount /tmp/tmp2
rmdir /tmp/tmp2

Step 5: Create init.d and conf.d entries to simplify or automate mounting

/etc/init.d/portage-squashfs:

#!/sbin/runscript
# Copyright 1999-2007 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2

depend() {
need localmount
}

checkopts() {
for var in "${MTPNT}"x "${PSQFS}"x; do
if [ "${var}" = "x" ]; then
eerror "one of the necessary variables was not defined"
return 1
fi
done

if [ "$(grep "squashfs" /proc/filesystems)" = "" ]; then
eerror "SquashFS is not supported by your kernel"
return 1
fi
if [ ! -e "${PSQFS}" ]; then
eerror "${PSQFS}: No such file or directory"
return 1
fi
if [ ! -d "${MTPNT}" ]; then
eerror "${MTPNT}: No such file or directory"
return 1
fi
}

start() {
local mtopts="-t squashfs -o loop,ro"

einfo "mounting ${PSQFS} at ${MTPNT}"
checkopts || ( eend 1; return 1 )
mount ${mtopts} ${PSQFS} ${MTPNT}
eend $?
}

stop() {
einfo "unmounting ${MTPNT}"
umount ${MTPNT}
}

# vim:ts=4

/etc/conf.d/portage-squashfs:

MTPNT=/usr/portage
PSQFS=/usr/portage.sqfs

Lastly, don't forget to make the init script runnable:

chmod +x /etc/init.d/portage-squashfs

Notes:

Extracting the portage tree and creating the SquashFS image in /tmp will work only if you have >= 1GB of RAM, approximately. I was using the EEE PC 8G, which comes with 1GB of RAM and there were no problems at all. Alternatively, if you have >= 800MB on your root device, then you could perform the operation there, but it would be much slower. If another desktop or laptop PC is available, a better alternative is to build the portage.sqfs file on the other machine and then copy it on to your EEE.

Also, it should go without saying that you will need to have root permissions to do this - use 'sudo -s'

You should also be aware, that this will make /usr/portage read-only. Therefore, in /etc/make.conf, set DISTDIR="/tmp/distdir" and PKGDIR="/tmp/binpkgs", or something similar.

20080325

Having Fun with Embedded Gentoo on the TS72xx SBC

Hi everyone!

Having left Kiel last week and after a week of being back in Montréal doing some hard work, I've gotten my toolchains, overlays and binary package repositories for the TS72xx board to a very stable state.

I'm now using dropbear, openvpn, jamvm-1.5.1, classpath-0.97.1, and qmerge, thanks to the ease of building binary packages using Gentoo/Portage, xmerge, etc. Here are a couple of screenshots that can hopefully do justice to all of the work I've done with the 'compatibility' toolchain. The compatibility toolchain and binaries have been tested to be fully compatible with the kernel, libc, and so on distributed by Technologic Systems.

The toolchains, overlays, and repositories are for both arm-unknown-linux-gnu and armv4tl-maverick-linux-gnueabi. I've also updated the documentation on the Gentoo-Wiki:

Gentoo for the TS72xx (Old Toolchain)
Gentoo for the TS72xx Single Board Computer (Full Distro)

I had to fix the __NR_waitpid definition (removed it) in the linux kernel headers which solved a lot of glibc related issues I was having.

Next, I compiled busybox, etc, etc, got qmerge working on the board, and I've also compiled jamvm-1.5.1 and the classpath-0.97.1 . There was a slight problem with cross-compiling the GNU classpath shown below in the classpath bug.

One really strange problem that I had was with portage-utils. The md5sum code comes directly out of the busybox source code. The md5sum comman works perfectly on my board, but as is, the md5 calculations that qmerge made were always wrong. I didn't really want to spend too much time fixing the source code, so instead, I just created a patch for qmerge so that it calls 'md5sum' through popen. That fixed the problem, although the overhead is obviously very high. I should probably put in a patch since Ned Ludd, the original portage-utils author has been so helpful to me on the gentoo-embedded list. Thanks again Ned ;-)

I also noticed that busybox wouldn't reboot the system (using the older toolchain) so I had to make a patch which used the busybox-1.0.0 code for 'reboot'.

I'm hoping that some autotools / libtool genius decides to fix the classpath bug below, because I really don't want to spend any more time on it than I have to. The autotools should really _not_ abandon supporting cross-toolchains. There should really be a regular check made whenever AC_CHECK_LIB or AC_SEARCH_LIBS are made to see if the library location pulled from the .la file actually reflects the location of the .so or .a library in question.

As always, see my overlays for all patches to the original source code.

Some bugs filed:
Gentoo Bug #213690
Classpath Bug #35684

20080211

Follow Up to EABI / Maverick Userland

Hi Everyone,

I left off with instructions for creating an eabi / maverick toolchain for the ts72xx boards from EmbeddedARM.com, and am still sifting my way through some pretty dense decisions.

First of all, Gentoo's Portage build system makes it really easy to cross-compile packages and maintain dependencies. I think it's fantastic. The ony tricky part is in how the package database is maintained on-board. It's not even a database actually, it's right in the filesystem. Depending on the block size, even a 2-byte file could take up 4 kB of space, which is way beyond anything conceptual for an on-board package database for embedded devices.

I've considered writing in some sqlite code into the 'qmerge' app that comes with portage-utils. The benefit is that the sqlite file would save potentially doezens of kB of space. Furthermore, the sqlite file could be compressed in the filesystem after updates have taken place. When any package maintenance needs to be done, the sqlite database could be extracted to a tmpfs-mounted directory, and all of the operations could be performed there.

An alternative to that, is actually just to use the current filesystem-based package db and simply compress that directory without the sqlite backend. Then when updates were necessary, it would be possible to extract that to tmpfs, and re-compress when done.

Time is of the essence on my latest project, so it's likely that I won't be able to fully code that into portage-utils until my current project is matured and nearing completion. The matured stage is probably 1 month from now, and the completion stage is probably closer to 2 months away.

I'll post my instructions, as planned on gentoo-wiki.com for doing cross-compilation and building up to a certain stage, but more instructions on how to maintain packages on a live system may be delayed for some time.

20071116

P2P Distributed Filesystem for Portage Binary Packages

After my latest post about the exponential size of a potential Gentoo / Portage database of binaries (indexed by use flags, build dependencies, etc) I just came up with a fairly interesting idea.

Who else (who we all know and love) has as much (and likely far more) data to index? Obviously, Google ;-) Googles method of indexing data is using their distributed filesystem.

So why couldn't binaries based on Portage ebuilds be indexed in such a fashion. Well, since the volume of data, indexed by use flags, build dependencies, etc, would be so massive, it's unlikely that any single, community-driven server could host such data alone.

If the community was involved, though, it wouldn't be too far fetched to make the binary-distribution distributed filesystem available on Peer-to-Peer networks. The same hashing technique could be used for each of the various packages that's being built. Furthermore, as has been pointed out by others already, the environmental impact that Portage has is probably intriguing, considering that every package installed by the average user is compiled from source.

And let's be honest with each other - the bottom line in using software is using it, and not building it.

++ for P2P networking AND Gentoo :)

Update [ 2007-11-30 ]: I also mentioned this once on Daniel Robbins' blog - Funtoo

20071103

Upgrading to Gnome 2.20 with Gentoo

Hi Everyone,

I thought I would post my experiences upgrading to Gnome-2.20 on my Gentoo box. In general, the upgrade was painless, but there were a few pitfalls that can be easily avoided.

This post was originally written on November 3rd, 2007, and most of the Gnome-2.20 packages were marked ~x86.

Add these entries to /etc/portage/package.keywords.
[ Updated: 2007-11-14 ]
[ Updated: 2007-11-16 ]
Due to bug #196621, you will need to edit your yelp use flags in /etc/portage/package.use
gnome-extra/yelp -debug
emerge -avD '>=gnome-2.20'

Ta-da!

[ Update: 2007-11-16 ]
Note: After performing the initial upgrade, then performing an emerge --sync at a later date, some of the package versions may have disappeared. Since all packages in the above file use absolute package versions, another attempt to do an emerge -D --with-bdeps y '>=gnome-base/gnome-2.20' could result in slot-version collisions within emerge (see bug #199359).

Lastly, I'm not sure if this will be the case for everyone, but I had to re-emerge gok, notification-daemon, gnome-applets, and gnome-python-desktop because gnome-applets was emerged in improper order, pulling in a previous gnome library. You can verify this by running revdep-rebuild -pv before re-emerging these packages.

emerge -av1 gok notification-daemon gnome-applets gnome-python-desktop

Currently, I'm not using the binary ATI drivers (ati-drivers-8.42.3), but I'll test it out and see if there are any conflicts.

Unfortunately, I would also like to upgrade to >=xorg-x11-7.3 / >=xorg-server-1.4, but ATI's current drivers for x86 will not work with >=xorg-server-1.4. Since I'm only using the open source radeon driver at the moment, I'll give that a shot before I test out ati-drivers-8.42.3.

Happy Gnome-ing :)

20071030

Restore-Points in Gentoo Linux

Today I was very tempted to upgrade my Gnome desktop to version 2.20, which is in the Portage tree, but not yet marked stable. Then I thought to myself - remember what happened the last time you attempted something like this? I spent the entire day simply trying to figure out which packages to downgrade.

Gentoo's Portage package management system is excellent because of the fine-grained control provided to the administrator over which packages and which options (USE flags) are installed. Again, on the plus side, a local repository of binary packages can be created (with FEATURES=buildpkg) so that packages need only to be compiled from source at least once. I write 'at least once' is because a package will need to be recompiled whenever a version number changes (obviously) or whenever the USE flags are changed. None of us really like that, but it's a fact of life.

Now, taking a slightly deeper perspective into the concept of a local repository, there are a few different ways of organizing this. If you think of all of the variables involved with the instantaneous state of a portage-based system, ignoring overlays for now, the collection of all installed packages becomes a (lengthy) one-dimensional 'tuple' at any particular point in time.

Now, lets say that each binary package stores all of its dependency and USE flags when it is created. Then, theoretically, we could do a complete re-installation of all of the packages on a system with the binary packages alone.

In this repository, for the sake of sparing disk-space, we would likely just dump unique binary packages in a position that would be written down in a massive table. From the simplest perspective, we could simply duplicate the entire table (actually, a look up table) whenever a package or USE flag had changed.

That would probably end up being a horribly inefficient waste of space. But on the completely opposite end of the spectrum, if one was to account for each package, and create variable, off-shooting dimensions of each tuple element whenever a new version was introduced, or when a USE flag was changed,

When a USE flag is changed, then so do the dimensions of each of the elements in those tuples. In fact, what we would observe is a hyper-volume of data pointers where each element is of varying dimension and size. If one was to attempt to visualize this, it would look something like a fractal hyper-image, where the residuals of each change dieing off after some amount of forking.

The obvious trade-off is complexity versus storage space... although, think since this repository would only be storing tuples of locations on disk, then the storage space might not be so high. Although, the size of the hyper-volume would increase exponentially with each new package added. On the other hand, the 'fractal' approach would be much harder to navigate (there might be some way to organize it in a hashing system). I'm not sure if it would be faster or slower.

In any event, one would need to call a tuple out of this repository and then re-install the binary files. The recorded tuple would then be a restore point.

The Perpetual Notion