Sat Sep 26 14:01:52 CEST 2009

Gentoo usage in commercial environments

Just sent this to the gentoo-dev@ mailinglist in the hope of getting some excellent propaganda material out of it:

[snip]
Hello everybody!

As Gentoo approaches its 10th birthday I've been wondering how and where it is used. We used to have some great stories from companies in the weekly newsletter, but that one has become very dormant a while ago.

I'd like to collect your success stories, endorsements and case studies so we can present to the rest of the world how using Gentoo makes your life easier and is totally awesome. If you don't want to have that information public I'll gladly anonymize it as long as I can be reasonably certain that you really exist. What is important is that you, if you actively use it in a commercial environment, write me whatever you think is important. Or you motivate someone you know to write it. Do your contribution to making things better :)

Everything from "I use it and it's great!" to a story starting on a rainy day in November 1885 is good. Don't be afraid, I'll work with you on making it into something readable.
And if you have specific criticism I'll take that too - maybe we can find an easy way to improve things. That is in your best interest too, so go ahead. Invest a few minutes of your time so we can save you more time!
[snip]

If you think you can contribute something just send me an email and I'll see what I can do.

Posted by Patrick | Permalink

Fri Sep 25 14:11:27 CEST 2009

Codeswarm revisited

Some people have asked me under what license I published my CodeSwarm video.
Usually I don't care, but to make it clear:

This video is now officially licensed under the Creative Commons Attribution-Share Alike license.
To make things more interesting I've published the CVS log and preprocessed XML used to render the animation. As they are just created by automated scripts I see no creative input worth a copyright, so they are released as-is.

The config file etc. are trivial to create, so just play around and see what happens. It's mostly self-explaining :)

That's all for now.

Posted by Patrick | Permalink

Thu Sep 24 03:09:24 CEST 2009

Gentoo Codeswarm

When I saw people do codeswarms of amarok and banshee and other small projects I thought "Hey, I can do that too!"

Now, after some work, and lots of CPU-cycles used, I have some results to show. And they are quite ... nice? awesome? For you to decide.

It all begins with pulling the data from cvs. Since I was in a hurry I did a naughty thing - do a checkout from anoncvs.gentoo.org, run cvs log > logfile
That is naughty because it causes a huge load on the server. Polite persons will create a local copy of the repository and run their own cvs server.

After ca. 50 minutes I had a logfile of 456926138 bytes, or roughly 450MB. This gets transformed into a list of events for codeswarm. The transformation itself is quite fast, but takes lots of RAM. In this case just over a gigabyte. So there I was with a 220MB XML dump to feed into the renderer ...

The first attempt died nicely because the default of using 1GB RAM cripples the render speed. Well, actually, my first first attempt died a horrible death because some [censored] in the Java world don't check what dynamic objects they load. And this means I had to run it all in a 32bit chroot. Grate phun. Grrr :D

Processing that amount of data looked quite reasonable for the first hour or so. Then it slowed down quite a bit, then even more. Increasing the memory helped a lot - I just let the JVM have 3.5G (it's a crippled 32bit process after all) and it effectively used ~2.5G. I think this shows the limitations of the current approach quite well.

Still it was too slow and too dense. The amount of files overwhelmed the screen so that it was looking very ... crowded

And that's a bit bland. I did notice that you can colour things, so obviously I split by categories. The colours are mostly random, but seem to work well. And in testing things I realized that the drawing time is negligible compared to the physics / preprocessing, so obviously I increased the resolution a bit. Hey, we live in the modern times of Full HD ... why compromise with legacy formats!
real    2876m48.775s
user    2897m44.342s
sys     10m28.831s
That's what "time" had to say about my little experiment. Almost 2 days of CPU-time at 2.6Ghz ... and that's with a naughty tweak. I reduced the amount of items because I realized that it was going to take Too Long (tm). So instead of representing each file ... why not represent each directory as one "thing"?
                    filename = filename.replace('files/','')
                    filename = filename.replace('/var/cvsroot/gentoo-x86/','')
                    filename = filename.replace('Attic','')
                    filename = "/".join(filename.split('/')[:-1])
That's all the mods I needed - strip the repo location at the front, fold files/ and Attic/ away and cut off the filename. Ends up with "category/package" instead, which for the purposes of Gentoo is a much more useful identifier. And the rendering time went down from one frame every 30 minutes worst case to about one frame every 3 minutes worst case. Y'know, the "unmodified" version is still processing after over 118 hours CPU time ... not that nice :)

One positive effect is that it reduces the input from over 200MB to just around 100MB.
Now after 48h CPU I have 13367 single PNGs with a total size of pretty exactly 5GB. Feeding those into mencoder generates 557.8 seconds of video with an average bitrate of 1737.2kB. The encoding takes 22 minutes walltime or 40 minutes CPU-time - in other words mencoder uses two threads properly. Good to know!

So just to warn you, the result of my little experiment is a high-bitrate 1080p video. It is about 116MB in size ...
Get it here
(US mirror)

Enjoy!

Posted by Patrick | Permalink

Sun Sep 20 00:02:48 CEST 2009

How not to write an app

So it seems that Ciaran is upset again. Which seems to be quite normal. But he has some good advice, and I thought I'd share some with you too.

Let's start with an easy one. Descriptive error messages. This is good:
$ bash -x foobar.sh
bash: foobar.sh: No such file or directory
This is not:
  ... In program paludis -ip portage:                                                                                                
  ... When performing install action from command line:                                                                              
  ... When adding install target 'portage':                                                                                          
  ... When parsing user package dep spec 'portage':                                                                                  
  ... When parsing generic package dep spec 'portage':                                                                               
  ... When disambiguating package name 'portage':                                                                                    
  ... When finding all versions in some arbitrary order from packages matching */portage with filter all matches filtered through supports action install:                                                                                                                
  ... When loading entries for virtuals repository:                                                                                  
  ... When finding all versions sorted from packages matching sys-kernel/gentoo-sources with filter all matches filtered through supports action install:                                                                                                                 
  ... When generating metadata for ID 'sys-kernel/gentoo-sources-2.6.28-r5::gentoo':                                                 
  ... When running an ebuild command on 'sys-kernel/gentoo-sources-2.6.28-r5::gentoo':                                               
  ... When running ebuild command to generate metadata for 'sys-kernel/gentoo-sources-2.6.28-r5::gentoo':                            
  ... When running command 'sandbox /usr/libexec/paludis/ebuild.bash '/usr/portage/sys-kernel/gentoo-sources/gentoo-sources-2.6.28-r5.ebuild' metadata':                                                                                                                  
  ... In ebuild pipe command handler for 'LOG0qaglobal scope tr':                                                                    
  ... global scope tr                                                                                                                
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] (same context) global scope tr                                                              
paludis@1253390790: [QA e.child.message] In thread ID '19463':
That's about 1/35th of the whole output ... which is just drowning out any reasonable information. Complete output is just under 750 lines, so good luck finding any error. But I guess there's a reason for that.

Now something really neat:
  ... In program paludis -ip portage:                                                                                                
  ... When performing install action from command line:                                                                              
  ... When adding install target 'portage':                                                                                          
  ... When parsing user package dep spec 'portage':                                                                                  
  ... When parsing generic package dep spec 'portage':                                                                               
  ... When disambiguating package name 'portage':                                                                                    
  ... Package name 'portage' is ambiguous, assuming you meant 'sys-apps/portage' (candidates were 'sys-apps/portage', 'virtual/portage')  
Let me say this as clearly as possible. Never assume. If you try to guess what I might have meant to say you may do something I explicitly do not want to happen or even something which breaks things. If your app is unable to unambiguously parse my input quit.
Failure to do so will reduce in lots of tears and you falling on sharp objects. Multiple times.

And of course ...
Fetch error:
  * In program paludis -ip portage:
  * When performing install action from command line:
  * When executing install task:
  * When performing pretend actions:
  * When fetching 'dev-db/sqlite-3.6.18:3::gentoo':
  * Fetch error: Fetch of 'dev-db/sqlite-3.6.18:3::gentoo' failed
I don't think that means what you think it does. A pretend action that doesn't only pretend, but claims to fetch things? That better be a small oopsie in the explanation that doesn't, in any way, describe what actually happens.

Now, Ciaran. You can try to ridicule me through your blog. You can relabel comments as my name if you don't like them (it's your blog after all - even though that's exquisitely rude) and then deny you did it (we call that lying).
But if you have to do so do it directly. Don't try to be funny (you saw what that did in the Funtoo-uses-Internet-Explorer Situation).
Don't label any criticism as FUD (see "Ten Ways PMS Raped your Baby").

And if you fork PMS then do so. You have your own hosting already. Now all you need is your own mailinglist and please leave those that work on the official version alone. And don't command us around, we're not your slaves.

Think you can do that? That would be awesome. Thanks!

Posted by Patrick | Permalink

Sat Sep 19 23:23:22 CEST 2009

Reality vs. PMS

If you ever get bored enough (which, on an epic scale from fighting with the Spartans at the Thermopylae to reading the telephone book should be just around counting the grains of rice in a 50kg-bag) and start reading PMS, you might get a few of those "Wait a minute ..." thoughts that happen when things aren't quite right.

One of those is around page 26, where PMS defines that bash 3.0 or higher is to be used. Which actually means "it has to be bash 3.0 compatible, but may be any higher version". Or to be even more precise: Ebuilds are not allowed to use any features not in bash 3.0, but the execution environment may be any higher version if it is compatible. How could we have a straightforward text that doesn't need interpretation! The lack of flamewars! The lack of circular reasoning!

Now if you look around on your installed machines you may notice that:
  • The oldest available ebuild is app-shells/bash-3.1_p17
  • Portage depends on >=app-shells/bash-3.2
  • Current stable on amd64 and x86 is bash4
Which means that you can't even use bash 3.0 anymore. And if you had an install that old you'd run into many upgrade issues - portage blocking bash, bash blocking portage, portage being unable to read EAPI1 ebuilds. A big mess with no clean upgrade path ... apart from injecting binpkgs and pretending that mess never happened.
From bash Changelog:
*bash-3.0-r12 (05 Jul 2005)
*bash-3.1 (10 Dec 2005)
*bash-3.2 (12 Oct 2006)
*bash-4.0 (21 Feb 2009)
So anyone not having bash 3.2 hasn't updated since May 2007. Seriously. That's when bash-3.2_p17 went stable. (October was just when the ebuild was added, the Changelogs are a bit tricky to parse)

So for all intensive purposes (and of course all intents and purposes) we can assume bash 3.2 installed anyway. Plus there's no older ebuild anyway, apart from the 3.1_p17 that is a leftover from a horrible upgrade path.
Even worse, for at least a year there has been a "contamination" of the tree in the form of bash 3.2 features like the "+=" assignment. Which means that some ebuilds won't work with 3.0 anyway. Extra bonus? Many eclasses use it, which means, in short, that you need 3.2 or higher. And it has been tolerated (maybe even encouraged) for such a long time that there's no way to undo that change now.

The obvious thing to do (fix PMS) has been ignored for quite some time. So I thought I'd send an email to the gentoo-pms mailinglist to get it sorted out. Guess what?
The amazing one-character patch was denied.

That's the best way to make PMS irrelevant - refuse to let it document reality.

Posted by Patrick | Permalink

Tue Sep 15 00:54:17 CEST 2009

Javaaaaaaargh.

I presume my liking of Java is known well enough ;)

So I tried a Java app today just to see how well it works. And suddenly ... this:
*** glibc detected *** /usr/lib/jvm/sun-jdk-1.6/bin/java: free():
invalid pointer: 0x00000000420eef40 ***                                        
    
======= Backtrace: ========= 
/lib/libc.so.6[0x3002271e46]
/opt/sun-jdk-1.6.0.16/jre/lib/amd64/libdcpr.so[0x7f8c503d2b52]             
/opt/sun-jdk-1.6.0.16/jre/lib/amd64/libdcpr.so[0x7f8c503dbe05]             
/opt/sun-jdk-1.6.0.16/jre/lib/amd64/libdcpr.so(Java_sun_dc_pr_PathFiller_reset+
0x43)[0x7f8c503d51c3]                                                      
[0x7f8cf52a4f50]                                                                
                                                                          
======= Memory map: ========                                                    
                                                                           
40000000-40009000 r-xp 00000000 09:00 85671701 /opt/sun-jdk-1.6.0.16/bin/java  
40108000-4010b000 rwxp 00008000 09:00 85671701 /opt/sun-jdk-1.6.0.16/bin/java   
41716000-42288000 rwxp 00000000 00:00 0        [heap]
So I think "well, sun-jdk is teh shizzle, I'll try ... err ... no, that one is fetch restricted. So is this one. Err, yeah, icedtea-bin!
Guess what. Same result. Not good.
Now I'm thinking "A crash _in the Java core_? No wai!".
/opt/sun-jdk-1.6.0.16/jre/lib/amd64/libzip.so(
Java_java_util_zip_Deflater_deflateBytes+0x305)[0x7fe671a33e95]
[0x7fe66eb3d4a2]
What, a crash in libzip? That smells bad. Random code execution bad. I find that very unlikely! How on earth does this happen?
Now, let's go bleeding edge and try to build icedtea from source. That only takes ... sigh. About one hour on a quadcore. Grumble grumble yell moan grumble.

And? Can you guess it? Implodes the same way. This is getting really interesting!

Hmm. Looks like all the code this app uses is pure Java. Hmm, this makes no sense. But ... wait ... what's that? "core.jar" - guessing from imports this is the Processing library. That looks like a candidate. Let's have a look at the homepage ... yes, this seems to be C++ code with a Java interface. Ok, this could mess with things. But how?
Just injecting the .jar with the newest version doesn't change a thing. Crummy. And I do notice that their LINUX tarball has files with the .dll ending. File says:
PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit
Uhm. Yah. No good.
But I can try to build it from source, yes?
Yes. These nice people have a svn repository I can pull from. And as a Gentoo user I'm used to playing with such things.

If you have a weak stomach you should stop reading now!
Now, svn checkout. unzip, make, c++, I have all of those (Gentoo sei Dank!). Let's have a look at the build syste... the build... what the ...
Ok, there's a build/ directory with linux/ and windows/ and other directories. And in those directories are, uhm, scripts. And the eclipse editor preferences? What the ... but I digress. Scripts. There's one called "make.sh". Good! Let's have a look.

#!/bin/sh
Ok, so they aren't using any bash features I hope!
  ARCH=`uname -m`
  if [ $ARCH = "i686" ]
  then
    echo Extracting JRE...
    tar --extract --file=jre.tgz --ungzip --directory=work
  else
#    echo This is not my beautiful house.
#    if [ $ARCH = "x86_64" ]
#    then
#      echo You gots the 64.
#    fi
Wat. SRSLY. I lack words to describe this. So the whole 64bit part is commented out. Why do I think that that's not a good idea? And what is "Extracting JRE..." supposed to mean? Noone sane would ... but ... omg. MAH BRAINZ. There's a FULL JRE in the svn checkout.
    echo "
The Java bundle that is included with Processing supports only i686 by default.
To build the code, you will need to install the Java 1.5.0_15 JDK (not a JRE,
and not any other version), and create a symlink to the directory where it is
installed. Create the symlink in the \"work\" directory, and named it \"java\":
ln -s /path/to/jdk1.5.0_15 `pwd`/work/java"
    exit
Some things are not meant to be seen. Like this message, which throws up many questions (and the rest of my lunch). At this point my motivation approached zero (or rather, my motivation to do useful things. I had a strong motivation to hurt some people).

And I try to run that crashy code in my i686 chroot. Guess what. Yes!

Now I do wonder. Who is to blame here?
The clueless upstream that doesn't even know how to cut a release without including their editor config?
The Java runtime that loads any crap without doing sanity checks? (Seriously ... how can you load a 32bit .so or even a .dll into your adress space without immediately exploding?)
Or me, who still thinks there are Java apps out there that don't make you go insane?

So, after a few hours of tracking it down I'm again reminded why I usually don't touch Java at all. The amount of stupid I've found today is enough to last a small african country for a year ...
Oh, one small final note. I can't really blame people for not knowing platform-specific things. But there's a fine line between being a bit ignorant and being purposefully incompetent. If you don't know how to package for a platform find someone who does. Ask on IRC, or if you don't like that on mailinglists. But please, don't release broken stuff like that. It's very rude and can be devastating for the mental state of people that might use your software!

Posted by Patrick | Permalink

Sat Sep 12 20:40:10 CEST 2009

Building Stuff, the Gentoo way

Today's blog post shall be focussed on how to efficiently build packages. So here's the boring part: Hardware specs.

Processor: AMD Phenom(tm) 9950 Quad-Core Processor @ 2.6Ghz
RAM: 8GB
/var/tmp/portage: 8GB SSD (2x4GB Compact Flash / RAID0 )
Storage: 4-disk SATA RAID-5 (mdadm)

Now that's quite powerful, but still you easily end up waiting some time for things to be compiled. And of course you usually don't want to test in your livesystem environment. So chroots to the rescue!
Currently I have 4 chroots I regularly use: i686 and amd64, stable and testing. That makes testing things quite a bit easier, especially when you run into some bugs that only happen on stable systems. The setup is quite simple, and because I'm lazy I've added some small scripts that mount everything I need.
All chroots share /usr/portage and /usr/portage/distfiles. That's a bindmount - saves lots of space and keeps them all in sync. One "special" thing is /usr/portage/packages, that's a bindmount per chroot. So I have a packages dir with amd64-stable, amd64-unstable etc. subdirectories, which makes searching quite a bit easier if I ever need a binpkg.

The make.conf is as default as possible, but if you build lots 'o stuff you end up with a few interesting mods:
FEATURES="buildpkg"
NOCOLOR="true"
PORT_LOGDIR="/var/log/portage"
PORTAGE_ELOG_SYSTEM="save"
PORTAGE_ELOG_CLASSES="info warn error log qa"
USE="X gcj objc"
VIDEO_CARDS="vga"

Why would I do that? Well, first of all buildpkg. That saves a binpkg of everything built, which saves quite a bit of time. The whole logging thing is very convenient if you build things with a script like this:
for i in `pquery --max -a 'x11-drivers/*'`; do emerge $i; done
Why would I do such a thing? Well, maybe someone asks me if all x11-drivers work with the new mesa or xorg-server. And like this I just kick it off and grep the logfiles later. Obvious, eh?
Lastly the useflags are optimized to make the toolchain capable of building exotic stuff like gnustep or java packages out of the box. Otherwise you'd soon be rebuilding gcc, which takes time and effort ... so let's add it as early as possible. And VIDEO_CARDS to make xorg "thinner" - less drivers installed, and we don't actually use the drivers anyway!

Now if I want to test a package I usually use:
FEATURES="test" emerge -1avk somepackage
which has the advantage that it uses binpkgs if available (k), shows me what will happen and (a)sks. And -1 / --oneshot so it doesn't end up in world file, so I can get rid of everything with a simple emerge --depclean.
Since there are so many packages where tests fail I often short-circuit and use --onlydeps to get everything installed without having to run and fail tests and only run the tests of the package I actually want to test. As you can see that's quite streamlined to get as many things compiled with as little effort as possible. To work on things I usually group one console tab in the shared PORTDIR (which in my case is a cvs checkout) with one console tab in the chroot. That way I can easily toggle between ebuild editing and compiling. Usually I only have two such tab groups open because otherwise I tend to forget things. Whenever possible I try to avoid doing two things at once in one chroot because that can cause some headaches, it is easier for me to use a different chroot for it. And setting up a new one is quite easy :)

A short while ago I deleted ~12000 binary packages because I was getting a few inconsistencies - the gcc 4.3 to 4.4 migration cost me quite a lot in terms of build time. But now I've mostly caught up, the common packages exist as binaries for fast install and everything else would need to be compiled anyway.
So much for the "how to compile things" part. Next week on BBC: How to make an omelette without eggs!

Posted by Patrick | Permalink

Fri Sep 11 02:33:08 CEST 2009

"/usr/portage is my overlay"

... and I wish more devs would focus on getting things in the tree. The proliferation of overlays is really nice because some have quite exotic stuff, but I find it frustrating to have to use 12 overlays to get all the apps I want/need. So please, if you can, consolidate. Push things to sunrise instead of your private overlay. And if it works (even it a bit ugly) push it to our main tree. Mask it if you don't like it. But please try to avoid handing users this nice puzzle game with 12 incompatible overlays that break random other stuff ...

Posted by Patrick | Permalink

Thu Sep 10 22:31:40 CEST 2009

Small status update

I've been trying to fix stuff as good as I can. Still there are tons of trivial bugs, so if something doesn't compile I usually just leave a note on the corresponding bug and continue with something else. It's kinda rude, but that way I get more fixes per timeunit done.

Y'all might have noticed me bumping postgresql and samba. Those were lagging behind so much, and many people use them - so I spent some time getting us up to speed there. Still lots of minor issues, but at least we have something from this decade now. Amusing thing: Just as I started bumping postgresql to 8.4 upstream reported a few security issues and bumped it. So I spent quite some time compiling postgresql 8.0 8.1 8.2 8.3 and 8.4 - 7.4 needs some more attention, but then who uses something that old ;)

I'm getting quite frustrated with test failures. In my chroots I use FEATURES="test buildpkg" so during the initial compile the tests are run, then reuse the binpkg to save time. With gcc 4.4.1 I had so many linking inconsistencies I decided to drop my binpkgs and start "from scratch". That has slowed me down a bit because I compile more, but it also makes me find lots of seriously messed up tests. Don't even think about suggesting that for future EAPIs as default. It's a retarded idea that does not work.
The amd64 chroot has close to 1k packages already after the gcc 4.4.1 upgrade, so I seem to build a bit of everything now. Lots of bumps, especially in dev-python. The quadcore definitely pays off for that. If anyone wants to motivate me ... I could use some more harddisks. Never enough of those ;)

I just bumped virtualbox to 3.0.6, a day after the announcement. I wouldn't be able to do that without helpers like Alessio who feed me things. They allow me to reduce my role to pure integration testing, so I can cover more ground, they get their favourite packages fixed and everyone profits. If you want to get involved feel free to bother me, that usually has a non-null chance of getting bugs fixed. We still have about 10k open bugs, so there's always something to do. And I can only fix what I'm aware of (same goes for everyone else. File bugs. Write patches. Make it easy for us to fix stuff.)

While I'm spending most of my time at the tip of bleeding edge these days I am aware of the lag with getting things marked stable. That's mostly a manpower issue, so if you have some spare time and want to improve things that might have the largest payoff at the moment. And don't think you're not qualified - they let n00bs like me have access to the tree, so you're qualified for stable testing.

So what's next? Things are getting into a better shape, but there's always room for improvement. Upstreams release a constant trickle of new stuff which we have to integrate. That takes time and we need to be aware of it, so if it hasn't been bumped after a week or so feel free to file us a bug. If you have ideas how to make things better don't hold back - maybe it's an awesome and helps a lot. Help us help you!

A worthy goal?
One bug a day. I challenge you to either open or fix one a day - doesn't take much time, but if we get 30 people doing it for a year that'd be the amount of open bugs we have. Or 300 people for one month ... Imagine what that'd do to your Gentoo! And now stop dreaming and go fix stuff.

Posted by Patrick | Permalink