Mar 29 (Tue), 2005, 17:22

mirror-dist-0.5.2 available

Fixed bugs people pointed out, mainly dealing with the code being written against portage cvs head, and not playing nice with portage stable (I hate you stable), version 0.5.2 of mirror-dist is available here. Enjoy.


Posted by Brian Harring | Permalink | Categories: General Gentoo, mirror-dist news

Mar 29 (Tue), 2005, 17:14

delta compression + portage-tree snapshots == good stuff.

Been kicking a script around for a bit for generating patches for tree snapshots, used by emerge-webrsync folk, and figured I'd poke bonsaikitten/patrick for some space to throw the patches up while waiting for an official infra implementation to go forward.

The clientside modification is done, available and named emerge-delta-webrsync. It works fine with stable portage (yes, that's a change for me since my code is usually cvs head based), and requires dev-util/diffball-0.6. So... what does it do? It identifies what the last snapshot used was, pulls patches to update to most current, and reconstructs the target file.

The main benefit of it it is the massive bandwidth saving per day. So... stats. Patch size refers to the patch to go from the previous day, to that day. If you haven't synced in 5 days, then just add up the patches for the days since you sync'd, and you'll have the amount fetched.

DatePatch sizeFull sizeBandwidth savings
2005/03/081089181866048199.42
2005/03/091399501866737899.25
2005/03/101466631869205199.22
2005/03/111039901872765999.44
2005/03/12923691870898399.51
2005/03/131958831882457298.96
2005/03/141122491883370899.40
2005/03/151146121884278299.39
2005/03/161206821877346699.36
2005/03/171301731893194899.31
2005/03/181954511888631398.97
2005/03/191204231892503699.36
2005/03/20957281890946899.49
2005/03/211106491894307699.42
2005/03/22871731897088099.54
2005/03/231095211898535499.42
2005/03/241006511900640599.47
2005/03/251410321908687899.26
2005/03/26677591913623899.65
2005/03/27814131918124499.58
2005/03/281114571907893599.42

Nifty thing there, say you hadn't synced since the 2005/03/07 snapshot; current (as of this posting) is 2005/03/28. Either you grab the full 19.08MB, or you grab the 2.49MB worth of patches in between, and reconstruct 2005/03/28. Even with a tree 3 weeks out of sync, users still use 87% less bandwidth. This makes a nice difference for dialup users.

The scripts are available above, ready to go pretty much. Take it for a spin if you're so inclined. Patches are pushed up to gentooexperimental.org w/in 15m of the official snapshots being available. Unless bandwidth usage goes insane, it'll be running till something official is in place.


Posted by Brian Harring | Permalink | Categories: delta compression, General Gentoo

Mar 16 (Wed), 2005, 13:07

Sweet, I'm now the material of blog spam

Seems to have died down a bit, but kind of crazy to see fortune-mod-gentoo-dev (random insane quotes from #gentoo-dev) being used as blog comment spamming, for example here (comments section), here (15th comment in), and here. Swell. Joys of having a name/nick that isn't yet tightly bound together, resulting in lots of weird spam/crazy stuff getting matched by google I guess :)


Posted by Brian Harring | Permalink | Categories: general, General Gentoo

Mar 15 (Tue), 2005, 17:35

mirror-dist

At spyderous's suggestion, tagged in the support he wanted. --distfile-tracking specifies a db (python shelve) that is distfile 1:1 ebuild. It is only 1:1, multiple ebuilds owning a file results in the db being indeterminate as to which ebuild specifically is labelled as the owner. This isn't an issue however, and is a bit simpler.

The intention is to provide an indication in the death-watch list as to who owned that file- knowing at least one owner should be enough to track back what/where/when a distfile was orphaned. At the moment, this db is just used to note in the death-watch the owner (if any). Extending it beyond that so it's a db of 1:N, distfiles:ebuilds is beyond the scope of the script, although if people want it I can break that code out, and have mirror-dist do a symbiotic thing with it. The issue with this though is that the distfiles db would have to maintain entries for a period greater then or equal to the period of the death watch limit, plus a fudge factor for script runtimes. If it's desired, I'll do it however. It's pretty straightforward (cp_all + cp_list + some mild voodoo portage_dep calls to parse use conditionals).

Meanwhile, branded a version on mirror-dist, and uploaded it to my devspace. It's marked as version 0.5 and available here for testing- why 0.5? Simple. The code works, and does the initial features Lance (Ramereth) requested... the code is pretty stable. The remaining 0.5 -> 1.0 is left for beautifying the output, tagging features people request, etc.

Course those user requests will ugly up the code, and probably result in the need for another rewrite by the time the script approaches v1... damn users causing a vicious cycle of improvement and release.
:-)


Posted by Brian Harring | Permalink | Categories: General Gentoo

Mar 14 (Mon), 2005, 18:03

sourcemage nonsense

Ahh... got to love it when another source distro resorts to slander/trolling to somehow improve themselves.

SourceMage's take on Gentoo Security, according to a rather obnoxious troll

Another page, updated by the same swollery is available- SourceMage's FAQ on diffs between gentoo and themselves. The sad thing is when swolley first came into #gentoo-portage, I honestly thought he was after updating the page so it was an unbiased comparison. Going by the changes he's created, and the security trolling, guess not.

I'd feel bad for Source Mage if it was just some random user/zealot trolling, but they're hosting the materials, indicating they endorse this load. On the plus side, the FaqDiff from above is much saner then it was. On the other hand, reading through their attempted mapping of our gleps to their own features in an attempt to demonstrate they're better/were there first/whatever is... annoying, and pretty stupid in my opinion.

If you're 'better' (whatever the hell that means), people will use sourcemage- these pages are either questionable PR,propaganda, or a hit piece. Presumbably the intention is to sway users to sourcemage. Personally, I'm kind of inclined to state, "Whatever". Like I said above, if it's better, they'll switch, if it's not, they won't. Although I spose it's a bit flattering one of their users (devs?) goes to the trouble of maintaining that crud to paint sourcemage as better then gentoo (note I'm not stating one or the other is better. It's a stupid arguement for people who have nothing better to do then argue which distro they typically don't even contribute to is better- for example trolls/zealots).

Dunno. Knowing the questions swolley asked (#gentoo-portage answered them), and what wound up on those pages is kind of disheartening. We're here creating a distro we care to use, not here to argue about if our preferred distro is better then some other schmucks distro.

Definitely is one of those days where just turning off all the crap from email and irc, and sticking to my code is a good choice.
Definitely more productive at least.


Posted by Brian Harring | Permalink

Mar 14 (Mon), 2005, 16:36

rewrite of distfiles mirroring script

Took a look through mirror.py that is distributed with portage, and did one of my usual "bah, I can do that better"... 36 hours later, a massive head ache from python's global interpretter lock + threading issues, and it's finished.

Available here, tentatively titled mirror-dist, it has a bunch of features that make it better then the default portage mirror script.

  • Uses command line args, instead of relying on env vars to control features/settings
  • Killable in under .5s. This is important- existing mirror script has no way to kill it immediately, threads will just run till they finish their current task, and notice they're supposed to be dead. Seppuku of threads is pretty much under .5s, so as soon as the switch has been 'flipped' stating, "shutdown", it bails pretty quickly.
  • Mirror overrides via command line- you can specify a text dict file to override portage's thirdpartymirrors. This is useful for the actual gentoo mirrors, since trying to fetch an update from another master mirror doesn't make much sense.
  • distfiles cleansing support
  • death watch db. Basically, if a distfile isn't associated with an ebuild, you can leave the distfile alone for a configurable period of time. That distfile is recorded in a db. If an ebuild re-claims the distfile prior to the specified interval, the distfile is removed from the db. If no ebuild claims the distfile, and the waiting period has passed, it gets waxed. Think of it as delayed cleaning support
  • Debug mode- acts as if it fetched and deleted what it would have normally. Useful for trying it out without having any actual action undertaken (eg, you don't trust it yet).
  • Command line configurable verification modes of the existing distfiles on disk- either do quick size checks (default), or md5 all distfiles to identify if a distfile on disk has become corrupt, and needs an update- a more likely occurence would be that the ebuild's digest for that file has changed, so the distfile on disk is no longer valid
Host of other goodies. Code is pretty clean, has --usage, and --help output too.
Enjoy.


Posted by Brian Harring | Permalink | Categories: General Gentoo, mirror-dist news

Mar 09 (Wed), 2005, 05:52

therac 25 linear accelerator and qa (history for the younguns)

Read about this a while back, and a question about why hard real time was important in #gentoo-embedded got me thinking of this again. Way back in '85, Atomic Energy Commission Limited and CGR got together to create a medical linear accelerator- just as it sounds, used for delivering radiation doses for chemotherapy. They created the therac 20, and therac 25- both had a software bug, except the later lacked hardware safeties, allowing the software bug to cause some seriously bad mayhem.

For example, the first case of the bug involed a patient getting in the range of 15,000 to 20,000 rads, when the avg rad dose was around 200. Bit of a difference there.

The report on it is really worth the read, and is available here. Figured I'd direct my readers (do I even have readers? :) to it.

Everybody hits software bugs- it's basically an accepted thing in the industry... the consequences of said bugs can be pretty horrible though.


Posted by Brian Harring | Permalink | Categories: general

Mar 09 (Wed), 2005, 04:43

ebuild shell

A request in #gentoo-portage wound up with me whipping out a quicky interactive sandbox'd bash shell, preloaded with all ebuild funcs- probably useful for anyone who wants to test out commands, then dump the commands into an ebuild. Without further ado, the simple hack ebuild-shell is available in my devspace- two files needed, ebuild-env, and the bashrc, ebuild.bashrc.

Requires portage cvs though- it's not worth the effort trying to make it work for stable, since to do it, the ebuild funcs must be treated as a lib- which is exactly what I did for head already (it wasn't fun). So... upgrade to try it out, or just wait till head is stabled someday :)


Posted by Brian Harring | Permalink | Categories: General Gentoo, Portage news

Mar 08 (Tue), 2005, 10:12

Cache refactoring

For those who don't know portage internals, portage maintains a cache of metadata pulled from ebuilds required for speed reasons. The current cache class for stable portage is a bit odd- the cache is category specific, and the derivative class is completely controlled by the template init. It's OOP just backwards- instead of the derivative triggering the parent's init when it's good and ready, the parent triggers the childs init when it's ready. Slightly backwards.

The real problem with it is that the cache instance is category specific. For example, you wish to search the entire repository tree for a specific description. With a category specific cache, the best you could achieve is calling a search method on each category instance. For an rdbms backend, that's 140+ selects when a single one would've sufficed (that's also sidestepping the issue of connection pooling for each of those category objects).

The new cache design is a pretty extensible framework, based at the repository level. Already, a rdbms backend class has been provided (sql_template), along with a fs based class (fs_template). Code is cleaner, and is built from the ground up to support 'frozen' caches- non modifiable cache instances. I'll blog about the benefits of that later, but it's the start of a way to ixnay the metadata transfer after rsync finishes.

Also, a new method was added for relegating to the cache backend, cache wide searches- basically a method to tell the cache instance "I want _all_ pkgs that have xyz in their DESCRIPTION". For N file implementations (for example, the standard flat_list people use), it's just as slow as before. For rdbms implementations, it can use a single select call.

Which, obviously can be a helluva lot faster. :)


Posted by Brian Harring | Permalink

Mar 05 (Sat), 2005, 16:59

ebd, otherwise known as the snappy lil ebuild processing daemon that could.

Way back in around roughly june/july of 2004, I had a crazy idea of how to kill off a bunch of bugs, and optimize regen runs- regen runs being basically a massive set of calls to bash to source an ebuild in a carefully defined environment, to get it's 'metadata', DEPENDS,RDEPENDS,SRC_URI, etc which is then store by python portage in a cache backend. This process is a bit slow- regening the full tree on a p4 2.4ghz is well over 30 minutes if the cache is empty.
Why do we need this process/action? Because if you didn't have the auxdb cache (which holds said metadata), you'd have to re-source the ebuild each time, which is *slow*. Jason Stubbs pegged it around 400x slower then cache.

The regen time is directly affected by a bunch of things- for example, if an eclass used by an ebuild is modified (mtime changes), then all ebuilds that use that eclass must now be re-sourced- this is because there is no way to determine what was changed in the eclass, thus the only safe course is to go and re-source all ebuilds.

Like I was saying, it's a slow process. I figured the startup of bash, and the initialization of the bash environment for getting the metadata keys could likely be sidestepped- why do this? Because in a full regen, you must suffer the cost of bash startup/re-initialization for every ebuild. Currently, 18,913 ebuilds (8,976 distinct packages), which adds up.

Essentially, what is needed is to rewrite ebuild.sh (bash portage) such that's it's a library, and callable. More importantly, you need to be able to save and load the environment via function calls- this must work. Consider it akin to how the kernel suspends a process- the process when restarted must be the same as what it was previously. The environment between 'phases' (unpack/compile/setup/install) is the same way. More importantly, you cannot have the environment from one 'depends' phase (the specific phase that gets metadata from an ebuild) bleed into another ebuild.

So long story short, you need to load/dump environments on the fly, and contain each executing ebuild/phase such that it doesn't taint the environment for when another ebuild is processed. This is tricky, but required if you want to avoid the bash startup costs for a regen. Yes, this is a lot of crap to fix/implement for a potentially crazy idea/scheme, but I still tried it. :)

In doing env fix ups, a lot of long standing bugs were fixed also- the restructuring detailed above allows for portage to run completely from a saved env- moreso, it requires portage to run from saved envs for everything past the setup phase (exemption to this is binpkgs, which is a matter for another blog entry). This makes it such that installed ebuilds no longer rely on eclasses in the tree- they just use the saved env, which already has the eclass. This fixes bug 46223, and also allows for the clean break of forced backwards compatibility for all eclass apis/existance (detailed in glep 33). Beyond that, env attributes (export/readonly) are tracked, and a host of other naggles were nailed down and fixed. Honestly, getting the env handling right, and doing this shift fixes a *lot* of issues with ebuild processing. Continuing on however..

So env handling is now sane, a nasty collection of long standing bugs are waxed... but I started this thing because I wanted to see what could be done to speed up regens. Effectively, what I call 'ebd', ebuild-daemon, is an ebuild processor. Python portage spawns ebuild-daemon with a set of pipes into the bash side of portage. This allows the daemon and python portage to have nice little chats, including things like dumping the env straight through the pipes for an ebuild to process, notifying it what phases to process, and telling it to start processing, and report the results.
This is not a one sided conversation though- bash portage can command python portage too, within limits. It allows the bash portage to -

  • request a confcache be transferred in (confcache ~== global autoconf cache, speeds configures up)
  • report detected sandbox failures back to the python side. Why is this needed? Because previously, sandbox errors were dumped only when the sandbox binary exited. This isn't viable with a sandbox'd ebd- you can't exit the ebd just to find out if a sandbox failure occured. So an alternative method is used.
  • hijack (yes, hijack) portageq calls directly into the existing portage process. This is a major speed-up- portage import on it's own is over half a second typically. With the hijack, it's well below .1s, since all that must be done is process the request, not load the relevant portage modules, *then* process the request, then go through shutdown code.
  • Not implemented yet, but the same hijack approach can be used to have useradd/groupadd run in a de-priv'ed environment, and have the request 'hijacked' back to the python process (which has higher privs). This basically allows pkg_setup to run de-prived, which is a good thing.
  • Various crazy ways to improve performance, like preloading an eclass into memory if you know it's going to be used heavily (eutils for example).

"Yeah yeah yeah. Give me stats, before I wallop you with a red herring" you're thinking... ok.
Original data sets, and methods are available here. Basic summary, wiped the cache between every run, ran each target 6 times via time emerge ${TARGET} --nopspinner --quiet &> /dev/null avgs the datasets, and compared the resultant run times. These stats were collected with portage 2.0.51-r2 as the base, and ebd patch 20041027-2.

targetvanilla/ebdrealusersys
timed @gnome van00:59.1300:38.2800:18.04
ebd00:43.4600:28.1200:13.41
timed @dev-util van00:52.5900:33.2100:17.54
ebd00:36.1600:23.4600:12.10
timed @sys van02:29.4301:35.2100:49.10
ebd01:50.1001:11.5300:36.42
timed @php van00:38.2700:25.3400:11.12
ebd00:18.0400:12.1500:05.13
timed regen van34:06.4621:25.4111:48.25
ebd22:52.1914:23.3507:39.23

Beyond that... I hated the emerge --metadata algorithm. It waxes the cache, and transfers everything. So I rewrote that in ebd also (although it's not bound to it) to transfer only what has changed. More stats :)

timed emerge --metadata
vanilla/ebdrealusersys
van01:51.3700:54.1300:12.22
ebd00:57.5000:20.5600:05.05

So that's a bit quicker.
Note that this is in cvs head, and cvs head is under active development. So... other cruft may get jammed in slowing this down, but the basic speedups are there currently.


Posted by Brian Harring | Permalink | Categories: Portage news

Mar 05 (Sat), 2005, 07:53

clarification of pvdabeel and portage

So... I've decided to do a bit of clarification of another gentoo devs posting.
This will be fun...

Q: What is portage-ng?
A: A fuzzy document that leaves you thinking, "gee, portage-ng will rock", but was still born from the get go.

Why? Because it's too nebulous. What does it truly say? It's like writing out a Mission Statement. Yeah, you feel all warm and fuzzy afterwards, but anyone who has written a mission statement knows the usefulness of such a document. It's like Bush talking about manned missions to Mars- sounds great, but that minor matter of how was left out.

Continuing on, if you haven't read the link above, it's suggested readers gander through it. After a quick read through, ask yourself how in the hell something that vague, is going to be mapped on to the god awful mess that is portage code?

Portage-ng is basically PR. Path spec, for instance is a great notion until you sit down and think, "now how in the hell are they going to pull that one off?". No one has put forth a public proposal of how to pull this beast off- pvdabeel has blogged/talked about it, but no code- so it's basically akin to Duke Nukem Forever, vaporware.

So why am I picking at portage-ng? Because in the public's eye, it is not known that it is a dead spec, completely irrevelent to portage development. Pvdabeel decided to blog about it in a semi-recent posting. The revelant portion follows-

Package management reloaded
One of the items on my "Things I need to blog about" list is portage-ng. This Gentoo project was created one and a half year ago by Daniel Robbins, Nicholas Jones and myself to redesign Portage.Its goal was to implement functionality that required changes to the design of Portage. The Portage team was joined by various developers such as Jason Stubbs, Marius Mauch and Brian Harring who all continued to contribute to a future redesign by documenting the limitations and feature requests in our bugzilla database. Not only did some users contribute by submitting a design for discussion to the mailinglist for this project, but also the cutting-edge projects such as Gentoo for Mac OS X , BSD and Gentoo/OpenSolaris contributed a lot to the requirement list (For instance "pathspec" ).

Few questions here.

  • What documentation of limitations have portage devs contributed to?
  • What 'future redesign' is this referencing?
  • Can I see the document for the future redesign of portage? It would definitely have saved the portage devs a lot of discussion over the last few months about restructuring/refactoring portage...
  • How did gentoo/osx, which was released in july of 2004, contribute to the portage-ng requirements, which were posted on December 2003?
  • How has gentoo/opensolaris, which isn't even released, contributed to the portage-ng requirements?
  • Where on the gentoo website is portage-ng spec even available (yes google picks it up, but where is it directly linked to)? It's not listed on any portage pages...

So, being the nuisance I am, I broached this subject on irc. I asked why pvdabeel, who isn't a portage dev, nor even affiliated with our work, was commenting on all of this stuff and putting it up on his blog- yes, it may not be 'official' (someone please define that damned word), but he's a gentoo dev, talking about a gentoo project ([Author: -ng is not a project]) he is involved in, and starts the entry out basically about portage.
The next portion of his blog entry might be relevant-

In my Mcs Thesis at the TINF research lab of the Free University of Brussels I've documented the issues Gentoo, and other Operating System software distributors have encountered. This state of the practice combined with a state of the art in Software Configuration Management has led to the design and implementation of a Meta-Distribution engine capable of installing and maintaining large scale software configurations such as Operating Systems. I will be blogging about my findings and their relation to for example portage-ng, Mac OS X, OpenSolaris... These articles will be technical, published on a weekly basis and often accompanied by a small paper. I'm hoping to release the entire Meta-Distribution engine under the GPL in Q105.

The clarification I got was that the future redesign/requirements are in reference to his thesis/"Metadistribution Engine", this yet unpublished document/body of software.
Oh, well that clears it up- one thing- a portage developer didn't grok that you're aparently (after heated clarification) talking up your thesis/Meta-distrubtion engine, and it's not really related to portage in anyway.

How is the general public supposed to discern that portage and your thesis are two seperate things, if someone who is intimately tied to portage development interpretted that blog entry as a portage comment?

Guess I must just be a moron, and misinterpretted your blog entry.

Further clarifications- only place the portage-ng document is referenced at this point, is on a gentoo-ppc page, which mislabels it as a glep. (Pvdabeel, you might want to fix your page- it's not a glep, and please don't link to it, we removed the -ng link from portage long ago because it wasn't relevant).


Posted by Brian Harring | Permalink

Mar 05 (Sat), 2005, 01:38

glep33 reloaded

Posted an updated version of glep33 to the gentoo-dev mailing list earlier tonight. Aside from a massive clean up of spelling/typos, it also tweaks the proposal slightly. Mainly-

  • elibs are delayed in loading in some cases- global scope calls for an elib result in the elib being loaded just prior to pkg_setup. The intention is to completely disallow elibs to be used for metadata modification
  • After hashing it out a bit with Jason Stubbs, adjusted the old eclass-compat ebuild in system profile requirement- it isn't required. The compat. ebuild is only required if users either A) refuse to upgrade to a recent portage version (their choice, and responsibility), or B) they somehow totally destroy/corrupt their installed packages database.
  • General clarification of things people seemed to not follow, or gloss over.

Regarding a damaged vdb (installed pkg database), that's a non-issue. If the vdb is damaged badly enough, portage doesn't know what packages are installed anymore, so the system is effectively unmaintanable.
It's a rarity, but some users do dumb things like nuking /var/db/pkg, or wind up suffering fs corruption. Either way, can't do much there regardless of this proposal.


Posted by Brian Harring | Permalink

Mar 01 (Tue), 2005, 19:47

metapkg Glep

Another portage glep has hit the mailing list, this time from spb (Stephen Bennett)- the email, and the glep

So... what the hell is it I spose is the logical question. Metapkg's are basically a non-installable node of indirection- it's depends/rdepends are always processed by portage. How is this different from a normal ebuild? If the ebuild is installed, and --deep wasn't specified, portage doesn't look into the nodes dependencies at all- it just assumes they're correct.

Virtuals, as they are implemented now are essentially installable- portage first hunts through your installed pkg database to find all providers of a certain virtual, thus knowing what virtuals are installed already. The problem with this is that portage has to walk the entire installed pkgs database to know what virtuals are installed already- aside from being slow, this approach really sucks if the virtual isn't installed. Portage has to walk the portdb (available ebuilds) and go looking for a 'provider'. This is slow, and makes implementation/handling of non-installed virtuals a pain in the ass- if for whater reason that provider winds up blocked later in the depgraph, the resolver is now past that stage, and must bail. Alternatively, it could go back and try to work it's way around the blocker, but that still is a pita, and leads to code duplication.

Assuming you're still reading, you can see our existing virtuals implementation... sucks, badly. Metapkgs address this quite nicely- a metapkg is basically a virtual, with all of the providers listed inline. Centralizes the providers into a common point that portage can look at, and have A) a list of pkgs to check to see if they're installed, B) have the order of what is preferable if no providers are installed.

Sounds great, except you lose a bit of flexibility- with metapkgs, users can no longer just add an ebuild that provides a virtual to their overlay, and have it satisfy the 'virtual'- they would need both the ebuild that can provide, and an overlay metapkg that includes the new ebuild in it's depends/rdepends. For that case, you gain speed/sanity for virtual processing, at the loss of a bit of flexibility.

An additional benefit of the metapkg for virtuals approach is that since metapkgs are ebuilds (stripped down ebuilds I'll grant you, but ebuilds none the less to the resolver) developers can now do versioned virtuals- have subsequent versions of a virtual that map several providers together under a common node. Nifty feature, and has been requested on occasion


Posted by Brian Harring | Permalink | Categories: General Gentoo, Portage news