I’ve just been finalizing – in more ways than one - the 0.6 beta of Steel. It’s been “nearly done” for about a week or so now, but there have been a couple of problems, now hopefully fixed.
The main one was the interesting habit of Steel zapping your desktop when a Ruby debugging session ended. It didn’t happen very often, but occasionally - ZAP!! - when you once had Word, Excel, Outlook, Visual Studio, etc., all up and running, all that remained was a virgin desktop. It took me some time to track this one down, and it’s all down to one of the most undesirable ‘features’ of Windows I’ve ever come across.
It’s to do with terminating a process ‘tree’. To be tidy, I thought it was a good idea to clean up any potential sub-processes that a Ruby program had created while I was debugging it. No problem – just find the Windows KillTree API. Except that there isn’t one. It seems that you have to do it the hard way and figure out which process is a child of the main Ruby process. OK – a little strange that Microsoft hadn’t provided an API for the job when there are over 80,000 of the things to do everything from formatting your hard disk to cleaning the fluff out of the keyboard. And stranger, there didn’t seem to be any documentation on how to do it; normally, there’s an MSDN article on stuff like that. At this point, alarm bells should have been ringing. But a quick search of Google came up with a technique which seemed to work fine. Most of the time.
It turns out that a process does indeed have a reference to its parent process (the process that created it).This is the parent Process ID (PID). However, the parent PID can have exited without killing off its child. Worse, far worse, Windows reuses PIDs! So not only can the parent of a process be a non-existent process (not too bad) – it can also point to a perfectly good process that isn’t its parent!
In my case, occasionally, just occasionally, the ‘parent’ of the desktop Explorer was my Ruby process. So killing the sub-tree of processes supposedly created by Ruby zapped the desktop Explorer and all its child processes – Outlook, Word, Visual Studio, etc. Baaaah...!!!
On reflection, I can’t see an absolutely safe way of killing a process tree in Windows because of this PID re-use. There just isn’t a cast-iron guarantee that the ‘parent’ of a given process really is the parent. I haven’t found any reference to this ’feature’ on the web anywhere. It seems that the issue of PID re-use is reasonably widely known, but the basic fact that you can’t build a good KillTree isn’t.