Just read about Ganglia in a magazine and checked it out - I have to say, it looks pretty awesome. We've got over 300 machines total in production, I believe, and it'd let us monitor their key stats (cpu usage, memory usage, disk usage, network usage, etc.) both on an individual, cluster-wise, and enterprise-level.
This means, for instance, that we could see at a glance what the system load was on an entire cluster of webservers - and even view historical data, for instance a graph from the past week for the entire cluster, either averaged across the cluster or on a per-machine basis.
Looks drool-worthy, and looks like it'd make life so much easier. Mmmm mmmm mmmm... good stuff.