System Management: Another Linux Success

Sun easily separates themselves from whitebox manufacturers with their management capabilities. The fact that Sun chose an embedded Linux platform as the nerve center of their server really only proves to sweeten the pie.

The MPC855T PowerPC – or Service Processor (SP) as it’s more commonly called in this analysis – is in fact an entire embedded Linux computer of its own. Even when plugging in one of the managed power supplies, the SP kicks on and boots up. All management of the system is handled through this minicomputer: the serial console, front console, BIOS, fan speeds and even power draw. Even when the machine is off, the SP allows us to manage the status of the system, even if it has crashed, remotely or locally. In a worst case scenario, the SP can actually be rebooted from a hard switch in the rear of the machine.

Fortunately, Sun provided us with another block diagram to explain the inner workings of the Service Processor.


Click to enlarge.

Two 10/100 out-of-band Ethernet ports are routed via a dedicated three-port Ethernet switch solely to the Service Processor. This way, any management Ethernet can actually be daisy-chained to reduce the total number of cables in a rack. IPMI, SNMP or Sun Control Station can all route over this out-of-band (or in-band) network for server status and maintenance. The out-of-band network address of the Service Processor can actually be set via the console in the front of the server, or via DHCP (default). Of course, the traditional serial console is also available for those who need it.

Given the versatility, and since it’s always running, we can actually connect remotely to the SP via SSH and do things like update the BIOS, or perhaps just change some settings in it. This “Lights Out Management” approach is not a new concept, but Sun clearly has the most thorough implementation that we have yet to touch.

The console on the front of the server acts as our basic portal into the Service Processor. From here, we can view the status of individual components like the fan and temperature. All of our commands on the console are routed to the SP, which then decides what to do with them; for example, when we tell the machine to turn on via the forward console, the service processor (which is already on) hands off the instruction to the managed power supply to enable.

Overall, we were incredibly impressed with the thoroughness of Sun’s Service Processor. Anything short of forgetting the BIOS password or replacing hardware will ensure that the system stays up. Considering that most of the tools used inside the SP environment are free and/or open sourced, it only adds further to its desirability as clever administrators could very easily expand on the SP’s original functionality.
Chipsets (con't) Storage and Power
Comments Locked

38 Comments

View All Comments

  • tironside - Thursday, February 24, 2005 - link

    I agree with dwnwrd. the lom part of it is not great for remote console etc. the lom that the hp stuff has is pretty slick, with a java / web interface. The other main problem I have with this is it offers only raid 1 unless you buy a rather expensive add on card to do raid 5, kind of a teaser to put 6 drive bays and only let you do raid 1... It's a good start, but sun needs to make some changes before it can go mission critical. (raid and lom enhancements imho) while I like cli stuff, trying to get junior people to do complicated cli stuff is dangerous...

  • dwnwrd - Thursday, February 24, 2005 - link

    I have some V20s and a V40. The service processor is pretty great except if you try to direct the Linux serial console to it then connect to the "serial over LAN" you'll get a flood of "serial8250: too much work for irq4" and a sleepy system.

    http://supportforum.sun.com/hardware/index.php?t=m...
  • Pontius - Thursday, February 24, 2005 - link

    I am curious what they are using when they benchmark the linux kernel compile times. They use the time command which spits out three times - real, user & sys. Are they using the sum of all these? If not, something is wrong. Because I did the same test, on the same 2.6.4 kernel using -j2 on a dual 2.8GHz Nocona system and I got a "real" time of 147s. That doesn't seem right because the Opterons are way faster at compilation. On the other hand, if I take the sum of the 3 times, I get 420s. Any thoughts?

  • jlee123 - Wednesday, February 23, 2005 - link

    RedHat 9, are you joking!!?? This has got to be a mistake, I can't understand how Sun could be shipping a 64-bit server with a 32-bit OS that's reached End Of Life. It's the equivalent of buying a workstation with Windows ME on it. Also, there was never a official port of RH9 to x86-64, the first x86-64 RedHat was RHEL3, the Fedora team later released FC1 x86-64. If Sun doesn't wish to pay licensing, they'd be better off shipping with FC2, FC3 or CentOS, a free rebuild of RHEL. This hardware isn't even going to begin to be utilized till it's running something more modern like RHEL4 x86-64.
  • JustAnAverageGuy - Wednesday, February 23, 2005 - link

    I could think of one use for these. :)

    http://forums.anandtech.com/categories.aspx?catid=...
  • lauwersw - Wednesday, February 23, 2005 - link

    Standard rule for parallel make is to use 2xnumber of processors available. This gives most optimal results to hide disk latencies and seems to be correct in most cases I've seen.
  • phaxmohdem - Wednesday, February 23, 2005 - link

    Call me what you will, but I would like to see some quad/dual Xeon scores to compare to as well (along with price tages for comparison :) )

    And yes, If I were a rich man who knew what to do with that much computing power, I would have a dozen of these babies in my basement! Who needs women anymore once you have 48 Opteron x50 or x52 cpus humming at your disposal. And drool core? Ahhhhhhhhhhh.
  • JustAnAverageGuy - Tuesday, February 22, 2005 - link

    That thing is a BEAST.

    I have no idea what I'd do with a computer like that.
  • MrEMan - Tuesday, February 22, 2005 - link

    Kristopher,

    Thanks for the clarification about the reduced media tag.

    E
  • KristopherKubicki - Tuesday, February 22, 2005 - link

    RyanVM: The system used 850s.

    Kristopher

Log in

Don't have an account? Sign up now