[Micronet] server diagnostic software??

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Micronet] server diagnostic software??

David Johnson-2
Does anyone know of any decent server diagnostic software, either free
or that can be purchased?  Ideally something that'll run under Linux
or will boot from PXE.

We've got a dual Xeon Supermicro server that's acting weirdly and we
suspect the CPU (or possibly memory) are faulty but don't have a good
way of verifying our suspicions.  Ideally we'd like something that
made a reasonable effort to exercise the CPUs + memory system to get
the temperature up whilst simultaneously verifying that the CPU and
memory appear to be working correctly.

Thanks,

David.

--
David Johnson, [hidden email], (510) 666-2983
Senior System Administrator, International Computer Science Institute

 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
Reply | Threaded
Open this post in threaded view
|

Re: [Micronet] server diagnostic software??

secabeen
In the overclocking community, Prime95 is the standard tool for burnin
testing: http://www.mersenne.org/freesoft/  There is a linux binary.

For memory, either Memtest86 or Memetest86+ will do exhaustive testing
of memory, and will PXE boot.

--Ted

On 6/11/2013 11:53 AM, David Johnson wrote:

> Does anyone know of any decent server diagnostic software, either free
> or that can be purchased?  Ideally something that'll run under Linux
> or will boot from PXE.
>
> We've got a dual Xeon Supermicro server that's acting weirdly and we
> suspect the CPU (or possibly memory) are faulty but don't have a good
> way of verifying our suspicions.  Ideally we'd like something that
> made a reasonable effort to exercise the CPUs + memory system to get
> the temperature up whilst simultaneously verifying that the CPU and
> memory appear to be working correctly.
>
> Thanks,
>
> David.
>

 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
Reply | Threaded
Open this post in threaded view
|

Re: [Micronet] server diagnostic software??

Graham Patterson
In reply to this post by David Johnson-2
Try the Ultimate Boot Disk. It has a lot of general test utilities and is generally useful.

Graham

Sent from my iPod

On Jun 11, 2013, at 11:53 AM, David Johnson <[hidden email]> wrote:

> Does anyone know of any decent server diagnostic software, either free
> or that can be purchased?  Ideally something that'll run under Linux
> or will boot from PXE.
>
> We've got a dual Xeon Supermicro server that's acting weirdly and we
> suspect the CPU (or possibly memory) are faulty but don't have a good
> way of verifying our suspicions.  Ideally we'd like something that
> made a reasonable effort to exercise the CPUs + memory system to get
> the temperature up whilst simultaneously verifying that the CPU and
> memory appear to be working correctly.
>
> Thanks,
>
> David.
>
> --
> David Johnson, [hidden email], (510) 666-2983
> Senior System Administrator, International Computer Science Institute
>
>
> -------------------------------------------------------------------------
> The following was automatically added to this message by the list server:
>
> To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:
>
> http://micronet.berkeley.edu
>
> Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.

 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
Reply | Threaded
Open this post in threaded view
|

Re: [Micronet] server diagnostic software??

Andrew M. Stevko
I agree with Ted.
Prime95 is good for running the CPU at 100% util.
Memtest86+ is great for detecting bad memory conditions http://www.memtest.org/
Spinrite 6 is fantastic for detecting/recovering bad drives and controllers.  https://www.grc.com/sr/spinrite.htm

try running both prime95 and memtest86 in separate vms on the same host to heat up your case and verify memory



On Tue, Jun 11, 2013 at 3:09 PM, Berkeley <[hidden email]> wrote:
Try the Ultimate Boot Disk. It has a lot of general test utilities and is generally useful.

Graham

Sent from my iPod

On Jun 11, 2013, at 11:53 AM, David Johnson <[hidden email]> wrote:

> Does anyone know of any decent server diagnostic software, either free
> or that can be purchased?  Ideally something that'll run under Linux
> or will boot from PXE.
>
> We've got a dual Xeon Supermicro server that's acting weirdly and we
> suspect the CPU (or possibly memory) are faulty but don't have a good
> way of verifying our suspicions.  Ideally we'd like something that
> made a reasonable effort to exercise the CPUs + memory system to get
> the temperature up whilst simultaneously verifying that the CPU and
> memory appear to be working correctly.
>
> Thanks,
>
> David.
>
> --
> David Johnson, [hidden email], (510) 666-2983
> Senior System Administrator, International Computer Science Institute
>
>
> -------------------------------------------------------------------------
> The following was automatically added to this message by the list server:
>
> To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:
>
> http://micronet.berkeley.edu
>
> Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.


-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.


 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
Reply | Threaded
Open this post in threaded view
|

Re: [Micronet] server diagnostic software??

gartim
In reply to this post by David Johnson-2
I'd run memtest+ overnight, then move on, otherwise you could get false
positives on cpu, etc. I said 'could'.

On Tue, Jun 11, 2013 at 11:53:34AM -0700, David Johnson wrote:

>Does anyone know of any decent server diagnostic software, either free
>or that can be purchased?  Ideally something that'll run under Linux
>or will boot from PXE.
>
>We've got a dual Xeon Supermicro server that's acting weirdly and we
>suspect the CPU (or possibly memory) are faulty but don't have a good
>way of verifying our suspicions.  Ideally we'd like something that
>made a reasonable effort to exercise the CPUs + memory system to get
>the temperature up whilst simultaneously verifying that the CPU and
>memory appear to be working correctly.
>
>Thanks,
>
>David.
>
>--
>David Johnson, [hidden email], (510) 666-2983
>Senior System Administrator, International Computer Science Institute
>
>
>-------------------------------------------------------------------------
>The following was automatically added to this message by the list server:
>
>To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:
>
>http://micronet.berkeley.edu
>
>Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.

 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
Reply | Threaded
Open this post in threaded view
|

Re: [Micronet] server diagnostic software??

Bruce Satow
There's a bootable CD called "Ultimate Boot CD" that you can download from here:
http://www.ultimatebootcd.com/

and a windows based one from here:
http://www.ubcd4win.com/

Both are pretty good.   The first one might be the one for you.  Looks and runs like old school DOS and ANSI. It has many  utilities and has a bunch of diagnostic and burn-in utils.  RAM and CPU tests, hard drive tests, etc.

The second, is similar to a Bart's PE CD that came out years back.  I made a customized version many years ago that had disk imagining utilities, anti-virus, and was networkable.  Also very useful and customizable.  This is good to clean up infected disks and damaged system software on the windows platform.

-Bruce




On 6/11/2013 3:18 PM, gartim wrote:
I'd run memtest+ overnight, then move on, otherwise you could get false
positives on cpu, etc. I said 'could'. 

On Tue, Jun 11, 2013 at 11:53:34AM -0700, David Johnson wrote:
Does anyone know of any decent server diagnostic software, either free
or that can be purchased?  Ideally something that'll run under Linux
or will boot from PXE.

We've got a dual Xeon Supermicro server that's acting weirdly and we
suspect the CPU (or possibly memory) are faulty but don't have a good
way of verifying our suspicions.  Ideally we'd like something that
made a reasonable effort to exercise the CPUs + memory system to get
the temperature up whilst simultaneously verifying that the CPU and
memory appear to be working correctly.

Thanks,

David.

-- 
David Johnson, [hidden email], (510) 666-2983
Senior System Administrator, International Computer Science Institute


-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.

--
  Bruce Satow
  Systems Administrator
  University of California at Berkeley  
  Space Sciences Laboratory
  7 Gauss Way
  Berkeley, California 94720-7450

  Phone: (510) 643-2348
      Cell: (510) 847-1914

Si hoc legere scis nimium eruditionis habes

 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.
Reply | Threaded
Open this post in threaded view
|

[Micronet] server diagnostic software??

David Johnson-2
In reply to this post by David Johnson-2
As a way of summarizing the most relevant responses to my request for
server diagnostic software, here's what's now on our "x86 diagnostics"
wiki page:

 * stress - a Linux program for stress testing CPUs, memory and disk
 * memtest86+ - a good memory test that works on modern hardware -
                included on the Ultimate Boot CD
 * check stuff out with IPMI
    ipmitool sel list
    ipmitool sensor
 * Prime95 - good way of exercising the CPU and heating things up
 * mcelog???
 * Spinrite 6???

The bad news is I still can't get the server to crash, but at least I
feel happy that I'm not missing something simple and obvious.

Thanks to everyone who replied,

David.
--
David Johnson, [hidden email], (510) 666-2983
Senior System Administrator, International Computer Science Institute


>>>>> "David" == David Johnson <[hidden email]> writes:

    David> Does anyone know of any decent server diagnostic software,
    David> either free or that can be purchased?  Ideally something
    David> that'll run under Linux or will boot from PXE.

    David> We've got a dual Xeon Supermicro server that's acting
    David> weirdly and we suspect the CPU (or possibly memory) are
    David> faulty but don't have a good way of verifying our
    David> suspicions.  Ideally we'd like something that made a
    David> reasonable effort to exercise the CPUs + memory system to
    David> get the temperature up whilst simultaneously verifying that
    David> the CPU and memory appear to be working correctly.

    David> Thanks,

    David> David.

    David> -- David Johnson, [hidden email], (510) 666-2983
    David> Senior System Administrator, International Computer Science
    David> Institute

 
    David> -------------------------------------------------------------------------
    David> The following was automatically added to this message by
    David> the list server:

    David> To learn more about Micronet, including how to subscribe to
    David> or unsubscribe from its mailing list and how to find out
    David> about upcoming meetings, please visit the Micronet Web
    David> site:

    David> http://micronet.berkeley.edu

    David> Messages you send to this mailing list are public and
    David> world-viewable, and the list's archives can be browsed and
    David> searched on the Internet.  This means these messages can be
    David> viewed by (among others) your bosses, prospective
    David> employers, and people who have known you in the past.


 
-------------------------------------------------------------------------
The following was automatically added to this message by the list server:

To learn more about Micronet, including how to subscribe to or unsubscribe from its mailing list and how to find out about upcoming meetings, please visit the Micronet Web site:

http://micronet.berkeley.edu

Messages you send to this mailing list are public and world-viewable, and the list's archives can be browsed and searched on the Internet.  This means these messages can be viewed by (among others) your bosses, prospective employers, and people who have known you in the past.