Posts by admin

Script to gather storage, hardware logs from Dell Cloudedge and Poweredge Servers

Log-bot script uses ipmitool, lshw, MegaCli to grab System event logs, controller logs and various hardware information. During execution the script checks for the ipmitool (delloem patched) and install it if its not available. The script takes 4-8 minute to complete. It is successfully tested on various Dell CloudEdge and PowerEdge servers running different Linux distributions. List of logs it gathers: Chassis -Fan rpm’s -FRU information -Power consumption and power history -Power Supply -DRAC info -System event logs -Temperature information OS -Basic Linux command output: (df -h, dmesg, fdisk, free, fstab, lspci etc) -Network Information -Bus information -Logical information -Driver and firmware information -Tape drive information -Disk and volume information -Controller & Vendor information -Size & disk information -Disk’s UUID -Disk size & serial number -BMC information -GPGPU information Storage -Controller logs -Partition information & other storage related information Note: In each bundle there will be a execs.logs file which will have all the commands the script execute.   How to gather logs: For previous versions click here  Download the script by clicking on link or root@theprojectbot:-# wget http://theprojectbot.com/Program/log-bot_v2.tar.gz  Extract the file using the command: root@theprojectbot:-# tar -xvf log-bot_v2.tar.gz  Navigate to the right directory using command: root@theprojectbot:-# cd bot  Run the script: root@theprojectbot:-# python...

Read More

Issue with iDRAC 7 display when installing openSUSE 12.3

Issue: When try to install openSUSE 12.3 via iDRAC7 the virtual console screen distorts as shown in the picture below. Resolution: Boot the server from the DVD and on installation menu set the video mode to 1024×768 and on boot option set the kernel parameter to usbcore.autosuspend=-1 as shown below. Explanation: Changing the Video mode to any other resolution fixes the display problem but on the other hand it disables the keyboard and mouse. This happen because some of the power management feature in the LFC/DRAC put the DRAC keyboard to temporary sleep when not being used. In this specific installation, for some reason the keyboard and mouse never came back up after sent to the temporary sleep. The usbcore.autosuspend=-1 parameter forces it so stay in action all the time and fixes the...

Read More

ipmitool Cheatsheet and Configuring DRAC from ipmitool

  1) List of helpful ipmitool commands: Check BMC Firmware Revision: #ipmitool –I open bmc info | grep –A3 “Firmware Revision” Check SEL log: #ipmitool sel List SEL log: #ipmitool sel list Check which node you are in [For Dell Cloud edge]: #ipmitool raw 0x34 0x11 Reset BMC/DRAC to default: #ipmitool  mc reset cold   2) Configure DRAC from ipmitool Set BMC/DRAC static IP #ipmitool lan set 1 ipsrc static Set BMC/DRAC IP Address #ipmitool lan set 1 ipaddr <ip add of bmc> Set BMC/DRAC Subnet Mask #ipmitool lan set 1 netmask <netmask addr> Set BMC/DRAC Default Gateway #ipmitool lan set 1 defgw ipaddr <ip add> Change the NIC settings to dedicated #ipmitool raw 0x30 0x24 2 Change the NIC settings to shared #ipmitool raw 0x30 0x24 0 Check the NIC settings #ipmitool raw 0x30 0x25 [Output of 00 means shared and 02 means dedicated] Restart the BMC/DRAC #ipmitool mc reset...

Read More

Dell iDRAC CLPSession error

Symptoms: DRAC web interface is not working When try ssh to the DRAC it returns: Create CLPsession instance error (1) ; Curl error 7 Solution: Do a soft reset on the DRAC. If you have racadm install then run the following command:  racadm racreset soft If you have ipmitool installed then run the command:  ipmitool mc reset cold Looking for ipmitool rpm, click...

Read More

What is a punctured RAID array?

What is a puncture stripe or a punctured RAID array and how to recover from it?  To understand the concept of a punctured stripe first we need to understand what exactly a RAID array is and how the information is stored on the disks in a RAID configuration. In the following post I am considering RAID5 (with three drives) as an example and will try to explain how the puncture happen and how to get rid of it. What is RAID5: In RAID5 the data is distributed in the form of parity across all the member disks.  In the case if one of the drive goes bad the data can rebuild again by calculating the parity across all the drives. More information on the parity can be found on: http://www.dataclinic.co.uk/raid-parity-xor/ But if two drives goes bad then there is no way to rebuild the data back to its original state. In most of the LSI* based controller whenever one disk fails from a container (Virtual Disk), the controller marked that virtual disk as degraded. What causes a puncture? Usually there are several things which can cause a puncture but it usually starts with a failed drive. For an instance John is a busy system admin and his job is to monitor a Dell PE 1950 which has a PERC 5/i controller installed [RAID5 with three disks] . He did not bother to do anything unless there is a amber light reporting an error on the front LCD panel. One ugly Monday  he came to work and saw a drive in slot 0 blinking amber. He called the support and ordered a new drive. Once he received the new drive he yanked the bad hard drive out and put the new drive in. As soon as he puts the new drive in, it starts rebuilding and in an hour or so all the drives are green again.   What did John do wrong? Most of us will say he didn’t do anything wrong. So lets move forward.   After couple of days John find out that drive in slot 1 is now blinking amber. Oh! Bummer. He called the support again and got another drive and continue with the same thing.   What did John do wrong this time? Hmmm lets say nothing because there is a possibility of multiple drive failure in a week difference. No big deal.  ...

Read More
Page 2 of 212