Product detail -- :HP ProLiant DL380p Gen8 Server Includes features, specifications and warranty information, as well links to technical support, product data sheets.
Product detail -- 733739-001:HP ProLiant DL360p Gen8 E5-2650v2 2P 32GB-R P420i/2GB FBWC 460W PS Perf Svr Includes features, specifications and warranty information.
Add me to the list, same problem here with one of two DL560 G8's purchased spring 2013. One has been fine, the other has been resetting itself with the message.
HP SIM reports a drive fault on a hard drive managed by an HP Smart Array controller The fault LED is illuminated on a hard drive managed by an HP Smart Array controller System completes POST but drive fails No hard drives are recognized Hard drive is not recognized by the server A new drive is not recognized Data is inaccessible Server response time is slower than usual ADU version 8.0 through 8.28 error messages Array Accelerator: The batteries were hot-removed. Array Accelerator: The cache is disabled because the batteries are low on the redundant controller. Array Accelerator: The cache for this controller is temporarily disabled since a snapshot is in progress.... Array Accelerator: The cache is disabled because the batteries are low. Array Accelerator: This controller has been set up to be a part of a redundant pair of controllers... Array Accelerator: The cache is disabled because the restore operation from flash memory failed. Array Accelerator: The cache is disabled because the charge on the flash-memory capacitor is too low. Array Accelerator: The cache is disabled because a capacitor has failed to charge to an acceptable level. Array Accelerator: The cache is disabled because the backup operation to flash memory failed. Array Accelerator: The cache is disabled because there are no capacitors attached to the cache module. Array Accelerator: The cache is disabled because a flash memory or capacitor hardware failure has been detected. Array Accelerator: This controller's firmware is not backward compatible with the cache module revision. Array status: The array has failed. Array status: The array currently has a drive erase operation queued, running, stopped or completed on a logical or physical drive. Array Status: The array has a spare drive assigned which is smaller than the smallest data drive in the array… Controller State: The array controller contains a volume that was created with a different version of controller firmware… Controller State: The array controller contains more logical drives than are supported in the current configuration… Controller State: The array controller contains one or more logical drives with a RAID level that is not supported in the current configuration… Controller State: The array controller contains redundant connections to one or more physical drives that are not supported in the current configuration… Controller State: The array controller is connected to an expander card or an external enclosure… Controller State: The array controller is operating without a memory board… Controller State: The array controller is operating without a memory board and has a bad volume position… Controller State: The array controller is operating without a memory board and contains more logical drives than are supported in the current configuration… Controller State: The array controller is operating without a memory board and has an invalid physical drive connection… Controller State: The array controller has a configuration that requires more physical drives than are currently supported… Controller State: The array controller has an unknown disabled configuration status message… Controller State: The array controller has an unsupported configuration… Controller State: The controller cannot be configured. CACHE STATUS PROBLEM DETECTED:... Controller State: A logical drive is configured with a newer version of the Array Configuration tools than is currently running. Controller State: The HBA does not have an access ID... Drive Offline due to Erase Operation: The logical drive is offline from having an erase in progress. Drive Offline due to Erase Operation: The physical drive is currently queued for erase. Drive Offline due to Erase Operation: The physical drive is offline and currently being erased. Drive Offline due to Erase Operation: The physical drive is offline and the erase process has been failed... Drive Offline due to Erase Operation: The physical drive is offline and the erase process has completed... Drive Offline due to Erase Operation: The physical drive is offline from having an erase in progress. Failed Array Controller: code: : Restart the server and run a diagnostic report… Logical drive state: A logical drive is configured with a newer version of Storage/Config Mod than is currently running. Logical drive state: Background parity initialization is currently queued... Logical drive state: The current array controller is performing capacity expansion,... Logical drive state: The logical drive is disabled from a SCSI ID conflict. Logical drive state: The logical drive is not configured. Logical drive state: The logical drive is not yet available. Logical drive state: The logical drive is offline from being ejected. Logical drive state: The logical drive is queued for erase. Logical drive state: The logical drive is queued for expansion. Logical drive state: The logical drive is queued for rebuilding. Logical drive state: This logical drive has a high physical drive count... NVRAM Error: Board ID could not be read... NVRAM Error: Bootstrap NVRAM image failed checksum test,... NVRAM Error: Bootstrap NVRAM image failed checksum test... Physical Drive State: The data on the physical drive is being rebuilt. Physical Drive State: Predictive failure. This physical drive is predicted to fail soon. Physical Drive State: This drive is not supported for configuration... Physical Drive State: SATA drives are not supported for configuration and should be disconnected from this controller. Physical Drive State: Single-ported drives are not supported for configuration and should be disconnected from this controller. Physical Drive State: This drive contains unsupported configuration data... Physical Drive State: This drive is not supported for configuration by this version of controller firmware. Physical Drive State: This drive is smaller in size than the drive it is replacing... Physical Drive State: This drive is unrecognizable... Physical Drive State: This physical drive is part of a logical drive that is not supported by the current configuration... Redundancy State: This controller has been setup to be part of a redundant pair of controllers.... Redundancy State: This controller has been setup to be part of a redundant pair of controllers.... Redundant Path Failure: Multi-domain path failure Redundant Path Failure: The logical drive is degraded due to the loss of a redundant path. Redundant Path Failure: The physical drive is degraded due to the loss of a redundant path. Redundant Path Failure: Warning: Redundant I/0 modules of this storage box... Smart SSD State: SSD has less than 2% of usage remaining before wearout. Smart SSD State: SSD has less than 2% of usage remaining before wearout… Smart SSD State: SSD has less than 5% of usage remaining before wearout. Smart SSD State: SSD has less than 5% of usage remaining before wearout… Smart SSD State: SSD has less than an estimated 56 days before it reaches the maximum usage limit for writes (wearout)... Smart SSD State: SSD has reached the maximum rated usage limit for writes (wearout) and should be replaced immediately. Smart SSD State: The SmartSSD Wear Gauge log is full… Storage Enclosure: The enclosure is reporting a critical temperature status condition... Storage Enclosure: One or more fans have failed. Storage Enclosure: Warning: The enclosure is reporting a high temperature status. Storage Enclosure: Warning: One or more redundant power supplies in this enclosure has failed or is not plugged in correctly. HP Proliant Servers Troubleshooting Guide Abstract This document describes common procedures and solutions for the many levels of troubleshooting for HP ProLiant G7 and earlier servers. This document is intended for the person who installs, administers, and troubleshoots servers or server blades. HP assumes you are qualified in the servicing of computer equipment and trained in recognizing hazards in products with hazardous energy levels. Part Number: 375445-404 August 2013 Edition: 13 © Copyright 2004, 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft®, Windows®, and Windows Server® are U.S. registered trademarks of Microsoft Corporation. Intel® and Pentium® are trademarks of Intel Corporation in the U.S. and other countries. AMD is a trademark of Advanced Micro Devices, Inc. Contents Introduction .................................................................................................................................. 8 What's new ................................................................................................................................................. 8 Revision history ............................................................................................................................................ 8 375445-403 (October 2011) ............................................................................................................. 8 375445-402 (April 2011) .................................................................................................................. 9 375445-401 (January 2011) .............................................................................................................. 9 Getting started............................................................................................................................ 11 HP ProLiant 100 Series Server troubleshooting information ............................................................................. 11 How to use this guide ................................................................................................................................. 11 Pre-diagnostic steps .................................................................................................................................... 12 Important safety information .............................................................................................................. 12 Symptom information ........................................................................................................................ 15 Prepare the server for diagnosis......................................................................................................... 15 Loose connections ...................................................................................................................................... 18 Service notifications.................................................................................................................................... 18 Firmware updates ...................................................................................................................................... 18 DIMM handling guidelines .......................................................................................................................... 19 Hard drive guidelines ................................................................................................................................. 19 SAS and SATA hard drive guidelines ................................................................................................. 19 SCSI hard drive guidelines ................................................................................................................ 19 Hard drive LED combinations ...................................................................................................................... 20 Hot-plug SCSI hard drive LED combinations ........................................................................................ 20 SAS and SATA hard drive LED combinations ...................................................................................... 20 Server updates with an HP Trusted Platform Module and BitLocker™ enabled ................................................... 21 Common problem resolution ........................................................................................................ 18 Diagnostic flowcharts .................................................................................................................. 22 Troubleshooting flowcharts .......................................................................................................................... 22 Troubleshooting flowchart reference websites ...................................................................................... 22 Start diagnosis flowchart ................................................................................................................... 24 General diagnosis flowchart.............................................................................................................. 24 Power-on problems flowchart ............................................................................................................. 26 POST problems flowchart .................................................................................................................. 30 Operating system boot problems flowchart ......................................................................................... 32 Server fault indications flowchart ....................................................................................................... 34 Hardware problems .................................................................................................................... 37 Procedures for all ProLiant servers ................................................................................................................ 37 Power problems ......................................................................................................................................... 37 Power source problems ..................................................................................................................... 37 Power supply problems ..................................................................................................................... 37 System open circuits and short circuits ................................................................................................ 38 UPS problems .................................................................................................................................. 38 General hardware problems ....................................................................................................................... 39 Problems with new hardware ............................................................................................................ 39 Unknown problem ............................................................................................................................ 40 Contents 3 Third-party device problems .............................................................................................................. 41 Internal system problems ............................................................................................................................. 42 Battery pack problems ...................................................................................................................... 42 CD-ROM and DVD drive problems ..................................................................................................... 42 Diskette drive problems ..................................................................................................................... 43 Drive problems (hard drives and solid state drives) .............................................................................. 44 SD card problems ............................................................................................................................ 46 USB drive key problems .................................................................................................................... 46 Fan problems ................................................................................................................................... 47 HP Trusted Platform Module problems ................................................................................................ 48 Memory problems ............................................................................................................................ 48 PPM problems .................................................................................................................................. 50 Processor problems .......................................................................................................................... 51 Tape drive problems ......................................................................................................................... 51 Graphics and video adapter problems ............................................................................................... 53 External device problems ............................................................................................................................ 53 Video problems................................................................................................................................ 53 Mouse and keyboard problems ......................................................................................................... 55 Audio problems ............................................................................................................................... 55 Printer problems ............................................................................................................................... 55 Cable problems ............................................................................................................................... 56 Local I/O cable problems ................................................................................................................. 56 Modem problems ............................................................................................................................. 56 Network controller problems ............................................................................................................. 58 Expansion board problems................................................................................................................ 59 Software problems ...................................................................................................................... 60 Operating system problems and resolutions .................................................................................................. 60 Operating system problems ............................................................................................................... 60 Operating system updates ................................................................................................................. 61 Restoring to a backed-up version ....................................................................................................... 62 When to Reconfigure or Reload Software ........................................................................................... 62 Linux operating systems .................................................................................................................... 63 Application software problems .................................................................................................................... 63 Software locks up ............................................................................................................................. 63 Errors occur after a software setting is changed ................................................................................... 63 Errors occur after the system software is changed ................................................................................ 63 Errors occur after an application is installed ........................................................................................ 63 ROM problems .......................................................................................................................................... 64 Remote ROM flash problems ............................................................................................................. 64 Boot problems.................................................................................................................................. 65 Configuration tools ..................................................................................................................................... 67 SmartStart software .......................................................................................................................... 67 HP ROM-Based Setup Utility .............................................................................................................. 67 Array Configuration Utility ................................................................................................................ 70 Option ROM Configuration for Arrays................................................................................................ 71 Re-entering the serial number and product ID ...................................................................................... 71 Management tools...................................................................................................................................... 72 Automatic Server Recovery ................................................................................................................ 72 ROMPaq utility................................................................................................................................. 72 iLO and iLO 2 technology ................................................................................................................. 73 iLO 3 technology ............................................................................................................................. 73 Software tools and solutions ......................................................................................................... 67 Contents 4 Erase Utility ..................................................................................................................................... 74 HP Systems Insight Manager ............................................................................................................. 74 Redundant ROM support ................................................................................................................... 74 USB support .................................................................................................................................... 74 Diagnostic tools ......................................................................................................................................... 75 HP Insight Diagnostics ...................................................................................................................... 75 HP Insight Diagnostics survey functionality .......................................................................................... 75 Integrated Management Log .............................................................................................................. 76 Array diagnostic software ................................................................................................................. 76 Remote support and analysis tools ............................................................................................................... 77 HP Insight Remote Support software ................................................................................................... 77 Keeping the system current .......................................................................................................................... 77 Drivers ............................................................................................................................................ 77 Version control ................................................................................................................................. 77 Operating system version support ...................................................................................................... 78 ProLiant Support Packs ...................................................................................................................... 78 Smart Update Firmware DVD ............................................................................................................ 78 HP Service Pack for ProLiant .............................................................................................................. 78 System Online ROM flash component utility ........................................................................................ 79 Subscriber's choice .......................................................................................................................... 80 Care Pack ....................................................................................................................................... 80 Firmware maintenance ............................................................................................................................... 80 Types of ROM.................................................................................................................................. 80 Verifying firmware versions ............................................................................................................... 82 Updating firmware ........................................................................................................................... 82 Unsupported processor stepping with Intel® processors ....................................................................... 85 Unsupported processor stepping with AMD processors ........................................................................ 85 HP resources for troubleshooting................................................................................................... 86 Online resources ........................................................................................................................................ 86 HP Technical Support website ............................................................................................................ 86 HP Guided Troubleshooting website ................................................................................................... 86 Server documentation ....................................................................................................................... 86 White papers .................................................................................................................................. 86 Service notifications, advisories, and notices ....................................................................................... 86 Subscription services ........................................................................................................................ 86 HP Care Pack Services...................................................................................................................... 87 Product information resources ...................................................................................................................... 87 Additional product information .......................................................................................................... 87 Registering the server........................................................................................................................ 87 Overview of server features and installation instructions ....................................................................... 87 Key features, option part numbers ...................................................................................................... 87 Server and option specifications, symbols, installation warnings, and notices ......................................... 87 Teardown procedures, part numbers, specifications ............................................................................. 88 Technical topics ............................................................................................................................... 88 Product installation resources ....................................................................................................................... 88 Switch settings, LED functions, drive, memory, expansion board and processor installation instructions, and board layouts .................................................................................................................................. 88 External cabling information .............................................................................................................. 88 Power capacity ................................................................................................................................ 88 Product configuration resources ................................................................................................................... 89 Device driver information .................................................................................................................. 89 DDR3 memory configuration.............................................................................................................. 89 Operating System Version Support..................................................................................................... 89 Contents 5 Operating system installation and configuration information (for factory-installed operating systems) ......... 89 Server configuration information ........................................................................................................ 89 Installation and configuration information for the server setup software .................................................. 89 Software installation and configuration of the server ............................................................................ 89 iLO information ................................................................................................................................ 89 Management of the server................................................................................................................. 90 Installation and configuration information for the server management system .......................................... 90 Fault tolerance, security, care and maintenance, configuration and setup .............................................. 90 Error messages ........................................................................................................................... 91 ADU error messages................................................................................................................................... 91 Introduction to ADU error messages ................................................................................................... 91 ADU version 8.0 through 8.28 error messages ................................................................................. 111 POST error messages and beep codes ....................................................................................................... 121 Introduction to POST error messages ................................................................................................ 121 Non-numeric messages or beeps only............................................................................................... 121 100 Series .................................................................................................................................... 132 200 Series .................................................................................................................................... 135 300 Series .................................................................................................................................... 139 400 Series .................................................................................................................................... 140 600 Series .................................................................................................................................... 140 1100 Series .................................................................................................................................. 142 1600 Series .................................................................................................................................. 142 1700 Series .................................................................................................................................. 146 1800 Series .................................................................................................................................. 171 Event list error messages ........................................................................................................................... 171 Introduction to event list error messages ............................................................................................ 171 A CPU Power Module (System Board, Socket X)... ............................................................................. 172 ASR Lockup Detected: Cause ........................................................................................................... 172 Automatic operating system shutdown initiated due to fan failure ........................................................ 172 Automatic Operating System Shutdown Initiated Due to Overheat Condition... ..................................... 172 Blue Screen Trap: Cause [NT]... ...................................................................................................... 172 Corrected Memory Error Threshold Passed (Slot X, Memory Module Y)... ............................................. 172 EISA Expansion Bus Master Timeout (Slot X)... ................................................................................... 173 PCI Bus Error (Slot X, Bus Y, Device Z, Function X) ............................................................................. 173 Processor Correctable Error Threshold Passed (Slot X, Socket Y) .......................................................... 173 Processor Uncorrectable Internal Error (Slot X, Socket Y) ..................................................................... 173 Real-Time Clock Battery Failing ........................................................................................................ 173 System AC Power Overload (Power Supply X) ................................................................................... 174 System AC Power Problem (Power Supply X) ..................................................................................... 174 System Fan Failure (Fan X, Location) ................................................................................................ 174 System Fans Not Redundant ............................................................................................................ 174 System Overheating (Zone X, Location) ............................................................................................ 174 System Power Supplies Not Redundant ............................................................................................. 174 System Power Supply Failure (Power Supply X).................................................................................. 174 Unrecoverable Host Bus Data Parity Error... ...................................................................................... 174 Uncorrectable Memory Error (Slot X, Memory Module Y).................................................................... 175 HP BladeSystem p-Class infrastructure error codes ....................................................................................... 175 Server blade management module error codes .................................................................................. 175 Power management module error codes ........................................................................................... 178 Port 85 codes and iLO messages ............................................................................................................... 179 Troubleshooting the system using port 85 codes ................................................................................ 179 Processor-related port 85 codes ....................................................................................................... 180 Memory-related port 85 codes......................................................................................................... 181 Contents 6 Expansion board-related port 85 codes ............................................................................................ 182 Miscellaneous port 85 codes ........................................................................................................... 182 Windows® Event Log processor error codes ............................................................................................... 183 Message ID: 4137 ......................................................................................................................... 183 Message ID: 4140 ......................................................................................................................... 184 Message ID: 4141 ......................................................................................................................... 184 Message ID: 4169 ......................................................................................................................... 184 Message ID: 4190 ......................................................................................................................... 184 Contacting HP .......................................................................................................................... 185 Contacting HP technical support or an authorized reseller ............................................................................ 185 Customer self repair ................................................................................................................................. 185 Server information you need...................................................................................................................... 185 Operating system information you need ..................................................................................................... 186 Microsoft® operating systems .......................................................................................................... 186 Linux operating systems .................................................................................................................. 187 Novell NetWare operating systems .................................................................................................. 188 SCO operating systems................................................................................................................... 188 IBM OS/2 operating systems .......................................................................................................... 189 Oracle Solaris operating systems ..................................................................................................... 190 Acronyms and abbreviations ...................................................................................................... 191 Index ....................................................................................................................................... 195 Contents 7 Introduction What's new The thirteenth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-404, includes the following additions and updates: • • • Added information about HP Service Pack for ProLiant (on page 78). SPP replaces older methods of updating firmware and system software on many of the servers supported by this document. Added a reference to the product page for HP Smart Update Manager (on page 79). The following topics were updated to provide information about the SPP and to explain how it replaces the HP Smart Update Firmware DVD and PSP: o o o o o o o o Firmware updates (on page 18) Drivers (on page 77) ProLiant Support Packs (on page 78) Smart Update Firmware DVD (on page 78) Updating firmware (on page 82) HP Smart Update Manager deployment (on page 82) Offline deployment (if SPP is not supported) (on page 83) Online deployment (if SPP is not supported) (on page 83) Revision history 375445-403 (October 2011) The twelfth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-403, included the following additions and updates: • • • • • • • • • Updated Power source problems (on page 37) Updated HP SIM reports a drive fault on a hard drive managed by an HP Smart Array controller (on page 44) Updated The fault LED is illuminated on a hard drive managed by an HP Smart Array controller (on page 44) Updated No hard drives are recognized (on page 45) Updated General graphics and video adapter problems are occurring (on page 53) Updated Screen is blank for more than 60 seconds after you power up the server (on page 53) Updated During installation of Oracle Solaris, the system locks up or a panic error occurs (on page 61) Updated When to Reconfigure or Reload Software (on page 62) Updated Array Configuration Utility Introduction 8 • • • • Updated Option ROM Configuration for Arrays (on page 71) Updated Automatic Server Recovery (on page 72) Updated the following section in HP Smart Update Manager deployment (on page 82): o Online deployment ("Online deployment (if SPP is not supported)" on page 83) ADU version 8.0 through 8.28 error messages (on page 111) POST error messages and beep codes (on page 121): Added or updated multiple messages in Error messages (on page 91) o o — Non-numeric messages or beeps only (on page 121) — 200 series (on page 135) — 1700 series (on page 146) — 1800 series (on page 171) • • Updated Linux operating systems (on page 187) Updated Oracle Solaris operating systems 375445-402 (April 2011) The eleventh edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-402, included the following additions and updates: • • Updated the HP ProLiant 100 Series Server troubleshooting information (on page 11) section to provide troubleshooting information for the HP ProLiant ML110 G7 Server and HP ProLiant DL120 G7 Server. Updated the following sections to include the HP Smart Update Firmware DVD: o o o Firmware updates (on page 18) HP Smart Update Manager deployment (on page 82) ROM Update Utility 375445-401 (January 2011) The tenth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-401, included the following additions and updates: • • • Added a new section to Getting started: Performing processor procedures in the troubleshooting process (on page 16) Updated Breaking the server down to the minimum hardware configuration (on page 16). Updated the introduction and sections in Hardware problems (on page 37): o o Unknown problem (on page 40) Processor problems (on page 51) POST error messages and beep codes (on page 121) Event List Error Messages (on page 171) Port 85 codes and iLO messages (on page 179) Windows® Event Log processor error codes (on page 183) • Updated multiple messages in Error messages (on page 91): o o o o Introduction 9 o Insight Diagnostics processor error codes Introduction 10 Getting started HP ProLiant 100 Series Server troubleshooting information Use this guide for troubleshooting information on the HP ProLiant ML110 G7 Server and the HP ProLiant DL120 G7 Server. For troubleshooting information on HP ProLiant 100 Series Servers other than the HP ProLiant ML110 G7 Server and HP ProLiant DL120 G7 Server, see the respective server user guides. How to use this guide NOTE: For common troubleshooting procedures, the term "server" is used to mean servers and server blades. This guide provides common procedures and solutions for the many levels of troubleshooting a ProLiant server—from the most basic connector issues to complex software configuration problems. To understand the sections of this guide and to identify the best starting point for a problem, use the following descriptions: • Common problem resolution (on page 18) Many server problems are caused by loose connections, outdated firmware, and other issues. Use this section to perform basic troubleshooting for common problems. • Diagnostic flowcharts (on page 22) When a server exhibits symptoms that do not immediately pinpoint the problem, use this section to begin troubleshooting. The section contains a series of flowcharts that provide a common troubleshooting process for ProLiant servers. The flowcharts identify a diagnostic tool or a process to help solve the problem. • Hardware problems (on page 37) When the symptoms point to a specific component, use this section to find solutions for problems with power, general components, system boards, system open circuits and short circuits, and external devices. • Software problems (on page 60) When you have a known, specific software problem, use this section to identify a solution to the problem. • • Software tools and solutions (on page 67) Use this section as a reference for software tools and utilities. HP resources for troubleshooting (on page 86) Getting started 11 When additional information becomes necessary, use this section to identify websites and supplemental documents that contain troubleshooting information. • Error messages (on page 91) Use this section for a complete list of the following messages: o o o o o ADU error messages (on page 91) POST error messages and beep codes (on page 121) Event list error messages (on page 171) HP BladeSystem infrastructure error codes ("HP BladeSystem p-Class infrastructure error codes" on page 175) Port 85 codes and iLO messages (on page 179) Pre-diagnostic steps WARNING: To avoid potential problems, ALWAYS read the warnings and cautionary information in the server documentation before removing, replacing, reseating, or modifying system components. IMPORTANT: This guide provides information for multiple servers. Some information may not apply to the server you are troubleshooting. Refer to the server documentation for information on procedures, hardware options, software tools, and operating systems supported by the server. 1. 2. 3. 4. Review the important safety information (on page 12). Gather symptom information (on page 15). Prepare the server for diagnosis (on page 15). Use the Start diagnosis flowchart (on page 24) to begin the diagnostic process. Important safety information Familiarize yourself with the safety information in the following sections before troubleshooting the server. Important safety information Before servicing this product, read the Important Safety Information document provided with the server. Symbols on equipment The following symbols may be placed on equipment to indicate the presence of potentially hazardous conditions. This symbol indicates the presence of hazardous energy circuits or electric shock hazards. Refer all servicing to qualified personnel. WARNING: To reduce the risk of injury from electric shock hazards, do not open this enclosure. Refer all maintenance, upgrades, and servicing to qualified personnel. Getting started 12 This symbol indicates the presence of electric shock hazards. The area contains no user or field serviceable parts. Do not open for any reason. WARNING: To reduce the risk of injury from electric shock hazards, do not open this enclosure. This symbol on an RJ-45 receptacle indicates a network interface connection. WARNING: To reduce the risk of electric shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle. This symbol indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists. WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching. This symbol indicates that the component exceeds the recommended weight for one individual to handle safely. WARNING: To reduce the risk of personal injury or damage to the equipment, observe local occupational health and safety requirements and guidelines for manual material handling. These symbols, on power supplies or systems, indicate that the equipment is supplied by multiple sources of power. WARNING: To reduce the risk of injury from electric shock, remove all power cords to completely disconnect power from the system. weight in kg weight in lb Warnings and cautions WARNING: Only authorized technicians trained by HP should attempt to repair this equipment. All troubleshooting and repair procedures are detailed to allow only subassembly/module-level repair. Because of the complexity of the individual boards and subassemblies, no one should attempt to make repairs at the component level or to make modifications to any printed wiring board. Improper repairs can create a safety hazard. WARNING: To reduce the risk of personal injury or damage to the equipment, be sure that: • • • • • The leveling feet are extended to the floor. The full weight of the rack rests on the leveling feet. The stabilizing feet are attached to the rack if it is a single-rack installation. The racks are coupled together in multiple-rack installations. Only one component is extended at a time. A rack may become unstable if more than one component is extended for any reason. WARNING: To reduce the risk of electric shock or damage to the equipment: • Do not disable the power cord grounding plug. The grounding plug is an important safety feature. • Plug the power cord into a grounded (earthed) electrical outlet that is easily accessible at all times. • Unplug the power cord from the power supply to disconnect power to the equipment. • Do not route the power cord where it can be walked on or pinched by items placed against it. Pay particular attention to the plug, electrical outlet, and the point where the cord extends from the server. Getting started 13 WARNING: To reduce the risk of personal injury or damage to the equipment: weight in kg weight in lb • Observe local occupation health and safety requirements and guidelines for manual handling. • Obtain adequate assistance to lift and stabilize the chassis during installation or removal. • The server is unstable when not fastened to the rails. • When mounting the server in a rack, remove the power supplies and any other removable module to reduce the overall weight of the product. CAUTION: To properly ventilate the system, you must provide at least 7.6 cm (3.0 in) of clearance at the front and back of the server. CAUTION: The server is designed to be electrically grounded (earthed). To ensure proper operation, plug the AC power cord into a properly grounded AC outlet only. Electrostatic discharge Preventing electrostatic discharge To prevent damaging the system, be aware of the precautions you need to follow when setting up the system or handling parts. A discharge of static electricity from a finger or other conductor may damage system boards or other static-sensitive devices. This type of damage may reduce the life expectancy of the device. To prevent electrostatic damage: • • • • • Avoid hand contact by transporting and storing products in static-safe containers. Keep electrostatic-sensitive parts in their containers until they arrive at static-free workstations. Place parts on a grounded surface before removing them from their containers. Avoid touching pins, leads, or circuitry. Always be properly grounded when touching a static-sensitive component or assembly. Grounding methods to prevent electrostatic discharge Several methods are used for grounding. Use one or more of the following methods when handling or installing electrostatic-sensitive parts: • Use a wrist strap connected by a ground cord to a grounded workstation or computer chassis. Wrist straps are flexible straps with a minimum of 1 megohm ±10 percent resistance in the ground cords. To provide proper ground, wear the strap snug against the skin. Use heel straps, toe straps, or boot straps at standing workstations. Wear the straps on both feet when standing on conductive floors or dissipating floor mats. Use conductive field service tools. Use a portable field service kit with a folding static-dissipating work mat. • • • If you do not have any of the suggested equipment for proper grounding, have an authorized reseller install the part. For more information on static electricity or assistance with product installation, contact an authorized reseller. Getting started 14 Symptom information Before troubleshooting a server problem, collect the following information: • • • • • • • • What events preceded the failure? After which steps does the problem occur? What has been changed since the time the server was working? Did you recently add or remove hardware or software? If so, did you remember to change the appropriate settings in the server setup utility, if necessary? How long has the server exhibited problem symptoms? If the problem occurs randomly, what is the duration or frequency? Run HP Insight Diagnostics (on page 75) and use the survey page to view the current configuration or to compare it to previous configurations. Refer to your hardware and software records for information. Refer to server LEDs and their statuses. To answer these questions, the following information may be useful: Prepare the server for diagnosis 1. 2. 3. 4. Be sure the server is in the proper operating environment with adequate power, air conditioning, and humidity control. For required environmental conditions, see the server documentation (on page 86). Record any error messages displayed by the system. Remove all diskettes, CD-ROMs, DVD-ROMs, and USB drive keys. Power down the server and peripheral devices if you will be diagnosing the server offline. If possible, always perform an orderly shutdown: a. Exit any applications. b. Exit the operating system. c. Power down the server. 5. 6. Disconnect any peripheral devices not required for testing (any devices not necessary to power up the server). Do not disconnect the printer if you want to use it to print error messages. Collect all tools and utilities, such as a Torx screwdriver, loopback adapters, ESD wrist strap, and software utilities, necessary to troubleshoot the problem. o You must have the appropriate Health Drivers and Management Agents installed on the server. To verify the server configuration, connect to the System Management Homepage (http://h18013.www1.hp.com/products/servers/management/agents/index.html ) and select Version Control Agent. The VCA gives you a list of names and versions of all installed HP drivers, Management Agents, and utilities, and whether they are up-to-date. o o HP recommends you have access to the server documentation (on page 86) for server-specific information. HP recommends you have access to the SmartStart CD for value-added software and drivers required during the troubleshooting process. Download the current version of SmartStart from the HP website (http://www.hp.com/servers/smartstart). Getting started 15 Performing processor procedures in the troubleshooting process Because this document supports multiple generations of HP ProLiant server models, it also covers processes that include troubleshooting of various models and types of processors. Before performing any troubleshooting steps that involve processors, review the following guidelines: • • Be sure that only authorized personnel perform the troubleshooting steps that involve installation, removal, or replacement of a processor. Always locate the documentation for your processor model before performing any steps that require installing, removing, or replacing a processor. If you cannot locate the hard copy of the instructions, locate your server user guide or maintenance and service guide on the HP website (http://www.hp.com/support/manuals). Some processor models require the use of a processor installation tool, and specific steps are documented to ensure that you do not damage the processor or processor socket on the system board. For server models that have pins inside the processor socket, remember that THE PINS ON THE SYSTEM BOARD ARE VERY FRAGILE AND EASILY DAMAGED. If you damage the socket, you must replace the system board. Depending on the server model, the contacts may be on the processor or they may be inside the processor socket. Never touch the contacts. THE PINS ON THE SYSTEM BOARD ARE VERY FRAGILE AND EASILY DAMAGED. If the contacts inside the processor socket are damaged, you must replace the system board. Always complete all other troubleshooting procedures before removing or replacing a processor. • • • Breaking the server down to the minimum hardware configuration During the troubleshooting process, you may be asked to break the server down to the minimum hardware configuration. A minimum configuration consists of only the components needed to boot the server and successfully pass POST. When requested to break the server down to the minimum configuration, uninstall the following components, if installed: • All additional DIMMs Leave only the minimum required to boot the server—either one DIMM or a pair of DIMMs. For more information, see the memory guidelines in the server user guide. • • • • • • All additional cooling fans, if applicable For the minimum fan configuration, see the server user guide. All additional power supplies, if applicable (leave one installed) All hard drives All optical drives (DVD-ROM, CD-ROM, and so forth) All optional mezzanine cards All expansion boards Before removing the components, be sure to determine the minimum configuration for each component and follow all guidelines in the server user guide. Always use the recommended minimum configuration above before removing any processors. If you are unable to isolate the issue with the configuration above, you will then remove all but one of the processors. Getting started 16 CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board. Getting started 17 Common problem resolution Loose connections Action: • • • • • • • • • Be sure all power cords are securely connected. Be sure all cables are properly aligned and securely connected for all external and internal components. Remove and check all data and power cables for damage. Be sure no cables have bent pins or damaged connectors. If a fixed cable tray is available for the server, be sure the cords and cables connected to the server are routed correctly through the tray. Be sure each device is properly seated. Avoid bending or flexing circuit boards when reseating components. If a device has latches, be sure they are completely closed and locked. Check any interlock or interconnect LEDs that may indicate a component is not connected properly. If problems continue to occur, remove and reinstall each device, checking the connectors and sockets for bent pins or other damage. For HP ProLiant BL c-Class Server Blades, be sure the Onboard Administrator tray is seated properly. Service notifications To view the latest service notifications, refer to the HP website (http://www.hp.com/go/bizsupport). Select the appropriate server model, and then click the Troubleshoot a Problem link on the product page. Firmware updates Download firmware updates from the following locations: • • • The most recent SPP ("HP Service Pack for ProLiant" on page 78) from the HP website (http://www.hp.com/go/spp). The HP Smart Components available as part of the SPP ("HP Service Pack for ProLiant" on page 78). The most recent version of a particular server or option firmware from the HP website (http://www.hp.com/support). All servers discussed in this document might not be supported by SPP. To determine if your product is supported by the SPP, see the latest server support guide on the HP website (http://www.hp.com/go/spp/documentation). If your server is not supported by SPP, see "Keeping the system current (on page 77)" for other options. Common problem resolution 18 HP offers a subscription service that can provide notification of firmware updates. For more information, see "Subscriber's Choice (on page 80)." For more information on updating firmware, see "Firmware maintenance (on page 80)." DIMM handling guidelines CAUTION: Failure to properly handle DIMMs can cause damage to DIMM components and the system board connector. When handling a DIMM, observe the following guidelines: • • • • • • • • • Avoid electrostatic discharge (on page 14). Always hold DIMMs by the side edges only. Avoid touching the connectors on the bottom of the DIMM. Never wrap your fingers around a DIMM. Avoid touching the components on the sides of the DIMM. Never bend or flex the DIMM. Before seating the DIMM, align the DIMM with the slot. To align and seat the DIMM, use two fingers to hold the DIMM along the side edges. To seat the DIMM, use two fingers to apply gentle pressure along the top of the DIMM. When installing a DIMM, observe the following guidelines: For more information, see the HP website (http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c008 68283&jumpid=reg_R1002_USEN). Hard drive guidelines SAS and SATA hard drive guidelines When adding hard drives to the server, observe the following general guidelines: • • • • The system automatically sets all drive numbers. If only one hard drive is used, install it in the bay with the lowest drive number. Drives must be the same capacity to provide the greatest storage space efficiency when drives are grouped together into the same drive array. Drives in the same logical volume must be of the same type: o o ACU does not support mixing SAS and SATA drives in the same logical volume. ACU does not support mixing traditional drives and solid state drives (SSD) in the same logical volume. SCSI hard drive guidelines • Each SCSI drive must have a unique ID. Common problem resolution 19 • • • The system automatically sets all SCSI IDs. If only one SCSI hard drive is used, install it in the bay with the lowest number. Drives must be the same capacity to provide the greatest storage space efficiency when drives are grouped together into the same drive array. Hard drive LED combinations Hot-plug SCSI hard drive LED combinations Activity LED (1) Online LED (2) Fault LED Interpretation (3) Flashing Off A predictive failure alert has been received for this drive. Replace the drive as soon as possible. The drive is online and is configured as part of an array. If the array is configured for fault tolerance and all other drives in the array are online, and a predictive failure alert is received or a drive capacity upgrade is in progress, you may replace the drive online. Do not remove the drive. Removing a drive may terminate the current operation and cause data loss. The drive is rebuilding or undergoing capacity expansion. Do not remove the drive. The drive is being accessed, but (1) it is not configured as part of an array; (2) it is a replacement drive and rebuild has not yet started; or (3) it is spinning up during the POST sequence. Do not remove the drive. Removing a drive may cause data loss in non-fault-tolerant configurations. One or more of the following conditions may exist: On, off, or On or off flashing On, off, or On flashing On or flashing On Flashing Off Off Off Flashing Flashing Flashing • • • Off Off On The drive is part of an array being selected by an array configuration utility Drive Identification has been selected in HP SIM The drive firmware is being updated The drive has been placed offline due to hard disk drive failure or subsystem communication failure. You may need to replace the drive. One or more of the following conditions may exist: Off Off Off • • • The drive is not configured as part of an array The drive is configured as part of an array, but it is a replacement drive that is not being accessed or being rebuilt yet The drive is configured as an online spare If the drive is connected to an array controller, you may replace the drive online. SAS and SATA hard drive LED combinations NOTE: Predictive failure alerts can occur only when the server is connected to a Smart Array controller. Common problem resolution 20 Online/activity LED (green) Fault/UID LED (amber/blue) Interpretation The drive has failed, or a predictive failure alert has been received for this drive; it also has been selected by a management application. The drive is operating normally, and it has been selected by a management application. A predictive failure alert has been received for this drive. Replace the drive as soon as possible. The drive is online, but it is not active currently. Do not remove the drive. Removing a drive may terminate the current operation and cause data loss. The drive is part of an array that is undergoing capacity expansion or stripe migration, but a predictive failure alert has been received for this drive. To minimize the risk of data loss, do not replace the drive until the expansion or migration is complete. Do not remove the drive. Removing a drive may terminate the current operation and cause data loss. The drive is rebuilding, erasing, or it is part of an array that is undergoing capacity expansion or stripe migration. The drive is active, but a predictive failure alert has been received for this drive. Replace the drive as soon as possible. The drive is active, and it is operating normally. A critical fault condition has been identified for this drive, and the controller has placed it offline. Replace the drive as soon as possible. A predictive failure alert has been received for this drive. Replace the drive as soon as possible. The drive is offline, a spare, or not configured as part of an array. On, off, or flashing Alternating amber and blue On, off, or flashing Steadily blue On On Flashing regularly (1 Hz) Amber, flashing regularly (1 Hz) Off Amber, flashing regularly (1 Hz) Flashing regularly (1 Hz) Off Flashing irregularly Amber, flashing regularly (1 Hz) Flashing irregularly Off Steadily amber Off Off Off Amber, flashing regularly (1 Hz) Off Server updates with an HP Trusted Platform Module and BitLocker™ enabled When a TPM is installed and enabled in RBSU, and when the Microsoft® Windows® BitLocker™ Drive Encryption feature is enabled, always disable BitLocker™ before performing any of the following procedures: • • • • • • • Restarting the computer for maintenance without a PIN or startup key Updating firmware (on page 82) Upgrading critical early boot components Upgrading the system board to replace or remove the TPM Disabling or clearing the TPM Moving a BitLocker™-protected drive to another server Adding an optional PCI device, such as a storage controller or network adapter Common problem resolution 21 Diagnostic flowcharts Troubleshooting flowcharts To effectively troubleshoot a problem, HP recommends that you start with the first flowchart in this section, "Start diagnosis flowchart (on page 24)," and follow the appropriate diagnostic path. If the other flowcharts do not provide a troubleshooting solution, follow the diagnostic steps in "General diagnosis flowchart (on page 24)." The General diagnosis flowchart is a generic troubleshooting process to be used when the problem is not server-specific or is not easily categorized into the other flowcharts. The available flowcharts include: • • • Start diagnosis flowchart (on page 24) General diagnosis flowchart (on page 24) Power-on problems o o o Server power-on problems flowchart (on page 26) p-Class server blade power-on problems flowchart (on page 28) c-Class server blade power-on problems flowchart (on page 28) Server and p-Class server blade POST problems flowchart (on page 31) c-Class server blade POST problems flowchart (on page 32) • POST problems flowchart (on page 30) o o • • Operating system boot problems flowchart (on page 32) Server fault indications flowchart o o Server and p-Class server blade fault indications flowchart (on page 34) c-Class server blade fault indications flowchart (on page 36) Troubleshooting flowchart reference websites Each flowchart contains references to external websites. The following websites correspond to the numbered websites in each flowchart: 1. 2. HP Technical Support (http://www.hp.com/support) Select your country and then follow the instructions to locate software, firmware, and drivers. HP ProLiant maintenance and service guides: o Business Support Center (http://www.hp.com/go/bizsupport) Select Manuals. Under Servers, select ProLiant and tc series servers. Select the product, and then locate the link for the maintenance and service guide. o HP BladeSystem p-Class Support and Documents (http://www.hp.com/products/servers/proliant-bl/p-class/info) Under Product support, select the product. Select Manuals (guides, supplements, addendums, etc). Under Service and maintenance information, locate the link for the maintenance and service guide. Diagnostic flowcharts 22 o HP BladeSystem c-Class Technical Documentation (http://www.hp.com/go/bladesystem/documentation) Select Support, Drivers and Manuals, and then select the product. Select Manuals, and then locate the link for the maintenance and service guide. 3. HP BladeSystem p-Class Support and Documents (http://www.hp.com/products/servers/proliant-bl/p-class/info) To locate the HP BladeSystem p-Class System Maintenance and Service Guide, select the product. Select Manuals (guides, supplements, addendums, etc). Under Service and maintenance information, locate the link for the document. 4. HP BladeSystem Power Sizer (http://www.hp.com/go/bladesystem/powercalculator) Use the Power Sizer to plan your power infrastructure and meet the needs of an HP BladeSystem solution. 5. Remote management (http://www.hp.com/servers/lights-out) To locate the Integrated Lights-Out User Guide, select the product, and then select Support & Documents. Select Manuals and locate the link to the document. 6. 7. SmartStart Support and Documents (http://www.hp.com/support/smartstart/documentation) In the User guides section, locate the link for the HP ROM-Based Setup Utility User Guide. System Management Homepage (https://localhost:2381) Access consolidated system management information. Diagnostic flowcharts 23 Start diagnosis flowchart Use the following flowchart to start the diagnostic process. General diagnosis flowchart Diagnostic flowcharts 24 The General diagnosis flowchart provides a generic approach to troubleshooting. If you are unsure of the problem, or if the other flowcharts do not fix the problem, use the following flowchart. Diagnostic flowcharts 25 Power-on problems flowchart Server power-on problems flowchart Some servers have an internal health LED and an external health LED, while other servers have a single system health LED. The system health LED provides the same functionality as the two separate internal and external health LEDs. Depending on the model, the internal health LED and external health LED may either appear solid or they may flash. Both conditions represent the same symptom. For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support). Symptoms: • • • • • • • • • The server does not power on. The system power LED is off or amber. The external health LED is red, flashing red, amber, or flashing amber. The internal health LED is red, flashing red, amber, or flashing amber. The system health LED is red, flashing red, amber, or flashing amber. Improperly seated or faulty power supply Loose or faulty power cord Power source problem Improperly seated component or interlock problem Possible causes: Diagnostic flowcharts 26 Diagnostic flowcharts 27 p-Class server blade power-on problems flowchart c-Class server blade power-on problems flowchart For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support). Diagnostic flowcharts 28 Symptoms: • • • • • • • The server does not power on. The system power LED is off or amber. The health LED is red or amber. Improperly seated or faulty power supply Loose or faulty power cord Power source problem Improperly seated component or interlock problem Possible causes: Diagnostic flowcharts 29 POST problems flowchart Symptoms: • Server does not complete POST NOTE: The server has completed POST when the system attempts to access the boot device. • • • • Server completes POST with errors Improperly seated or faulty internal component Faulty KVM device Faulty video device Possible problems: Diagnostic flowcharts 30 Server and p-Class server blade POST problems flowchart Diagnostic flowcharts 31 c-Class server blade POST problems flowchart Operating system boot problems flowchart Symptoms: • • • • Server does not boot a previously installed OS Server does not boot SmartStart Corrupted OS Hard drive subsystem problem Possible causes: Diagnostic flowcharts 32 • • • Incorrect boot order setting in RBSU Use iLO to remotely attach virtual devices to mount the SmartStart CD onto the server blade. Use a local I/O cable and drive to connect to the server blade, and then restart the server blade. There are two ways to use SmartStart when diagnosing OS boot problems on a server blade: Diagnostic flowcharts 33 Server fault indications flowchart Symptoms: • • Server boots, but a fault event is reported by Insight Management Agents Server boots, but the internal health LED, external health LED, or component health LED is red or amber NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation. Possible causes: • • • • Improperly seated or faulty internal or external component Unsupported component installed Redundancy failure System overtemperature condition Server and p-Class server blade fault indications flowchart Some servers have an internal health LED and an external health LED, while other servers have a single system health LED. The system health LED provides the same functionality as the two separate internal and external health LEDs. Depending on the model, the internal health LED and external health LED may either appear solid or they may flash. Both conditions represent the same symptom. Diagnostic flowcharts 34 For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support). Diagnostic flowcharts 35 c-Class server blade fault indications flowchart Diagnostic flowcharts 36 Hardware problems Procedures for all ProLiant servers The procedures in this section are comprehensive and include steps about or references to hardware features that may not be supported by the server you are troubleshooting. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board. Power problems Power source problems Action: 1. 2. 3. 4. 5. 6. 7. Press the Power On/Standby button to be sure it is on. If the server has a Power On/Standby button that returns to its original position after being pressed, be sure you press the switch firmly. Plug another device into the grounded power outlet to be sure the outlet works. Also, be sure the power source meets applicable standards. Replace the power cord with a known functional power cord to be sure it is not faulty. Replace the power strip with a known functional power strip to be sure it is not faulty. Have a qualified electrician check the line voltage to be sure it meets the required specifications. Be sure the proper circuit breaker is in the On position. If Enclosure Dynamic Power Capping or Enclosure Power Limit is enabled on supported servers, be sure there is sufficient power allocation to support the server. For more information, see the following documents: o The HP Power Capping and HP Dynamic Power Capping for ProLiant servers technology brief on the HP website (http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01549455/c01549455.p df) The HP BladeSystem Onboard Administrator User Guide on the HP website (http://www.hp.com/go/bladesystem/documentation) o 8. Be sure no loose connections exist ("Loose connections" on page 18). Power supply problems Action: 1. Be sure no loose connections exist ("Loose connections" on page 18). Hardware problems 37 2. If the power supplies have LEDs, be sure they indicate that each power supply is working properly. If the LEDs indicate a problem with a power supply, replace the power supply. For more information, see the server documentation on the HP website (http://www.hp.com/support). Be sure the system has enough power, particularly if you recently added hardware, such as hard drives. Additional power supplies may be required. Check the system information from the IML. For product-specific information, see the server documentation on the HP website (http://www.hp.com/support). For more information, see the HP Power Advisor on the HP website (http://www.hp.com/go/hppoweradvisor). 3. 4. If running a redundant configuration, be sure that all of the power supplies in the system are the same. For a list of supported power supplies, see the server documentation on the HP website (http://www.hp.com/support). System open circuits and short circuits Action: CAUTION: Do not operate the server for long periods with the access panel open or removed. Operating the server in this manner results in improper airflow and improper cooling that can lead to thermal damage. 1. 2. 3. 4. Check the server LEDs to see if any statuses indicate the source of the problem. For LED information, refer to the server documentation. Remove all power sources to the server. Be sure no loose connections (on page 18) exist in the area. Be sure each component in the area is working. Refer to the section for each component in this guide. If you cannot determine the problem by checking the specific area, perform each of the following actions. Restart the server after each action to see if the problem has been corrected. • • • Reseat all I/O expansion boards. Be sure no loose connections (on page 18) exist in the rest of the server, particularly with the cables that connect to the system board. Be sure no foreign material exists, such as screws, bits, or slot bracket blanks, that may be short circuiting components. UPS problems UPS is not working properly Action: 1. 2. 3. Be sure the UPS batteries are charged to the proper level for operation. See the UPS documentation for details. Be sure the UPS power switch is in the On position. See the UPS documentation for the location of the switch. Be sure the UPS software is updated to the latest version. Use the Power Management software located on the Power Management CD. Hardware problems 38 4. 5. 6. 7. 8. 9. Be sure the power cord is the correct type for the UPS and the country in which the server is located. See the UPS reference guide for specifications. Be sure the line cord is connected. Be sure each circuit breaker is in the On position, or replace the fuse if needed. If this occurs repeatedly, contact an authorized service provider. Check the UPS LEDs to be sure a battery or site wiring problem has not occurred. See the UPS documentation. If the UPS sleep mode is initiated, disable sleep mode for proper operation. The UPS sleep mode can be turned off through the configuration mode on the front panel. Change the battery to be sure damage was not caused by excessive heat, particularly if a recent air conditioning outage has occurred. NOTE: The optimal operating temperature for UPS batteries is 25°C (77°F). For approximately every 8°C to 10°C (16°F to 18°F) average increase in ambient temperature above the optimal temperature, battery life is reduced by 50 percent. Low battery warning is displayed Action: 1. 2. Plug the UPS into an AC grounded outlet for at least 24 hours to charge the batteries, and then test the batteries. Replace the batteries if necessary. Be sure the alarm is set appropriately by changing the amount of time given before a low battery warning. Refer to the UPS documentation for instructions. One or more LEDs on the UPS is red Action: Refer to the UPS documentation for instructions regarding the specific LED to determine the cause of the error. General hardware problems Problems with new hardware Action: 1. Be sure the hardware being installed is a supported option on the server. For information on supported hardware, see the server documentation. If necessary, remove unsupported hardware. To be sure the problem is not caused by a change to the hardware release, see the release notes included with the hardware. If no documentation is available, see the HP support website (http://www.hp.com/support). Be sure the new hardware is installed properly. To be sure all requirements are met, see the device, server, and OS documentation. Common problems include: o o 2. 3. Incomplete population of a memory bank Installation of a processor without a corresponding PPM Hardware problems 39 o o o 4. 5. 6. 7. 8. Installation of a SCSI device without termination or without proper ID settings Setting of an IDE device to Primary/Secondary when the other device is set to CS Connection of the data cable, but not the power cable, of a new device Be sure no memory, I/O, or interrupt conflicts exist. Be sure no loose connections (on page 18) exist. Be sure all cables are connected to the correct locations and are the correct lengths. For more information, see the server documentation. Be sure other components were not accidentally unseated during the installation of the new hardware component. Be sure all necessary software updates, such as device drivers, ROM updates, and patches, are installed and current, and the correct version for the hardware is installed. For example, if you are using a Smart Array controller, you need the latest Smart Array Controller device driver. Uninstall any incorrect drivers before installing the correct drivers. If the "Unsupported processor detected" message is displayed, update the system ROM to support the installed processor. For more information, see "Unsupported processor stepping with Intel® processors (on page 85)." 9. After installing or replacing boards or other options, run RBSU to be sure all system components recognize the changes. If you do not run the utility, you may receive a POST error message indicating a configuration error. a. Check the settings in RBSU. b. Save and exit the utility. c. Restart the server. For more information on RBSU, see the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/support/smartstart/documentation). 10. 11. 12. 13. Be sure all switch settings are set correctly. For additional information about required switch settings, see the labels located on the inside of the server access panel or the server documentation. Be sure all boards are properly installed in the server. To see if the utility recognizes and tests the device, run HP Insight Diagnostics (on page 75). Uninstall the new hardware. Unknown problem Action: 1. 2. Power down and disconnect power to the server. Following the guidelines and cautionary information in the server documentation, reduce the server to the minimum hardware configuration by removing all cards or devices that are not necessary to start the server. Keep the monitor connected to view the server startup process. Before completing this step, see "Breaking the server down to the minimum hardware configuration (on page 16)." Reconnect power, and then power on the system. o 3. If the video does not work, see "Video problems (on page 53)." Hardware problems 40 CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185) before proceeding. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board. o If the system fails in this minimum configuration, one of the primary components has failed. If you have already verified that the processor, PPM, power supply, and memory are working before getting to this point, replace the system board. If not, be sure each of those components is working. If the system boots and video is working, add each component back to the server one at a time, restarting the server after each component is added to determine if that component is the cause of the problem. When adding each component back to the server, be sure to disconnect power to the server and follow the guidelines and cautionary information in the server documentation. o Third-party device problems Action: 1. 2. 3. Refer to the server and operating system documentation to be sure the server and operating system support the device. Be sure the latest device drivers ("Drivers" on page 77) are installed. Refer to the device documentation to be sure the device is properly installed. For example, a third-party PCI or PCI-X board may be required to be installed on the primary PCI or PCI-X bus, respectively. Testing the device Action: 1. Uninstall the device. If the server works with the device removed and uninstalled, a problem exists with the device, the server does not support the device, or a conflict exists with another device. 2. 3. If the device is the only device on a bus, be sure the bus works by installing a different device on the bus. Restarting the server each time to determine if the device is working, move the device: a. To a different slot on the same bus (not applicable for PCI Express) b. To a PCI, PCI-X, or PCI Express slot on a different bus c. To the same slot in another working server of the same or similar design If the board works in any of these slots, either the original slot is bad or the board was not properly seated. Reinsert the board into the original slot to verify. 4. If you are testing a board (or a device that connects to a board): a. Test the board with all other boards removed. b. Test the server with only that board removed. Hardware problems 41 CAUTION: Clearing NVRAM deletes the configuration information. Refer to the server documentation for complete instructions before performing this operation or data loss could occur. 5. Clearing NVRAM can resolve various problems. Clear the NVRAM, but do not use the backup .SCI file if prompted. Have available any .CFG, .OVL, or .PCF files that are required. Internal system problems Battery pack problems The battery stops charging or begins to swell The batteries in BBWC battery packs are a consumable material. After 3 years of service, batteries may not provide predictable data retention times. If a battery fails completely, the HP Smart Array controller detects this condition and automatically restricts write cache functions to protect user data. To help ensure uninterrupted performance levels, HP recommends replacing battery packs at 3-year intervals. In NiMH batteries, the charging and discharging processes create and recombine inert gases, which can cause the button cell to swell in size by as much as 20%. Battery packs are designed to stop charging before excess swelling occurs. However, if excess swelling does occur, a pressure mechanism within the button cell releases these non-toxic, non-corrosive gases before the cell incurs physical damage. If this pressure release occurs, the battery no longer charges properly, and the storage solution reports a failed battery. CD-ROM and DVD drive problems System does not boot from the drive Action: 1. 2. 3. 4. 5. Be sure the drive boot order in RBSU is set so that the server boots from the CD-ROM drive first. If the CD-ROM drive jumpers are set to CS (the factory default), be sure the CD-ROM drive is installed as device 0 on the cable so that it is in position for the server to boot from the drive. Be sure no loose connections (on page 18) exist. Be sure the media from which you are attempting to boot is not damaged and is a bootable CD. If attempting to boot from a USB CD-ROM drive: o o Refer to the operating system and server documentation to be sure both support booting from a USB CD-ROM drive. Be sure legacy support for a USB CD-ROM drive is enabled in RBSU. Data read from the drive is inconsistent, or drive cannot read data Action: 1. 2. Clean the drive and media. If a paper or plastic label has been applied to the surface of the CD or DVD in use, remove the label and any adhesive residue. Hardware problems 42 3. Be sure the inserted CD or DVD format is valid for the drive. For example, be sure you are not inserting a DVD into a drive that only supports CDs. Drive is not detected Action: 1. 2. 3. 4. Be sure no loose connections (on page 18) exist. Refer to the drive documentation to be sure cables are connected as required. Be sure the cables are working properly. Replace with known functional cables to test whether the original cables were faulty. Be sure the correct, current driver is installed. Diskette drive problems Diskette drive light stays on Action: 1. 2. 3. 4. Be sure no loose connections (on page 18) exist. Be sure the diskette is not damaged. Run the diskette utility on the diskette (CHKDSK on some systems). Be sure the diskette is properly inserted. Remove the diskette and reinsert correctly into the drive. Be sure the diskette drive is cabled properly. Refer to the server documentation. A problem has occurred with a diskette transaction Action: Be sure the directory structure on the diskette is not bad. Run the diskette utility to check for fragmentation (CHKDSK on some systems). Diskette drive cannot read a diskette Action: 1. 2. If the diskette is not formatted, format the diskette. Check the type of drive you are using and be sure you are using the correct diskette type. Drive is not found Action: Be sure no loose connections (on page 18) exist with the drive. Non-system disk message is displayed Action: 1. 2. Remove the non-system diskette from the drive. Check for and disconnect any non-bootable USB devices. Diskette drive cannot write to a diskette Action: 1. If the diskette is not formatted, format the diskette. Hardware problems 43 2. 3. 4. Be sure the diskette is not write protected. If it is, use another diskette or remove the write protection. Be sure you are attempting to write to the proper drive by checking the drive letter in the path statement. Be sure enough space is available on the diskette. Drive problems (hard drives and solid state drives) HP SIM reports a drive fault on a hard drive managed by an HP Smart Array controller Action: • • • If the drive fault LED is illuminated, see the drive LED combination tables in "Hard drive LED combinations (on page 20)." If the drive fault LED is flashing, replace the hard drive. See the server maintenance and service guide. If the drive fault LED is not flashing and the operating system supports HP Insight Diagnostics, version 7.40 or later ("HP Insight Diagnostics" on page 75), perform the following: a. Run the Smart Array SCSI Diagnosis feature (on page 75). b. Perform the actions recommended by the diagnosis report. c. If the issue is still present, run the appropriate array diagnostic software (on page 76). For array diagnostic software report analysis, contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). • If the drive fault LED is not flashing and the operating system does not support HP Insight Diagnostics, version 7.40 or later ("HP Insight Diagnostics" on page 75), insert the SmartStart CD and run the appropriate array diagnostic software (on page 76). For array diagnostic software report analysis, contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). The fault LED is illuminated on a hard drive managed by an HP Smart Array controller Action: • • • If the drive fault LED is illuminated, see the drive LED combination tables in "Hard drive LED combinations (on page 20)." If the drive fault LED is flashing, replace the hard drive. See the server maintenance and service guide. If the drive fault LED is not flashing and the operating system supports HP Insight Diagnostics, version 7.40 or later ("HP Insight Diagnostics" on page 75), perform the following: a. Run the Smart Array SCSI Diagnosis feature (on page 75). b. Perform the actions recommended by the diagnosis report. c. If the issue is still present, run the appropriate array diagnostic software (on page 76). For array diagnostic software report analysis, contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). • If the drive fault LED is not flashing and the operating system does not support HP Insight Diagnostics, version 7.40 or later ("HP Insight Diagnostics" on page 75), insert the SmartStart CD and run the appropriate array diagnostic software (on page 76). For array diagnostic software report analysis, Hardware problems 44 contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). System completes POST but drive fails Action: 1. 2. 3. 4. 5. 6. 7. Be sure no loose connections (on page 18) exist. Be sure no device conflict exists. Be sure the hard drive is cabled properly and terminated if necessary. Be sure the hard drive data cable is working by replacing it with a known functional cable. Be sure the access panel is installed properly when the server is operating. Drives may overheat and cause sluggish response or drive failure. Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. Run RBSU ("HP ROM-Based Setup Utility" on page 67) to verify that the boot controller order is set properly. No hard drives are recognized Action: 1. 2. 3. 4. 5. Be sure no power problems (on page 37) exist. Check for loose connections (on page 18). Be sure that the controller supports the hard drives being installed. Be sure the controller has the most recent firmware. If the controller supports license keys and the configuration is dual domain, be sure the license key is installed. Hard drive is not recognized by the server Action: 1. Check the hard drive LEDs to be sure they indicate normal function. For information on hard drive LEDs, see hard drive LED combinations (on page 20). For server-specific hard drive LED information, see the server documentation or the HP website (http://www.hp.com). Be sure no loose connections (on page 18) exist. Be sure the correct drive controller drivers are installed. Be sure the hard drive is configured properly: a. To determine the proper configuration, see the hard drive documentation. b. Remove the hard drive and be sure the configuration jumpers are set properly. c. 2. 3. 4. For a non-hot-plug hard drive, be sure a conflict does not exist with another hard drive. Check for SCSI ID conflicts. d. When using an array controller, be sure the hard drive is configured in an array. Run ACU. Hardware problems 45 A new drive is not recognized Action: 1. 2. 3. 4. Be sure the drive is supported. To determine drive support, see the server documentation or the HP website (http://www.hp.com/go/bizsupport). Be sure the drive bay is not defective by installing the hard drive in another bay. Run HP Insight Diagnostics (on page 75). Then, replace failed components as indicated. When the drive is a replacement drive on an array controller, be sure that the drive is the same type and of the same or larger capacity than the original drive. Data is inaccessible Action: 1. 2. 3. Be sure the files are not corrupt. Run the repair utility for the operating system. Be sure no viruses exist on the server. Run a current version of a virus scan utility. When a TPM is installed and is being used with BitLocker™, be sure the TPM is enabled in RBSU ("HP ROM-Based Setup Utility" on page 67). See the TPM replacement recovery procedure in the operating system documentation. When migrating encrypted data to a new server, be sure to follow the recovery procedures in the operating system documentation. 4. Server response time is slower than usual Action: 1. 2. Be sure the hard drive is not full. If needed, increase the amount of free space on the hard drive. HP recommends that hard drives have a minimum of 15 percent free space. Review information about the operating system encryption technology, which can cause a decrease in server performance. For more information, see the operating system documentation. SD card problems System does not boot from the drive Action: 1. 2. Be sure the drive boot order in RBSU is set so that the server boots from the SD card. Reseat the SD card. USB drive key problems System does not boot from the drive Action: 1. 2. 3. Be sure that USB is enabled in RBSU. Be sure the drive boot order in RBSU is set so that the server boots from the USB drive key. Reseat the USB drive key. Hardware problems 46 Fan problems General fan problems are occurring Action: 1. Be sure the fans are properly seated and working. a. Follow the procedures and warnings in the server documentation for removing the access panels and accessing and replacing fans. b. Unseat, and then reseat, each fan according to the proper procedures. c. 2. 3. Replace the access panels, and then attempt to restart the server. Be sure the fan configuration meets the functional requirements of the server. Refer to the server documentation. Be sure no ventilation problems exist. If you have been operating the server for an extended period of time with the access panel removed, airflow may have been impeded, causing thermal damage to components. Refer to the server documentation for further requirements. Be sure no POST error messages ("POST error messages and beep codes" on page 121) are displayed while booting the server that indicate temperature violation or fan failure information. Refer to the server documentation for the temperature requirements for the server. Access the IML to see if any event list error messages (on page 171) relating to fans are listed. Replace any required non-functioning fans and restart the server. Refer to the server documentation for specifications on fan requirements. Be sure all fan slots have fans or blanks installed. Refer to the server documentation for requirements. Verify the fan airflow path is not blocked by cables or other material. 4. 5. 6. 7. 8. Hot-plug fan problems are occurring Action: 1. Check the LEDs to be sure the hot-plug fans are working. Refer to the server documentation for LED information. NOTE: For servers with redundant fans, backup fans may spin up periodically to test functionality. This is part of normal redundant fan operation.
HP ProLiant DL360 G7 Server Maintenance and Service Guide Abstract This guide is for an experienced service technician. HP assumes you are qualified in the servicing.
Power efficiency and power management in HP ProLiant servers Technology brief Introduction.
HP Proliant Servers Troubleshooting Guide