Best practices for HA pairs

To ensure that your HA pair is robust and operational, you need to be familiar with configuration best practices.

  • Make sure that each power supply unit in the storage system is on a different power grid so that a single power outage does not affect all power supply units.
  • Use interface groups (virtual interfaces) to provide redundancy and improve availability of network communication.
  • Follow the documented procedures in the Data ONTAP Upgrade and Revert/Downgrade Guide for 7-Mode when upgrading your HA pair.
  • Maintain consistent configuration between the two nodes.

    An inconsistent configuration is often the cause of failover problems.

  • Test the failover capability routinely (for example, during planned maintenance) to ensure proper configuration.
  • Make sure that each node has sufficient resources to adequately support the workload of both nodes during takeover mode.
  • Use the Config Advisor tool to help ensure that failovers are successful.
  • If your system supports remote management (through an RLM or Service Processor), make sure that you configure it properly, as described in the Data ONTAP System Administration Guide for 7-Mode.
  • Follow recommended limits for FlexVol volumes, dense volumes, Snapshot copies, and LUNs to reduce the takeover or giveback time.

    When adding traditional or FlexVol volumes to an HA pair, consider testing the takeover and giveback times to ensure that they fall within your requirements.

  • For systems using disks, check for failed disks regularly and remove them as soon as possible, as described in the Data ONTAP Storage Management Guide for 7-Mode. Failed disks can extend the duration of takeover operations or prevent giveback operations.
  • Multipath HA is required on all HA pairs except for some FAS22xx system configurations, which use single-path HA and lack the redundant standby connections.
  • To ensure that you receive prompt notification if takeover capability becomes disabled, configure your system for automatic email notification for the takeover impossible EMS messages:
    • ha.takeoverImpVersion
    • ha.takeoverImpLowMem
    • ha.takeoverImpDegraded
    • ha.takeoverImpUnsync
    • ha.takeoverImpIC
    • ha.takeoverImpHotShelf
    • ha.takeoverImpNotDef
  • Set the cf.giveback.auto.cancel.on_network_failure option to true if network-related failovers are enabled.