Thursday 20 September 2007

IPMP with IPv6 test addresses

Quite a while back Dave Miner discussed how IP Multipathing can be done on the cheap in Solaris 10 by not using IPv4 test addresses at all and relying on the physical link state or alternatively using IPv6 link local addresses as test addresses. My first attempts at configuring the latter failed and not getting any response to my comment in his blog, I left it for more pressing work.

Recently I got back to this and managed to get it working.

Firstly you need to have /etc/inet/hosts and/or /etc/inet/ipnodes set up correctly with your IP address and hostname. In this example our host is called myhost with two interfaces bge0 and bge1.

/etc/inet/hosts:
192.168.1.1 myhost

Then configure your interfaces.

/etc/hostname.bge0:
myhost group production failover up

/etc/hostname.bge1:
up

/etc/hostname6.bge0:
group production -failover up

/etc/hostname6.bge1:
group production -failover up

You need to reboot for these configuration changes to take effect or you can pass the contents of the /etc/hostname* files as arguments to ifconfig.

Update: You need to ensure that both the host and the switch are set to autonegotiate for this to work.

Update 2007.09.20: According to this Sun document you do not need to mark an IPv6 test address as deprecated to prevent applications from using the test address. I've updated the configurations above to reflect this.

3 comments:

Unknown said...

Hello,
I wanted to know if its possible to connect a host with two interfaces to two different switches, i.e. one interface connected to one switch and the other to another switch and achieve this failover using IPMP. Or do i need to further do some link aggregation as well?
Thanks
Naveen

Matthew Flanagan said...

Yes you can do this. There is no need to do aggregation.

N said...

hi, I have an issue - maybe you will be able to help?

It worked for me only this way:

1. Setup


Code:

# cat /etc/hostname*
10.23.10.113/24 broadcast + group data failover up <- hostname.e1000g0

0.0.0.0/24 broadcast + group data -failover deprecated up standby <- hostname.nxge3

group data -failover up <- hostname6.e1000g0
group data -failover up <- hostname6.nxge3


2. Testing


Code:

# if_mpadm -d e1000g0
Offline failed as there is no other functional interface available in the multipathing group for failing over the network access.


3. Fixing (I guess you can just pick random free IP)


Code:

# ifconfig nxge3 10.23.10.114/24 broadcast + group data -failover deprecated standby up


4. Again testing


Code:

# if_mpadm -d e1000g0


It works!

5. Even more - plubing back 0.0.0.0 on this NIC


Code:

# ifconfig nxge3 0.0.0.0/24 broadcast + group data -failover deprecated standby up
# if_mpadm -d e1000g0


And it magically again works!

How could it be? Why it didn`t work, when I plumbed it with 0.0.0.0 IP originally? And worked after I plumbed some real IP, and also worked when I reverted it back to 0.0.0.0?

This is just plain magic...