CentOS 6.4 host odd tap / ebtables behaviour

Hi all

We have 3 x CentOS 6.4/6.5 hosts and around 15 guests across them - Cloudmin Pro (25 VMs) controls them all. Recently I've had some issues with guest connectivity which I think relates to ebtables and tapN configuration.

Taking one of the hosts as an example, there are 5 guests on there each with their own eth0 NIC. Am I correct in assuming that each guest should have its own tapN and not be sharing with another guest?

If I am correct, then something has gone wrong on my system as I can see 2 pairs of guests that have the same tapN number - this could be a result of moving them from another host, or it could be related to changes I've made to ebtables just recently via Cloudmin's interface.

Is there a simple way to specify new tapN references for these machines to remove the shared config?

This is a production environment so I need to be 100% sure of success prior to applying any changes.

Thanks in advance.

Status: 
Active

Comments

Yes, each guest should get its own tapN which is assigned by KVM on startup.

How did you determine that multiple guests were sharing the same tap?

In Cloudmin, ebtables is used to prevent VMs from using IPs that are not assigned to them, by setting up ethernet-level firewall rules on tap interfaces.

Thanks Jamie

I've run tail /kvm/*tap and the file outputs display all my tapNs for kvm instances. Here I can see 2 instances using tap0 and it's those 2 instances that are failing to get connectivity. I can also see 2 instances sharing tap3 but those aren't failing at the moment (most likely because the ebtables aren't blocking spoofing, or because they've not been rebooted in a while).

Also, when I run ebtables -L I can see multiple chains for tap0 and tap3.

I've moved instances from other hosts (using Cloudmin Pro) and also cloned systems on here too. So perhaps there's been an issue with one of those operations somewhere?

So to rectify, what the most sensible way to modify the tapN reference manually? I have ebtables configured and blocking spoofed IPs, so any changes will need to be made to the Cloudmin KVM config too.

Cheers again.

Yes, this could be an issue with cloning or moving. If you shut down one of the clashing instances and then bring it back up again, does it get assigned a new tapN interface? KVM should be doing this automatically.

When I shutdown one of the troubled instances, the other becomes available. However the startup does not then assign a new tapN, instead it keeps the existing clashed one.

I'm currently relocating one of the less critical instances to another host to see if it behaves there. If it's the host that is misbehaving I can simply relocate all instances to other hosts and debug further.

Once I know more, I'll post back here.

Thanks.

That is very odd - Cloudmin itself doesn't control the tanN device selection, it just records what KVM selects when a VM is started.

Unless maybe this VM is somehow configured via a non-standard -net parameter to use a fixed tap device? I'd be interested to see the full command line of the running kvm process for both these VMs.