Virtual Server Replication issue with NFS mounted file systems

Hi - this is probably one of those corner cases, so I'm going to explain the situation. We have a client set up with two web servers that are behind a Varnish server that handles and load-balances all requests to the two redundant web heads. They're all Drupal sites, and the two server share a files subsystem under the Drupal root, but they each have a webroot that we intend to copy across on a regular schedule - the only shared component is the files directory, which is mounted on each server, and there's a symlink, out of the webroot up into a shared file system directory on the NFS server.

Problem here is that using the Replicated VIrtualmin domains function, when the domain is replicated over to the secondary server it doesn't apparently create the replica with the same GID and UID on the secondary server. Thus the replica is set up in such a way that it cannot access the shared NFS file system - the secondary has different UID/GID combinations, and so the NFS share fails.

This is really messy - so is there some way to ensure that the replica is created with the proper GID/UID combination? Patching all this up by hand after the fact sort of defeats the whole purpose.

Status: 
Closed (fixed)

Comments

Note that I realize your docs only talk about this in the sense of sharing the whole home directory between servers, but the same issue applies there as well.

Virtualmin replication should handle this case properly - the same UID should be used on the new system, for exactly this reason.

How do you have replication setup? Using Cloudmin, or some custom process using backup/restore commands?

It's set up in Cloudmin. It looks like I could probably get what I want by doing it on the command line with Virtualmin by doing the transfer and using the option to not change UID --- but it'd be nice to do this through Cloudmin for various reasons. When Cloudmin does the original replication it sets up a different UID and GID on the secondary server, and then it wants to continue to use that UID/GID combination.

Which Cloudmin and Virtualmin versions are you running there? We did make some changes in this area recently, and I want to check if the behavior has regressed - because it shouldn't change the UID/GID when replicating.

Cloudmin 8.3 and Virtualmin 4.18GPL on the Cloudmin server

Virtualmin 4.18Pro on both of the replication VPS servers.

One more question -are you sharing users between the systems in any way, such as via an LDAP database?

Nope. No user sharing at all currently. No LDAP.

I will perform some tests to see if I can replicate this bug, and update this ticket.

So -- I am still trying to work around this, and it's really becoming an issue with my client. I finally decided to give up on doing it this way, and just trying to do a backup from the primary server to the secondary, and to restore it over there (and having this run as a script when a domain is created). This almost works, but I'm having a terrible time now getting the backup and restore perl command in the command line API to behave. The documentation for the feature names seems to be pretty unclear. I'm TRYING to do a backup on the primary server that does not include the mysql databases (since they already exist on the database server) and then to restore it.

The backup looks like this: virtualmin backup-domain --dest ssh://root:supersecretpassword@dpiweb2i.cloudmin.cruiskeenconsulting.com... --domain ${VIRTUALSERVER_DOM} --feature virtualmin --feature web --feature logrotate --feature ssl --feature webmin --feature dir --except-feature mysql

This works, but the backup still seems to include the mysql database. And when I try to restore it on the secondary server with virtualmin restore-domain --no-reuid --feature virtualmin --feature web --feature logrotate --feature dir --except-feature mysql --source /tmp/virtualmin.tgz --all-domains --only-missing

It tries to restore the database - which fails because the database already exists. So the restore fails.

This doesn't seem like such an odd thing to want to do, but no matter how I try to do it I get a failure.

For this kind of restore, make sure you specify the --replication flag - it tells Virtualmin that the new system is sharing some resources (like the MySQL DB server or home directory), and thus should prevent the DB from being re-created.

AH ! ==

I'll try that.

Note that this is not a documented flag --

Still doesn't work ---

root@dpiweb2i.cloudmin.cruiskeenconsulting.com /root/bin/restoreserver + virtualmin restore-domain --no-reuid --source /tmp/virtualmin.tgz --all-domains --all-features --except-feature mysql --replication Checking for missing features .. .. all features in backup are supported Checking for errors in backup .. .. no errors found Starting restore.. Extracting backup archive file .. .. done Re-creating virtual server cat.cruiskeenconsulting.com .. .. the following warnings were detected : A MySQL database named cat already exists Restore failed!

I've tried a number of different ways of doing this, including just doing the features I think I want -- web dir logrotate -- but it still fails saying that the database already exists

AH -- much closer now --- this works much closer if I also add --only-features on the restore

virtualmin restore-domain --replication --no-reuid --only-features --source /tmp/virtualmin.tgz --all-domains --feature web --feature dir --feature logrotate

The problem NOW is that it's not re-creating the unix user on the secondary server so the restore falls all over itself complaining that the user does not exist.

So - what flag combination on backup and restore would fix this? I'm afraid that the docs for this are really hard to chase down since the features are documented in various places and are not intuitively obvious.

Backup is virtualmin backup-domain --dest ssh://root:somemysterypassword@dpiweb2i.cloudmin.cruiskeenconsulting.com... --domain ${VIRTUALSERVER_DOM} --all-features

Restore is virtualmin restore-domain --replication --no-reuid --only-features --source /tmp/virtualmin.tgz --all-domains --feature web --feature dir --feature logrotate

Also -- using --all-features --except-feature mysql fails the same way

Oh --- I've got it -- it seems to be a little dependent on order of the flags. I had --all-features at the end of the backup command, after the definition of where the output goes. Once I put it as the FIRST flag on hte backup command things started working.

So -- I THINK I finally have a working solution here.

Thanks for the help -- but I really think the docs should be clearer.

The --replication flag was intended for internal use only so isn't documented, but we'll fix that in the next release.

The --except-feature flag has to follow --all-features - this will be made clearer in the next release.

What I've got set up now seems to be working plenty well enough, so I'm going to close this.

Ok - unfortunately, I have been unable to re-produce this problem in test systems.