Submitted by marcusone on Tue, 03/18/2014 - 22:25 Pro Licensee
Fresh Centos 6.5 minimal install, fully updated, then virtualmin installed via your script install.sh.
I have LDAP Users & Groups, LDAP Client working inside of Webmin perfectly. even remote SSH with ldap users works!
Then I try to create a server in Virtualmin (with the setting of "Store users and groups" in LDAP database set) and get the dreaded error of:
Creating administration group .. .. administration group was created but does not exist!
Yet if I create groups in Webmin module, it works just fine.
I don't have ncsd or any caching running that I know of.
Where can I look for more error logs? Thoughts? Everything else works great!
Status:
Closed (fixed)
Comments
Submitted by JamieCameron on Tue, 03/18/2014 - 23:43 Comment #1
That error means that the group was added to the LDAP DB, but didn't show up as a Unix group on the system.
Does CentOS 6.5 perhaps introduce a delay between when an LDAP entry is added and when it appears as a Unix user or group?
Submitted by marcusone on Tue, 03/18/2014 - 23:49 Pro Licensee Comment #2
It doesn't even make it into the LDAP DB! (sorry, forgot to add that detail).
So no dealy, as it all works great in Webmin modules.
Where does Virtualmin check for the group? or where can i look for more error logs?
Submitted by JamieCameron on Tue, 03/18/2014 - 23:52 Comment #3
It might appear to be missing from LDAP because if Virtualmin adds the LDAP entry, checks if the group is visible, and if not removes the entry so that the LDAP DB isn't full of partially created groups.
If you add a group in the Webmin module, is it immediately visible on the Virtualmin system?
Submitted by marcusone on Wed, 03/19/2014 - 00:22 Pro Licensee Comment #4
Thanks, But I'm Certain that it isn't in the LDAP. group is not visible anywhere, so never got created. I can create the same group name in Webmin no problem... which then causes Virtualmin to fail saying the group already exists.
Where in the code does it do the call to ldapadd, or via the webmin module?
Is there no debugging log for Virtualmin/Webmin??
[edit] If i force the group name to 'test' an already existing group in LDAP, freshly created, I get "Failed to create virtual server : A unix group named test already exists - try selecting a different administration username"
Submitted by JamieCameron on Wed, 03/19/2014 - 19:04 Comment #5
Virtualmin and Webmin add LDAP users by connecting directly to the DB and creating objects.
You might want to check the LDAP server log to see what operations Virtualmin is performing.
Submitted by marcusone on Wed, 03/19/2014 - 20:46 Pro Licensee Comment #6
Does virtualmin use Webmin (same module?) - as all guides suggest you need Webmin's module to work and then virtualmin will "just work".
Webmin works; virtualmin doesn't.
THERE IS OBVIOUSLY A BUG in the Virtualmin create server script: The script Deletes the group right after making it! See the 'DEL' command near the bottom of this snip.
Mar 19 21:21:36 ems1 slapd[12388]: conn=1003 op=2 ADD dn="cn=webox,dc=Groups,dc=xx,dc=xx"
Mar 19 21:21:36 ems1 slapd[12388]: conn=1003 op=2 RESULT tag=105 err=0 text=
Mar 19 21:21:36 ems1 slapd[12388]: conn=1003 op=3 UNBIND
Mar 19 21:21:36 ems1 slapd[12388]: conn=1003 fd=22 closed
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 fd=22 ACCEPT from IP=xxx:45074 (IP=0.0.0.0:389)
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=0 EXT oid=1.3.6.1.4.1.1466.20037
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=0 STARTTLS
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=0 RESULT oid= err=0 text=
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 fd=22 TLS established tls_ssf=256 ssf=256
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=1 BIND dn="cn=Manager,dc=xx,dc=xx" method=128
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=1 BIND dn="cn=Manager,dc=xx,dc=xx" mech=SIMPLE ssf=0
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=1 RESULT tag=97 err=0 text=
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=2 DEL dn="cn=webox,dc=Groups,dc=xx,dc=cxxa"
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=2 RESULT tag=107 err=0 text=
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 op=3 UNBIND
Mar 19 21:21:37 ems1 slapd[12388]: conn=1004 fd=22 closed
Submitted by JamieCameron on Thu, 03/20/2014 - 21:28 Comment #7
That deletion is expected - see the reason mentioned in comment #3.
If you create a group at in the "LDAP Users and Groups" module, does the LDAP server log the same
ADD
line?Submitted by marcusone on Fri, 03/21/2014 - 12:55 Pro Licensee Comment #8
What command does it use to check if the group is visible? Perhaps you can point out where in the code this add and check are done so that I can debug any 'timing' issues, or perhaps add a delay in the script to see if its an issue with some kind of cache or race condition.
Only difference I see is the 'op=5' in the above, and 'op=2' in the Virtualmin one. No idea what 'op' is... ??
Prior and post 'ADD' there are searches; done as expected.
Submitted by JamieCameron on Fri, 03/21/2014 - 20:33 Comment #9
I can send you an update to Virtualmin that waits longer for the group to appear, to hopefully fix the timing issue. Are you running the GPL or Pro version of Virtualmin?
Submitted by marcusone on Fri, 03/21/2014 - 20:40 Pro Licensee Comment #10
GPL at the moment thanks!
Submitted by marcusone on Fri, 03/21/2014 - 20:42 Pro Licensee Comment #11
Can you let me know what file is modified/etc so that I can debug if that doesn't work?
Submitted by JamieCameron on Sun, 03/23/2014 - 00:52 Comment #12
OK, I've emailed you an updated Virtualmin GPL RPM package which adds a longer wait time for new LDAP users and groups to show up.
Submitted by marcusone on Sun, 03/23/2014 - 10:37 Pro Licensee Comment #13
Thanks...but getting the exact same error!
WHERE are the log files for Virtualmin?
What Command Are you using to check existence of groups?
Because during your delay if I do a 'getent group webox' on the machine console, i get the correct response: "webox:*:500:" Then I get an error about the user not found - so whatever command you use now works if I issue the 'getent' command in the pause.
But if I don't do the 'getent group ' during the pause in your script, I get the same error about group not found.
Also, the 'webox' group was deleted, yet Virtualmin and 'getent group' still think it exists. So some kind of caching is happening in the sssd deamon.
Submitted by marcusone on Sun, 03/23/2014 - 11:13 Pro Licensee Comment #14
During your pauses by spamming getent group grpname; id username I got the script to continue.
So I need you to change what commands you use to check users, or perhaps we can skip that check and just make sure they are added to LDAP correctly and let the admin worry about them authenticating in the system?
Now when it continues I get authentication errors in the email list add for virtual users in Postfix
aliases failed : LDAP add of mailLocalAddress=webox@webox.xx,dc=Virtual,dc=refamco,dc=ca failed : modifications require authentication at ../web-lib-funcs.pl line 1381.
Mail for domain failed! : LDAP add of mailLocalAddress=webox.xx,dc=Virtual,dc=xx,dc=xxfailed : modifications require authentication at ../web-lib-funcs.pl line 1381
Yet adding a user works just fine once the server is created. [EDIT]: no, adding a user works, however, no dc=Virtual entry is created :( [EDIT2]: Fixed my authentication issues; but still doesn't add Virtual for newly created users.
Submitted by JamieCameron on Sun, 03/23/2014 - 13:52 Comment #15
Ok, this looks like progress .. maybe. Virtualmin doesn't run any external command to check if a user exists, instead it uses the
getpwent()
andgetgrent()
Perl function calls for users and groups respectively. I bet the problem is that they cache the user and group lists within the Virtualminminiserv.pl
process, and so don't pick up the new LDAP entries right away. However, other processes likegetent
will re-query the LDAP server.Are you sure there is no
nscd
process running on your system?Submitted by marcusone on Sun, 03/23/2014 - 14:10 Pro Licensee Comment #16
Yes i'm sure no nscd (not even installed) and I played with the cache setting in sssd (which actually made the issue worse, as then when you delete the group it doesn't reflect in getent or id for 5min, so can't retry until the cache is expired).
Anyway we can set Virtualmin to make external calls so that the users/groups get into the same cache that the perl getpwent and getgrent are using (as said, virtualmin seems to pickup the change if a system call is made first right after the grp/user is made). Not sure why the Perl module isn't doing the same thing... likely a minor bug/way the perl module work.
Submitted by marcusone on Sun, 03/23/2014 - 17:55 Pro Licensee Comment #17
Found the Issue!!! YAY.
Its the SSSD deamon caching results from LDAP AND the Perl Calls
First problem is the Perl calls to find users and groups doesn't seem to trigger a lookup in the LDAP directory; so the user/group must be in the cache.
However, the Cache (no matter what I set it to) lasts for about 10sec or so (in the testing I did, setting all cache setting i could find in sssd.conf to 1. So the timing of getting the newly created into the cache is best done by doing this during your delay in the script:
# rm -fr /var/lib/sss/db/*; /etc/init.d/sssd restart;getent group <new group>; id <new user>
I think if LDAP is used, the install script should just make and check LDAP calls (don't check the system level for the user/group). That will solve #1 and make #2 not needed/matter.
Submitted by JamieCameron on Sun, 03/23/2014 - 22:28 Comment #18
I think what I'll do is have Virtualmin also try shelling out to
getent
to check if a new user or group exists. That should get around whatever cache is in the Perl process.Submitted by marcusone on Tue, 03/25/2014 - 19:42 Pro Licensee Comment #19
Any progress on this? or is it easy to disable the checks?
Submitted by JamieCameron on Wed, 03/26/2014 - 19:37 Comment #20
I've implemented this - and I can send you another beta version if you like?
Submitted by marcusone on Wed, 03/26/2014 - 19:52 Pro Licensee Comment #21
sure!
Submitted by JamieCameron on Wed, 03/26/2014 - 23:47 Comment #22
Ok, I have emailed you an updated RPM.
Submitted by marcusone on Sun, 03/30/2014 - 07:18 Pro Licensee Comment #23
Unfortunately that has not solved the problem :( very strange.
I'd much rather just have the option to remove the external check at this point...
Submitted by utweb-systems on Tue, 04/01/2014 - 09:37 Comment #24
We are also interested in this post. So we are subscribing to follow the progress.
Thanks,
-Alex
Submitted by utweb-systems on Fri, 05/02/2014 - 12:14 Comment #25
Jamie,
We are also starting to encounter this issue with our instance (shown in the screenshot below).
Thanks,
-Alex
Submitted by JamieCameron on Fri, 05/02/2014 - 14:09 Comment #26
Alex - can you check that if a user is created in Webmin's LDAP Users and Groups module that it shows up as a Unix user on the system? ie. can it be switched to with the "su" command?
Submitted by utweb-systems on Fri, 05/02/2014 - 14:22 Comment #27
Jamie,
Yes that worked just fine. It worked for the reseller account as well.
Thanks,
-Alex
Submitted by utweb-systems on Fri, 05/02/2014 - 14:23 Comment #28
Jamie,
However the delete did not...
Submitted by JamieCameron on Fri, 05/02/2014 - 16:27 Comment #29
That error with deletion is a separate un-related problem.
As for the first issue, this may be due to a propagation delay between when a user is added to LDAP and when it is visible to the Unix system. I can send you an update that should fix this ... which Linux distro is Virtualmin running on there?
Submitted by marcusone on Fri, 05/02/2014 - 21:53 Pro Licensee Comment #30
There is no delay in adding to LDAP and visibility on the linux system. The issue is the delay in the visibility to the method employed by Virtualmin to check the addition.
This issue has yet to be fixed!
Submitted by JamieCameron on Sat, 05/03/2014 - 13:29 Comment #31
Actually, Virtualmin 4.07 (which has been released to all users) includes both a longer delay in checking for new users to show up, and tries the getent command to find new users and groups (rather than just relying on the getpwent function call in perl).
Submitted by utweb-systems on Mon, 05/05/2014 - 11:38 Comment #32
Jamie,
We are running RHEL 6.5 x64.
Thanks,
-Alex
Submitted by JamieCameron on Mon, 05/05/2014 - 13:47 Comment #33
Alex - when you create a domain, is there a delay of around 10 seconds between the messages "Creating administration group" and ".. administration group was created but does not exist" ?
Submitted by utweb-systems on Mon, 05/05/2014 - 13:54 Comment #34
Jamie,
No, it is nearly immediate. We are running the latest version of Virtualmin Pro (as of Thursday last week).
Thanks,
-Alex
Submitted by JamieCameron on Mon, 05/05/2014 - 19:11 Comment #35
Alex - is there any chance I could remotely access this system to see what Virtualmin it doing internally when it fails like this?
Submitted by utweb-systems on Tue, 05/06/2014 - 09:10 Comment #36
Hello,
I'm working with Alex on this project and have done some additional testing this morning.
I am seeing the 10 second delay between "Creating administration group" and ".. administration group was created but does not exist!".
Parsing over the slapd logs I'm not seeing an ADD line for the group anywhere. I'll see it iterate over every group in existence, and then attempt to delete the administrative group it should have created as part of a cleanup process, but no evidence that it ever tried to create the group in LDAP in the first place.
If I do a
watch -n1 "getent group | grep demo"
on the panel host (the username I'm creating has demo in it) I don't see it ever pop up during the creation process.The last ADD notification I see is back on May 2nd, before we updated to 4.07.
-Scotty (via Alex's thread)
Submitted by JamieCameron on Tue, 05/06/2014 - 20:05 Comment #37
Scotty - if you go to Webmin -> System -> LDAP Users and Groups and create a group, does that creation event show up in the LDAP server's log?
Submitted by utweb-systems on Wed, 05/07/2014 - 08:45 Comment #38
Oddly, I don't see an ADD event for that. However, a group does indeed get created in LDAP.
Submitted by JamieCameron on Wed, 05/07/2014 - 16:09 Comment #39
Perhaps there is some additional logging that needs to be turned on? I'm very interested to know if the group is being created wrongly, or just now visible from the Virtualmin system.
Submitted by utweb-systems on Thu, 05/08/2014 - 11:49 Comment #40
All,
We were able to demonstrate success on this. It turns out there was a misconfiguration on the group "key" in the sssd configuration that was causing our issue. Now we are able to successfully create virtual servers in Virtualmin with the account information stored in our LDAP service.
Thanks for all the help!
Thanks,
-Alex
Submitted by JamieCameron on Thu, 05/08/2014 - 12:58 Comment #41
Great!
Submitted by Issues on Thu, 05/22/2014 - 13:01 Comment #42
Automatically closed -- issue fixed for 2 weeks with no activity.