By: Mulk Anand named 22 Aug 2016 at 3:49 a.m. CDT

37 Responses
Mulk Anand gravatar
Hi, After completing clustering of GluuCE, I tried to retest dynamic registration for OIDC but it is failing. Static registration is working fine. When testing dynamic registration, I get the page which says: ``` Select your OpenID Connect Identity Provider sdggluu.iaglab.sdgc.com Or enter your account name (eg. "mike@seed.gluu.org", or an IDP identifier (eg. "mitreid.org"): ``` After entering a valid user test1@sdggluu.iaglab.sdgc.com and submitting the request it takes me to the login form where username field is already populated with test1@sdggluu.iaglab.sdgc.com and is not editable. And by entering password and submitting the request results in invalid credentials message which is expected because the username should be only "test1" and not "test1@sdggluu.iaglab.sdgc.com". what could be the issue here?

By Mohib Zico Account Admin 22 Aug 2016 at 4:02 a.m. CDT

Mohib Zico gravatar
Please check log and share any stack trace.

By Mulk Anand named 22 Aug 2016 at 4:05 a.m. CDT

Mulk Anand gravatar
which log do i need to look into?

By Mohib Zico Account Admin 22 Aug 2016 at 4:12 a.m. CDT

Mohib Zico gravatar
oxauth.log. Do tail and only provide error logs only.

By Mulk Anand named 22 Aug 2016 at 4:19 a.m. CDT

Mulk Anand gravatar
ok..here it is: ``` [root@sdggluu logs]# more oxauth.log | grep "Provider" 2016-08-22 04:23:23,521 INFO [org.jboss.seam.init.Initialization] two components with same name, higher precedence wins: org.jboss.seam.cache.cacheProvider 2016-08-22 04:23:23,523 INFO [org.jboss.seam.init.Initialization] two components with same name, higher precedence wins: org.jboss.seam.persistence.persistenceProvide 2016-08-22 04:23:23,938 INFO [org.jboss.seam.Component] Component: org.jboss.seam.cache.cacheProvider, scope: APPLICATION, type: JAVA_BEAN, class: org.jboss.seam.cache.EhCacheProvider 2016-08-22 04:23:24,018 INFO [org.jboss.seam.Component] Component: org.jboss.seam.persistence.persistenceProvider, scope: STATELESS, type: JAVA_BEAN, class: org.jboss.seam.persistence.PersistenceProvider 2016-08-22 04:23:24,410 INFO [org.xdi.oxauth.model.util.JwtUtil] Adding Bouncy Castle Provider 2016-08-22 04:23:24,507 INFO [org.gluu.site.ldap.LDAPConnectionProvider] Attempting to create connection pool: 1 2016-08-22 04:23:24,685 INFO [org.gluu.site.ldap.LDAPConnectionProvider] Attempting to create connection pool: 1 2016-08-22 04:23:25,180 INFO [org.gluu.site.ldap.LDAPConnectionProvider] Attempting to create connection pool: 1 2016-08-22 04:23:25,213 INFO [org.gluu.site.ldap.LDAPConnectionProvider] Attempting to create connection pool: 1 ```

By Sahil Arora user 22 Aug 2016 at 4:20 a.m. CDT

Sahil Arora gravatar
Try to enter IDP Identifier "sdggluu.iaglab.sdgc.com" instead of user account "test1@sdggluu.iaglab.sdgc.com" and let us know if that works?

By Mulk Anand named 22 Aug 2016 at 4:23 a.m. CDT

Mulk Anand gravatar
It didn't work. After submitting as per your suggestion this is what i got: ``` Error: Invalid Request Description: Could not find valid provider metadata for the selected OpenID Connect provider; contact the administrator ``` And the URL in the browser is: https://dynamic022.iaglab.sdgc.com:44443/dynamic/fake_redirect_uri?target_link_uri=https%3A%2F%2Fdynamic022.iaglab.sdgc.com%3A44443%2Fdynamic%2F&x_csrf=Dh4QMnR3Isk&iss=sdggluu.iaglab.sdgc.com

By Sahil Arora user 22 Aug 2016 at 5:46 a.m. CDT

Sahil Arora gravatar
Please try to load auth_openidc Module again and restart apache sudo a2enmod auth_openidc sudo service apache2 restart If not working, please share apache error logs from OIDC & let us know if you'd see any metadata files at, /var/cache/apache2/mod_auth_openidc/metadata Thanks Support@Gluu

By Mulk Anand named 22 Aug 2016 at 7 a.m. CDT

Mulk Anand gravatar
a2enmod is not available for CentOS. I disabled mod_auth_openidc and then restarted apache and then again enabled and restarted apache. Issue still remains. there is no such directory /var/cache/apache2/mod_auth_openidc/metadata. Here is the apache error log: ``` [Sun Aug 21 03:26:01.853412 2016] [auth_digest:notice] [pid 2936] AH01757: generating secret for digest authentication ... [Sun Aug 21 03:26:01.854353 2016] [lbmethod_heartbeat:notice] [pid 2936] AH02282: No slotmem from mod_heartmonitor [Sun Aug 21 03:26:01.859610 2016] [mpm_prefork:notice] [pid 2936] AH00163: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips configured -- resuming normal operations [Sun Aug 21 03:26:01.859619 2016] [core:notice] [pid 2936] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Mon Aug 22 07:28:21.321055 2016] [core:notice] [pid 5547] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0 [Mon Aug 22 07:28:21.322295 2016] [suexec:notice] [pid 5547] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon Aug 22 07:28:21.323334 2016] [ssl:warn] [pid 5547] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:28:21.323358 2016] [ssl:warn] [pid 5547] AH01909: RSA certificate configured for static022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:28:21.323841 2016] [ssl:warn] [pid 5547] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:28:21.323852 2016] [ssl:warn] [pid 5547] AH01909: RSA certificate configured for dynamic022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:28:21.323950 2016] [ssl:warn] [pid 5547] AH02292: Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366) [Mon Aug 22 07:28:21.361680 2016] [auth_digest:notice] [pid 5547] AH01757: generating secret for digest authentication ... [Mon Aug 22 07:28:21.362723 2016] [lbmethod_heartbeat:notice] [pid 5547] AH02282: No slotmem from mod_heartmonitor [Mon Aug 22 07:28:21.363738 2016] [ssl:warn] [pid 5547] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:28:21.363759 2016] [ssl:warn] [pid 5547] AH01909: RSA certificate configured for static022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:28:21.364251 2016] [ssl:warn] [pid 5547] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:28:21.364268 2016] [ssl:warn] [pid 5547] AH01909: RSA certificate configured for dynamic022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:28:21.364376 2016] [ssl:warn] [pid 5547] AH02292: Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366) [Mon Aug 22 07:28:21.371203 2016] [mpm_prefork:notice] [pid 5547] AH00163: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips configured -- resuming normal operations [Mon Aug 22 07:28:21.371228 2016] [core:notice] [pid 5547] AH00094: Command line: '/usr/sbin/httpd2 -f /etc/httpd2/conf/httpd.conf -D FOREGROUND' [Mon Aug 22 07:42:42.887660 2016] [mpm_prefork:notice] [pid 5547] AH00170: caught SIGWINCH, shutting down gracefully [Mon Aug 22 07:42:43.972100 2016] [core:notice] [pid 5758] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0 [Mon Aug 22 07:42:43.973236 2016] [suexec:notice] [pid 5758] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon Aug 22 07:42:43.974199 2016] [ssl:warn] [pid 5758] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:42:43.974221 2016] [ssl:warn] [pid 5758] AH01909: RSA certificate configured for static022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:42:43.974721 2016] [ssl:warn] [pid 5758] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:42:43.974732 2016] [ssl:warn] [pid 5758] AH01909: RSA certificate configured for dynamic022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:42:43.974820 2016] [ssl:warn] [pid 5758] AH02292: Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366) [Mon Aug 22 07:42:44.010487 2016] [so:warn] [pid 5758] AH01574: module auth_openidc_module is already loaded, skipping [Mon Aug 22 07:42:44.014368 2016] [auth_digest:notice] [pid 5758] AH01757: generating secret for digest authentication ... [Mon Aug 22 07:42:44.015364 2016] [lbmethod_heartbeat:notice] [pid 5758] AH02282: No slotmem from mod_heartmonitor [Mon Aug 22 07:42:44.016354 2016] [ssl:warn] [pid 5758] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:42:44.016376 2016] [ssl:warn] [pid 5758] AH01909: RSA certificate configured for static022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:42:44.016844 2016] [ssl:warn] [pid 5758] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:42:44.016856 2016] [ssl:warn] [pid 5758] AH01909: RSA certificate configured for dynamic022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:42:44.016940 2016] [ssl:warn] [pid 5758] AH02292: Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366) [Mon Aug 22 07:42:44.025754 2016] [mpm_prefork:notice] [pid 5758] AH00163: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips configured -- resuming normal operations [Mon Aug 22 07:42:44.025782 2016] [core:notice] [pid 5758] AH00094: Command line: '/usr/sbin/httpd2 -f /etc/httpd2/conf/httpd.conf -D FOREGROUND' [Mon Aug 22 07:51:10.761055 2016] [mpm_prefork:notice] [pid 5758] AH00170: caught SIGWINCH, shutting down gracefully [Mon Aug 22 07:51:11.839380 2016] [core:notice] [pid 5889] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0 [Mon Aug 22 07:51:11.840592 2016] [suexec:notice] [pid 5889] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon Aug 22 07:51:11.841597 2016] [ssl:warn] [pid 5889] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:51:11.841618 2016] [ssl:warn] [pid 5889] AH01909: RSA certificate configured for static022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:51:11.842123 2016] [ssl:warn] [pid 5889] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:51:11.842135 2016] [ssl:warn] [pid 5889] AH01909: RSA certificate configured for dynamic022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:51:11.842258 2016] [ssl:warn] [pid 5889] AH02292: Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366) [Mon Aug 22 07:51:11.875645 2016] [so:warn] [pid 5889] AH01574: module auth_openidc_module is already loaded, skipping [Mon Aug 22 07:51:11.879090 2016] [auth_digest:notice] [pid 5889] AH01757: generating secret for digest authentication ... [Mon Aug 22 07:51:11.880127 2016] [lbmethod_heartbeat:notice] [pid 5889] AH02282: No slotmem from mod_heartmonitor [Mon Aug 22 07:51:11.881115 2016] [ssl:warn] [pid 5889] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:51:11.881136 2016] [ssl:warn] [pid 5889] AH01909: RSA certificate configured for static022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:51:11.881732 2016] [ssl:warn] [pid 5889] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Mon Aug 22 07:51:11.881749 2016] [ssl:warn] [pid 5889] AH01909: RSA certificate configured for dynamic022.iaglab.sdgc.com:443 does NOT include an ID which matches the server name [Mon Aug 22 07:51:11.881844 2016] [ssl:warn] [pid 5889] AH02292: Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366) [Mon Aug 22 07:51:11.889586 2016] [mpm_prefork:notice] [pid 5889] AH00163: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips configured -- resuming normal operations [Mon Aug 22 07:51:11.889622 2016] [core:notice] [pid 5889] AH00094: Command line: '/usr/sbin/httpd2 -f /etc/httpd2/conf/httpd.conf -D FOREGROUND' [Mon Aug 22 07:52:06.058854 2016] [auth_openidc:warn] [pid 5893] [client 10.20.12.5:50585] oidc_util_file_read: no file found at: "/var/www2/html/metadata/test1.provider", referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:52:09.571729 2016] [auth_openidc:error] [pid 5893] [client 10.20.12.5:50585] oidc_util_http_call: curl_easy_perform() failed on: https://test1/.well-known/openid-configuration (Could not resolve host: test1; Name or service not known), referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:52:18.633696 2016] [auth_openidc:error] [pid 5901] [client 10.20.12.5:50588] oidc_clean_expired_state_cookies: state has expired, referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:57:57.474567 2016] [auth_openidc:warn] [pid 5890] [client 10.20.12.5:50772] oidc_util_file_read: no file found at: "/var/www2/html/metadata/sdggluu.iaglab.sdgc.com.provider", referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:57:57.566573 2016] [auth_openidc:warn] [pid 5890] [client 10.20.12.5:50772] oidc_util_file_read: no file found at: "/var/www2/html/metadata/sdggluu.iaglab.sdgc.com.client", referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ ``` Although in above logs it indicates that client and provider files are not found but I can see they generated in the directory: ``` [root@SDGVMLAB022 logs]# ls -ltr /var/www2/html/metadata/ total 36 -rw-r--r--. 1 apache apache 7023 Jul 25 05:59 sdgvmlab019.iaglab.sdgc.com.provider.bkp -rw-r--r--. 1 apache apache 1152 Aug 17 10:27 sdgvmlab019.iaglab.sdgc.com.client.bkp -rw-r--r--. 1 apache apache 6236 Aug 19 07:24 sdggluu.iaglab.sdgc.com.provider.bkp -rw-r--r--. 1 apache apache 1148 Aug 19 07:24 sdggluu.iaglab.sdgc.com.client.bkp -rw-r--r--. 1 apache apache 6236 Aug 22 07:57 sdggluu.iaglab.sdgc.com.provider -rw-r--r--. 1 apache apache 1148 Aug 22 07:57 sdggluu.iaglab.sdgc.com.client ```

By Sahil Arora user 22 Aug 2016 at 7:48 a.m. CDT

Sahil Arora gravatar
Thanks for sharing logs. From the errors below, it looks like metadata file missing at following location, oidc_util_file_read: no file found at: "/var/www2/html/metadata/sdggluu.iaglab.sdgc.com.provider Please make sure path exists and has required permissions (apache:apache) [Mon Aug 22 07:52:06.058854 2016] [auth_openidc:warn] [pid 5893] [client 10.20.12.5:50585] oidc_util_file_read: no file found at: "/var/www2/html/metadata/test1.provider", referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:52:09.571729 2016] [auth_openidc:error] [pid 5893] [client 10.20.12.5:50585] oidc_util_http_call: curl_easy_perform() failed on: https://test1/.well-known/openid-configuration (Could not resolve host: test1; Name or service not known), referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:52:18.633696 2016] [auth_openidc:error] [pid 5901] [client 10.20.12.5:50588] oidc_clean_expired_state_cookies: state has expired, referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ [Mon Aug 22 07:57:57.474567 2016] [auth_openidc:warn] [pid 5890] [client 10.20.12.5:50772] oidc_util_file_read: no file found at: "/var/www2/html/metadata/sdggluu.iaglab.sdgc.com.provider", referer: https://dynamic022.iaglab.sdgc.com:44443/dynamic/ Pls paste content of /etc/httpd/conf.d/dynamic.conf

By Mulk Anand named 22 Aug 2016 at 8:39 a.m. CDT

Mulk Anand gravatar
Well, i had mentioned that these files do exist there in my earlier post. Also, the dynamic.conf file already had the path defined. Here is the dynamic.conf file contents: ``` <VirtualHost *:44443> ServerName dynamic022.iaglab.sdgc.com DocumentRoot /var/www2/html OIDCMetadataDir /var/www2/html/metadata OIDCClientSecret secret OIDCRedirectURI https://dynamic022.iaglab.sdgc.com:44443/dynamic/fake_redirect_uri OIDCCryptoPassphrase secret OIDCSSLValidateServer Off <Location /dynamic/> AuthType openid-connect Require valid-user </Location> SSLEngine On SSLCertificateFile /etc/httpd2/certs/apache.crt SSLCertificateKeyFile /etc/httpd2/certs/apache.key </VirtualHost> ```

By Sahil Arora user 23 Aug 2016 at 11:52 a.m. CDT

Sahil Arora gravatar
I'm able to register OpenID dynamic host successfully on my local system. I kept cert-key pair location same as mentioned in the document SLCertificateFile /etc/pki/tls/certs/localhost.crt SSLCertificateKeyFile /etc/pki/tls/private/localhost.key Here, both certificate and key files already exist on the server. ln -s /etc/httpd/sites-available/dynamic.conf service httpd restart Please make these changes and restart apache. In your case, oidc cannot read the medatada file for some reason. Please check on the permission of /var/www/html ll /var/www/html - drwxr-xr-x. 2 apache apache 4096 Aug 23 14:53 dynamic - drwxr-xr-x. 2 apache apache 4096 Aug 23 15:12 metadata

By Mulk Anand named 29 Aug 2016 at 5:32 a.m. CDT

Mulk Anand gravatar
I have done all those things but I am unable to make it work. I have also removed the dynamic and metadata directory and recreated them with apache user as its owner but still same issue.

By Sahil Arora user 29 Aug 2016 at 5:32 a.m. CDT

Sahil Arora gravatar
we are closing this ticket. let us know should you have further questions

By Mulk Anand named 29 Aug 2016 at 5:34 a.m. CDT

Mulk Anand gravatar
One more thing..it was working earlier when I had standalone Gluu server. Recently i have made it part of cluster and after that I am facing this issue. Static one is working fine.

By Sahil Arora user 29 Aug 2016 at 7:44 a.m. CDT

Sahil Arora gravatar
It seems your client is not able to communicate with Gluu server (Identity provider). Please check the connectivity and also make sure servers are in time sync. You can also try using "https://sdggluu.iaglab.sdgc.com" on OIDC Identity Provider page. If nothing works, you can create metadata manually, Here is where to get metadata https://your.gluu.host.name/.well-known/openid-configuration

By Mulk Anand named 29 Aug 2016 at 8:33 a.m. CDT

Mulk Anand gravatar
Time sync is not an issue. I have verified the time on all the 3 VM nodes and they are in sync. If time sync was issue, it would have failed my other use cases too - static registration and Outbound SAML are working fine. For testing, I created a user - "test11@sdggluu.iaglab.sdgc.com" and when I use this, it works fine. But with the user "test1" (no xxx@sdggluu.iaglab.sdgc.com format), it gives me the issue.

By Mulk Anand named 29 Aug 2016 at 8:47 a.m. CDT

Mulk Anand gravatar
Also, I have compared contents of sdggluu.iaglab.sdgc.com.provider with the details I get by accessing https://sdggluu.iaglab.sdgc.com/.well-known/openid-configuration and both are exactly the same. Please note that I have deleted the .provider and .client files multiple times but never had problem in getting it generated when doing fresh testing.

By Aliaksandr Samuseu staff 29 Aug 2016 at 9:24 a.m. CDT

Aliaksandr Samuseu gravatar
Hi, Mulk. Could you explain shortly again what is your current issue? From the opening post: >After entering a valid user test1@sdggluu.iaglab.sdgc.com and submitting the request it takes me to the login form where username field is already populated with test1@sdggluu.iaglab.sdgc.com and is not editable. And by entering password and submitting the request results in invalid credentials message which is expected because the username should be only "test1" and not "test1@sdggluu.iaglab.sdgc.com" ..it seems that, in general, your setup is functional. The behaviour about not being possible to change login in this one particular case is confirmed by me and Sahil, and we'll check with the dev team whether it works as expected. But providing `https://sdggluu.iaglab.sdgc.com` instead of `test1@sdggluu.iaglab.sdgc.com` at this mod_auth_opeind's page *should* work for you, resulting in the Gluu login page with empty login form where you should be able to put any credentials. It works like that both for me and Sahil currently.

By Aliaksandr Samuseu staff 29 Aug 2016 at 9:29 a.m. CDT

Aliaksandr Samuseu gravatar
Please check attached screenshot showing how mod_auth_openid "discovery page" looks like for me

By Aliaksandr Samuseu staff 29 Aug 2016 at 9:46 a.m. CDT

Aliaksandr Samuseu gravatar
> One more thing..it was working earlier when I had standalone Gluu server. Recently i have made it part of cluster and after that I am facing this issue. Static one is working fine. Ok, got it. Could you open your current sdggluu.iaglab.sdgc.com.client metadata for a registered client, take "client_id" from there, then check whether or not client with such id exists in your Gluu cluster, **at both nodes**? One way you could do that is to set `sdggluu.iaglab.sdgc.com` to be resolved first to ip of the 1st node in the `/etc/hosts` file of the machine where your browser is running, wait for a minute, then access the web UI. Then set it to resolve to ip of the 2nd node, wait for a minute, access web UI. When in there, open page whre OpenID clients are listed, hit the "Search" button, and check whether one of them has inum the same as that "clientid"

By Mulk Anand named 29 Aug 2016 at 9:59 a.m. CDT

Mulk Anand gravatar
Sorry...need to go home now. Will go through your suggestions tomorrow and update you. Thanks for your help.

By Mulk Anand named 30 Aug 2016 at 3:59 a.m. CDT

Mulk Anand gravatar
Hi Alex, As per your suggestion to enter https://sdggluu.iaglab.sdgc.com instead of userid, I was able to get the login form with blank username/pwd field and was able to test it fine. Thank you. Per you second suggestion to verify client id on the second node of the cluster, first off I noticed that it was taking me to SuperGluu authentication which I had enabled few weeks back and then had reverted back on node 1 but on node 2 it was still active. I manually removed SuperGluu auth and made it internal. After logging into the admin console on node 2, I noticed there were no new client_ids there. This is a new issue that I am seeing. I was under impression that after I was able to make csync work, both nodes are now in sync but it does not seem to be the case. Can you pls assist?

By Mohib Zico Account Admin 30 Aug 2016 at 4:03 a.m. CDT

Mohib Zico gravatar
>> Per you second suggestion to verify client id on the second node of the cluster, first off I noticed that it was taking me to SuperGluu authentication which I had enabled few weeks back and then had reverted back on node 1 but on node 2 it was still active. LDAP replication is broken. Make sure both LDAP are replicated properly and performing their job.

By Mulk Anand named 30 Aug 2016 at 4:14 a.m. CDT

Mulk Anand gravatar
Can you please share the command to run the replication?

By Sahil Arora user 30 Aug 2016 at 4:59 a.m. CDT

Sahil Arora gravatar
Please refer to "LDAP Replication" [here](https://www.gluu.org/docs/cluster/)

By Aliaksandr Samuseu staff 30 Aug 2016 at 5:45 a.m. CDT

Aliaksandr Samuseu gravatar
I agree with Zico, you need to test LDAP replication between nodes (testing csync2 while you on it won't hurt too). > I was under impression that after I was able to make csync work, both nodes are now in sync but it does not seem to be the case Most of Gluu's configuration is stored in its internal LDAP directory now. csync plays an important role still, but it mostly syncs Tomcat's and Shibboleth's configuration (and the latter will be moved into LDAP in the future too). Please run this command in containers at each node and share output with us: `# /opt/opendj/bin/dsreplication status --hostname 127.0.0.1 --port 4444 -I 'admin' -w 'GLOBAL_ADMIN_PASS' --trustAll --no-prompt` You can check whether LDAP replication is functional by modifying attribute of any entry in the `userRoot` context ("o=gluu"), like, `displayname` attribute of some user entry, and checking whether the change will appear at the 2nd node. You can use console tools like **ldapsearch/ldapmodify**, or GUI like **Jxplorer**. In any case, it seems to fix that you'll need to choose the node with most up-to-date/valid contents, then disable LDAP replication, re-enable it and re-initialize LDAP directory on the other node from it. Please check the cluster guide for details (it's been updated recently)

By Mulk Anand named 30 Aug 2016 at 6:27 a.m. CDT

Mulk Anand gravatar
Hi Alex, Here is the output from node 1: ``` Suffix DN : Server : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4) ----------:----------------------------------:---------:---------------------:-------:-------:-------------:----------:--------------:------------- o=gluu : 10.10.10.120:4444 : 1879 : true : 11809 : 2456 : 8989 : 0 : : true o=gluu : 10.10.10.149:4444 : 12152 : true : 673 : 1286 : 8989 : 0 : : true o=gluu : SDGVMLAB020.iaglab.sdgc.com:4444 : 1879 : true : 11809 : 2456 : 8989 : 0 : : true o=gluu : sdggluu.iaglab.sdgc.com:4444 : 12152 : true : 673 : 1286 : 8989 : 0 : : true o=site : 10.10.10.149:4444 : 2 : true : 18904 : 1286 : 8989 : 0 : : true o=site : sdggluu.iaglab.sdgc.com:4444 : 2 : true : 18904 : 1286 : 8989 : 0 : : true o=site : 10.10.10.120:4444 : 2 : : : : : : : o=site : SDGVMLAB020.iaglab.sdgc.com:4444 : 2 : : : : : : : [1] The port used to communicate between the servers whose contents are being replicated. [2] The number of changes that are still missing on this server (and that have been applied to at least one of the other servers). [3] Age of oldest missing change: the date on which the oldest change that has not arrived on this server was generated. [4] Whether the replication communication through the replication port is encrypted or not. ``` And this is what I have on node 2: ``` [root@sdggluu logs]# /opt/opendj/bin/dsreplication status --hostname 127.0.0.1 --port 4444 -I 'admin' -w 'xxxxxx' --trustAll --no-prompt No replication information found. ```

By Mulk Anand named 30 Aug 2016 at 6:29 a.m. CDT

Mulk Anand gravatar
I restarted Gluu on node 2 but still I got the same message I pasted above. Also, I created a new user from node 1 GluuCE console but that has not replicated on node 2.

By Sahil Arora user 31 Aug 2016 at 5:05 a.m. CDT

Sahil Arora gravatar
Hi Mulk, As suggested by Alex,Please disable LDAP replication, re-enable it and re-initialize LDAP directory on the other node. Please check the [cluster guide](https://www.gluu.org/docs/cluster/) for details (it's been updated recently).

By Mulk Anand named 31 Aug 2016 at 5:31 a.m. CDT

Mulk Anand gravatar
Hi Sahil, Can you please tell me how do i disable LDAP replication? The guide talks about enabling it and not how to disable it.

By Aliaksandr Samuseu staff 31 Aug 2016 at 5:59 a.m. CDT

Aliaksandr Samuseu gravatar
Hi, Mulk. Here are commands you need: Run it in container of the node where `status` command from before showed that replication still enabled: `# /opt/opendj/bin/dsreplication disable --disableAll -I 'admin' -w 'REPLICATION_ADMIN_PASS' --trustAll --no-prompt` Then choose the node that has most recent changes, move into its container and re-enable replication: `# /opt/opendj/bin/dsreplication enable -I 'admin' -w 'REPLICATION_ADMIN_PASS' -b 'o=gluu' -h 127.0.0.1 -p 4444 -D 'cn=directory manager' --bindPassword1 'LDAP_PASS_OF_INSTANCE_ON_THIS_NODE' -r 8989 -O hostname.or.ip.of.the.other.node --port2 4444 --bindDN2 'cn=directory manager'--bindPassword2 'LDAP_PASS_OF_INSTANCE_ON_THE_OTHER_NODE' -R 8989 --secureReplication1 --secureReplication2 -X -n` ...then re-initialize the other node from there: `# /opt/opendj/bin/dsreplication initialize-all --hostname 127.0.0.1 --port 4444 -I 'admin' -w 'REPLICATION_ADMIN_PASS' -b 'o=gluu' --trustAll --no-prompt` Please refer to OpenDJ's admin guide's [corresponding chapters](https://backstage.forgerock.com/#!/docs/opendj/3/admin-guide/chap-replication) when in doubt at any step.

By Mulk Anand named 31 Aug 2016 at 8:56 a.m. CDT

Mulk Anand gravatar
I tried to follow the steps you gave but it doesn't seem to be working. This is what I got when I started to re-initialize: ``` [root@sdggluu logs]# /opt/opendj/bin/dsreplication initialize-all --hostname 10.10.10.149 --port 4444 -I 'admin' -w 'xxxxxxx' -b 'o=gluu' --trustAll --no-prompt Initializing base DN o=gluu with the contents from 10.10.10.149:4444: 0 entries processed (0 % complete). Error during the initialization with contents from server 10.10.10.149:4444. Last log details: [31/Aug/2016:09:28:09 -0400] severity="NOTICE" msgCount=0 msgID=org.opends.messages.backend-413 message="Initialize Backend task dsreplication-initialize-4 started execution". Task state: STOPPED_BY_ERROR. Check the error logs of 10.10.10.149:4444 for more information. See /tmp/opendj-replication-5069699207384341436.log for a detailed log of this operation. Details: com.forgerock.opendj.cli.ClientException: Error during the initialization with contents from server 10.10.10.149:4444. Last log details: [31/Aug/2016:09:28:09 -0400] severity="NOTICE" msgCount=0 msgID=org.opends.messages.backend-413 message="Initialize Backend task dsreplication-initialize-4 started execution". Task state: STOPPED_BY_ERROR. Check the error logs of 10.10.10.149:4444 for more information. ``` I ran this command on node 1 (10.10.10.149). I was getting the same error when I used 127.0.0.1 IP in the initialize command that you gave. Any idea what could have gone wrong and on which node? Node 1 is the one which has latest data.

By Mulk Anand named 31 Aug 2016 at 8:58 a.m. CDT

Mulk Anand gravatar
also, I had restarted Gluu on Node 1 and in the replication log I am seeing these: ``` [31/Aug/2016:09:24:07 -0400] category=SYNC severity=ERROR msgID=null.-1 msg=Domain o=gluu: the server with serverId=-2 is unreachable In Replication Server=Replication Server 8989 2003 unroutable message =ErrorMsg Details:routing table is empty [31/Aug/2016:09:24:07 -0400] category=SYNC severity=ERROR msgID=org.opends.messages.replication.79 msg=The following error has been received : Domain o=gluu: the server with serverId=-2 is unreachable In Replication Server=Replication Server 8989 2003 unroutable message =ErrorMsg Details:routing table is empty [31/Aug/2016:09:26:21 -0400] category=SYNC severity=NOTICE msgID=org.opends.messages.replication.204 msg=Replication server RS(2003) started listening for new connections on address 0.0.0.0 port 8989 [31/Aug/2016:09:26:34 -0400] category=SYNC severity=ERROR msgID=org.opends.messages.replication.212 msg=Directory server DS(15178) timed out while connecting to replication server 127.0.0.1:8989 for domain "o=gluu" [31/Aug/2016:09:26:34 -0400] category=SYNC severity=WARNING msgID=org.opends.messages.replication.23 msg=Directory server DS(15178) was unable to connect to any of the following replication servers for domain "o=gluu": 2003 [31/Aug/2016:09:26:35 -0400] category=SYNC severity=ERROR msgID=org.opends.messages.replication.178 msg=Directory server 15178 was attempting to connect to replication server 2003 but has disconnected in handshake phase [31/Aug/2016:09:26:52 -0400] category=SYNC severity=INFORMATION msgID=org.opends.messages.replication.207 msg=Replication server RS(2003) has accepted a connection from directory server DS(21830) for domain "cn=schema" at 127.0.0.1/127.0.0.1:42098 [31/Aug/2016:09:26:52 -0400] category=SYNC severity=NOTICE msgID=org.opends.messages.replication.62 msg=Directory server DS(21830) has connected to replication server RS(2003) for domain "cn=schema" at 127.0.0.1/127.0.0.1:8989 with generation ID 8408 [31/Aug/2016:09:27:10 -0400] category=SYNC severity=NOTICE msgID=org.opends.messages.replication.62 msg=Directory server DS(8954) has connected to replication server RS(2003) for domain "cn=admin data" at 127.0.0.1/127.0.0.1:8989 with generation ID 162541 [31/Aug/2016:09:27:10 -0400] category=SYNC severity=INFORMATION msgID=org.opends.messages.replication.207 msg=Replication server RS(2003) has accepted a connection from directory server DS(8954) for domain "cn=admin data" at 127.0.0.1/127.0.0.1:42138 [31/Aug/2016:09:27:17 -0400] category=SYNC severity=ERROR msgID=org.opends.messages.replication.178 msg=Directory server 15178 was attempting to connect to replication server 2003 but has disconnected in handshake phase ```

By Aliaksandr Samuseu staff 31 Aug 2016 at 10:03 a.m. CDT

Aliaksandr Samuseu gravatar
Hi, Mulk. Have you checked availability of ports 8989 and 4444 between these 2 vms? Each of those must be accessible on both of them. Btw, did you follow our cluster guide step by step before? In particular, did you run `ldapGeneralConfigInstall.py` script on **both** nodes?

By Mulk Anand named 01 Sep 2016 at 1:05 a.m. CDT

Mulk Anand gravatar
Hi Alex, I did follow cluster guide step by step earlier. But i did not run the ldapGeneralConfigInstall.py script on both nodes, it was only ran on node 1. The guide does not instruct to run on both. Do i need to run on both nodes? On ports availability and communication, I will update in a bit.

By Mulk Anand named 01 Sep 2016 at 1:15 a.m. CDT

Mulk Anand gravatar
I was unable to telnet on port 8989 from both nodes. But on port 4444 i was able to. I just made firewall changes to allow port 8989 and now i am able to telnet from both to each other. These are the ports that are now allowed from firewall: ``` ports: 1389/tcp 443/tcp 80/tcp 30865/tcp 44440/tcp 1686/tcp 8989/tcp 4440/tcp 4444/tcp 1636/tcp ```

By Mulk Anand named 01 Sep 2016 at 1:28 a.m. CDT

Mulk Anand gravatar
That was the issue Alex. Thanks a ton for your help, you really nailed it. :) After firewall pinhole changes this is what i did: 1. disabled replication. ran command on node1 2. re-enabled replication. ran command on node 1 3. re-initialized. ran command on node 1 4. No commands were ran on node 2. And this was the output: ``` Initializing base DN o=gluu with the contents from 10.10.10.149:4444: 388 entries processed (2 % complete). 14932 entries processed (100 % complete). Base DN initialized successfully. See /tmp/opendj-replication-39973408270441434.log for a detailed log of this operation. ```