Tuesday 30 May 2017

NIM Master, NIMSH and SSL on AIX 7.1 TL4 SP3

NIM Master, NIMSH and SSL on AIX 7.1 TL4 SP3


Whilst working with one of my AIX customers recently I discovered a problem with NIMSH and SSL. The customer had updated their NIM master from AIX 7.1 TL4 SP1 to AIX 7.1 TL4 SP3. After the SP update, any attempt to connect to a NIM client (over NIMSH+SSL), from the NIM master, would simply hang. For example, we tried to list the filesets on the NIM client with this command, which never returned any output.

[root@750lpar4]/ # nim -o lslpp 750lpar9

The /var/adm/ras/nimsh.log file, on the NIM client, showed that the NIMSH session stopped here:

Thu Jan 12 14:31:49 2017        Loading certificates..
Thu Jan 12 14:31:49 2017        Loading private key file..
Thu Jan 12 14:31:49 2017        create BIO

NIM master: 750lpar4
7100-04-03-1543

NIM client: 750lpar9
7100-04-02-1614

[root@750lpar4]/ # lsnim -l 750lpar9
750lpar9:
   class          = machines
   type           = standalone
   connect        = nimsh (secure)
   platform       = chrp
   netboot_kernel = 64
   if1            = 10_1_50 750lpar9 0
   cable_type1    = N/A
   Cstate         = ready for a NIM operation
   prev_state     = not running
   Mstate         = currently running
   cpuid          = 00F603CD4C00
   Cstate_result  = success

The root cause of the problem become apparent when we ran truss against the nim –o command.

[root@750lpar4]/ # truss -adef -o truss.lsnim.out -w all nim -o lslpp 750lpar9

[root@750lpar4]/ # cat truss.lsnim.out
13959372: C o u l d   n o t   l o a d   m o d u l e   / u s r / l i b / l
13959372: i b s s l . s o .\n S y s t e m   e r r o r :   N o   s u c h
13959372: f i l e   o r   d i r e c t o r y
19267612: C o u l d   n o t   l o a d   m o d u l e   / u s r / l i b / l
19267612: i b c r y p t o . s o .\n S y s t e m   e r r o r :   N o   s u
19267612: c h   f i l e   o r   d i r e c t o r y

The required shared library object files were missing on the NIM master.

[root@750lpar4]/usr/lib # ls -ltr libssl.so libcrypto.so
libssl.so not found
libcrypto.so not found

We fixed this issue by extracting the missing files from the (existing) /usr/lib/libssl.a and /usr/lib/libcrypto.a archives.

[root@750lpar4]/usr/lib # slibclean

[root@750lpar4]/usr/lib # /bin/ar -v -x /usr/lib/libssl.a /usr/lib/libssl.so
x - /usr/lib/libssl.so

[root@750lpar4]/usr/lib # /bin/ar -v -x /usr/lib/libcrypto.a /usr/lib/libcrypto.so
x - /usr/lib/libcrypto.so

[root@750lpar4]/usr/lib # ls -ltr libssl.so libcrypto.so
-rwxr-xr-x    1 root     system       724913 Jan 18 09:08 libssl.so
-rwxr-xr-x    1 root     system      3031337 Jan 18 09:08 libcrypto.so

After that, the nim –o commands started working again.

[root@750lpar4]/usr/lib # nim -o showlog 750lpar9
HELLO

So, the question is why did this happen? Well, in the past, the libssl.so.0.9.8 shared object was extracted by NIM, but more recent updates by the OpenSSL version has forced IBM to move to libssl.so. Usually, the extracted shared library object is added (if not currently present) when nimconfig -c is run.  But given that this is an existing NIM master, we did not want to run this again (as we would lose all of the current SSL key access). So extracting the objects is preferred. The problem is due to the fact that the libssl.so and libcrypto.so files are not populated when the AIX 7100-04-03 update is applied. This is a bug and will be officially addressed, soon, under APAR IV93152 NIM push operation to client hang on nimsh over SSL.

I believe this issue may also occur when you migrate your NIM master from AIX 7.1 to 7.2 (with nimadm for example). But I need to do more testing to reproduce and confirm the issue.

Here’s one good reason to setup NIMSH over SSL.

NIMSH, SSL and LPM

The following link is a great reference guide for configuring NIMSH over SSL.

NIMSH over SSL

No comments:

Post a Comment