Thursday, September 5, 2019

Adding node to an OpenShift cluster running on AWS

This post talks about scaling an OpenShift 3.11 cluster deployed on AWS using the Quick Start deployment guide.

The OpenShift Cluster was deployed using 8 m5.xlarge instances as follows:
Master = 3 x m5.xlarge
Etcd = 3 x m5.xlarge
Worker = 1 x m5.xlarge
Bastion/Ansible = 1 x m5.xlarge


The new node was created by right-clicking the existing worker node, and using "Launch More Like This" from the menu.



After the new node is up and running, make sure you can ssh to it from the Bastion/Ansible node.



Next, log into the Bastion/Ansible node and edit the /etc/ansible/hosts file to add the new_node information.

    new_nodes:
      hosts:
        ip-10-0-66-172.us-east-2.compute.internal:
          instance_id: i-092768fa7ac7d9caa
          openshift_node_group_name: node-config-compute-infra
    nodes:
      hosts:
        ip-10-0-23-92.us-east-2.compute.internal: *id001
        ip-10-0-60-39.us-east-2.compute.internal: *id002
        ip-10-0-73-252.us-east-2.compute.internal: 
          instance_id: i-0cc9f39f54c8c87e6
          openshift_node_group_name: node-config-compute-infra

        ip-10-0-95-180.us-east-2.compute.internal: *id003


Next, run the Ansible playbook as follows:

[root@ip-10-0-130-225 ~]# cd /usr/share/ansible/openshift-ansible
[root@ip-10-0-130-225 openshift-ansible]# ansible-playbook -i /etc/ansible/hosts playbooks/openshift-node/scaleup.yml

PLAY [Populate config host groups] *************************************************************************************************************************************************************************************

TASK [Load group name mapping variables] *******************************************************************************************************************************************************************************
Thursday 05 September 2019  18:47:11 +0000 (0:00:00.062)       0:00:00.062 **** 
ok: [localhost]

TASK [Evaluate groups - g_nfs_hosts is single host] ********************************************************************************************************************************************************************
Thursday 05 September 2019  18:47:11 +0000 (0:00:00.026)       0:00:00.089 **** 
skipping: [localhost]

TASK [Evaluate oo_all_hosts] *******************************************************************************************************************************************************************************************
Thursday 05 September 2019  18:47:11 +0000 (0:00:00.020)       0:00:00.109 **** 
ok: [localhost] => (item=ip-10-0-23-92.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-95-180.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-60-39.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-73-252.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-69-149.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-2-243.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-57-57.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-66-172.us-east-2.compute.internal)

TASK [Evaluate oo_masters] *********************************************************************************************************************************************************************************************
Thursday 05 September 2019  18:47:11 +0000 (0:00:00.087)       0:00:00.197 **** 
ok: [localhost] => (item=ip-10-0-23-92.us-east-2.compute.internal)
ok: [localhost] => (item=ip-10-0-95-180.us-east-2.compute.internal)

ok: [localhost] => (item=ip-10-0-60-39.us-east-2.compute.internal)


It will take while before it completes....


TASK [openshift_storage_glusterfs : Generate topology file] ************************************************************************************************************************************************************
Thursday 05 September 2019  19:28:49 +0000 (0:00:00.080)       0:41:37.833 **** 
skipping: [ip-10-0-23-92.us-east-2.compute.internal]

TASK [openshift_storage_glusterfs : Place heketi topology on heketi Pod] ***********************************************************************************************************************************************
Thursday 05 September 2019  19:28:49 +0000 (0:00:00.072)       0:41:37.906 **** 
skipping: [ip-10-0-23-92.us-east-2.compute.internal]

TASK [openshift_storage_glusterfs : Load heketi topology] **************************************************************************************************************************************************************
Thursday 05 September 2019  19:28:49 +0000 (0:00:00.075)       0:41:37.981 **** 
skipping: [ip-10-0-23-92.us-east-2.compute.internal]

TASK [openshift_storage_glusterfs : Delete temp directory] *************************************************************************************************************************************************************
Thursday 05 September 2019  19:28:49 +0000 (0:00:00.218)       0:41:38.200 **** 
ok: [ip-10-0-23-92.us-east-2.compute.internal]

PLAY RECAP *************************************************************************************************************************************************************************************************************
ip-10-0-2-243.us-east-2.compute.internal : ok=14   changed=1    unreachable=0    failed=0   
ip-10-0-23-92.us-east-2.compute.internal : ok=67   changed=2    unreachable=0    failed=0   
ip-10-0-57-57.us-east-2.compute.internal : ok=14   changed=1    unreachable=0    failed=0   
ip-10-0-60-39.us-east-2.compute.internal : ok=32   changed=1    unreachable=0    failed=0   
ip-10-0-66-172.us-east-2.compute.internal : ok=152  changed=82   unreachable=0    failed=0   
ip-10-0-69-149.us-east-2.compute.internal : ok=14   changed=1    unreachable=0    failed=0   
ip-10-0-95-180.us-east-2.compute.internal : ok=32   changed=1    unreachable=0    failed=0   
localhost                  : ok=22   changed=0    unreachable=0    failed=0   


INSTALLER STATUS *******************************************************************************************************************************************************************************************************
Initialization              : Complete (0:02:42)
Node Bootstrap Preparation  : Complete (0:30:10)
Node Join                   : Complete (0:00:20)
Thursday 05 September 2019  19:28:49 +0000 (0:00:00.150)       0:41:38.351 **** 
=============================================================================== 
openshift_node : install needed rpm(s) ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1018.88s
container_runtime : Install Docker ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 331.95s
openshift_node : Install iSCSI storage plugin dependencies ---------------------------------------------------------------------------------------------------------------------------------------------------- 244.13s
openshift_node : Install node, clients, and conntrack packages ------------------------------------------------------------------------------------------------------------------------------------------------ 125.15s
Ensure openshift-ansible installer package deps are installed ------------------------------------------------------------------------------------------------------------------------------------------------- 119.25s
openshift_node : Install NFS storage plugin dependencies ------------------------------------------------------------------------------------------------------------------------------------------------------ 116.75s
openshift_node : Install dnsmasq ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 99.21s
openshift_node : Install Ceph storage plugin dependencies ------------------------------------------------------------------------------------------------------------------------------------------------------ 97.82s
os_firewall : Install iptables packages ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 93.99s
openshift_node : Install GlusterFS storage plugin dependencies ------------------------------------------------------------------------------------------------------------------------------------------------- 60.62s
openshift_excluder : Install docker excluder - yum ------------------------------------------------------------------------------------------------------------------------------------------------------------- 42.84s
openshift_repos : Ensure libselinux-python is installed -------------------------------------------------------------------------------------------------------------------------------------------------------- 15.66s
nickhammond.logrotate : nickhammond.logrotate | Install logrotate ---------------------------------------------------------------------------------------------------------------------------------------------- 15.06s
openshift_manage_node : Wait for sync DS to set annotations on all nodes --------------------------------------------------------------------------------------------------------------------------------------- 10.75s
os_firewall : need to pause here, otherwise the iptables service starting can sometimes cause ssh to fail ------------------------------------------------------------------------------------------------------ 10.08s
openshift_repos : refresh cache --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5.30s
container_runtime : Get current installed Docker version -------------------------------------------------------------------------------------------------------------------------------------------------------- 5.08s
Approve node certificates when bootstrapping -------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.57s
container_runtime : restart container runtime ------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.54s
tuned : Ensure files are populated from templates --------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.47s


Login and check if the new node has been added to the cluster.


[root@ip-10-0-130-225 openshift-ansible]# oc login -u system:admin
Logged into "https://Red-H-OpenS-1OZOCWK21Q1XP-9b6c91fba5af361e.elb.us-east-2.amazonaws.com:443" as "system:admin" using existing credentials.

You have access to the following projects and can switch between them with 'oc project <projectname>':

  * default
    kube-public
    kube-service-catalog
    kube-system
    management-infra
    openshift
    openshift-ansible-service-broker
    openshift-console
    openshift-infra
    openshift-logging
    openshift-monitoring
    openshift-node
    openshift-sdn
    openshift-template-service-broker
    openshift-web-console

Using project "default".
[root@ip-10-0-130-225 openshift-ansible]# oc get nodes 
NAME                                        STATUS    ROLES           AGE       VERSION
ip-10-0-23-92.us-east-2.compute.internal    Ready     master          13h       v1.11.0+d4cacc0
ip-10-0-60-39.us-east-2.compute.internal    Ready     master          13h       v1.11.0+d4cacc0
ip-10-0-66-172.us-east-2.compute.internal   Ready     compute,infra   3m        v1.11.0+d4cacc0
ip-10-0-73-252.us-east-2.compute.internal   Ready     compute,infra   13h       v1.11.0+d4cacc0
ip-10-0-95-180.us-east-2.compute.internal   Ready     master          13h       v1.11.0+d4cacc0


Move any hosts that you defined in the new_nodes section to the appropriate section ie. nodes. By moving these hosts, subsequent playbook runs that use this inventory file treat the nodes correctly. You can keep the empty new_nodes section. 


    new_masters: {}
    new_nodes: {}
    nodes:
      hosts:
        ip-10-0-23-92.us-east-2.compute.internal: *id001
        ip-10-0-60-39.us-east-2.compute.internal: *id002
        ip-10-0-73-252.us-east-2.compute.internal:
          instance_id: i-0cc9f39f54c8c87e6
          openshift_node_group_name: node-config-compute-infra
        ip-10-0-66-172.us-east-2.compute.internal:
          instance_id: i-092768fa7ac7d9caa
          openshift_node_group_name: node-config-compute-infra
        ip-10-0-95-180.us-east-2.compute.internal: *id003
    provision_in_progress: {}