Introdução
Quando iniciamos uma operação de MOVE em um Oracle RAC, o FPP pode não começar pelo node onde está executando o FPP Client atualmente. Por exemplo, em um RAC de 2 nodes, se o FPP Client estiver ativo no Node 1, o FPP pode começar o move reiniciando o node 2 primeiro, depois realizar a realocação do FPP Client para o Node 2 que já foi reiniciado, e finalizar a operação relizando o restart do Node 1, onde o FPP Client estava ativo inicialmente.
Para casos de uso onde é desejável ter uma ordem específica e controlada na qual os nodes são reiniciados, o FPP possui o recurso de organizar os nodes do Oracle RAC em batches, o que permite termos mais controle sobre a ordem, ou a quantidade de nodes que são atualizados paralelamente.
Por exemplo: Em um RAC de 4 nodes podemos ter 4 batches com 1 node em cada batch, apenas para ter controle da ordem e do momento em que cada node é reiniciado, é útil quando precisamos garantir que o próximo node não tenha algum processo crítico em execução. Outro caso uso seria colocar 2 ou 3 nodes em cada batch para reduzir o tempo de janela de manutenção, pois os nodes que ficam dentro do mesmo grupo são reiniciados em paralelo.
Este post demonstra como utilizar este recurso em exemplos via linha de comando, assim como a opção de REST API.
Opções de Controle
-batches – Este parâmetro permite especificar os grupos colocando a lista de nodes separados por virgula dentro de cada batch. Um batch é representado por ().
rhpctl move gihome -batches "(node1,node2),(node3,node4)"
-chainbatches : Este parâmetro especifica que o FPP deve iniciar um batch após o outro sem aguardar por interação do usuário. Por padrão, a operação de MOVE irá pausar após a conclusão de cada batch, e aguardar uma interação do usuário para iniciar o próximo batch.
-pausebetweenbatches : Especifica que o JOB deve ficar com status PAUSED após a conclusão de cada batch, e permite que o usuário resuma o mesmo JOB para continuar a execução no próximo batch. Por padrão o FPP irá marcar o JOB como SUCCESSED após cada batch, e o usuário precisar criar um novo JOB para executar o próximo batch.
-continue / -revert / -abort : Quando há uma operação de move pausada, o FPP permite o usuário continuar para o próximo batch (-continue), fazer rollback da operação que já foi concluída (-revert) ou cancelar o move e deixar o ambiente do jeito que está (-abort).
Este post apresenta exemplos apenas do -continue, mas para utilizar as outras opções, basta substituir o nome do parâmetro, a sintaxe do comando é a mesma.
rhpctl move gihome -destwc WC_GI1922_cluster01 -continue rhpctl move gihome -destwc WC_GI1922_cluster01 -revert rhpctl move gihome -destwc WC_GI1922_cluster01 -abort
Determinando a Ordem do Move (Opção -batches)
Iniciando um move do Grid com a opção -batches (propositalmente começando pelo node 2):
[grid@fppserver ~]$ rhpctl move gihome -destwc WC_GI1922_cluster01 -batches "(rac02),(rac01)" -schedule NOW -sourcehome /u01/app/product/19.0.0.0/GI1921 -ignorewcpatches Operation "rhpctl move gihome" scheduled with the job ID "171".
O JOB finaliza com sucesso quando o primeiro batch é concluído, observe no detalhe a última linha apresentada no log:
[grid@fppserver ~]$ rhpctl query job -jobid 171 fppserver.dibiei.com: Audit ID: 1591 Job ID: 171 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -destwc WC_GI1922_cluster01 -batches (rac02),(rac01) -schedule NOW -sourcehome /u01/app/product/19.0.0.0/GI1921 -ignorewcpatches" Scheduled job execution start time: 2024-04-09T19:05:16-03. Equivalent local time: 2024-04-09 19:05:16 Current status: SUCCEEDED Result file path: "/fpp_images/chkbase/scheduled/job-171-2024-04-09-19:05:35.log" Job execution start time: 2024-04-09 19:05:35 Job execution end time: 2024-04-09 19:13:36 Job execution elapsed time: 8 minutes 0 seconds Result file "/fpp_images/chkbase/scheduled/job-171-2024-04-09-19:05:35.log" contents: fppserver.dibiei.com: Audit ID: 1588 fppserver.dibiei.com: verifying versions of Oracle homes ... fppserver.dibiei.com: verifying owners of Oracle homes ... fppserver.dibiei.com: verifying groups of Oracle homes ... fppserver.dibiei.com: Connecting to RHPC... fppserver.dibiei.com: initiating sharedness check for the 'move gihome' operation on client cluster fppserver.dibiei.com: completed the sharedness check for the source and destination Oracle home on client cluster fppserver.dibiei.com: Connecting to RHPC... rac01.dibiei.com: retrieving status of databases ... rac01.dibiei.com: retrieving status of services of databases ... rac01.dibiei.com: relocating services of databases ... rac01.dibiei.com: stopping services of databases ... rac01.dibiei.com: stopping instances of databases ... rac01.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac02". rac01.dibiei.com: Executing "prepatch" script on nodes [rac02]. rac01.dibiei.com: Successfully executed "prepatch" script on nodes [rac02]. rac01.dibiei.com: Executing "postpatch" script on nodes [rac02]. Using configuration parameter file: /u01/app/product/19.0.0.0/GI1922/crs/install/crsconfig_params The log of current session can be found at: /u01/app/grid/crsdata/rac02/crsconfig/crs_postpatch_apply_oop_rac02_2024-04-09_07-08-14PM.log Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [3270192392]. 2024/04/09 19:09:06 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service' Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [3270192392]. 2024/04/09 19:10:57 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector. 2024/04/09 19:10:57 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector. 2024/04/09 19:11:05 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector. 2024/04/09 19:11:06 CLSRSC-4005: Failed to patch Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue. 2024/04/09 19:11:06 CLSRSC-672: Post-patch steps for patching GI home successfully completed. rac01.dibiei.com: Successfully executed "postpatch" script on nodes [rac02]. rac01.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac01/rhp/conf/rhp.pref rac01.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac01/rhp/conf/rhp.pref rac01.dibiei.com: starting instances of databases "orcl" ... rac01.dibiei.com: Updating inventory on nodes: rac02. ======================================== rac01.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2752 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-09_07-12-01PM.log 'UpdateNodeList' was successful. rac01.dibiei.com: Updated inventory on nodes: rac02. rac01.dibiei.com: Updating inventory on nodes: rac02. ======================================== rac01.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2755 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-09_07-12-13PM.log 'UpdateNodeList' was successful. rac01.dibiei.com: Updated inventory on nodes: rac02. rac01.dibiei.com: Relocating RHPS or RHPC node to rac02 Continue by running 'rhpctl move gihome -destwc WC_GI1922_cluster01 -continue'
Neste estágio, a qualquer momento podemos executar o comando sugerido para prosseguir a operação de move no próximo grupo de servidores.
Criando um novo JOB executando o comando informado na conclusão do primeiro batch:
[grid@fppserver ~]$ rhpctl move gihome -destwc WC_GI1922_cluster01 -continue -schedule NOW Operation "rhpctl move gihome" scheduled with the job ID "172".
Alguns minutos depois, esse JOB também concluiu com sucesso:
[grid@fppserver ~]$ rhpctl query job -jobid 172 fppserver.dibiei.com: Audit ID: 1593 Job ID: 172 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -destwc WC_GI1922_cluster01 -continue -schedule NOW" Scheduled job execution start time: 2024-04-09T19:19:24-03. Equivalent local time: 2024-04-09 19:19:24 Current status: SUCCEEDED Result file path: "/fpp_images/chkbase/scheduled/job-172-2024-04-09-19:19:35.log" Job execution start time: 2024-04-09 19:19:35 Job execution end time: 2024-04-09 19:26:20 Job execution elapsed time: 6 minutes 44 seconds Result file "/fpp_images/chkbase/scheduled/job-172-2024-04-09-19:19:35.log" contents: fppserver.dibiei.com: Audit ID: 1592 fppserver.dibiei.com: Connecting to RHPC... rac02.dibiei.com: retrieving status of databases ... rac02.dibiei.com: relocating services of databases ... rac02.dibiei.com: stopping services of databases ... rac02.dibiei.com: stopping instances of databases ... rac02.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac01". rac02.dibiei.com: Executing "prepatch" script on nodes [rac01]. rac02.dibiei.com: Successfully executed "prepatch" script on nodes [rac01]. rac02.dibiei.com: Executing "postpatch" script on nodes [rac01]. Using configuration parameter file: /u01/app/product/19.0.0.0/GI1922/crs/install/crsconfig_params The log of current session can be found at: /u01/app/grid/crsdata/rac01/crsconfig/crs_postpatch_apply_oop_rac01_2024-04-09_07-21-34PM.log Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [3270192392]. 2024/04/09 19:22:30 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service' Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [658638121]. 2024/04/09 19:24:53 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector. 2024/04/09 19:24:53 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector. 2024/04/09 19:25:00 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector. 2024/04/09 19:25:00 CLSRSC-4005: Failed to patch Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue. 2024/04/09 19:25:03 CLSRSC-672: Post-patch steps for patching GI home successfully completed. rac02.dibiei.com: Successfully executed "postpatch" script on nodes [rac01]. rac02.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac02/rhp/conf/rhp.pref rac02.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac02/rhp/conf/rhp.pref rac02.dibiei.com: starting instances of databases "orcl" ... rac02.dibiei.com: Updating inventory on nodes: rac01. ======================================== rac02.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2873 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-09_07-25-39PM.log 'UpdateNodeList' was successful. rac02.dibiei.com: Updated inventory on nodes: rac01. rac02.dibiei.com: Updating inventory on nodes: rac01. ======================================== rac02.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2878 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-09_07-25-50PM.log 'UpdateNodeList' was successful. rac02.dibiei.com: Updated inventory on nodes: rac01. rac02.dibiei.com: completed the move of Oracle Grid Infrastructure home on client cluster "cluster01" rac02.dibiei.com: Completed the 'move gihome' operation on cluster cluster01.
Por ser o último batch, a operação de move é considerada concluída. Em casos de uso com mais batches, seria necessário executar o comando “rhpctl move gihome …. -continue” repetidamente, até chegar no último batch da lista informada no comando de move inicial.
DICA: Se você quiser apenas controlar a ordem dos nodes e não ter uma pausa com interação manual entre cada batch, adicione o parâmetro -chainbatches
Adicionando uma Pausa no JOB Atual (-pausebetweenbatches)
Funcionalmente falando, o parâmetro “-pausebetweenbatches” não muda muita coisa em relação ao exemplo anterior, pois a diferença é que desta forma o FPP pausa o JOB atual e permite a utilização do comando “rhpctl resume job” para prosseguir com a execução nos próximos nodes, ao invés de tornar necessário executar o comando de move novamente com a opção “-continue“, que na prática acabada gerando um novo JOB.
Para uso com linha de comando, essa é uma opção que eu acho mais interessante por simplificar o comando para resumir a operação, e manter tudo em um único JOB, que por sua vez resulta em um único arquivo de log consolidado para todo o move.
Exemplo:
[grid@fppserver ~]$ rhpctl move gihome -sourcehome /u01/app/product/19.0.0.0/GI1921 -destwc WC_GI1922_cluster01 -ignorewcpatches -batches "(rac02),(rac01)" -schedule NOW -pausebetweenbatches Operation "rhpctl move gihome" scheduled with the job ID "166".
Consultando o status do JOB 166, neste momento está com status EXECUTING e já está executando o “prepatch” no node rac02:
[grid@fppserver ~]$ rhpctl query job -jobid 166 fppserver.dibiei.com: Audit ID: 1567 Job ID: 166 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -sourcehome /u01/app/product/19.0.0.0/GI1921 -destwc WC_GI1922_cluster01 -ignorewcpatches -batches (rac02),(rac01) -schedule NOW -pausebetweenbatches" Scheduled job execution start time: 2024-04-08T23:26:43-03. Equivalent local time: 2024-04-08 23:26:43 Current status: EXECUTING Result file path: "/fpp_images/chkbase/scheduled/job-166-2024-04-08-23:27:07.log" Job execution start time: 2024-04-08 23:27:07 Result file "/fpp_images/chkbase/scheduled/job-166-2024-04-08-23:27:07.log" contents: fppserver.dibiei.com: Audit ID: 1566 fppserver.dibiei.com: verifying versions of Oracle homes ... fppserver.dibiei.com: verifying owners of Oracle homes ... fppserver.dibiei.com: verifying groups of Oracle homes ... fppserver.dibiei.com: Connecting to RHPC... fppserver.dibiei.com: initiating sharedness check for the 'move gihome' operation on client cluster fppserver.dibiei.com: completed the sharedness check for the source and destination Oracle home on client cluster fppserver.dibiei.com: Connecting to RHPC... rac01.dibiei.com: retrieving status of databases ... rac01.dibiei.com: retrieving status of services of databases ... rac01.dibiei.com: relocating services of databases ... rac01.dibiei.com: stopping services of databases ... rac01.dibiei.com: stopping instances of databases ... rac01.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac02". rac01.dibiei.com: Executing "prepatch" script on nodes [rac02].
Após alguns minutos o Move no node rac02 foi concluído. O JOB agora está com status “PAUSED” e a última informação no LOG é a indicação para executar o comando que faz o FPP resumir a operação do move que está pausada:
[grid@fppserver ~]$ rhpctl query job -jobid 166 fppserver.dibiei.com: Audit ID: 1568 Job ID: 166 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -sourcehome /u01/app/product/19.0.0.0/GI1921 -destwc WC_GI1922_cluster01 -ignorewcpatches -batches (rac02),(rac01) -schedule NOW -pausebetweenbatches" Scheduled job execution start time: 2024-04-08T23:26:43-03. Equivalent local time: 2024-04-08 23:26:43 Current status: PAUSED Result file path: "/fpp_images/chkbase/scheduled/job-166-2024-04-08-23:27:07.log" Job execution start time: 2024-04-08 23:27:07 Job execution end time: 2024-04-08 23:35:19 Job execution elapsed time: 8 minutes 12 seconds Result file "/fpp_images/chkbase/scheduled/job-166-2024-04-08-23:27:07.log" contents: fppserver.dibiei.com: Audit ID: 1566 fppserver.dibiei.com: verifying versions of Oracle homes ... fppserver.dibiei.com: verifying owners of Oracle homes ... fppserver.dibiei.com: verifying groups of Oracle homes ... fppserver.dibiei.com: Connecting to RHPC... fppserver.dibiei.com: initiating sharedness check for the 'move gihome' operation on client cluster fppserver.dibiei.com: completed the sharedness check for the source and destination Oracle home on client cluster fppserver.dibiei.com: Connecting to RHPC... rac01.dibiei.com: retrieving status of databases ... rac01.dibiei.com: retrieving status of services of databases ... rac01.dibiei.com: relocating services of databases ... rac01.dibiei.com: stopping services of databases ... rac01.dibiei.com: stopping instances of databases ... rac01.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac02". rac01.dibiei.com: Executing "prepatch" script on nodes [rac02]. rac01.dibiei.com: Successfully executed "prepatch" script on nodes [rac02]. rac01.dibiei.com: Executing "postpatch" script on nodes [rac02]. Using configuration parameter file: /u01/app/product/19.0.0.0/GI1922/crs/install/crsconfig_params The log of current session can be found at: /u01/app/grid/crsdata/rac02/crsconfig/crs_postpatch_apply_oop_rac02_2024-04-08_11-30-06PM.log Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [3270192392]. 2024/04/08 23:31:00 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service' Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [3270192392]. 2024/04/08 23:32:58 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:32:58 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:33:06 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:33:06 CLSRSC-4005: Failed to patch Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue. 2024/04/08 23:33:07 CLSRSC-672: Post-patch steps for patching GI home successfully completed. rac01.dibiei.com: Successfully executed "postpatch" script on nodes [rac02]. rac01.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac01/rhp/conf/rhp.pref rac01.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac01/rhp/conf/rhp.pref rac01.dibiei.com: starting instances of databases "orcl" ... rac01.dibiei.com: Updating inventory on nodes: rac02. ======================================== rac01.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2701 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-08_11-33-47PM.log 'UpdateNodeList' was successful. rac01.dibiei.com: Updated inventory on nodes: rac02. rac01.dibiei.com: Updating inventory on nodes: rac02. ======================================== rac01.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2709 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-08_11-34-00PM.log 'UpdateNodeList' was successful. rac01.dibiei.com: Updated inventory on nodes: rac02. rac01.dibiei.com: Relocating RHPS or RHPC node to rac02 Continue by running 'rhpctl move gihome -destwc WC_GI1922_cluster01 -continue'
Para resumir o processo e avançar para o próximo node, eu prefiro resumir o JOB atual com o comando “rhpctl resume job” ao invés de executar o comando “rhpctl move gihome -destwc <wc_name> -continue” manualmente.
Exemplo resumindo o JOB 166 no FPP:
[grid@fppserver ~]$ rhpctl resume job -jobid 166 fppserver.dibiei.com: Audit ID: 1569
Agora ao consultar o mesmo JOB novamente ele deve ter voltado para o status EXECUTING. Para acompanhar o progresso em tempo real, também podemos usar um “tail” no arquivo de log que aparece nas primeiras linhas quando consultamos o status de um JOB.
[grid@fppserver ~]$ tail -100f "/fpp_images/chkbase/scheduled/job-166-2024-04-08-23:27:07.log" fppserver.dibiei.com: Audit ID: 1566 fppserver.dibiei.com: verifying versions of Oracle homes ... fppserver.dibiei.com: verifying owners of Oracle homes ... fppserver.dibiei.com: verifying groups of Oracle homes ... fppserver.dibiei.com: Connecting to RHPC... fppserver.dibiei.com: initiating sharedness check for the 'move gihome' operation on client cluster fppserver.dibiei.com: completed the sharedness check for the source and destination Oracle home on client cluster fppserver.dibiei.com: Connecting to RHPC... rac01.dibiei.com: retrieving status of databases ... rac01.dibiei.com: retrieving status of services of databases ... rac01.dibiei.com: relocating services of databases ... rac01.dibiei.com: stopping services of databases ... rac01.dibiei.com: stopping instances of databases ... rac01.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac02". rac01.dibiei.com: Executing "prepatch" script on nodes [rac02]. rac01.dibiei.com: Successfully executed "prepatch" script on nodes [rac02]. rac01.dibiei.com: Executing "postpatch" script on nodes [rac02]. Using configuration parameter file: /u01/app/product/19.0.0.0/GI1922/crs/install/crsconfig_params The log of current session can be found at: /u01/app/grid/crsdata/rac02/crsconfig/crs_postpatch_apply_oop_rac02_2024-04-08_11-30-06PM.log Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [3270192392]. 2024/04/08 23:31:00 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service' Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [3270192392]. 2024/04/08 23:32:58 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:32:58 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:33:06 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:33:06 CLSRSC-4005: Failed to patch Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue. 2024/04/08 23:33:07 CLSRSC-672: Post-patch steps for patching GI home successfully completed. rac01.dibiei.com: Successfully executed "postpatch" script on nodes [rac02]. rac01.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac01/rhp/conf/rhp.pref rac01.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac01/rhp/conf/rhp.pref rac01.dibiei.com: starting instances of databases "orcl" ... rac01.dibiei.com: Updating inventory on nodes: rac02. ======================================== rac01.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2701 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-08_11-33-47PM.log 'UpdateNodeList' was successful. rac01.dibiei.com: Updated inventory on nodes: rac02. rac01.dibiei.com: Updating inventory on nodes: rac02. ======================================== rac01.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2709 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-08_11-34-00PM.log 'UpdateNodeList' was successful. rac01.dibiei.com: Updated inventory on nodes: rac02. rac01.dibiei.com: Relocating RHPS or RHPC node to rac02 Continue by running 'rhpctl move gihome -destwc WC_GI1922_cluster01 -continue' fppserver.dibiei.com: Connecting to RHPC... rac02.dibiei.com: retrieving status of databases ... rac02.dibiei.com: relocating services of databases ... rac02.dibiei.com: stopping services of databases ... rac02.dibiei.com: stopping instances of databases ... rac02.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac01". rac02.dibiei.com: Executing "prepatch" script on nodes [rac01]. rac02.dibiei.com: Successfully executed "prepatch" script on nodes [rac01]. rac02.dibiei.com: Executing "postpatch" script on nodes [rac01]. Using configuration parameter file: /u01/app/product/19.0.0.0/GI1922/crs/install/crsconfig_params The log of current session can be found at: /u01/app/grid/crsdata/rac01/crsconfig/crs_postpatch_apply_oop_rac01_2024-04-08_11-41-44PM.log Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [3270192392]. 2024/04/08 23:42:40 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service' Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [658638121]. 2024/04/08 23:45:27 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:45:27 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:45:35 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector. 2024/04/08 23:45:35 CLSRSC-4005: Failed to patch Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue. 2024/04/08 23:45:37 CLSRSC-672: Post-patch steps for patching GI home successfully completed. rac02.dibiei.com: Successfully executed "postpatch" script on nodes [rac01]. rac02.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac02/rhp/conf/rhp.pref rac02.dibiei.com: Processing arguments file /u01/app/grid/crsdata/rac02/rhp/conf/rhp.pref rac02.dibiei.com: starting instances of databases "orcl" ... rac02.dibiei.com: Updating inventory on nodes: rac01. ======================================== rac02.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2895 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-08_11-46-17PM.log 'UpdateNodeList' was successful. rac02.dibiei.com: Updated inventory on nodes: rac01. rac02.dibiei.com: Updating inventory on nodes: rac01. ======================================== rac02.dibiei.com: Starting Oracle Universal Installer... Checking swap space: must be greater than 500 MB. Actual 2899 MB Passed The inventory pointer is located at /etc/oraInst.loc You can find the log of this install session at: /u01/app/oraInventory/logs/UpdateNodeList2024-04-08_11-46-29PM.log 'UpdateNodeList' was successful. rac02.dibiei.com: Updated inventory on nodes: rac01. rac02.dibiei.com: completed the move of Oracle Grid Infrastructure home on client cluster "cluster01" rac02.dibiei.com: Completed the 'move gihome' operation on cluster cluster01.
Observe que após a linha “Continue by running ‘rhpctl move gihome -destwc WC_GI1922_cluster01 -continue'”, o FPP seguiu a execução do Move no próximo node.
REST API
Se você quiser saber mais sobre o recurso de REST API do FPP, veja o post abaixo:
Usando o Oracle Fleet Patching and Provisioning (FPP) com REST API
A interface de REST API ainda não suporta todos os parâmetros que conseguimos usar na linha de comando com rhpctl. Para este caso de uso, eu não identifiquei na documentação os atributos correspondentes para os parâmetros “-pausebetweenbatches” e “-chainbatches“. No entanto, tentei utilizar mesmo assim e observei que essas opções já existem na API e funcionam corretamente.
Para obter o comportamento do parâmetro “-chainbatches“, adicione o atributo “chainBatches” com valor TRUE no JSON. Para obter o comportamento do parâmetro “-pausebetweenbatches“, adicione o atributo “pauseBetweenBatches” com valor true no JSON.
API ainda não tem uma opção de resumir um JOB, sendo possível apenas consular, modificar ou deletá-lo. Sendo assim, não conseguimos obter o mesmo resultado demonstrado anteriormente de resumir a operação de MOVE reaproveitando o JOB original através da API, apesar de ser possível dar continuidade ao mesmo JOB via linha de comando.
Para continuar o MOVE via API, precisamos executar uma nova requisição “PATCH” com um JSON Body simplificado contendo apenas o atributo “continue” com valor “true”, o atributo “schedule” é opcional. Isso vai criar um novo JOB, como no primeiro exemplo, o lado negativo disso é que o primeiro JOB fica com status PAUSED indefinidamente.
Exemplo do JSON utilizado para realizar o MOVE do Grid, onde o FPP irá iniciar o próximo batch automaticamente sem intervenção do usuário:
{
"sourceHome": "/u01/app/product/19.0.0.0/GI1922",
"ignorewcpatches": "true",
"batches": "(rac01),(rac02)",
"chainBatches": "true",
"eval": "false",
"schedule": "NOW"
}
Exemplo do JSON forçando uma pausa e passando o controle do início do próximo batch para o usuário:
{
"sourceHome": "/u01/app/product/19.0.0.0/GI1922",
"ignorewcpatches": "true",
"batches": "(rac01),(rac02)",
"pauseBetweenBatches": "true",
"eval": "false",
"schedule": "NOW"
}
URL Usada para a operação PATCH:
https://fppserver:8894/rhp-restapi/rhp/gihome/<DEST WC NAME>
Exemplo chamando a execução com Postman:

A operação gerou o JOB 169.
Consultando o status do JOB 169 no FPP Server, observe que ele apresenta exatamente o mesmo comportamento demostrado no primeiro exemplo:
[grid@fppserver ~]$ rhpctl query job -jobid 169 fppserver.dibiei.com: Audit ID: 1582 Job ID: 169 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -ignorewcpatches -batches (rac01),(rac02) -destwc WC_GI1921_CLUSTER01_REST -json -sourcehome /u01/app/product/19.0.0.0/GI1922 -schedule NOW" Scheduled job execution start time: 2024-04-09T18:38:32-03. Equivalent local time: 2024-04-09 18:38:32 Current status: SUCCEEDED Result file path: "/fpp_images/chkbase/scheduled/job-169-2024-04-09-18:38:34.log" Job execution start time: 2024-04-09 18:38:34 Result file "/fpp_images/chkbase/scheduled/job-169-2024-04-09-18:38:34.log" contents: .... .... .... rac02.dibiei.com: Updated inventory on nodes: rac01. rac02.dibiei.com: Relocating RHPS or RHPC node to rac01 Continue by running 'rhpctl move gihome -destwc WC_GI1921_CLUSTER01_REST -continue'
A partir deste ponto podemos dar o comando para continuação do Move via REST API, realizando uma nova requisição PATCH na mesma URL. O corpo da requisição só precisa do atributo “continue”, eu adiciono o “schedule” apenas por padronização:
{
"schedule": "NOW",
"continue": true
}
Exemplo:

Com isso o FPP cria um novo JOB na sequência, neste caso foi o JOB 170:
[grid@fppserver ~]$ rhpctl query job -jobid 170 fppserver.dibiei.com: Audit ID: 1585 Job ID: 170 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -destwc WC_GI1921_CLUSTER01_REST -json -schedule NOW -continue" Scheduled job execution start time: 2024-04-09T18:53:23-03. Equivalent local time: 2024-04-09 18:53:23 Current status: EXECUTING Result file path: "/fpp_images/chkbase/scheduled/job-170-2024-04-09-18:53:34.log" Job execution start time: 2024-04-09 18:53:34 Result file "/fpp_images/chkbase/scheduled/job-170-2024-04-09-18:53:34.log" contents: fppserver.dibiei.com: Audit ID: 1584 fppserver.dibiei.com: Connecting to RHPC... rac01.dibiei.com: retrieving status of databases ... rac01.dibiei.com: relocating services of databases ... rac01.dibiei.com: stopping services of databases ... rac01.dibiei.com: stopping instances of databases ... rac01.dibiei.com: Executing "prepatch" and "postpatch" scripts on nodes: "rac02". rac01.dibiei.com: Executing "prepatch" script on nodes [rac02]. rac01.dibiei.com: Successfully executed "prepatch" script on nodes [rac02]. rac01.dibiei.com: Executing "postpatch" script on nodes [rac02]. Using configuration parameter file: /u01/app/product/19.0.0.0/GI1921/crs/install/crsconfig_params The log of current session can be found at: /u01/app/grid/crsdata/rac02/crsconfig/crs_postpatch_apply_oop_rac02_2024-04-09_06-55-29PM.log Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [658638121]. ... ...
Ao final temos 2 JOBs concluídos com sucesso:
[grid@fppserver ~]$ rhpctl query job -jobid 169 -brief fppserver.dibiei.com: Audit ID: 1589 Job ID: 169 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -ignorewcpatches -batches (rac01),(rac02) -destwc WC_GI1921_CLUSTER01_REST -json -sourcehome /u01/app/product/19.0.0.0/GI1922 -schedule NOW" Scheduled job execution start time: 2024-04-09T18:38:32-03. Equivalent local time: 2024-04-09 18:38:32 Current status: SUCCEEDED Result file path: "/fpp_images/chkbase/scheduled/job-169-2024-04-09-18:38:34.log" Job execution start time: 2024-04-09 18:38:34 Job execution end time: 2024-04-09 18:47:36 Job execution elapsed time: 9 minutes 1 seconds
[grid@fppserver ~]$ rhpctl query job -jobid 170 -brief fppserver.dibiei.com: Audit ID: 1590 Job ID: 170 User: grid Client: cluster01 Scheduled job command: "rhpctl move gihome -destwc WC_GI1921_CLUSTER01_REST -json -schedule NOW -continue" Scheduled job execution start time: 2024-04-09T18:53:23-03. Equivalent local time: 2024-04-09 18:53:23 Current status: SUCCEEDED Result file path: "/fpp_images/chkbase/scheduled/job-170-2024-04-09-18:53:34.log" Job execution start time: 2024-04-09 18:53:34 Job execution end time: 2024-04-09 19:01:23 Job execution elapsed time: 7 minutes 49 seconds
Conclusão
Este post apresentou apresentou a funcionalidade básica de batches nas operações de MOVE do FPP utilizando exemplos de linha de comando e REST API. Apesar de ter concentrado apenas em Move do GRID, a mesma funcionalidade pode ser utilizada em operações de Move de Database, ou até mesmo no Move combinado de GRID + DATABASE em uma única operação.