Check and Modify Security Protocols in VMware Appliances(KB#00093)

Overview

It is just to centralize the configuration method of security protocols like TLS or SSL in all VMware appliances. VMware did a good job in documenting this process and I am just putting all in one view.  I will add the product in the list if I feel that it needs to be here.

For vCenter Server -

To check :

1. Connect with vCenter server appliance with SSH with its management IP address
2. Run below command#

#cd /usr/lib/vmware-vSphereTlsReconfigurator/VcTlsReconfigurator/
#./reconfigureVC scan

Below is the sample output : TLS Version is TLSv1.2. It means that TLSv1.2 is enabled and any other version is disabled. 


To update in vCenter version 6.5 and 6.7:

Managing TLS protocol configuration for vSphere 6.5/6.7 (2147469) (vmware.com)

To update in vCenter version 7.x

Enable or Disable TLS Versions on vCenter Server Systems (vmware.com)


For vCD or VMware Cloud Director

To check :

1. Login vCD or cloud director appliance
2. Run below command#

#cd /opt/vmware/vcloud-director/bin
#./cell-management-tool ssl -protocols -l

Below is the sample output :


To update:

Note : It need downtime and need to update on each cell individually. So please shutdown the vCD services first before doing this. Follow this article for this vCD | Upgrade from version 9.5 to 10.1.2 ~ My vCloud Notes (vcnotes.in)

#./cell-management-tool ssl-protocols -d SSLv3,SSLv2Hello

Follow this VMware article to update 


For vRealize Automation 

To check and update, just follow this article

For vRealize Log Insight

Good article by vendor.

For NSX for vSphere (NSX-V)

Please see this documentation.

For ESXI Host

Worth to check this page here.

For vROPS

Please click here to check this.

Kubernetes | Command Cheat Sheet(KB#00092)

Overview

Well yes, you are thinking right that I am learning Kubernetes so wanted to share some useful insights and will continue to share stuff on this. Below are some commands for daily operations while working with Kubernetes. I will keep on adding stuff here.

 

Command to Command
Check Minikube version $minikube version
Start Minikube cluster $minikube start
Check if Kubectl is installed $kubectl version
Check kubectl cluster info $kubectl cluster-info
Check kubectl node info $kubectl get node
, ,

PS | How to get HA restarted VM's Org and OrgvDC info with VM Name

Overview

You will see many blogs giving solution for fetching the VM names which are restarted by HA in event of esxi host failures using Get-VIEvent powercli command. But the extracted VM Name too is not in well format to use as it is. You have to use excel and text to column and then extract the VM Name etc. For me, I have vCD also so at the time of ESXi host failures and HA events, I not only need to fetch the VM Name but also Org and OrgvDC info to share it with my customer. It becomes more lengthy for me and I need to make it quick. So it is extended solution for such kind of scenario. Hope you will find it useful.

Let's see how I could do it using powershell.

Script

#Start here

Write-Host "This script will help you out to have VM name restarted by HA due to esxi host failuers" -ForegroundColor Yellow

Function Get-HAVM{
$Date=Get-Date
$HAVMrestartold=1
$raw = Get-VIEvent  -maxsamples 10000000 -Start ($Date).AddDays(-$HAVMrestartold) -type warning | Where {$_.FullFormattedMessage -match "restarted"} |select CreatedTime,FullFormattedMessage |sort CreatedTime -Descending
$raw.vm.name
Remove-Item -Path C:\Temp\vmlist.csv
$raw.vm.name | Out-File C:\Temp\vmlist.csv
}
Get-HAVM
$allvms = Get-Content -Path C:\Temp\vmlist.csv
$vms = Get-VM -Name $allvms
$myView = @()
foreach ($vm in $vms){
$Report = [PSCustomObject] @{
 VM_Name = $vm.Name
 Org_Name = $vm.Folder.Parent.Parent.Name
 OrgvDC_Name = $vm.Folder.Parent.Name
}
$MyView += $Report
}
$myView | Out-GridView

#End here

Any doubt? Comment box is yours :)

Let's give it more power

If you have smtp configured in your environment then simply you can mail it from the same script using Send-MailMessage command but for that you might have to do some tweak in above script. 

Hint is, You have to save final report. Change in the last line of above script like

$myView | Out-File C:\Temp\vmsrestartedbyHA.csv

then use below command

Send-MailMessage -From 'gautam.johar@vcnotes.in' -To 'my.reader@home.com', 'myreader2@home.com' -Subject 'HA Event is triggered and VM list is attached' -Body "Please find the attachment" -Attachments C:\Temp\vmsrestartedbyHA.csv -Priority High -DeliveryNotificationOption OnSuccess, OnFailure -SmtpServer 'smtp.vcnotes.in'

Change wherever applicable.

If you are good enough in PowerShell then you can have many ways to enhance the ideas. For me this is basic script which is working fine for me.

Side Note

I created this script to run perfectly in PowerShell ISE so run in that please or if you have any error in running it in simple powershell cli terminal then you might need to fix the visible errors.

Good Luck!









vRA | How to manually assign the unassigned shards

 Overview

In one of the vRA upgrade from 7.4 to 7.6, I faced this issue post upgrade. All went well except below error on VAMI page of both vRA appliances (as I had two nodes). If you have more and if you stuck with this error then you will see this error on all the nodes. 

================

Elasticsearch validation failed:

status: red
number_of_nodes: 2
unassigned_shards: 4
number_of_pending_tasks: 0
number_of_in_flight_fetch: 0
timed_out: False
active_primary_shards: 113
cluster_name: horizon
relocating_shards: 0
active_shards: 226
initializing_shards: 0
number_of_data_nodes: 2
delayed_unassigned_shards: 0

=================

If you read above error then you will understand that there are 4 unassigned shards which were not automatically assigned to any of the available vra node. 

Cause 

It happens if and when DB sync between primary and slave vra nodes are not good. When primary node was not having updated data but slave nodes were running with some additional data. Total break between Master and Replica DB replication. In my case also before upgrading there were many issues with DB.

If you recover the cluster state even then these shards might not assign automatically and give above alert. Now you have to assign the unassigned shards manually. Let's see the process.

Resolution

1. Check the state from Master node CLI with below command

#curl http://localhost:9200/_cluster/health?pretty=true

You will have this error in output

{
  "cluster_name" : "horizon",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 113,
  "active_shards" : 226,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 4,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

2. Check the cluster information with below command

#curl -s -XGET http://localhost:9200/_cat/nodes

You will have similar output

master.mylab.local 172.25.3.199 8   d * Dreadknight
replica.mylab.local 172.25.3.200 8   d m Masque

3. Search for unassigned shards

#curl -XGET 'http://localhost:9200/_cat/shards' | grep UNAS

You will see similar output as below

 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 15870  100 15870    0     0   484k      0 --:--:-- --:--:-- --:--:--  484k
v3_2020-10-02  4 p UNASSIGNED
v3_2020-10-02  4 r UNASSIGNED
v3_2020-10-02  2 p UNASSIGNED
v3_2020-10-02  2 r UNASSIGNED

4. Re-assigned these using the following command, where index = v3_2020-10-02, and shards to be re-assigned are '2' and '4', while running on the master node - 'Dreadknight. Change your command according to your environment. for example, value after index will be changed, value after shard, after node will  be changed. Other infos will be same.


curl -XPOST 'localhost:9200/_cluster/reroute' -d '{"commands":[{"allocate":{"index":"v3_2020-10-02","shard":2,"node":"Dreadknight","allow_primary":"true"}}]}'

and

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{"commands":[{"allocate":{"index":"v3_2020-10-02","shard":4,"node":"Dreadknight","allow_primary":"true"}}]}'

That's it. Now shards have been assigned or allocated automatically manually.

Log out all the nodes VAMI and log in back. You will not see any such error.


Miscellaneous Notes

This is dynamic post and I will keep on adding points in here. I generally add small but useful things here which is not worthy to create long post.

 


How to Explanation
transfer the tech-support bundle to FTP on Arista Router copy flash:/EOS-4.18.2F.swi ftp:/user:password@192.168.10.15/EOS-4.18.2F.swi
user = username of ftp server account
password = password of ftp server account
192.168.10.15 = IP address of ftp server
EOS-4.18.2F.swi = tech-support bundle file name
Encrypt a PowerShell script https://drive.google.com/open?id=19Bvik1FcSTC57eJ0CZPE4D-8hnQfyCi-
Reboot Windows with PowerShell command powershell.exe -encodedCommand cwBoAHUAdABkAG8AdwBuACAALQByACAALwB0ACAAMAAxACAA
To create a digital clock Download and run these PS script to create the clock on your PC.
EST Clock | CST Clock | IST Clock
Do few things in Linux 1. Check Kernal Version in Linux - Rpm -qa | grep -I kernel
2. Change IP on an interface - ifconfig eth1 192.168.2.2 netmask 255.255.255.0
3. To set or change DG of any VM - route add default gw 192.168.2.1
4. File location to change the IP - vi /etc/sysconfig/network-scripts/ifcfg-eth0
5. To Search specific text in linux server -
grep -rnw '/path/to/somewhere/' -e 'pattern'
How to ping with the MTU value ping www.yahoo.com -f -l 1492
Add Network Components in vRNI Check this article
How to encode and decode Base64 script Check this here
Ping an entire subnet in Windows I have documented it here
Some Useful ESXi Commands Check speed and other info of HBA card - esxcli storage san fc list
vCloud API Guide for NSX Here is the vendor page for pdf
To create static routes in multiple esxi hosts $esx = Get-VMHost -Name esxihost_Name
$esxcli = Get-EsxCli -VMHost $esx -V2
$parms = @{
network = '192.168.102.0/24'
gateway = '192.168.3.1'
}
$esxcli.network.ip.route.ipv4.add.Invoke($parms)
$esxcli.network.ip.route.ipv4.list.Invoke()
Send mail to any mail account using PS Download the powershell script from Google Drive. Click here
How to delete any iso file in all datastores which is older than 15 days foreach($ds in Get-datastore){
New-PSDrive -Name GJ -PSProvider VimDatastore -Root '/' -Datastore $ds > $null
Get-Childitem -Path GJ:\ -Recurse -Include *.iso | Remove-Item -Confirm:$true | Where ((Get-date).AddDays(-15))
#This will search each and every folder in your datastore and show you the file to delete it.
Remove-PSDrive -Name GJ -Confirm:$false}
Replace false to true in command (Remove-Item -Confirm:$false to Remove-Item -Confirm:$true)if you want to check and delete each file one by one
How to edit Login Banner in Vmware Cloud Director Appliance 1. Create or edit a file in /etc/login.warn and put your message in here.
2. Edit /etc/sshd/sshd_config file and change the line from #Banner none to #Banner /etc/login.warn
,

PS | To extract DRS rules with VM names

Hi Guys,

This is not a big thing but still I wanted to document it for my own reference. I got a request like which VMs are in which DRS rules so I got below script.

#Start here

$VC = Read-host "Enter the FQDN\IP of vCenter Server"

Connect-VIServer $VC
$DRSRules = Get-Cluster | Get-DrsRule
$Results = ForEach ($DRSRule in $DRSRules)
     {
    "" | Select-Object -Property @{N="Cluster";E={(Get-View- Id $DRSRule.Cluster.Id).Name}},
    @{N="Name";E={$DRSRule.Name}},
    @{N="Enabled";E={$DRSRule.Enabled}},
    @{N="DRS Type";E={$DRSRule.KeepTogether}},
    @{N="VMs";E={$VMIds=$DRSRule.VMIds -split ","
     $VMs = ForEach ($VMId in $VMIds)
        {
        (Get-View -Id $VMId).Name
        }
      $VMs -join ","}}
     }
$Results | out-gridview

#End here

Another window will open and copy entire output into excel if you want.

Cheers!


,

vCD | How to disable auto-discovery for particular OrgvDC

Overview
How to connect
How to check existing setting
How to update existing setting

Overview

I am creating a post on the subject because there is no clear cut article on this on web or might be I couldn't find straightforward process to do this. Basically, in vCD GUI there is option to disable or enable the auto-discovery for entire vCD system. On org level you cannot disable or enable auto-discovery but you can override this setting on OrgvDC level but with the help of Admin APIs. Hope you know about APIs but what is Admin APIs. This will automatically be answered in this post. Read this post carefully and I hope you will understand this. To know more about auto-discovery, you can check out this post by Tom Fojta.

How to connect

You cannot even check the auto-discovery status for OrgvDC from GUI. You need to use the API. I have already covered this in my previous posts to connect vCD in API tool. Have a look here

How to check existing setting

Once you are connected then Use below api query to extract your Org detail

1. GET https://vcloud_ip_or_fqdn/api/org 

Now, copy entire output and paste into notepad++ or any other text editor you want. Search for Org name where your orgvDC was created. Search in the notepad++ file only. You will get href link from there. Copy that link and paste it in API tool and send GET command. Example is shown below

2. GET https://vcloud_ip_or_fqdn/api/org/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc"

Now, you will get OrgvDCs list in this org. Copy entire output again and paste it into notepad++ again. Search target OrgvDC name and copy the href for that OrgvDC. Below is the example-

https://vcloud_ip_or_fqdn/api/vdc/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc"

In order to check the value you need to modify the above href value little bit. Check below

https://vcloud_ip_or_fqdn/api/admin/vdc/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc"

Hope you could notice the difference in above lines. Now create and send GET command as below

3. GET https://vcloud_ip_or_fqdn/api/admin/vdc/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc"

Note that if you run the GET command with adding "admin" then only you will get the auto-discovery option in output. Below is the example command and output with "admin" keyword-




Note that, if any OrgvDC output is not having this line that's mean it is following the vDC global level setting and to override this value by adding this line here. I will explain how.

Flase means VM auto-discovery is disabled and true means it is enabled. I explained you the process to get the value to Vm Auto Discovery status for OrgvDC. Now let's how to change this value.


How to update existing setting

To update this value from false to true or true to false or even enter the whole line here, you need to follow below steps

1. From above steps 3, you got orgvDC href value where you send GET query to get the auto vm discovery states, now you replace GET command with PUT command


2. Now, in the output for OrgvDC which you copied into notepad++,  If vmDiscoveryEnabled is false and you want to make it true then change the keyword from false to true and vice-versa. 

3. Copy entire output again after changing the value and paste it in the BODY, select RAW and select xml as shown in my previous post.

4. You will not click on send button now, you need to add one more header here along with other placed headers. Header info is given here and practical use below. For this reason only, I had to create an entire post. This is not clearly mentioned on any article on web so now you have one.


In case, you want to use JSON then you can use that too but make sure then JSON must be selected in body where you pasted the data from notepad++.

Once you put the content-type then make sure you have entered the right vDC href and selected operations in PUT and not GET. 

Now hit the send button.

You will get message "202 Accepted" if all went good.






,

vCD | How to select ESP as Protocol in firewall rule of ESG

Overview
How to connect
How to extract edge firewall rules config
How to update edge firewall rules config

Overview

This post is to share the process to change the existing available protocols in NSX-v Edge firewall rule (Not DFW). Available protocols are TCP, UDP, ICMP and Any on vCD's Edge Service Gateway page. See below image.
My customer's demand was to set another protocol here which is ESP. I checked on GUI and it was clear that it is not possible from here so I could change it successfully from API queries.

How to connect

Before updating this firewall rule field, we must know that how to connect vCloud Director in any API tool. You can use Postman, Insomnia, ARC (Advance Rest Client) as a tool to connect vCD. You might need to disable SSL check before executing any api call. Below snippet is from Postman API tool.

Once that SSL check is disabled then
1. Set Authorization as Basic Auth. See below image

2. Set header as mentioned below
Accept application/*;version=32.0

Version can be according to your vCD version. 

3. Now create api query like https://vcloud_ip_or_fqdn/api/sessions and select POST in query type. It will be like
POST https://vcloud_ip_or_fqdn/api/sessions
This query is to get authorization and access token. Once you entered the URL and selected query type as POST then hit "Send" button to run this query.
Post run you will get "200 OK" and authorization and access token headers. See below images


Use above two headers as shown in below images

Now, you are ready to do any operations in vCD using this API tool

How to extract edge firewall rules config


Use below api query to extract your Org detail

1. GET https://vcloud_ip_or_fqdn/api/org

Copy the output and paste in Notepad++. Search for target OrgvDC name where your edge is residing. Then create another query and run it

2. GET https://vcloud_ip_or_fqdn/api/vdc/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc"

You will see similar output in your Notepad++ data. Just copy vdc href from notepad++ file not from here and paste in Postman and then hit send

Here, you will have another output from OrgvDC. Search here the edge name. You will get line like below. Copy that line similar below and run another query 
https://iaas-sin.aticloud.aero/network/vdc/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc/edges
Now, create a API call like

3. GET https://vcloud_ip_or_fqdn/network/vdc/a038859f-bf22-4d64-b6dc-e1cb8fdf2fbc/edges

It will give you output like below. Only single line.

https://vcloud_ip_or_fqdn/network/edges/1343b683-bdca-4b80-9e19-8d668f98d8bc

Now, again create a query to fetch edge firewall services. It will be like

4. GET https://vcloud_ip_or_fqdn/network/edges/1343b683-bdca-4b80-9e19-8d668f98d8bc/firewall/config. 

It will give you all configuration of this edge. 

How to update edge firewall rules config

It is a simple process. Copy the output of point 4 in text editor like notepad++ and search for entries like below-
<application>
  <service>
    <protocol>tcp</protocol>
    <port>any</port>
    <sourcePort>any</sourcePort>

Here in protocol we need to replace it from tcp to esp. In notepad++ itself change the field to esp. It will be like below
<application>
  <service>
    <protocol>esp</protocol>
    <port>any</port>
    <sourcePort>any</sourcePort>
Now, copy entire output from notepad++. Full output not these 5 lines. and paste in postman. where? See below-


Once done, create below query

PUT https://vcloud_ip_or_fqdn/network/edges/1343b683-bdca-4b80-9e19-8d668f98d8bc/firewall/config

and hit send button. That's it. To cross check it either you can check in GUI or again follow "How to extract edge firewall rules config"
,

vCD | Upgrade from version 9.5 to 10.1.2

 

Overview

It is three step process from version 9.5 to version 10.1.2. I would suggest to complete pre-requisites properly and it will be flawless process. You should check inter-operability first so that your other components can function with vCD versions you will be upgrading to. My experience says that during migration, first deploy the primary node → Transfer the DB → Replace custom certificates with self-signed certificates → Make sure your primary node up → Now add more nodes if you want to deploy multi-cell architecture → Change certificate.ks in standby nodes.

When we talk about Primary and standby nodes then it is only for Postgre DB which is active only on Primary node and will be Standby in standby nodes. VMware-vcd service will always be active-active in all three nodes (If you deploy minimum three nodes in multi-cell architecture). See below image.



Upgrade path will be-
Current 9.5 in-linux → In-place upgrade to 9.7 in-linux → Migrate to 9.7 appliance → Upgrade to version 10.1.2 appliance. You can check vendor doc for this upgrade path.  Now when you know the workflow then let's proceed for planning phase.

Planning

First of anything, you should check the interoperability of your product versions. Click here for VMware InterOperability guide You need to plan your upgrade as per this guide. This phase is most important phase, I must say. If you plan with perfection then very less chances of failure are there. Let's see what all you need to plan-
In-place upgrade is quite simple. There is no such complexity. All planning need is for Migration from in-linux to appliance
1. IP Addresses
There are two choices we have. You need to decide whether you want to change the existing IPs of existing vCD cells or you want to use new IPs on your new vCD cells. Why? Because you are going to deploy new cells for migration to appliance. I will describe in next steps. I used the existing IPs of existing cells. In case, you are using same IP addresses then 
1. at the time of 9.7 appliance deployment you need to change the old cell's IP address to any temp but reachable IP address. This IP address should be reachable to your new cell as well as your existing external DB server. why? Because 
1.1. This old cell IP address we will assign to new vCD cell's eth0 NIC
1.2. We still need old vcd cell (anyone) for DB migration that's why it must be reachable to new cell and your external DB server
2. You need to free IPs from all of your old three cells so that we can assign same three IPs to all three new 9.7 appliance node's eth0 nics.
3. You need to change DNS entries for you old cells with new temp IPs and then create DNS (Host and PTR records) for new cells with old IP addresses. Any confusion? comment pls.
4. You need to create different VLAN for IPs of eth1 of all three new vcd cells, if you already don't have it.

2. Network Route
It is quite crucial part of this migration. In old vCD cells there used to be three different NICs holding different traffics like HTTPS, VMRC, NFS etc. but in vCD 9.7 appliance, each cell will be having two NICs only holding these services. You need to ensure that both these NICs must be on different VLANs\subnets. Now, you need to ensure that your new vCD cells's eth1 can reach your NFS server and for that if require you need to configure static routes as per your network flow.
3. Single cell or Multi cell Architecture
You need to decide and plan your upgrade according to this point. Additional points to be taken care are

- Check out Load-balancer configuration. It might needs to modify post vcd 9.7 deployment.
- Do you have more than one LBs that balanced different traffics. For example, Internet and Intranet
- You need to deploy additional nodes at very last stage that is after certificate replacement and making first node fully up

4. NFS
NFS is another critical part of this migration. Some guys might have NFS on Linux Machine, some might have on Windows and some might have it on direct storage box. 

You need to make sure that while deploying first Primary node in version 9.7 appliance, NFS mount point must be empty otherwise 9.7 deployment will fail. You need to ensure that you NFS server must be reachable from the eth1 NIC of all vcd nodes in 9.7 appliance deployment.

I guess, I have given enough clues to help you plan this migration. Now, let's see all steps one by one.

In-place upgrade from 9.5 to 9.7

Pre-requisites
- Make sure that upgrade.bin file is uploaded in any directory in vCD cell(s) and user has enough permission on it. 
- Do a md5sum check. Command is # md5sum installation-file.bin
Main activity
You need to run all below command on all vcd cells one-by-one
Step 1 : Stop vCD services on all cells using ./cell-management-tool 
To see the current status
./cell-management-tool -u administrator -p 'password' cell --status
To stop coming more tasks on it
./cell-management-tool -u administrator -p 'password' cell --quiesce true
To put it in maintenance mode
./cell-management-tool -u administrator -p 'password' cell --maintenance true 
Finally to shutdown vCD services
./cell-management-tool -u administrator -p 'password' cell -s 

Step 2 : Take snapshot of all vCD cells 
Step 3 : Take full backup of external MS SQL database
Step 4 : Take ownership on downloaded .bin file. Command is
#chmod u+x installation-file.bin
Step 4 : Install the upgrade file now
#./installation-file.bin
If you have placed .bin file in /tmp folder then change the location in CLI and then run the command
It is a simple one. No such complexity.

Best Practices for next migration

1. Segregation of traffic should be as below. I did it like this so sharing it as my personal recommendations. You can chose other way round as well. 
 vCD 9.7 appliance         eth0                                   eth1 
 Primary                  HTTPS+VMRC+API        PostgreDB+NFS 
 Standby                  HTTPS+VMRC+API        PostgreDB+NFS 
 Standby                  HTTPS+VMRC+API        PostgreDB+NFS
2. If primary is deployed as "Primary-Large" then all standby cells must be deployed as "Standby-Large". If primary is deployed as "Primary-small" then all standby cells must be deployed as "Standby-small".
3. In multicell architecture, standby cells should be deployed at very last stage that is after making first cell up and running fully functional and after replacing the certificates. It will make things simple.
4. Both NICs of vcd 9.7 appliance cells must be on different VLANs

Migrate from ver 9.7 In-linux to 9.7 appliance

Pre-requisites
1. Clean and accessible NFS mount share. Must be accessible from eth1 nic if you are planning to transfer NFS traffic to eth1. I did the same.
2. Accurate DNS entries for new vcd cells with old IPs (If you don't plan to change the existing IPs)
3. If you have customized certificates then have passwords of keystore, https and console proxy.
4. Make sure that network flows are opened between new vcd cell's eth1 nic and old cell. Follow vcd 9.7 Install guide attached in last of this post
5. Configure AMQP with version 3.7
6. Here is the vendor documentation for all pre-requisites
7. Your production downtime will start when you will change the production IP of old vcd cell here. We will assign this IP to new primary cell.

Start the vCloud Director Appliance Deployment : Primary Node

1. Start deploying an ova as usual. A wizard will open → Give vCD cell a valid name → Give it folder location → Compatible datastore → Underlying ESXi host → Select eth1 and eth0 portgroups → Click next to complete customized template wizard →

Under Customized template, fill the following-
NTP Server to use: 8.8.8.8 
Initial Root Password: VMware1!
Expire Root Password Upon First Login: uncheck
Enable SSH root login: check
NFS mount for transfer file location: IPaddress:/sharename 
'vcloud' DB password for the 'vcloud' user: VMware1!
Admin User Name: administrator
Admin Full Name: vCD Admin
Admin user password: VMware1!
Admin email: vcd97@vcnotes.in
System name: vcd4
Installation ID: 12
eth0 Network Routes: blank
eth1 Network Routes: blank
Default Gateway: 172.17.2.1
Domain Name: vcnotes.in
Domain Name Servers: 8.8.8.8,8.8.4.4
eth0 Network IP Address: 172.17.2.21
eth0 Network Netmask: 255.255.255.224
eth1 Network IP Address: 172.17.2.22
eth1 Network Netmask: 255.255.255.240

You need to modify above detail as per your environment. Few doubts you might have-

System Name - For first primary node, you can put vcd1
Installation ID - In case of Brownfiled setup, note the installation id from running setup and put same here
eth0 and eth1 Network - It is to put static routes according to your network design

Once all info is given, review it and click on Finish.

Glad to see vendor's documentation here. Refer to page number 54. 

Note that:

1. You need not to change installation ID on each and every standby cell. Installation ID is the ID which vCD uses to generate unique mac addresses for vCD VMs. I have seen few blogs asking to change it for standby nodes. This is totally incorrect.
2. Domain Name and Domain search path will be same as vcnotes.in. It should not be like vcdcell01.vcnotes.in. When you put the VM name at starting of deployment then DNS automatically generate FQDN.

Post-Checks-

1. Once ova deployment is finished then access SSH. If SSH is not responding the access console and start sshd service. Service sshd start. Check below logs to ensure everything is good.
#cat /opt/vmware/var/log/firstboot
#cat /opt/vmware/var/log/vcd/setupvcd.log
#cat /opt/vmware/var/log/vami/vami-ovf.log

If everything went well during deployment then firstboot logs will show you the success mark otherwise it will refer to check setupvcd.log and then vami-ovf.log

2. Browse VAMI interface https://IP_FQDN_of_primary_cell:5480. It should be like below


3. Browse https://IP_FQDN_of_primary_cell/cloud and https://IP_FQDN_of_primary_cell/provider. All portals will be accessible and without any error

Start the vCloud Director Appliance Deployment : Standby Node
Not Now :)

Once your primary cell is deployed then don't deploy standby node right after. Now, its time to transfer the DB.

Take backup of internal embedded postgres database

#/opt/vmware/appliance/bin/create-db-backup

Configure External Access to the vCloud Director Database 

1. Stop vCD services on all cells including one primary and old three cells. 
2. SSH to new vcd cell and create a file with name external.txt in  /opt/vmware/appliance/etc/pg_hba.d with below command

#vi /opt/vmware/appliance/etc/pg_hba.d/external.txt

Now add below colored content in external.txt file

#TYPE  DATABASE       USER     ADDRESS            METHOD
host       vcloud                 vcloud   172.25.2.194/32       md5
host       vcloud                 vcloud   172.25.2.209/32       md5

Note that : IP address 172.25.2.194/32 is IP address of you old external DB server with CIDR value.  IP address 172.25.2.209 is the IP address of eth1 NIC of old vcd cell. Refer to Page number 82 and 83 in vcd 9.7 Install guide attached in last of this post.
You can ensure proper update of above created file by checking file pg_hba.conf. Just run #cat pg_hba.conf and it should show the entries you just made in above steps

3. SSH to old vcd cell and run below command. Refer to vCD 9.7 Install guide page number 122.

/opt/vmware/vcloud-director/bin/cell-management-tool dbmigrate -dbhost eth1_IP_new_primary \ -dbport 5432 -dbuser vcloud -dbname vcloud -dbpassword database_password_new_primary

What you need to modify in above command is-

eth1_IP_new_primary - It is eth1 IP address of new primary vcd appliance cell
database_password_new_primary - it is the database password given while deploying the primary vcd node
-dbname  - It should be vcloud only if you haven't changed intentionally. 
/ - Many get confused with this /, it is just in VMware documentation which means next line. Doesn't matter if you use it or not.

Rest info should be understood and can be used as it is. If all went well then it will transfer your external SQL DB to embedded postgreSQL.

Transfer Certificates from Old Cell and Integrate it to New Primary Cells

1. On the migration source copy all the following files from old vcd cell to new vcd cell. Do not edit any entries in these files in this process. Use WinSCP to move the files between the two devices. Rename the file to cerificates.ks.migrated to avoid any confusion before paste it into new vcd cell.

/opt/vmware/vcloud-director/certificates.ks
/opt/vmware/vcloud-director/etc/certificates
/opt/vmware/vcloud-director/etc/global.properties
/opt/vmware/vcloud-director/etc/proxycertificates
/opt/vmware/vcloud-director/etc/responses.properties
/opt/vmware/vcloud-director/etc/truststore

2. Create a new directory in new vCD cell and paste above files there
mkdir /root/tempCerts

3. Change the ownership of vcloud user on above files
chown vcloud:vcloud /root/tempCerts/*

4. On the new appliance, now rename all existing files to keep them for reference
cd /opt/vmware/vcloud-director/etc/
mv certificates certificates
mv global.properties global.properties
mv proxycertificates proxycertificates
mv responses.properties responses.properties
mv truststore truststore

I didn't renamed /opt/vmware/vcloud-director/certificates.ks here because I have already renamed it in step 1.

5. On the new appliance copy the files from /root/tempCerts to their respective directories. 

mv /root/tempCerts/certificates.ks.migrated /opt/vmware/vcloud-director/
mv /root/tempCerts/certificates /opt/vmware/vcloud-director/etc/
mv /root/tempCerts/global.properties /opt/vmware/vcloud-director/etc/
mv /root/tempCerts/proxycertificates /opt/vmware/vcloud-director/etc/
mv /root/tempCerts/responses.properties /opt/vmware/vcloud-director/etc/
mv /root/tempCerts/truststore /opt/vmware/vcloud-director/etc/

Here, you have transferred all the certificates from old to new cell and now this is the time to run configure command so that new vCD primary appliance can use these certificates. 

6. Below is the command.

Before this, note that /opt/vmware/vcloud-director/certificates.ks (Customer certificate copied from old cell) is not in use because we have renamed it with certificates.ks.migrated. We will do all initial configurations with self-signed certificates and then will use custom certificate in last step.

/opt/vmware/vcloud-director/bin/configure --unattended-installation --database-type postgres --database-user vcloud --database-password db_password_new_primary --database-host eth1_ip_new_primary --database-port 5432 --database-name vcloud --database-ssl true --uuid --keystore /opt/vmware/vcloud-director/etc/certificates.ks --keystore-password root_password_new_primary --primary-ip appliance_eth0_ip --console-proxy-ip appliance_eth0_ip --console-proxy-port-https 8443

Hope above command is self-explanatory. If not, comment it and ask your doubt.

Once above command is successful one then you can follow next step

7. Start vCD services on first primary new cell

SSH to primary cell and start the vcd services 
#service vmware-vcd start or #systemctl start vmware-vcd.services

You can monitor the progress of the cell startup at /opt/vmware/vcloud-director/logs/cell.log.

Update certificate.ks file

1. Rename the original certificates.ks file. You are renaming self-signed certificate now. 

mv /opt/vmware/vcloud-director/certificates.ks /opt/vmware/vcloud-director/certificates.ks.original

2. Rename the migrated certificates.ks file. You are renaming custom certificate now to use it in production
mv /opt/vmware/vcloud-director/certificates.ks.migrated /opt/vmware/vcloud-director/certificates.ks 

3. Shutdown vCloud Director management cell. 

/opt/vmware/vcloud-director/bin/cell-management-tool -u administrator -p 'Password' cell --quiese true 
/opt/vmware/vcloud-director/bin/cell-management-tool -u administrator -p 'Password' cell --maintanance true
/opt/vmware/vcloud-director/bin/cell-management-tool -u administrator -p 'Password' celll -s

4. Change the ownership on certificates.ks file
chown vcloud.vcloud /opt/vmware/vcloud-director/certificates.ks

5. Run the configuration tool to import the new certificate. 
/opt/vmware/vcloud-director/bin/configure

If asked “Please enter the path to the Java keystore containing your SSL certificates and private keys:” enter the location you uploaded the file to. If our case: /opt/vmware/vcloud-director/certificates.ks. It will ask about https, console proxy and keystore password. Supply all.

Press Y wherever it prompt and You are Done!!

You need not to start vCD service manually now. It will automatically started.

Check the /cloud, /provider and :5480 portals and make sure it is accessible well from intranet and internet environments. 

Some Useful Commands for HA Cluster Operations-

I am making it smallest font size to avoid any confusion in command. These are show commands and you can run to have deep inside of vCD HA cluster status.

sudo -i -u postgres /opt/vmware/vpostgres/current/bin/repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf node status
sudo -i -u postgres /opt/vmware/vpostgres/current/bin/repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf cluster show
sudo -i -u postgres /opt/vmware/vpostgres/current/bin/repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf cluster matrix 
sudo -i -u postgres /opt/vmware/vpostgres/current/bin/repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf cluster crosscheck 
systemctl status appliance-sync.timer  #It is to check the time sync between all the nodes and need to run on all nodes seperately




Start the vCloud Director Appliance Deployment : Standby Node1

1. All process to deploy standby node is same except

- You will only seed info which is applicable for standby node at the time of deployment
- You just need to transfer Certificate.ks file to its default location and no other certificate replacement is required on standby node

Start the vCloud Director Appliance Deployment : Standby Node2

Same as above no change.

If everything goes well then in /cloud or /provider interface, you will see all three nodes with green ticket icon.

Now, All nodes are deploy in vCD. In mulit-cell deployment only one or two steps are additional here.

Load-Balancer Configuration : You need to check your existing load balancer configurations and need to modify them if require. If your load-balancer was already configured with in-use IP addresses then you just need to change in-use port from 443 to 8443. For me, LB was configured in NSX for internal traffic and F5 was there to entertain Internet traffic.

Would like to share some issues which I encountered during migration

Known Errors during above deployment and migration

Listing down where I was stuck
Issue 1: After deployment of first node, I got below error on VAMI interface
The deployment of the primary vCloud Director appliance fails because of insufficient access permissions to the NFS share. The appliance management user interface displays the message: No nodes found in cluster, this likely means PostgreSQL is not running on this node. The /opt/vmware/var/log/vcd/appliance-sync.log file contains an error message: creating appliance-nodes directory in the transfer share /usr/bin/mkdir: cannot create directory ‘/opt/vmware/vcloud-director/data/transfer/appliance-nodes’: Permission denied.
Solution : It means that NFS was not clean and PostgreSQL service couldn't be running. If you check above mentioned firstboot and setupvcd.log files then you will have idea. Delete all the content of NFS share and delete the existing node and retry deployment. No other fix.
Issue 2: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: signature check failed.
These were the log entires in cell.log and portal was not up
Solution: Edit the global.properties file in new primary cell and comment out (#) three lines which are associated with ssl connection and run configure command
/opt/vmware/vcloud-director/bin/configure --unattended-installation --database-type postgres --database-user vcloud --database-password db_password_new_primary --database-host eth1_ip_new_primary --database-port 5432 --database-name vcloud --database-ssl true --uuid --keystore /opt/vmware/vcloud-director/etc/certificates.ks --keystore-password root_password_new_primary --primary-ip appliance_eth0_ip --console-proxy-ip appliance_eth0_ip --console-proxy-port-https 8443
If it doesn't work then run below command
/opt/vmware/vcloud-director/bin/configure --unattended-installation --database-type postgres --database-user vcloud --database-password db_password_new_primary --database-host eth1_ip_new_primary --database-port 5432 --database-name vcloud --database-ssl false --uuid --keystore /opt/vmware/vcloud-director/etc/certificates.ks --keystore-password root_password_new_primary --primary-ip appliance_eth0_ip --console-proxy-ip appliance_eth0_ip --console-proxy-port-https 8443
It will work for sure as worked for me twice
Again run below command now
/opt/vmware/vcloud-director/bin/configure --unattended-installation --database-type postgres --database-user vcloud --database-password db_password_new_primary --database-host eth1_ip_new_primary --database-port 5432 --database-name vcloud --database-ssl true --uuid --keystore /opt/vmware/vcloud-director/etc/certificates.ks --keystore-password root_password_new_primary --primary-ip appliance_eth0_ip --console-proxy-ip appliance_eth0_ip --console-proxy-port-https 8443
Issue 3: DB transfer was failing, I couldn't capture the error but it was giving old cell's IP address error
Solution : When you prepare /opt/vmware/appliance/etc/pg_hba.d/external.txt file, I mentioned to put IP address of external DB so here you need to put IP address of your old cell as mentioned in above steps. In my case, I had to put IP address of eth1 nic of old cell
Issue 4: vCD Portal was up from internet and Intranet but VM's console was not accessible from Internet.
Solution: You need to make sure that in multi-cell deployment if you are using more than one LB then you will change the new cell's IP address in all LB configuration. In my case, Internet facing LB configuration change was missed so when we corrected it was resolved.

Upgrade from version 9.7 appliance to Cloud Director 10.1.2 appliance

Prerequisites

Take a snapshot of the primary vCloud Director appliance.

Log in to the vCenter Server instance on which resides the primary vCloud Director appliance of your database high availability cluster.
Navigate to the primary vCloud Director appliance, right-click it, and click Power > Shut Down Guest OS.
Right-click the appliance and click Snapshots > Take Snapshot. Enter a name and, optionally, a description for the snapshot, and click OK.
Right-click the vCloud Director appliance and click Power > Power On.

Verify that all nodes in your database high availability configuration are in a good state. See Check the Status of a Database High Availability Cluster.

Procedure

  1. In a Web browser, log in to the appliance management user interface of a vCloud Director appliance instance to identify the primary appliance, https://appliance_ip_address:5480.
  2. Make a note of the primary appliance name. You must upgrade the primary appliance before the standby and application cells. You must use the primary appliance when backing up the database. Note: You must upgrade primary cell first. 

    vCloud Director is distributed as an executable file with a name of the form VMware_vCloud_Director_v.v.v.v- nnnnnnnn_update. tar.gz, where v. v. v. v represents the product version and nnnnnnnn the build number. For example, VMware_vCloud_Director_10.0.0.4424-14420378_update.tar.gz.
  3. Create the local-update-package directory in which to extract the update package.
    #mkdir /tmp/local-update-package
  4. Extract the update package in the newly created directory.
    #cd /tmp
    #tar -vzxf VMware_vCloud_Director_v.v.v.v-nnnnnnnn_update.tar.gz -C /tmp/local-update-package
  5. Set the local-update-package directory as the update repository.
    #vamicli update --repo file:///tmp/local-update-package
  6. Check for updates to verify that you established correctly the repository.
    #vamicli update --check
    You will see similar output, if all went well


  7. Shut down vCloud Director by running the following command
    #/opt/vmware/vcloud-director/bin/cell-management-tool -u <admin username> cell --shutdown
    OR
    #Service vmware-vcd stop
    You can use either way to stop vcd services
  8. Apply the available upgrade
    #vamicli update --install latest

    Note: Follow all above steps on all cells one by one and restart each cell too after upgrading the application. Now login on Primary Cell only and upgrade the database schema
  9. From the primary appliance, back up the vCloud Director appliance embedded database.
    #/opt/vmware/appliance/bin/create-db-backup
  10. From any appliance, run the vCloud Director database "upgrade" utility.
    #/opt/vmware/vcloud-director/bin/upgrade
  11. Reboot each vCloud Director appliance
    #shutdown -r now
I will now share what is not there on vendor article-

1. Post application upgrade and login in html interface of vCD 10.1.2, you might notice that vCenter is showing disconnected and is not connecting post reconnect and refresh option. In that case, you need to follow below steps-

  1. Login to primary cell with root
  2. Run below command to accept the certificate (Issue is with certificate exchange of new vCD version 10.1.2 and needs to accept)
#opt/vmware/vcloud-director/bin/cell-management-tool trust-infra-certs --vsphere --unattended

For more info, refer the URL https://kb.vmware.com/s/article/78885

2. Post upgrading vCD application, postgres service might stop and while upgrading the database schema, you may see error, "unable to establish initial connection with database". To resolve this, either start the service manually or reboot the cell once.

That's all :) Hope it was helpful.

VMware References

1. vCD 9.7 Install and Upgrade guide 
2. Above guide has all detail but just in case, if you need something specific. Here is Certificate replacement guide from VMware
3. Here is the database migration steps from VMware and same is mentioned in guide number 1.
4.Awesome article written by Richard Harris for the same process. Must check. I too learned from his experiences

vCD | Error "Invalid User" during any operation with VM

Hi Guys,

I am covering this error because I couldn't find any article on it and I had to open a case with VMware to resolve it. Luckily, my issue went into hand of a good guy and we could resolve it after around 5-6 hours call. So, I thought to cover this up as well. It can be beneficial for someone.

In this article, I will not share exact solution rather I will tell you that why is it happening, I mean the root cause and then you have to raise a case with VMware. At least you will now know the root cause. Why not solution? because you will have to do some changes in vCloud Database and it is very critical to touch cloud director DB by your own and if you are not that expert.

Error is below in image and impacted vCD version is 9.7

Reason: In my case, it happened because my customer deleted a user from LDAP server directly without transferring its objected from vCloud Director. Hope you know that when you delete any user in vCD, it ask you to transfer its objects. In this case, it was not happened and all those owned objected were locked for any modification.

VMware identified all the objects which was running with user ID and then replaced the user ID with system's account user id. They did it from vCD database. You need to raise case with them.

Enjoy the troubleshooting :)



,

vROPS | CPU Addition Automation

Hello Friends,

In my previous post, I explained the cpu addition automation with PowerShell. Now, I got a request to explain the steps from vROPS as well. As I said that who has vROPS then it is better than doing it from PowerShell. vROPS is undoubtedly Enterprise level solution and I would say PS here is a trick to do it.

Here you go...

Pre-requisite

  • CPU hot add must be enable for the VM(s) which you want to automate for. How to do it?
  • User account you are using, must have sufficient access rights

 Procedure

1. Login vROPS with admin privilege account.
2. Click on Alerts from top Menu. Which one? See below image-

3. Expand the Alert Settings and Click on "Alert Definition" and then click on Green + icon as shown below.

 4. Fill the form ;)

Name : Give any suitable name. I gave "_increase cpu count".
Base Object : Select Virtual machine under vcenter adapter here. Type virt and it will auto highlight.
Alert Impact :
Impact: Efficiency (Because continuous CPU spike will decrease the efficiency)
Criticality : Critical (You can select any as per your requirement)
Alert type and sub-type : Application:Performance
Wait Cycle : 1
Cancel Cycle : 1

5. Add Symptom definitions

1. Click on + icon to create a new one if you don't have already.
2. Now select CPU|usage% or CPU|Workload% metric as per your requirement. I chose CPU|Workload% as it make more sense in vROPS version 10.1.2.
3. Drag it to right or double click on the metric.

it looks like as -

  • 3.1. Give it a name
  • 3.2. click on drop down and select "Immediate" or "Critical" as you wish
  • 3.3. When metric is greater than, I put here 99. It means that when CPU|Workload% will be higher than 99%, it will trigger the action. 
  • 3.4. Once done, click on save. You will be back on alert definition page.
  • 3.5. Add recommendation if you want.
    • 3.5.1. Click on + icon
    • Select Adapter type "vCenter Adapter"
    • Select Action "Set CPU count for VM"

Once done then click on save. Your new and fresh Alert Definition has been created here.

Let's automate it now-

1. Check your current active policy. How? Check below

2. Now go to "Policy Library" and find active policy name in the list.
3. Select it and click on edit
4. Directly go to "Alert/Symptoms:Definition" and search for your "Alert Definition" which we created in above steps. It will look like below-


5. Click on Automate column and select "Local". You will see green tick icon as shown in above image.

That's it!

Now what you have done is, You have enabled automated action to add CPU count (it will add +1) whenever any VM's CPU workload% will be more than 99%.

To see automated task's action, you can go to:

Administration → History → Recent Tasks. Below is the example of successful automated task

Above is standard way to automate it and it should work in most of the scenarios.

Caution : If you are targeting few VMs then make sure you are not applying automation on "Current Default Policy" otherwise CPU addition will trigger on all your VMs in vCenter server.

Solution : Create a new policy → Create new Group and add target VMs in this group Apply new Policy on new Group. In this way, only those VMs will be automated which you will add in the group

Known issues in this Automation:

Problem 1 :  Input Parameter 'CpuCount' not in range, positive number range;passed value 0

It means while increasing cpu count, vROPS detected that new cpu value passed was 0. It means let's say current value is 2 and we or vrops asked it to be 0. That is why it is failing.
Solution:

1. Check the VMware tool version, It should be running and up to date
2. Check the host CPU's core maximum capacity. How many max cores it can assign to any VM
3. Check esxi host's CPU family and other compatibility. Check this guide


Problem2 : Automation is not triggering
Solution : It has only one reason that automation is not enabled in Policy.

If you have any other issues than above ones then feel free to comment out. I will try my best to help you out.