With the release of VCF9, lots of customers, have existing vSphere environments and would like to manage this environments under VCF Operations in VCF9 , in the below blog, we’ll go through importing a simple vSphere 8.0 update 3 environment under VCF Ops.
The existing environment to be imported has the following components:
vCenter 8.0 u3 (running on the same cluster
2 ESX hosts 8.0 u3
NFS Storage, Truenas
Lifecycle image is applied to both hosts
VCF 9.0.1
Single Management Workload Domain
Below are the steps to import this vSphere 8 environment, under VCF 9 using VCF Ops
Following the previous blog where we uploaded all the bundles of to upgrade the VCF Management Workload Domain offline, I started running the prechecks for the upgrade and the below was the result:
Since this is a test and nested environment, I silenced the precheck alarm
In this blog we’ll discuss how to use the offline bundle transfer utility of VCF to upload install bundles to SDDC Manager in order to prepare for a VCF upgrade from VCF 5.0 to VCF 5.2, the offline bundle transfer utility is useful in case the SDDC manager doesn’t have internet access which is the case in air gapped environment.
The pre-requisites below are required for running the utility:
A Windows or Linux computer with internet connectivity (either directly or through a proxy) for downloading the bundles
The computer must have Java 8 or later
A Windows or Linux computer with access to the SDDC Manager appliance for uploading the bundles
To upload the manifest file from a Windows computer, you must have OpenSSL installed and configured ( what I did is I copied the files required to the SDDC Manager appliance using WinSCP after downloading them using the utility, this will be later explained)
First step I downloaded the bundle transfer utility from Broadcom Customer Support Portal
There Utility tool will be used in 2 locations, the PC where the manifest and compatibility files are downloaded as well as in the SDDC Manager,
So first is to extract the OBTU in my PC, which looks like the below:
Second to upload it to the SDDC Manager appliance via WinSCP
I copied it in the same location as suggested in the official documentation where I followed the below steps:
mkdir /opt/vmware/vcf/lcm/lcm-tools
tar -xvf lcm-tools-prod.tar.gz
cd /opt/vmware/vcf/lcm/
chown vcf_lcm:vcf -R lcm-tools
chmod 750 -R lcm-tools
One issue I was failing on using the utility from my PC was this :
Which I fixed by modifying by changing the default Java 8 installation location to a different folder than the default and modifying the JAVA_HOME environmental variable in my Windows PC
We need to download 3 files from the depot which are
Manifest file
Compatibility data
vSAN HCL file
To download the manifest file:
To download the compatibility data file:
To download the vSAN HCL file:
Finally download the install bundles, I’m upgrading my lab from VCF 5.0 to VCF 5.2 so below are the steps I followed to download the install bundles
Using this command “lcm-bundle-transfer-util –download –outputDirectory C:\Hisham\VMware\VCF\5.2 –depotUser xyz@gmail.com –sv 5.0.0.0 –p 5.2.0.0”
And I selected All, notice here that Avi Controller is now available as part of the VCF 5.2 BOM is this is a new feature to lifecycle Avi Controllers from SDDC Manager ( this is available in VCF 5.2) previously it was a manual installation.
Next is to update the manifest file and Compatibility Matrix file in the SDDC Manager appliance. I uploaded the 3 files downloaded in the previous steps to the SDDC Manager appliance
To update the sourceManifest file , I ran the below command in the SDDC Manager appliance, however I used to get couple of unexplained failures prior to that and the reason is the path of the file need to look something like the below
Not sure the reason behind it, but it had the same path when I downloaded it on my PC so I followed the same path.
Next, I did the same for the compatibility matrix file:
And when I tried to update the vSAN HCL file I got the below error:
ERROR: Cannot upload vSAN HCL to SDDC Manager versions below 5.1.0.0 so probably the one downloaded was not up to date,
Last but not least is to upload the offline bundles downloaded earlier to SDDC Manager
However I noticed when trying to update all bundles on SDDC Manager that only NSX gets uploaded, so instead I uploaded each bundle at time, not sure the reason behind this.
As per the below, the offline bundles are uploaded successfully to the SDDC Manager:
On clicking Plan for Upgrade the below message was showing:
The VMware Cloud Foundation deployment process is referred to as bring-up. You specify deployment information specific to your environment such as networks, hosts, license keys, and other information in the deployment parameter workbook and upload the file to the VMware Cloud Builder appliance to initiate bring-up of the management domain.
During bring-up, the management domain is created on the ESXi hosts specified in the deployment parameter workbook. The VMware Cloud Foundation software components are automatically deployed, configured, and licensed using the information provided. The deployment parameter workbook can be reused to deploy multiple VMware Cloud Foundation instances of the same version.
The following procedure describes how to perform bring-up of the management domain using the deployment parameter workbook. You can also perform bring-up using a custom JSON specification
In this blog, I’ll prepare VCF 5.0 bringup in a nested ESXi environment.
Software installed:
Cloud builder 5.0.0
ESXi 8.0 update 1a
Memory: 128 GB RAM
Deployment parameters workbook VCF 5
Infrastructure pre-requisites:
I’m using opnsense as the gateway for the below networks:
Management
vMotion
vSAN
Host Overlay
Forward and reverse DNS records for:
Cloud Builder
4 ESXi hosts
1 NSX manager (the reason here, is that I’m using one NSX manager, I modified the JSON file for bringup to include one NSX manager only)
NSX manager VIP
SDDC manager
vCenter
ESXi Preparation
4 ESXi hosts were installed with the below resources:
Storage:
40 GB hard disk for boot
26 GB for Cache
256 for capacity (2 & 3 are requirements for vSAN)
CPU: 20 cores, 10 cores per socket, which ends up with 2 sockets
Memory: 128 GB
One more step here, I checked the hard disk for 26 GB and it didn’t show as SSD:
Since this is a lab environment, there is no need to use 3 NSX managers, I’ll only use one and to do so, I modified the JSON file generated from the excel file of the deployment parameter sheet, in order to do so , we need to convert the excel to JSON and then modify the JSON:
/opt/vmware/bringup/scripts/json-generator.sh my-vcf-deploy.xlsx my-vcf-deploy.json vcf-ems (The source file is the deployment parameter excel and the destination is the JSON file)
Now, the final step, which is deploying the cloud builder VM and importing the modified JSON file (with one NSX Manager), I’ve all management IPs in the same subnet (192.168.1.0/24) with VLAN 0,
I faced this issue on reviewing the pre-requisites page
Failed to get SSH Key for Host
This is due to incorrect password configured in ESXi host, versus what was configured in the deployment parameter sheet, once this is configured, the verification passed except one issue
Which I ignored as this is a lab environment
Now clicked on Deploy SDDC and left the deployment running,
After sometime, I faced the below errors:
The log file where I checked the above error was :
/var/log/vmware/vcf/bringup/vcf-bringup-debug.log
I checked around and found out that from vCenter I can’t ping any of the 3 hosts, only the host where vCenter is deployed is reachable form vCenter,
Since originally I cloned 3 ESXi VMs from the first one, I suspected issue with UUID which I reset
Followed this and still was the same issue, so I re-installed the entire ESXi VMs and then the error was bypassed
Hit this error:
Apparently the time it takes for deploying NSX manager is more than the wait time from Cloud builder VM, so I suspended the cloud builder VM and waited till NSX manager loads
After NSX Manager loaded, I resumed the cloud builder VM
Starting NSX 3.2.2 release the Sub-Transport Node Profile within a Transport Node Profile feature was created in order to support Stretched L2/L3 clusters and stretched vSAN clusters
Sub-TNP is basically a template to be applied to Sub-Clusters to be prepared for NSX with a configuration different than the one used in one TNP , one use case is having ESXi hosts under the same clusters in different sites, however the host TEP network is not available/stretched across the two sites, in this case a Sub-TNP can be applied to the Sub-Clusters of the second site for instance with different host TEP which means different IP pools and a different VLAN, without affecting the overlay segment communication across the two stretched vSphere cluster.
A sub-TNP can only override the following fields of a host switch: VDS Host Switches ID, uplink profiles and IP assignment.
In this blog we’ll demonstrate that, I have a vSphere cluster with 2 ESXi hosts, esxi03 and esxi04, assuming that esxi04 is in a different site and the host TEP subnet used for esxi03 is not available at the second site, so we’ll create 2 different uplink profiles, 2 different IP Pools.
vSphere Cluster:
Network Pools:
The first pool will be referred to in the Transport Node Profile used for esxi03:
The second IP Pool will be referred to in the Sub-TNP used to prepare esxi04:
IP Pools
The first IP Pool will be referred to in the Transport Node Profile used for esxi03:
The second IP Pool will be referred to in the Sub-TNP used for esxi04:
We’ll create a Sub-Cluster and add to it esxi04:
You’ll notice here that the sub-cluster was added for esxi04
Transport Node Profile:
Notice here the uplink profile and host TEP IP pool used are for esxi03
However in the Sub-TNP the below configuration was used:
Next step, is to prepare the cluster using the TNP and the Sub-TNP:
Notice here that the we are applying the Sub-TNP to the Sub-Cluster
Let’s check the host TEPs assigned to esxi03
Let’s check the host TEPs assigned to esxi04
Testing TEP to TEP reachability
So to conclude, it’s a nice feature to have, in case of underlay networking restrictions.
The NSX Advanced Load Balancer supports improved and more flexible methods for upgrading the system. The Controller supports all the upgrade workflows through the UI
First Step , is to download the target version, currently I have NSX ALB version 21.1.5 and I I’m upgrading to 22.1.3
Figure 1 Target Version
Then, login to existing NSX ALB Controller and upload the .pkg file, Go to Administration –>Controller
–>Software then upload the .pkg file from Computer
Figure 2 Uploading .pkg file
Upload will take some time to complete,
Figure 3 Upload In Progress
Figure 4 Upload Completes
Now go to Administration–>System Update–>Upgrade
Figure 5 Upgrade Process
The next window shows pre-upgrade checks as well as Service Engine option to upgrade or stay at the current version, in my lab I’m upgrading all,
Figure 6 Upgrade Prechecks
The Controller will take some time to do the final checks
Figure 7 Upgrade final checks
The Controller will not be accessible during the upgrade process
Figure 8 Controller temporary unavailable
Finally after 5-10 minutes the Controller was up and the Service Engine Groups are upgraded successfully
NSX supports static routing as well as dynamic routing protocols to provide connectivity to workloads hosted in vSphere environment to the outside word, in case of dynamic routing, neighbor failure detection, or next hop reachability can be determined with keepalives, for example in case of OSPF , it’s the hello and dead intervals and in BGP it’s the keepalive messages, BFD can be used with dynamic routing protocols to support faster failover times.
Static routing can be configured on a T0 gateway, toward external subnets with a next hop of the physical upstream device. To protect the static routes, BFD is highly recommended to detect the failure of the upstream device.
The timers depend on the edge node type, edge VMs support minimum of TX/RX 500 ms and baremetal edges support a minimum of 50 ms.
In this blog, we’ll configure T0 with static routing and BFD with some failover scenarios while testing N/S reachability, the below diagram is the architecture we’ll reference in this blog post.
Figure 1 Edge node Design:
Figure 2 T0 logical design:
This is a snippet from the configuration of the upstream device:
The interface gigaethernet3 configuration
This BFD configuration matches the BFD configuration used in defining a BFD peer under T0 shown later.
The static routes configured with BFD towards the edge nodes 1 and 2:
And finally the routing table which shows ECMP configuration of the static route pointing to the overlay segment in NSX:
From NSX, the below configuration was made:
The default BFD Profile was used, which has the same timers as the ones configured in Cisco:
And Finally a default route was configured from T0 gateway pointing towards the Physical upstream device:
From Edges CLI, let’s validate the routing table of each:
Edge node 1:
Edge Node 2:
Let’s validate that BFD is up before testing failover scenarios:
From Physical upstream device:
From each edge node:
Edge Node 1:
Edge Node 2:
Testing N/S reachability before failover
I’ve created a loopback on the physical upstream device with ip 1.1.1.1/32, to test connectivity we pinged from a test VM on the overlay segment in NSX:
To test failover, I’ll change the IP of the uplink interface of Edge node 1 to 10.0.0.3 instead of 10.0.0.2
From the upstream device, the routing table has changed:
Let’s examine the routing table from each edge node:
Edge Node 1:
Edge Node 2:
Notice here that edge node 2 learnt the default route from the isr, the reason is explained as per the reference design guide below:
When Inter-SR routing is enabled by the user, an overlay segment is auto plumbed between SRs (similar to the transit segment auto plumbed between DR and SR) and each end gets an IP address assigned in 169.254.0.128/25 subnet by default. An IBGP session is automatically created between Tier-0 SRs and northbound routes (EBGP and static routes) are exchanged on this IBGP session
After correcting the uplink IP of Edge Node 2, you can see the default route in the routing table pointing to the physical device
In this final blog for NSX Federation, we’ll discuss the third configuration option for T0 which is “Active/Active Location All Primary” , this option is for T0s without services, VMs egressing from their location, will send the traffic to their local Edge Nodes, supporting local egress. The below is the route exchange between T0 A/A All Primary and the upstream device,
However in this blog, we will go with T1 with custom span and monitor the traffic flows
Looking back at the previous topology, the mode of T0 is changed and also added 3 T1s instead, where 2 T1s are location specific having only one primary location and another T1 with custom span, having HQ as primary and DR as secondary
First Change the mode of the T0 to “Mark All locations as Primary”:
Next, let’s create Default route on each location pointing to the corresponding CSR:
Let’s verify from edge nodes:
HQ:
DR:
Let’s Now create the three T1s, according to the diagram above,
T1-HQ with HQ only as the primary location:
T1-DR with DR only as the primary location:
Finally Create the last T1 with HQ as Primary and DR as Secondary:
After Creating the segments and connecting them to their respective T1s accordingly:
Let’s verify the span of each T1 created from Edge nodes in HQ and DR:
Edge Node in HQ:
As you can see, T1-HQ and T1-HQ-DR are existing here due to their span which is HQ and DR respectively
Edge Node in DR:
As you can see, T1-DR and T1-HQ-DR are existing here due to their span which is HQ and DR respectively
Let’s verify from the edge nodes the forwarding table of each T1 SR component:
T1 HQ:
You can see that this points to T0 DR So if a VM in HQ is connected to a segment which is connected to T1-HQ communicates South North, the traffic flow will be: T1-HQ DR –>T1-HQ SR–>T0 Stretched DR–>T0 Stretched SR–>Upstream Device
For T1 HQ-DR SR, we will verify the forwarding table from edge node in HQ and edge node in DR: HQ edge node:
So if a VM in HQ is connected to a segment which is connected to T1-HQ-DR communicates South North, the traffic flow will be:
T1-HQ-DR DR –>T1-HQ-DR SR–>T0 Stretched DR–>T0 Stretched SR–>Upstream Device
DR edge node:
Notice here, the default route on the T1-HQ-DR SR component is learned from 169.254.32.2 which is the Intersite Transit Subnet field. This subnet is used for cross-location communication between gateway components
So if a VM in DR is connected to a segment which is connected to T1-HQ-DR communicates South North, the traffic flow will be:
let’s check the forwarding table for Lets check the forwarding table of T1-DR
So if a VM in DR is connected to a segment which is connected to T1-DR communicates South North, the traffic flow will be:
T1-DR DR –>T1-DR SR–> T0 Stretched DR–>T0 Stretched SR–>Upstream Device
Finally, let’s check the routing table of T0 stretched SR component in each site:
HQ:
First 2 routes are advertised from the two T1s, which have span HQ and HQ-DR , last highlighted route is known from iBGP (RTEP) between two sites.
DR:
First two routes are known from iBGP (RTEP) and the last one is known from T1-DR which spans DR only
So quoting from the multi-locations design guide, this is how the traffic flow will be from each site:
Let’s customize this in our topology instead
Let’s create the below static routes:
HQ-CSR:
DR-CSR:
After Redistributing into BGP, let’s do a traceroute from test VMs in each of the 3 segments to a loopback IP in the branch router.
Traceroute from a VM in HQ connected to a segment connected to T1-HQ-DR:
Traceroute from a VM in HQ connected to a segment connected to T1-HQ:
Traceroute from a VM in DR connected to segment connect to T1-DR:
Traceroute from a VM in HQ connected to a segment connected to T1-HQ-DR:
Notice here that the VM in DR site, crossed the intersite link to egress form HQ , this means any VM connected to a segment connected to T1 with custom span, will always egress from the site listed as primary in T1
In this Blog we’ll discuss how changing the T0 mode to Active/Active instead of Active/Standby, have impact on the physical network.
Let’s examine the routing table of site B CSR
Notice now, the edge in the secondary site is advertising the T1 DR segments, unlike when the T0 mode was P/S active/standby
Now let’s check form the edge side,
Now the edge in the secondary site is advertising T1 DR segments
Cross check with the multilocation guide:
Since now secondaries are advertising the connected routes, the branch router might prefer the secondary site routes, bear in mind that in Primary/Secondary, the egress is always from the primary site, so this might end in asymmetric traffic which will get dropped
So first let’s validate/test two things:
1-The egress of a VM in secondary site
2-The routing table of the branch router and how it reaches the VM in secondary site
First, to examine how the VM in a secondary site egresses, let’ check the routing table ( and the internal VRF) of secondary edge and do a traceroute:
From the above, it’s clearly that the secondary edge will prefer the default route learned from the primary edge node (learned from iBGP) due to higher Local Preference.
Let’s check the branch router routing table:
So here it clearly prefers Site B CSR,
In this case the traffic from the VM in a secondary site egresses through primary and ingress will be through secondary,
In this, let’s prefer the edges in primary location, by using AS prepend (making the routes advertised from the secondary site, with higher cost) and check the routing tables again
Create route map with AS prepend:
Apply the route-map to the secondary site neighbor:
Let’s verify the advertised routes from the secondary edge node:
Cross check it with the branch router routing table: