Azure Planning
This guide helps you understand Azure-specific requirements and make key planning decisions for your HyperShift deployment.
Target Audience
This guide is designed for OpenShift administrators and SREs who are new to Azure. We assume you have:
- Strong knowledge of OpenShift/Kubernetes operations
- Limited or no experience with Azure infrastructure
- Understanding of networking, storage, and identity management concepts in Kubernetes
Prerequisites
Azure Requirements
If you're new to Azure but experienced with OpenShift/Kubernetes, this section explains Azure-specific concepts you'll need.
Azure Subscription & Your Permissions
What is an Azure Subscription?
An Azure subscription is the billing and access boundary for Azure resources. Think of it as the top-level container where all your Azure resources live and are billed.
Your Azure account needs these permissions:
| Permission | Why You Need It | OpenShift/K8s Equivalent |
|---|---|---|
Contributor |
Create and manage Azure resources (VMs, networks, storage) | cluster-admin for infrastructure |
User Access Administrator |
Assign roles to managed identities | Permission to create RoleBindings |
Application.ReadWrite.OwnedBy(Microsoft Graph API) |
Create service principals (External DNS, cluster operations) and query Graph API to assign RBAC roles to managed identities | ServiceAccount creation + role binding permissions |
How to verify your permissions:
# Check your subscription roles
az role assignment list --assignee $(az ad signed-in-user show --query id -o tsv) \
--scope /subscriptions/$(az account show --query id -o tsv)
# Look for "Contributor" and "User Access Administrator" in the output
Why both Contributor and User Access Administrator?
Contributorlets you create resources but NOT assign permissions to themUser Access Administratorlets you assign permissions (roles) to managed identities- HyperShift needs both: create resources AND grant them permissions to work together
Azure Resource Groups
What is a Resource Group?
Similar to a Kubernetes namespace, but for Azure resources. All Azure resources must belong to exactly one resource group. Resource groups are a logical container for managing lifecycle and access control.
You'll need TWO types:
- Persistent Resource Group (e.g.,
hypershift-shared)- Lifecycle: Long-lived, survives across multiple cluster deployments
- Contains: OIDC issuer (storage account), managed identities, federated credentials, DNS zones (optional)
- Important: Never delete this - it contains shared infrastructure
- Per-Cluster Resource Groups (created automatically by tasks)
- Lifecycle: Created and deleted with each hosted cluster
- Contains: VNet, NSG, subnets, worker node VMs
- Safe to delete when removing a cluster (using
task cluster:destroy)
Resource Group Analogy
Think of the persistent resource group like a shared namespace for operators and the per-cluster groups like workload-specific namespaces that come and go.
Optional: Azure DNS Zone
What is it?
Azure DNS is Microsoft's DNS hosting service. A DNS zone hosts the DNS records for a specific domain (like example.com).
When do you need it?
- Only if using External DNS for automatic DNS management (recommended for production)
- Not needed for development/testing - you can use Azure LoadBalancer DNS instead (e.g.,
abc123.eastus.cloudapp.azure.com)
What is "DNS zone delegation"?
When you create a hosted cluster with domain my-cluster.azure.example.com:
- The parent zone (
azure.example.com) must already exist in Azure DNS - HyperShift creates a child zone (
my-cluster.azure.example.com) automatically - "Delegation" means the parent zone's NS records point to the child zone's nameservers
- This allows the child zone to manage its own DNS records independently
DNS Delegation Example
If your parent zone is azure.example.com in Azure DNS:
- HyperShift creates child zone:
my-cluster.azure.example.com - Parent zone gets NS record:
my-cluster.azure.example.com→[child-ns1, child-ns2, ...] - Cluster API becomes:
api.my-cluster.azure.example.com - Apps become:
*.apps.my-cluster.azure.example.com
Required Tools
Install these tools on your workstation:
| Tool | Purpose | Installation |
|---|---|---|
Azure CLI (az) |
Interact with Azure APIs | Install guide |
jq |
Parse JSON output | brew install jq or apt install jq |
ccoctl |
Cloud Credential Operator utility for Azure identity setup | Bundled with OpenShift installer or download |
task |
Task runner for automation scripts | Install guide |
Management Cluster Requirements
Cluster Prerequisites
- An existing OpenShift cluster running in Azure
- Cluster must have:
- Sufficient capacity for hosting control plane pods
- Network connectivity to Azure APIs
- Ability to create LoadBalancer services (if not using External DNS)
- Administrative access (
cluster-adminpermissions) to the management cluster
Tools
- OpenShift CLI (
oc) - HyperShift CLI binary (download)
Required Configuration Files
Understanding which files you need to prepare versus which are generated automatically:
Files You Must Create
These files must be prepared before starting the deployment:
| File | Purpose | When Needed | How to Get |
|---|---|---|---|
| azure-credentials.json | Service principal for cluster operations (create/destroy VMs, networks) | Before creating first cluster | See detailed instructions below |
| pull-secret.json | Red Hat pull secret for downloading OpenShift images | During operator installation | Download from cloud.redhat.com |
Files Generated During Deployment
These files are created automatically by the Taskfile automation and should not be manually edited:
| File | Purpose | Created By | Used By |
|---|---|---|---|
| workload-identities.json | Managed identity client IDs for cluster components (storage, networking, etc.) | task azure:identities |
task cluster:create |
| serviceaccount-signer.private serviceaccount-signer.public |
RSA key pair for signing service account tokens | task prereq:keys |
task cluster:create |
| .azure-net-ids | Network resource IDs (VNet, NSG, subnet) exported as environment variables | task azure:infra |
task cluster:create |
File Lifecycle
- You create:
azure-credentials.json,pull-secret.json - Automation creates: All other files during setup
- Don't edit: Generated files - they're overwritten on each run
Azure Credentials File Format
The azure-credentials.json file should contain service principal credentials:
{
"subscriptionId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"tenantId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"clientId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"clientSecret": "your-service-principal-secret"
}
How to create:
# Create service principal with required permissions
SP_DETAILS=$(az ad sp create-for-rbac \
--name "hypershift-cluster-ops" \
--role Contributor \
--scopes "/subscriptions/$(az account show --query id -o tsv)")
# Extract values and create file
cat <<EOF > azure-credentials.json
{
"subscriptionId": "$(az account show --query id -o tsv)",
"tenantId": "$(az account show --query tenantId -o tsv)",
"clientId": "$(echo "$SP_DETAILS" | jq -r '.appId')",
"clientSecret": "$(echo "$SP_DETAILS" | jq -r '.password')"
}
EOF
Key Planning Decisions
DNS Management Strategy
Choose how to handle DNS for your hosted clusters:
Option 1: With External DNS (Recommended for Production)
Best For: Production environments, multiple clusters, custom domains
Characteristics:
- Automatic DNS record management via External DNS operator
- Custom domain names (e.g.,
api.my-cluster.example.com) - Requires: DNS zones, service principal with DNS permissions, External DNS operator
- Higher initial setup complexity, but simpler ongoing operations
Example DNS:
- API Server:
api.my-cluster.azure.example.com - Apps:
*.apps.my-cluster.azure.example.com
Decision Criteria:
- Choose this if you need custom, branded domain names
- Choose this if you plan to manage multiple clusters
- Choose this if you want fully automated DNS provisioning
Option 2: Without External DNS (Simpler for Dev/Test)
Best For: Development, testing, proof-of-concept environments
Characteristics:
- Manual DNS management or Azure-provided LoadBalancer DNS
- API server uses Azure LoadBalancer DNS (e.g.,
abc123.eastus.cloudapp.azure.com) - No DNS zones or service principals needed
- Lower initial setup complexity, but requires manual DNS work for production use
Example DNS:
- API Server: my-cluster-api.eastus.cloudapp.azure.com
- Apps: my-cluster-apps.eastus.cloudapp.azure.com
Decision Criteria:
- Choose this for quick testing or POC environments
- Choose this if you don't control your DNS infrastructure
- Choose this if you only need a single cluster temporarily
Recommendation
Start with Option 2 (Without External DNS) for your first cluster to learn the basics. Once comfortable, you can deploy production clusters with Option 1 (External DNS).
Resource Group Strategy
Plan your resource group structure to separate long-lived shared resources from cluster-specific resources:
Persistent Resource Group
- Name example:
hypershift-shared,openshift-common - Lifecycle: Long-lived, not deleted when clusters are removed
- Contains:
- OIDC issuer (storage account)
- Managed identities (per cluster, but persistent)
- Federated identity credentials (per cluster, but persistent)
- (Optional) DNS zones
Per-Cluster Resource Groups
- Name pattern:
<prefix>-vnet-rg,<prefix>-nsg-rg,<prefix>-managed-rg - Lifecycle: Created and deleted with each cluster
- Contains:
- Virtual Network (VNet)
- Network Security Group (NSG)
- Subnets
- Virtual Machines (worker nodes)
Understanding Resource Persistence
Managed identities and federated credentials are per-cluster but persistent. They:
- Are created once per cluster
- Remain after cluster deletion
- Allow cluster recreation without identity recreation
- Preserve Azure IAM role assignments
- Support disaster recovery scenarios
Network infrastructure is temporary and deleted with the cluster.
Network Planning
VNet and Subnet Design
- Default:
/16VNet with/24subnet - Plan for sufficient IP addresses for worker nodes
- Consider network security group (NSG) rules for your environment
- Ensure network connectivity to Azure APIs and management cluster
Network Security
- NSG rules are created automatically
- Review and customize based on your security requirements
- Consider private endpoint requirements for production
Next Steps
Now that you've planned your Azure infrastructure:
- Workflow & Repository Setup - Understand how this repository works and how to configure it
- Azure Foundation Setup - Set up Azure infrastructure using Taskfile automation