Before you begin deploying
IBM® Spectrum LSF allows users to deploy HPC clusters with LSF as the scheduling software, leveraging Terraform and IBM Cloud Schematics for automation.
IBM Spectrum LSF solution does not support bare metal-based deployments. All the deployments are based on the VSI. Make sure to provide the valid instance profiles.
Confirm your IBM Cloud® settings
Complete the following steps before you deploy the IBM® Spectrum LSF deployable architecture.
-
Confirm that you have an IBM Cloud Pay-As-You-Go or Subscription account. If you have a Trial or Lite account, upgrade your account.
-
Log in to your IBM Cloud account with your IBM ID.
Setting IAM permissions - CLI
Before deploying an IBM Spectrum LSF cluster, specific IAM permissions must be assigned to either a user or an access group. The automation script enables this process.
User has the flexibility to run the specific scripts to gain the required IAM permissions to perform the LSF deployment. The automation ensures that if the user has a certain permissions, then the script omits them and add only the required permissions to perform the deployment.
For example, for the App configuration service, the user requires Administrator and Manager permissions. If the user already has the Administrator permission, then the script omits this and provide only Manager permission.
As an admin, you need to have the following permissions to perform the deployment:
- Administrator for All Identity and Access enabled service
- Administrator for IAM Identity Service
- Administrator for All Account Management services
Benefits of the scripts:
- Interactive input collection - The script prompts for the IBMid (admin email), Account ID, and target (User or Access Group).
- Permission check - The script verifies that the admin has account-level Administrator rights which are required to assign policies.
- Assigns required permissions for LSF deployment - This script grants the appropriate permissions across IBM Cloud services that LSF depends upon (for example, VPC, COS, DNS services, KMS, Secrets Manager, and Sysdig Monitoring).
- Avoids duplicates - The script skips the assignment if a matching policy already exists.
You can get the scripts by performing gitclone on the branch:
git clone -b main https://github.com/terraform-ibm-modules/terraform-ibm-hpc.git
-
Navigate to
cd tools/access-management, you get thepermissions.shfile. -
Login to the IBM Cloud with your API key. Run the following command:
a. ibmcloud login --apikey <YOUR_API_KEY> -g <RESOURCE_GROUP> b. chmod +x permissions.sh c. ./permissions.sh -
Enter the admin email or IBMid.
-
Enter the Account ID.
For the Account ID, login to the IBM Cloud account by using your unique credentials. Go to Manage > Account > Account settings. You will find the Account ID.
-
You are asked to assign the roles:
- Access Group - Select this option, if you want to assign the access to the entire access group.
- User - Select this option, if you want to assign the access to an individual user.
Select the required option.
-
Enter the target user email, if you select the option 2.
-
User policy is successfully created.
If the user skips to enter the ACCOUNT_ID, then script displays the error message:
:x: ACCOUNT_ID is required.
This script ensures the user or access group has all the required IAM permissions to successfully deploy an LSF environment.
Setting IAM permissions - UI
IBM Cloud® Identity and Access Management (IAM) access policies are required to install this deployable architecture and provision clusters.
To view access policies, complete the following steps:
-
In the IBM Cloud console, select Manage > Access (IAM).
-
In the IAM navigation menu, select Users and then select the account user.
-
Select Access to view the associated access policies and access groups. See the following table for the permissions that you need for this deployable architecture:
Verify access policies Service Resources Platform roles Service roles App configuration All Administrator Manager All Identity and Access enabled services All Administrator Manager Cloud Object Storage All Service Configuration Reader Writer DNS Services All Editor Manager IAM Identity Service All Administrator -- Cloud Monitoring All Administrator Manager Key Protect All Service Configuration Reader Manager Secrets Manager All Administrator Manager Security and Compliance Center Workload Protection All Administrator -- VPC Infrastructure Services All Editor -- The above-mentioned permissions are mandatory, failing to have these permissions lead to deployment failure. Contact the account administrator for the permissions.
Gather LSF entitlement information
The offering uses Bring Your Own Licenses (BYOL) for Spectrum LSF when you deploy an LSF cluster on IBM Cloud. For production clusters, work with your business owners or license management team to make sure that your organization has procured enough licenses to deploy the HPC cluster by using IBM Spectrum LSF. Failure to comply with licenses for production use of software is a violation of the IBM International Program License Agreement.
The current solution no longer requires ibm_customer_number(ICN) for entitlement check before deploying the solution for non-production use. The solution is now available for use without ICN validation. Users can provision up to
a maximum of 10 static worker nodes for evaluation or non-production use cases. If the number of worker nodes exceeds 10, it becomes the user responsibility to obtain the necessary entitlement check and licensing for those additional nodes
in the production environment. For production use or for evaluating greater than 10 worker nodes, the user must purchase the necessary LSF licenses. To purchase the license, go to Purchasing licenses.
Before you begin
Before you can deploy your Spectrum LSF cluster, you need to create or gather some information. To get started, complete the following steps.
Create an IBM Cloud API key
Verify that you have an IBM Cloud API key. ibmcloud_api_key is the value required for this variable. For more information, see Creating an API key.
Create an SSH key
Make sure that you have an SSH key that you can use for authentication and that it is uploaded to IBM Cloud VPC. The IBM® Spectrum LSF deployable architecture supports either RSA or Ed 25519 key types. This key is used to log in to all VSIs
that you create. Make sure that you use the same key types in an LSF cluster (for example, deploy management and compute nodes with the same key). ssh_keys is the value required for this variable. For more information about creating
SSH keys, see SSH keys.
Generate the remote IP to access Spectrum LSF cluster
This is a mandatory value configured through the Catalog tile and requires a valid IP address range or CIDR format to allow access to the LSF cluster. This value is required for variable remote_allowed_ips.
If this field is left empty (for example, [""]) or not provided, then the cluster deployment fails during the initial setup phase. It is essential to supply a valid entry to proceed with a successful deployment.
For more information on mandatory and optional deployment values, see Deployment values topic.
Support for lsf_version
IBM Spectrum LSF currently supports both Fix Pack 14 (FP14) and Fix Pack 15 (FP15). lsf_version is the value required for this variable. By default, the IBM Spectrum LSF solution now ships with Fix Pack 15 (FP15) to provide users
with the most up-to-date features and support. For more information, see Fix Pack 15.
Application center password
For both FP14 and FP15, Application Center is enabled by default to support job submission, workflow management, and monitoring. To access the GUI, a valid password must be provided. If an appropriate password is not specified, the deployment
fails. app_center_gui_password is the value required for this variable.
Enabling optional values
IBM Spectrum LSF also provides some optional or advanced features such as Observability, Monitoring, Cloud Logs, SCC integration, Hyperthreading, Existing Bastion Support, KMS and more.
If you want to enable and configure any of these features for your cluster, ensure to update the corresponding values accordingly. Note that certain features may be enabled by default.
Also, ensure that the necessary IAM permissions are in place when enabling these features. The required IAM permissions are mentioned in the above section Verify access policies.
Next steps
After you create and gathered your information and reviewed any additional prerequisites for your interface of choice, you are ready to begin Deploying IBM Spectrum LSF.
Before an actual deployment is done, you need to analysis the required amount of capacity in terms of vCPU and memory, so that the deployment does not fail due to capacity concerns.
Select the method for accessing the cluster (Post deployment)
The values for remote_allowed_ips must be provided to identify a list of IP addresses of systems that can access the bastion node. All the cluster nodes can be directly accessed through bastion nodes (except dynamic nodes).
See the following example SSH command syntax for accessing different types of nodes:
-
Deployer node:
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -J ubuntu@<replace this with your bastion_node IP address> vpcuser@<replace this with your deployer_node IP address> -
Login node:
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -J ubuntu@<replace this with your bastion_node IP address> lsfadmin@<replace this with your login_node IP address> -
Management node:
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -J ubuntu@<replace this with your bastion_node IP address> lsfadmin@<replace this with your management_node IP address> -
Static compute node:
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -J ubuntu@<replace this with your bastion_node IP address> lsfadmin@<replace this with your static_compute_node IP address>
This worker node instance type supports a combination of multiple instance profile type that might be chosen for different number of instance count. For example, you might choose 100 instance to be created from bx2-4x16 and 10 instance
from mx3d-8x80. So, you would get a total count of 110 static worker nodes with different instance profile, based on your requirement.