Best practices and requirements for software as a service
These best practices summarize some of the most important technical principles for SaaS providers based on the control requirements and implementation guidance of the IBM Cloud Framework for Financial Services. These best practices are not a replacement for the information provided in the Control Implementation Overview templates, which are still the definitive guides to the controls and required evidence for application providers.
If you're a software provider, see Best practices for software.
1. Use an approved reference architecture
Requirement: Deploy and manage IBM Cloud resources as defined by an approved reference architecture.
Purpose & value: The quickest path toward meeting the controls of the IBM Cloud Framework for Financial Services is to use one of the pre-defined, single-tenant reference architectures. These architectures are designed to facilitate implementing the other requirements and best practices such as network isolation, security function isolation, high availability, backup and recovery, encryption, audit logging, compliance monitoring, etc.
A key aspect of the framework is to separate user workloads from system management functionality and isolate security functions from non-security functions. One of the most important features of the reference architectures is the definition of physical and logical separation between the edge, management, and workload planes. It is your responsibility to ensure that separation stays in place.
Implementation guidance: Explore and choose one of the supported reference architectures for the IBM Cloud for Financial Services:
- IBM Cloud® Virtual Private Cloud reference architecture, with options for using one or both of:
- IBM Cloud Satellite® reference architecture
- IBM Cloud® for VMware® Regulated Workloads reference architecture
Most Relevant Controls:
Family | Control |
---|---|
System and Communications Protection (SC) | SC-2 Application Partitioning SC-3 Security Function Isolation |
2. Use only services that are IBM Cloud for Financial Services Validated
Requirement: Only use IBM Cloud and third-party managed services which are IBM Cloud for Financial Services Validated.
Purpose & value: IBM Cloud for Financial Services Validated designates that an IBM Cloud service or ecosystem partner service has evidenced compliance to the controls of the IBM Cloud Framework for Financial Services. Using services which are Financial Services Validated makes it much easier to meet all of the control requirements and provide evidence of compliance.
An external service is an IBM Cloud or third-party service deployed outside of the authorization boundaries of your offering and typically not under direct control of your offering. External services used by an offering should be considered extensions of the offering itself. For an offering to be validated against the requirements of the IBM Cloud Framework for Financial Services, all of its external services must meet the same requirements.
For cases where you need additional functionality not covered by any Financial Services Validated service, you are advised to install your own software within your deployment. However, you are then responsible for ensuring the software meets all of the IBM Cloud controls.
If you represent a technology vendor seeking the Financial Services Validated designation, then you must request special approval for a temporary deviation if you wish to use services that are are not themselves Financial Services Validated. If you represent a financial institution or if the Financial Services Validated designation is not required by your consumers, then you have freedom to choose based on the level of risk you are willing to accept. Regardless of designation, all IBM Cloud services are designed with security in mind, and many are certified with other compliance programs, such as ISO, SOC 2, etc.
Implementation guidance:
For technology vendors:
- For services which are not Financial Services Validated, consult with your IBM customer success representative to determine if it is appropriate for use.
- If you add services to your solution which are not Financial Services Validated, be sure to update the appropriate parts of the System Environment and Inventory section of your Control Implementation Overview template.
Most Relevant Controls:
Family | Control |
---|---|
Access Control (AC) | AC-20 Use of External Information Systems |
System and Services Acquisition (SA) | SA-4 Acquisitions Process SA-9 External Information System Services |
Enterprise System and Services Acquisition (ESA) | ESA-5 Subcontractor Risk Management |
Security Assessment and Authorization (CA) | CA-3 System Interconnections |
3. Ensure all deployed software meets all controls
Requirement: Ensure all software used by your service is developed, installed, configured, and managed in accordance with the IBM Cloud Framework for Financial Services control requirements. It is recommended that you use software that is Financial Services Validated, when available.
Purpose & value: Your service will need software to function. This may be software you implement, or it may be software from third-parties, such as databases, logging stacks, bastion hosts, web application firewalls, etc. These are all perfectly acceptable if you adhere to all IBM Cloud Framework for Financial Services requirements related to development, deployment, configuration, and management of the software.
You will be expected to provide appropriate evidence for all software as part of the validation process. Using software that is already Financial Services Validated will make it easier for you to provide this evidence.
4. Implement a system of account, identity, and access management to enable a zero-trust environment
Requirement: Implement a system of zero-trust where account, identity, and access management are based on the principles of separation of duties and least privilege. This system must include IBM Cloud accounts and resources under the control of IBM Cloud Identity and Access Management (IAM) as well as supporting accounts and resources outside of IBM Cloud.
Purpose & value: The design of the IBM Cloud for Financial Services is intended to enable a zero-trust model for deployments of the reference architectures. The scope of any one individual operator or administrator is strictly limited to those actions necessary and appropriate to perform their assigned roles. Overlap of duties is minimized or eliminated to prevent the need for privilege escalation that may result in undesired flows of information into or out of your deployment. It is guided by two major principles:
- Separation of duties - No user should be given enough privileges to misuse the system on their own. For example, no one user should have the authority/ability to develop, compile, and/or move object code from non-production environments into production environments.
- Least privilege - Restrict the access privileges of authorized personnel (e.g., program execution privileges, file modification privileges) to the minimum necessary to perform their jobs. Access must be granted based only on a need to know. Access that has not been explicitly permitted is denied by default. Role-based access controls (RBAC) are used to allocate logical access to a specific job function or area of responsibility, rather than to an individual. Administrative access is only granted to individuals whose job roles have been approved for administration responsibilities.
Not only do these principles apply to IBM Cloud accounts and resources, but they also apply to accounts and resources that are not managed by IBM Cloud. For example, your operators will need credentials and proper authorizations to resources such as (but not limited to):
- Virtual servers in your VPCs
- Self-installed software (such as databases, firewalls, etc.)
- Source code control systems
- External identity providers and user directories
- etc.
In addition, you must provide RBAC for consumer access to the deployed application in the workload plane.
Individuals should be able to request access to resources, and they should be granted access only if it is absolutely required for the them to do their jobs. As individuals leave the organization, change job roles, etc., there needs to be a system to revalidate that the previously granted access continues to be needed for the individuals to perform their duties.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Access Control (AC) | AC-2 Account Management AC-3 Access Enforcement AC-5 Separation of Duties AC-6 Least Privilege AC-14 Permitted Actions without Identification or Authentication |
Identification and Authentication (IA) | IA-5 Authenticator Management IA-5 (1) Authenticator Management | Password-Based Authentication |
5. Use and maintain non-production environments for development and testing
Requirement: Do not put anything into production that has not been tested in an equivalent non-production environment. Put non-production environments in a separate IBM Cloud account. Treat all non-production environments as if they were for production environments by following all of the best practices and IBM Cloud Framework for Financial Services controls for each environment you create.
Purpose & value: Non-production environments (all environments that your service uses for development, testing, or demonstration) are as critical to the security of your service as production environments. Security weaknesses must be detected and mitigated in development environments by enforcing all the same controls and security testing as production. As security weaknesses in development environments may be more likely than production, separation of these environments from production is required.
It is permissible to run a "pre-production" environment in your production account to be used only for final verification and sniff testing before promotion to production -- but, only if the set of operators is the same for both environments. However, all changes still need to be developed and tested first in a non-production environment in a separate account.
Implementation guidance:
- Ensure all controls in the IBM Cloud Framework for Financial Services are followed for each new environment you create, whether for production or not.
- Create and manage production environments using an IBM Cloud account separate from the one you are using for non-production environments.
Most Relevant Controls:
Family | Control |
---|---|
Configuration Management (CM) | CM-3 (2) Configuration Change Control | Testing, Validation, and Documentation Of Changes CM-4 (1) Impact Analyses | Separate Test Environments |
System and Services Acquisition (SA) | SA-10 Developer Configuration Management SA-15 (9) Development Process, Standards, and Tools | Use of Live Data |
System and Communications Protection (SC) | SC-2 Application Partitioning SC-3 Security Function Isolation |
6. Enforce information flow policies and protect the boundaries of your application
Requirement: Leverage the components of the reference architecture to enforce policies for information flows and to protect the boundaries of your application.
Purpose & value: Information flow control determines where information is allowed to travel within the components of your service offering as well as to/from other interconnected systems. Information flow control is intended to reduce the risk of sensitive information flowing from one system to another that has less stringent security policies. Flow control is based on the characteristics of the information (e.g., is the content of an information payload valid?) and/or the information path (e.g., is the information coming from a trusted IP address?).
Boundary protection increases security by monitoring and restricting communications at the external network boundary and key internal boundaries of your service. Boundary protection devices (e.g., gateways, encrypted tunnels, hardware and virtual firewalls, etc.) are commonly employed.
Implementation guidance:
Most Relevant Controls:
7. Ensure all operator actions are executed through a bastion host
Requirement: Ensure all interactive operator actions can only be run through a bastion host in your dedicated edge or management plane which is properly isolated from your workload plane. Enable recording of bastion sessions for auditing.
Purpose & value: A bastion host is a server in your edge or management plane that allows a secure connection to other resources in your edge, management, and workload planes. Using a bastion host for executing operator actions minimizes the chances of penetration from an external source. Session logging allows tracking of all activities that are performed through the bastion host in the event a potential security issue must be investigated.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Access Control (AC) | AC-6 (9) Least Privilege | Auditing Use of Privileged Functions AC-17 Remote Access |
Audit and Accountability (AU) | AU-14 Session Audit |
Identification and Authentication (IA) | IA-2 (1) Identification and Authentication (Organizational Users) | Network Access to Privileged Accounts |
8. Capture audit events and forward to a SIEM
Requirement: Capture audit events for actions performed on the components of your service, including IBM Cloud services as well as any software components you install within your deployment. Audit logs should be securely stored, and they should also be forwarded to a security information and event management (SIEM) system.
Purpose & value: It is important to have audit logs to be able to investigate abnormal activity as well to comply with regulatory audit requirements. With a proper audit logging system, you can be automatically alerted to potential problems. This allows you to be proactive and check into potential security issues shortly after they've occurred.
Your software components (whether written by you or a third-party) should be enabled to emit critical management and data events. Many examples are listed in the AU-2 control from the IBM Cloud Framework for Financial Services, including:
- Access, downloading, revisions to confidential information
- Access to or update to any financial transaction data
- Any changes to consumer accounts
- Login/logoff event successes and failures
- etc.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Audit and Accountability (AU) | AU-1 Audit and Accountability Policy and Procedures) | Network Access to Privileged Accounts AU-2 Audit Events AU-3 Content of Audit Records AU-11 Audit Record Retention |
System and Information Integrity (SI) | SI-4 Information System Monitoring |
9. Ensure operational logging and monitoring is implemented
Requirement: Ensure operational logging and monitoring is implemented.
Purpose & value: Operational logging and operational monitoring are a very important for ensuring your application runs smoothly. Proper operational logging and monitoring can help you determine if you need to failover to an alternate storage or processing site. In addition, operational logging and monitoring can help you determine if operations have returned to normal after a system disruption.
Operational logging is a complement to audit logs. Audit logs contain auditable events, while operational logs contain everything else that might be logged. For example, an operational log might contain entries that reflect what's happening as a program executes, such as functions getting called, stack traces getting thrown, etc. While this kind of log data is key to keeping a system running smoothly, it's typically not sent to a SIEM.
Operational logs can be split into two categories:
- Application - Log data generated by software components that you deploy and manage within your deployment. This may be from your own code, or other software components like databases, message queues, etc.
- Platform - Log data from IBM Cloud service instances that is not contained in the audit data sent to Activity Tracker Event Routing.
Operational monitoring for gauging system health is a very important complement to monitoring for security and compliance. Operational metrics include measurements for CPU usage, memory usage, API response times, etc.
Most Relevant Controls:
Family | Control |
---|---|
Contingency Planning (CP) | CP-2 (3) Contingency Plan | Resume Essential Missions / Business Functions CP-6 Alternate Storage Site CP-7 Alternate Processing Site CP-10 Information System Recovery and Reconstitution |
System and Information Integrity (SI) | SI-11 Error Handling |
10. Follow secure development processes and ensure software integrity
Requirement: Ensure that security engineering principles are applied in the design, development, implementation, and modification of the system. Maintain software integrity by using signed images, applying security patches, doing vulnerability scans, etc.
Purpose & value: It is critical to follow security engineering principles and manage changes to your service using a system development life cycle (SDLC) that incorporates information security considerations. Doing so minimizes the risk of introducing code or configuration changes that undermine your system security.
Lack of software integrity leaves you vulnerable to security problems. So, it is critical for you to be able to assert the origin of all software components in the system and to keep software up to date. In addition, running regular scans on software to detect vulnerabilities allows you to more quickly take corrective action.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
System and Information Integrity (SI) | SI-7 Software & Information Integrity |
System and Services Acquisition (SA) | SA-3 System Development Life Cycle SA-8 Security Engineering Principles SA-10 Developer Configuration Management SA-10 (1) Developer Configuration Management | Software and Firmware Integrity Verification SA-11 Developer Security Testing and Evaluation SA-15 Development Process, Standards, and Tools SA-15 (9) Development Process, Standards, and Tools | Use of Live Data |
Risk Assessment (RA) | RA-5 Vulnerability Scanning |
11. Encrypt consumer data at rest and in transit
Requirement: Encrypt all consumer data whether at rest or in transit.
Purpose & value: Consumers do not want their data to be accessible to bad actors. Protecting consumer data against unauthorized disclosure, modification, or destruction throughout the data lifecycle is of paramount importance in the IBM Cloud for Financial Services. Cryptographic controls must be in place in all regions and availability zones to protect the confidentiality and integrity of data.
Data at rest is to always be encrypted, and you must use Hyper Protect Crypto Services to manage encryption keys. Hyper Protect Crypto Services allows you to take the ownership of the cloud HSM to fully manage your encryption keys and to perform cryptographic operations using Keep Your Own Key (KYOK). This means you have full control and authority over encryption keys. No one except you (not even IBM) has access to your encryption keys. You should ensure that all Financial Services Validates services which support integration with Hyper Protect Crypto Services are properly configured.
Data in transit should always be encrypted using TLS 1.2 or higher. This includes all traffic in your deployment such as between virtual server instances and between pods in Red Hat OpenShift on IBM Cloud.
Use Hyper Protect Crypto Services for TLS offload for all data that is that is requested from outside of IBM Cloud (inbound traffic to IBM Cloud) and protected by a certificate which is signed by a public Certificate Authority. Configure any web servers to use TLS offload to set up the session, so that the private key never leaves Hyper Protect Crypto Services.
More detailed guidance can be found in the Cryptographic Requirements appendix of the Control Implementation Overview Template for whichever reference architecture you are using.
Implementation guidance:
Most Relevant Controls:
12. Implement business continuity and disaster recovery
Requirement: Implement a system for business continuity and disaster recovery (BCDR). Define Recovery Point Objective (RPO), Recovery Time Objective (RTO), and Maximum Tolerable Downtime (MTD) metrics for essential business functions. Implement your application so that the metrics are achieved.
Purpose & value: Unfortunately, disasters occur that can make some or all of the data centers in a region unavailable. To avoid data loss, you must ensure all important data is backed up to a separate multizone region that can be used to restore service. However, your consumers must be able to opt out of having their data stored in the alternate region based on their data residency requirements.
In addition, you must follow best practices for BCDR as defined by the specific managed services that you use. For example, Direct Link, Hyper Protect Crypto Services, etc.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Contingency Planning (CP) | CP-2 Contingency Plan CP-6 Alternate Storage Site CP-7 Alternate Processing Site CP-9 Information System Backup CP-10 Information System Recovery and Reconstitution CP-10 (2) System Recovery and Reconstitution | Transaction Recovery |
13. Design your application for high availability (recommended)
Requirement: Design your application for high availability (HA) to minimize downtime for your consumers.
This item is unique among the other best practices in that it represents a recommendation rather than a IBM Cloud Framework for Financial Services requirement. The IBM Cloud Framework for Financial Services only requires you to meet the RPO, RTO, and MTD goals that you define (see previous best practice). However, to be best in class, you should strive to be highly available.
Purpose & value: Consumers will be relying on your service for critical operations and have the potential to be deeply impacted by any downtime. To avoid downtime, you should implement redundancy to eliminate single points of failure. It is strongly recommended that you deploy your application to:
- Multiple availability zones within any multizone region you're using. This enables you to ensure if one data center in a region goes down, that your application can continue to function.
- Multiple multizone regions with failover between them. If something catastrophic happens that impacts an entire region, you can failover to another so consumers can still use your service.
In addition, it is recommended that you:
- Implement a solution for autoscaling your deployment before capacity is exceeded.
- Follow best practices for HA as defined by the specific managed services that you use. For example, Direct Link, Hyper Protect Crypto Services, etc.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Contingency Planning (CP) | CP-2 Contingency Plan CP-7 Alternate Processing Site |
System and Communications Protection (SC) | SC-6 Resource Availability |
14. Use endpoint detection and remediation (EDR) tooling to detect malicious code
Requirement: Use endpoint detection and remediation (EDR) tooling that is effective at detecting malicious code in a cloud environment.
Purpose & value: Malicious code includes viruses, worms, Trojan horses, and spyware. The presence of malicious code puts the operation of your service and the customer data you process at risk. It is important to run regular, automated scans of the components of your service to detect these vulnerabilities so that they can be mitigated.
There are many EDR solutions for virtual server instances such as CrowdStrike and Carbon Black. For Kubernetes, relevant tooling includes StackRox, NeuVector, Sysdig Secure, and Twistlock.
Most Relevant Controls:
Family | Control |
---|---|
System and Information Integrity (SI) | SI-3 Malicious Code Protection |
15. Regularly scan for open ports / protocols
Requirement: Regularly scan for open ports / protocols, and ensure only those ports / protocols needed by your service are open.
Purpose & value: As part of the principal of least privilege, you should only open the minimum number of ports / protocols that are needed by your service. Open ports / protocols not needed by your service enable unnecessary threat vectors. So, it is important to regularly scan to make sure the list of open ports / protocols is correct. Tools such as Nmap or Tenable can be used for this purpose.
Most Relevant Controls:
Family | Control |
---|---|
Configuration Management (CM) | CM-7 Least Functionality CM-7 (1)Least Functionality | Periodic Review |
16. Secure and manage secrets and certificates
Requirement: Secrets must be securely protected through their entire lifecycle.
Purpose & value: A secret is any piece of data that is sensitive within the context of an application or service. Secrets include all of the following but are not limited to:
- Passwords of any type (database logins, OS accounts, functional IDs, etc.)
- API keys
- Long-lived authentication tokens (OAuth2, github, IAM, etc.)
- SSH keys
- Encryption keys
- Other private keys (PKI/TLS certificates, HMAC keys, signing keys, etc.)
You should should ensure:
- Secrets are generated and stored in the environment (e.g., dev, test, production) where your service is deployed.
- Secrets never leave their environments (e.g., dev, test, production) and should be secured using access control measures. Service design should minimize the number of machines and people with access to secrets using both authorization and network restrictions based on the principle of least privilege.
- Secrets are rotated in according with the requirements of the IBM Cloud Framework for Financial Services with minimal or no down time.
- Secrets are never stored in source code, configuration files, or documentation.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Access Control (AC) | AC-2 Account Management |
Identification and Authentication (IA) | IA-2 User Identification and Authentication IA-3 Device Identification and Authentication IA-5 Authenticator Management |
17. Tag all IBM Cloud resources with security attributes
Requirement: Tag all IBM Cloud resources with security attributes.
Purpose & value: All IBM Cloud resources in your account should be tagged with security attributes you define. You can apply user tags to organize your resources and easily find them later, to help you with identifying specific team usage or cost allocation, and to control access to your resources without requiring updates to your IAM policies.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Access Control (AC) | AC-16 Security Attributes |
System and Communications Protection (SC) | SC-16 Transmission of Security Attributes |
18. Monitor for security and compliance against a baseline configuration
Requirement: Deploy and use tooling for monitoring and reporting of security and compliance. Maintain a baseline configuration for your service that consists of automated mechanisms to facilitate information system baseline management.
With Security and Compliance Center you can embed security checks into your every day workflows to help monitor for security and compliance. By monitoring for risks, you can identify security vulnerabilities and quickly work to mitigate the impact and fix the issue. By using Security and Compliance Center along with external integrations (such as, OpenShift Compliance Operator (OSCO), Tanium, NeuVector, and so on), you can build a robust approach for monitoring for security and compliance issues.
Purpose & value: Continuous monitoring and reporting helps detect malicious activity and security vulnerabilities as early as possible. Automated baseline management makes it much more likely that any compliance-breaking changes to your system configurations are caught early.
Implementation guidance:
Most Relevant Controls:
Family | Control |
---|---|
Change Management (CM) | CM-2 (2) Baseline Configuration | Automation Support for Accuracy and Currency CM-6 Configuration Settings CM-6 (1) Configuration Settings | Automated Management, Application, and Verification |
Next steps
Learn about the three reference architectures for the IBM Cloud for Financial Services:
- VPC reference architecture
- Satellite reference architecture
- VMware Regulated Workloads reference architecture
Or, if you're intending to be a software provider, see also: