IBM Cloud Docs
AI summarization using highly resilient serverless architecture

AI summarization using highly resilient serverless architecture

The AI summarization using highly resilient serverless architecture pattern describes an internet-facing web application that is deployed using IBM Cloud® Code Engine serverless platform in two IBM Cloud regions. By provisioning an application in two regions, user requests are served in an active-active manner and if an outage in one region occurs, the second region continues to serve user requests.

Architecture diagram

Network Diagram
Figure 1. AI Summarization using highly resilient serverless architecture network diagram

  • Code Engine abstracts the operational burden of building, deploying, and managing workloads in Kubernetes so that developers can focus on what matters most to them: the source code.

  • By default, Code Engine applications are deployed within a single zone of a region. If a failure of the hosting zone occurs, the workload is automatically re-created in one of the remaining zones of that region, providing high availability at a regional level.

  • This pattern provides additional protection by provisioning an application in two regions of IBM Cloud.

  • IBM Cloud® Container Registry provides a multi-tenant private image registry that is used to store container images. Code Engine application configuration points to this Container Registry for image reference.

  • IBM Cloud® Monitoring service is setup to monitor your Code Engine workloads. Code Engine forwards selected information about your workloads to the monitoring platform so that you can monitor specific metrics such as requests, revisions, and durations with this monitoring that is configurable.

  • An IBM Cloud® Log Analysis service instance is also setup where Code Engine application logs are forwarded to for help with troubleshooting issues.

  • This pattern uses a global content delivery network (CDN) called IBM Cloud® Internet Services and its global load balancer capability to provide global endpoint for the web application.

  • Origin pools are created for each region application endpoint, to which traffic is intelligently routed to when attached to a global load balancer.

  • A health check is created to gain insight into the availability of pools so that traffic is always routed to healthy ones making the app highly available.

  • By setting Traffic Steering to ‘Geo’, the load balancer directs incoming traffic to origin pools based on the client’s region or point of presence.

  • Code Engine provides immediate Distributed Denial of Service (DDoS) protection for your application. Code Engine's DDoS protection is provided by Cloud Internet Services (CIS) at no additional cost. DDoS protection covers System Interconnection (OSI) Layer 3 and Layer 4 (TCP/IP) protocol attacks. Layer 7 protection is implemented by configuring WAF (Web Application Firewall) ruleset sensitivity and response behavior, adding rate limiting and adding firewall rules. WAF is also included as part of Cloud Internet Services (CIS) at no additional charge.

Design scope

Following the Architecture Framework, the AI summarization using highly resilient serverless architecture pattern covers design considerations and architecture decisions for the following aspects and domains:

  • Data: Artificial Intelligence

  • Compute: Serverless

  • Networking: Isolation, Cloud Native Connectivity, Load Balancing, DNS

  • Security: Data Security, Identity and Access, Application Security, Infrastructure Security

  • DevOps: Code Repository

  • Resiliency: High Availability, Disaster Recovery

  • Service management: Monitoring, Logging, Auditing

Design Scope
Figure 2. AI summarization using highly resilient serverless architecture design scope

The Architecture Framework provides a consistent approach to design cloud solutions by addressing requirements across a set of aspects and domains, which are technology-agnostic architectural areas that need to be considered for any enterprise solution. For more information, see Introduction to the architecture framework.

Requirements

The following represents a baseline set of requirements that are applicable to most clients and critical to successful AI summarization using highly resilient serverless architecture pattern deployment.

Table 1. AI Summarization using highly resilient serverless architecture requirements
Aspect Requirement
Data Provide a way to perform summarization of input.
Compute Provide different levels of CPU and memory options to match the type of workload.
Network Provide connectivity to web applications from public internet or privately in IBM Cloud private network by using Virtual Private Endpoint (VPE) for VPC.
Provide isolation with the ability to deploy application with different levels of visibility such as at project level, public, or private.
Security Grant access to other users for Code Engine by using IBM Cloud® Identity and Access Management.
Provide DDoS and Layer-7 attack protection for web application.
Provide a method to store and access sensitive configuration information, such as secrets, passwords, and keys.
DevOps Provide a private image repository with vulnerability scanning capability
Resiliency Deploy the application across multiple regions to make it resilient to regional failures.
Service management Provide Health and System monitoring with ability to monitor and correlate performance metrics, events, and provide alerting across applications and infrastructure.
Ability to diagnose issues and exceptions and identify error source. Get insight into the performance of your workloads

Components

Table 2. AI Summarization using highly resilient serverless architecture components
Aspect Component How the component is used
Data IBM watsonx.ai Summarization is performed by the IBM Developed GRANITE model, which is hosted and used by IBM watsonx.ai, bringing together new generative AI capabilities powered by foundation models and traditional machine learning (ML) into a powerful studio spanning the AI lifecycle.
Compute Code Engine Abstracts the operational burden of building, deploying, and managing workloads in Kubernetes.
Networking Global Load Balancer Public load balancing of web server traffic across regions
IBM Cloud® DNS Services The Domain Name System (DNS) to associate human-friendly domain names with IP addresses
Security Secrets A secret provides a method to include sensitive configuration information, such as passwords or SSH keys, to your deployment. By referencing values from your secret, you can decouple sensitive information from your deployment to keep your app, function, or job portable.
Cloud Identity and Access Management Cloud Identity and Access Management
Cloud Internet Services (CIS) Code Engine provides immediate DDoS protection for your application. DDoS protection covers System Interconnection (OSI) Layer 3 and Layer 4 (TCP/IP) protocol attacks, but not Layer 7 attacks. CIS provides capability to further add security against Layer 7 attacks by configuring WAF rulesets and WAF firewall rules.
Resiliency Cloud Internet Services (CIS) For highly resilient application, Cloud Internet Services (CIS) Global Load Balancer provides your application a global endpoint. The origin pools serves as a backend target to the Load Balancer. Origin pools are setup for each of the regions where an application is deployed.
Service management IBM Cloud Monitoring IBM Cloud Monitoring service to monitor Code Engine workloads. Code Engine forwards selected information about your workloads to monitoring so that you can monitor specific metrics such as requests, revisions, and duration.
Log Analysis Code Engine apps, jobs, functions, or builds in the console with logging enabled, logs are forwarded to an Log Analysis service where they are indexed, enabling full-text search through all generated messages and convenient querying based on specific fields.
IBM Cloud® Activity Tracker View, manage, and audit user-initiated activities made in your Code Engine service instance by using the Activity Tracker service.