Monitoring a Linux bare metal server
You can monitor a Bare Metal server with IBM Cloud Monitoring by configuring a monitoring agent in your server. The monitoring agent uses an access key (token) to authenticate with the IBM Cloud Monitoring instance. The monitoring agent acts as a data collector. It automatically collects metrics. You view metrics via the web-based user interface. You can monitor Bare metals in IBM Cloud, on-prem, and in other clouds.
By default, this agent collects core infrastructure and network time series that you can use to monitor the host. For a list of collected metrics, see Metrics Available for non-orchestrated environments.
The Monitoring agent automatically collects the following types of system metrics per host:
-
System hosts metricsprovide information about CPU, memory, and storage usage metrics, that you can use to analyze the performance and resource utilization of all your processes. -
File and File System metricsprovide information about files and file system that you can use to analyze file interactions that occur in your system. For example, you can find information about your open files, bytes going in and out, or the percentage of usage of a given file system. -
Process metricsprovide information about the processes that run in your servers. For example, you can use these metrics to explore the number of processes, or get client or server information. -
Network metricsprovide information about the network. They offer insight to the connections that are established between your applications, containers, and servers. For example, you can find information about the bytes that are being sent or received, or the number of HTTP requests, connections, and latency. In addition, for SQL or MongoDB, the agent collects additional information when it is configured in troubleshooting mode.
Through the Monitoring UI, you can analyze data in the Advisor tab, the Explore tab, and in the Dashboard tab. You monitor the data through metric views and dashboards.
Consider the following information when monitoring your data:
-
In the Explorer tab, you can monitor individual metrics.
-
In the Advisor tab, you can monitor Red Hat OpenShift or host level metrics.
This tab is only available for users that belong to a team that has access to monitor Red Hat OpenShift or host level metrics.
-
In the Dashboard tab, you can monitor through panels predefined dashboards or custom ones and get a specialized insight into network data, application data, topology, services, hosts, and containers. A panel displays a metric or group of metrics in a dashboard.
For each metric view and dashboard, you can define the scope of the data, how to aggregate data, and what time and group filters to apply to the data. For more information, see Managing panels.
You can configure a dashboard as the default entry point for a team, unifying a team's experience, and allowing users to focus their immediate attention on the most relevant information for them.
For more information, see Viewing metrics.
Before you begin
-
Install the IBM Cloud CLI. For more information, see Installing the IBM Cloud CLI.
-
Provision an IBM Cloud Monitoring instance from the catalog.
-
Provision a bare metal server.
To complete the steps in this topic, ensure you have internet access from the bare metal. This is needed for configuring the monitoring agent.
-
Configure a VPN connection between your terminal and the bare metal server
Virtual Private Networking (VPN) access enables users to manage all servers remotely and securely over the IBM Cloud® private network. A VPN connection from your location to the private network allows out-of-band management and server rescue through an encrypted VPN tunnel. VPN tunnels can be initiated to any IBM Cloud data center or PoP allowing you geographic redundancy.
Complete the following steps to configure a VPN connection between your terminal and the bare metal server:
-
Depending on your operating system, download the latest
MotionPro32-bit or 64-bit files from the Array Networks Clients and Tools download site. Learn more. -
Configure a standalone SSL VPN client and open a connection:
For example, if you use the MotionPro Plus client for MacOS, to add a profile, click Add.
In the
Basicsection, enter aTitle. Enter aGateway, for example, for a bare metal in Dallas 10, entervpn.dal10.softlayer.com. Enter your VPN user name. Check that thePortis set to443. Then, click OK.To open a secure connection, click Login.
-
Connect to a bare metal server by using SSH
You might require a VPN to access your system depending on your security setup and
sshconfiguration on the bare metal host.You must
sshto the host by using your credentials, or the root credentials that are available from the IBM Cloud Console.You will require root permissions in order to install the monitoring agent.
For example, you can complete the following steps to get the bare metal information that you need to
sshinto the server:-
Click the Menu icon
> Classic Infrastructure > Device List.
-
Identify the bare metal server that you want to monitor. Copy the Public IP.
-
Click the bare metal server device name.
-
Select Passwords. Copy the password for the root user.
Then, from a terminal, run the following command:
ssh <USER_ID>@<IP_ADDRESS>Where:
<USER_ID>is the user ID that you use to log in to the bare metal server. For example, useroot.<IP_ADDRESS>is the public IP address of the bare metal server.For example:
ssh root@45.123.122.12
Configure a monitoring agent to collect metrics from the bare metal server
You must install a monitoring agent to collect and forward metrics from a bare metal server to an IBM Cloud Monitoring instance.
Complete the following steps from the command line to install a monitoring agent:
-
Obtain the access key. For more information, see Getting the access key through the IBM Cloud UI.
-
Obtain the ingestion URL. For more information, see collector endpoints.
-
Deploy the monitoring agent. Run the following command:
curl -sL https://ibm.biz/install-sysdig-agent | sudo bash -s -- --access_key ACCESS_KEY --collector COLLECTOR_ENDPOINT --collector_port 6443 --secure true --tags TAG_DATA --additional_conf 'sysdig_capture_enabled: false'Where
-
ACCESS_KEY is the ingestion key for the instance.
-
COLLECTOR_ENDPOINT is the ingestion URL for the region where the monitoring instance is available.
-
TAG_DATA are comma-separated tags that are formatted as TAG_NAME:TAG_VALUE. You can associate one or more tags to your monitoring agent. For example, role:serviceX,location:us-south. Later on, you can use these tags to identify metrics from the environment where the agent is running.
-
The SECURE flag must be set to true to use a secure SSL/TLS connection to send metrics to the collector.
-
Set sysdig_capture_enabled to false to disable the capture feature. By default is set to true. For more information, see Working with captures.
If
cURLis not available, you must install it. For example, for an Ubuntu bare metal, run the following command:sudo apt-get update. Then, run the install command:sudo apt-get install curl.For example, see the following sample command to install a monitoring agent that forwards metrics to a monitoring instance in US South (Dallas):
curl -sL https://ibm.biz/install-sysdig-agent | sudo bash -s -- -a xxxxxxxxxxxxx -c ingest.us-south.monitoring.cloud.ibm.com --collector_port 6443 --secure true -ac "sysdig_capture_enabled: false" --tags sourceType:baremetal,location:dallas -
-
Configure the agent for non-orchestrated environments.
Open the
dragent.yamlfile that is located in/opt/draios/etc/.Add the following configuration parameter:
feature: mode: monitor_lightRestart the agent. Run the following command:
service dragent restart
Launch the monitoring UI to verify that you are getting data to monitor the bare metal server
Complete the following steps to launch the web UI:
-
Log in to your IBM Cloud account.
After you log in with your user ID and password, the IBM Cloud console opens.
-
Click the Menu icon
> Observability.
-
Select Monitoring.
The list of instances that are available on IBM Cloud is displayed.
-
Select your instance. Then, click Open dashboard.
It may take some time before you see the bare metal entry while the information is initally collected and processed by the monitoring agent.
You only can monitor one instance per browser. You could have multiple tabs for the same instance.
Monitor your bare metal
In the Advisor tab, you can monitor and troubleshoot the health, risk, and capacity of hosts and Kubernetes clusters.
- Data is refreshed every 10 minutes.
- Metrics are prioritized by event count and severity.
- For more information, see Advisor.
In the Advisor section, choose to monitor by host. Check out the predefined dashboards that you can use to monitor the health of your resources.
When you choose to monitor by host, you can choose any of the following dashboards:
- Host Resource Usage
- File System Usage & Performance
- Memory Usage
- Network
- Sysdig Agent Health & Status
[Optional] Configure the Prometheus IPMI Exporter to monitor sensor metrics
In addition to the set of metrics that are automatically collected by the monitoring agent, you might want to collect other metrics such as sensor metrics. You can use the Prometheus IPMI Exporter to perform the collection of Intelligent
Platform Management Interface (IPMI) device sensor metrics from the bare metal server.
- The Prometheus IPMI Exporter exporter supports local IPMI devices and remote devices that can be accessed by using Remote Management Control Protocol (RMCP).
- When you use RMCP to access remote devices, you can use an IPMI exporter to monitor multiple IPMI devices. You identify each device by passing the target hostname as a parameter.
- The IPMI exporter relies on tools from the FreeIPMI suite.
You can collect the following metrics when you configure the IPMI exporter in a bare metal server:
-
IPMI admin metrics
The metric
ipmi_up {collector="<NAME>"}reports1when data from a remote host is collected successfully. It reports0for collection of data in a local host.The metric
ipmi_scrape_duration_secondsreports the amount of time that it takes the collector to retrieve the data. -
IPMI System event log (SEL) metrics
The metric
ipmi_sel_entries_countreports the number of entries in the system event log.The metric
ipmi_sel_free_space_bytesreports the number of free bytes for new ystem event log entries. -
IPMI sensor data
The IPMI exporter collects 2 metrics per sensor type: state and value. A value of
0reports a normal state. A value of1reports a warning state. A value of2reports a critical state. A value ofNaNreports information not available. For example, see the metrics for different sensors:Temperature sensor metrics:
ipmi_temperature_celsius,ipmi_temperature_stateFan speed sensor metrics:
ipmi_fan_speed_rpm,ipmi_fan_speed_stateVoltage sensor metrics:
ipmi_voltage_state,ipmi_voltage_volts -
IPMI chassis power state of the machine
The metric
ipmi_chassis_power_stateinforms about the current state of the chassis of the machine. It has a value of1when the power is on. It has a value of0when the power is off. -
DCMI data
The metric
ipmi_dcmi_power_consumption_current_wattsinforms about the live power consumption of the machine in Watts. -
BMC details
The metric ipmi_bmc_info includes information about the firmware revision and manufacturer in labels and has a value of
1.
For more information, see Prometheus IPMI Exporter.
Complete the following steps to configure the Prometheus IPMI Exporter:
Install the Prometheus IPMI exporter
Complete the following steps:
-
From a local terminal, download the Prometheus IPMI exporter.
-
In the bare metal server, from the
shhsession, create the directory/usr/monitor. Run the following commands:cd /usrmkdir monitor -
Copy the file to the bare metal. From the directory where the file is available, run the following command:
scp ipmi_exporter-v1.2.0.linux-amd64.tar.gz root@<IP_ADDRESS>:/usr/monitor/Where
<IP_ADDRESS>is the public IP address of the bare metal server.If the command fails, check that your VPN connection is still open.
-
In the bare metal server, from the
shhsession, uncompress the file. Run the following commands:cd /usr/monitor/tar -xvf ipmi_exporter-v1.2.0.linux-amd64.tar.gz -
In the bare metal server, from the
shhsession, install the FreeIPMI suite. Run the following commands:sudo apt-get updatesudo apt-get install freeipmi -
In the bare metal server, from the
shhsession, check theipmi_local.ymlfile. Optionally, you can update the file to exclude sensors that you do not want to monitor.Change to the directory where you have extracted the IPMI exporter:
cd ipmi_exporter-v1.2.0.linux-amd64/Check the configuration file. Run the command:
more ipmi_local.ymlYou should see a file with similar content.# Configuration file for ipmi_exporter # This is an example config for scraping the local host. # In most cases, this should work without using a config file at all. modules: default: # Available collectors are bmc, ipmi, chassis, dcmi, and sel collectors: - bmc - ipmi - dcmi - chassis - sel # Got any sensors you don't care about? Add them here. exclude_sensor_ids: # - 2 -
In the bare metal server, from the
shhsession, run the IPMI exporter../ipmi_exporter --config.file=ipmi_local.yml & -
Check the IPMI exporter is running. Run the command:
ps -aux | grep ipmiYou should see the IPMI exporter running.
Install the Prometheus exporter
The monitoring agent automatically collects metrics from Prometheus exporters. Therefore, to collect metrics from your IPMI exporter, you must also configure the Prometheus exporter.
Complete the following steps to run the Prometheus exporter:
-
From a local terminal,download the Prometheus exporter.
-
In the bare metal server, from the
shhsession, change to the directory/usr/monitor/. Run the following command:cd /usr/monitor/ -
Copy the file to the bare metal. From the directory where the file is available, run the following command:
scp prometheus-2.18.1.linux-amd64.tar.gz root@<IP_ADDRESS>:/usr/monitor/Where
<IP_ADDRESS>is the public IP address of the bare metal server.If the command fails, check that your VPN connection is still open.
-
In the bare metal server, from the
shhsession, uncompress the file. Run the following commands:cd /usr/monitor/tar -xvf prometheus-2.18.1.linux-amd64.tar.gz -
Modify the
prometheus.ymlfile to include information about the scrape_configuration for the IPMI exporter.Change to the Prometheus directory:
cd prometheus-2.18.1.linux-amd64/Edit the
prometheus.ymlfile and add the section scrape_configs:# my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: ipmi metrics_path: '/metrics' scheme: http static_configs: - targets: ['localhost:9290'] labels: instance: baremetal01 region: us-south -
Run the Prometheus exporter:
./prometheus &
Configure network settings
If you want to collect metrics from remote servers, complete the following steps:
-
Enable the firewall to allow access to the
ipmi_exporter. -
[Optional] Update the VPC rules
If you use private endpoints, add an inbound rule to the security group for port
9290withsource type = Security Groupand choose the security group for the bare metal server.
Update the monitoring agent that is running in the bare metal server
Complete the following steps:
-
In the bare metal server, from the
shhsession, change to the directory/opt/draios/etc/. Run the following command:cd /opt/draios/etc/ -
Update the
/opt/draios/etc/dragent.yaml.Append the following section to the
dragent.yamlfile:prometheus: enabled: true interval: 30 log_errors: true max_metrics: 3000 max_metrics_per_process: 3000 max_tags_per_metric: 20 process_filter: - include: port: 9090 conf: port: 9090 path: "/metrics" - include: port: 9290 conf: port: 9290 path: "/metrics" -
Restart the monitoring agent. Run the following command:
service dragent restart
Verify that you can see the prometheus ipmi metrics
Complete the following steps:
-
Click the Menu icon
> Observability.
-
Select Monitoring.
-
Identify the monitoring instance that you created. Then, click Open dashboard.
-
In the
Exploreview, select Hosts and Containers. Then, select the bare metal server that you want to monitor.
Hosts and Containers view -
Open the option to select more Dashboards and Metrics . Then, enter in the search bar ipmi. The list of IPMI metrics is displayed.
IPMI metrics
Configure a dashboard to analyze the IPMI status of your Bare metal
To create a dashboard to monitor the IPMI metrics, complete the following steps:
-
Select the
ipmi_upmetric.
ipmi_up metrics -
Select the 3 dots icon. Then, select Copy to dashboard.
Copy dashboard -
Enter the name [Bare Metal] IPMI monitoring. Then, click Copy and Open.
Copy and open a dashboard The dashboard opens.
IPMI custom dashboard -
Add more IPMI metrics to the [Bare Metal] IPMI monitoring custom dashboard. Repeat the steps for each of the IPMI metrics that you want to monitor.
-
Drag and drop, and resize panels to get the dashboard layout that you want. Save the layout.
Next steps
-
Create a custom dashboard. For more information, see Working with dashboards.
-
Learn about alerts. For more information, see Working with alerts.
-
Learn how to manage logs. See Getting started with IBM Cloud Logs.
-
Learn about the IBM Cloud Monitoring Workload Protection functionality to find and prioritize software vulnerabilities, detect and respond to threats, and manage configurations, permissions and compliance from source to run. See Getting started with IBM Cloud® Security and Compliance Center Workload Protection.