AWS
StackState Self-hosted v5.1.x
Last updated
StackState Self-hosted v5.1.x
Last updated
Amazon Web Services (AWS) is a major cloud provider. This StackPack enables in-depth monitoring of AWS services.
StackState Agent V3 collects all service responses from the target AWS account.
Topology is updated in real time:
Once an hour, all services are queried to gain a full point-in-time snapshot of resources.
Once a minute, Cloudtrail and Eventbridge events are read to find changes to resources.
Logs are retrieved once a minute from CloudWatch and a central S3 bucket. These are mapped to associated components in StackState.
Metrics are retrieved on-demand by the StackState CloudWatch plugin. These are mapped to associated components in StackState.
To set up the StackState AWS integration, you need to have:
AWS CLI version 2.0.4 or later is installed on the environment where StackState is running.
The following AWS accounts:
At least one target AWS account that will be monitored.
An AWS account for StackState and the StackState Agent to use when retrieving data from the target AWS account(s). It's recommended to use a separate shared account for this and not use any of the accounts that will be monitored by StackState, but this isn't required.
A user or role with a policy attached with the action to allow assuming the role stsIntegrationRole in the account that will be monitored. For details see the StackState docs on the required AWS policy.
It's recommended to have two different AWS accounts: One that's being monitored and another used for data collection.
Data Collection Account - used by StackState and StackState Agent to retrieve data from the Monitor Account. Requires an AWS policy granting permissions to assume the role StackStateAwsIntegrationRole
in the Monitor Account.
The policy below grants permission to assume the role StackStateAwsIntegrationRole
that's created in each target AWS Monitor Account when the CloudFormation Stack is deployed. StackState and StackState Agent require this policy to collect data from the AWS Monitor Account. The policy should be created and attached to the AWS user(s) or IAM role(s) that will be used by StackState and the StackState Agent:
The policy can be made available to StackState and the StackState Agent in one of the following ways:
If StackState or StackState Agent run on EC2 or EKS AND the Data Collection Account and Monitor Account are in the same AWS organization:
In all other situations:
StackState Agent collects topology, logs and (if configured) VPC flow logs, and StackState pulls CloudWatch metrics from AWS. If StackState Agent or StackState run in an AWS environment and the Data Collection Account and Monitor Account are in the same AWS organization, an IAM role can be attached to the EC2 instance or EKS pod that they run on and used for authentication. This removes the need to specify an AWS Access Key ID and Secret when a StackPack instance is installed or in the Agent AWS check configuration.
To attach an IAM role and use it for authentication:
Attach the created policy to the relevant IAM role:
To use the IAM role for StackState (CloudWatch metrics): When you install an AWS StackPack instance, set the following parameter values:
AWS Access Key ID: use-role
AWS Secret Access Key: use-role
To use the IAM role for StackState Agent:
Agent on EC2: When you configure the AWS check, leave empty quotes for the parameters aws_access_key_id
and aws_secret_access_key
.
Agent on EKS: When you configure the AWS check as a cluster check (required for an Agent running on Kubernetes), leave empty quotes for the parameters aws_access_key_id
and aws_secret_access_key
in the values.yaml
file used to deploy the Cluster Agent.
The StackState AWS CloudFormation Stack should be deployed in each AWS account that you will monitor. It provides the minimum level of access required for StackState and the StackState Agent to collect topology, telemetry and logs.
The necessary resources can be deployed for one account in a single region using an automated CloudFormation template.
Ireland
Frankfurt
N. Virginia
Ohio
N. California
Hong Kong
Singapore
Sydney
The default StackState CloudFormation template can be used to deploy all necessary resources. It can be deployed to multiple AWS accounts and regions at once by deploying it in a CloudFormation StackSet. It's recommended to use this template as it provides an easy upgrade path for future versions and reduces the maintenance burden compared to creating a custom template.
The template requires the following parameters:
MainRegion - The primary AWS region. This can be any region, as long as this region is the same for every template deployed within the AWS account. Global resources will be deployed in this region such as the IAM role and S3 bucket. Example: us-east-1
.
StsAccountId - The 12-digit AWS account ID to be monitored. This will be the AWS account that the IAM role can be assumed from, to perform actions on the target AWS account. Example: 0123456789012
.
ExternalId - A shared secret that the StackState Agent will present when assuming a role. Use the same value across all AWS accounts that the Agent is monitoring. Example: uniquesecret!1
.
Post fix - Optional. Value to append to all resource names when deploying the stack multiple times in the same account.
Install the AWS StackPack from the StackState UI StackPacks > Integrations screen. You will need to enter the following details, these will be used to configure the StackPack instance within StackState and for StackState to query live telemetry from the AWS account. To create topology in StackState, you will also need to configure the AWS check on StackState Agent V3.
Role ARN - the ARN of the IAM Role created by the cloudFormation stack. For example, arn:aws:iam::<account id>:role/StackStateAwsIntegrationRole
where <account id>
is the 12-digit AWS account ID being monitored.
External ID - a shared secret that StackState will present when assuming a role. Use the same value across all AWS accounts. For example, uniquesecret!1
To enable the AWS check and begin collecting topology and log data from AWS, add the configuration below to StackState Agent V3.
If you don't already have it, you will need to add the StackState helm repository to the local helm client:
Update the values.yaml
file used to deploy the stackstate-agent
with details of your AWS instance:
external_id - The same external ID used to create the CloudFormation stack in every account and region.
role_arn - In the example arn:aws:iam::123456789012:role/StackStateAwsIntegrationRole
, substitute 123456789012 with the target AWS account ID to read.
regions - The Agent will only attempt to find resources in the specified regions. global
is a special region for global resources, such as Route53.
Deploy the checks_agent
using the updated values.yaml
:
Kubernetes:
OpenShift:
VPC FlowLogs can be analysed to retrieve relations between EC2 instances and RDS database instances. For each VPC that you want to analyse, a FlowLog needs to be configured. The process of adding FlowLogs for new VPCs could be automated using a Lambda triggered by a CloudTrail event that creates the FlowLog. Relations will be retrieved for EC2 instances and RDS database instances with a static public or private IP address and emit the proper URNs. For public IP addresses urn:host:/{ip-address}
, for private IP addresses the URN has the form urn:vpcip:{vpc-id}/{ip-address}
.
To configure a VPC FlowLog from the AWS console:
From the VPC Dashboard, choose Your VPCs under VIRTUAL PRIVATE CLOUD.
Select the VPC that you want to configure.
Select Flow logs on the lower TAB-bar.
Click Create flow log.
Add the settings as shown in the screenshot.
StackState and the StackState Agent require access to the internet to call the AWS APIs. If direct internet access can't be given, an HTTP proxy can be used to proxy the API calls.
In the StackState AWS integration, CloudWatch metrics are pulled directly by StackState, while events and topology data are collected by the StackState Agent. This means that proxy details must be configured in two places to handle all requests - StackState for CloudWatch metrics and StackState Agent for topology and events data.
To configure a proxy for CloudWatch metrics collected by StackState, follow the steps below:
In the StackState UI, go to Settings > Telemetry Sources > CloudWatch sources.
Find the CloudWatch source for which you want to configure a proxy. Open the ... menu to the right and select Edit.
Enter the proxy details in Proxy URI.
Click TEST CONNECTION to check that the proxy can be used to successfully connect to CloudWatch.
To check the status of the AWS integration, run the status subcommand and look for aws_topology
under Running Checks
:
The AWS integration retrieves the following data:
The AWS StackPack supports the following event:
EC2 Instance Run State: when the instance is started, stopped, or terminated. This will appear as the Run State in the EC2 instance component.
AWS events are primarily used to provide real-time updates to topology. These events aren't displayed as StackState events.
Metrics data is pulled at a configured interval directly from AWS by the StackState CloudWatch plugin. Retrieved metrics are mapped onto the associated topology component.
API Gateway
Method
SQS Queue, Lambda Function
API Gateway
Method - HTTP Integration
API Gateway
Resource
API Gateway Method
API Gateway
Rest API
API Gateway Stage
API Gateway
Stage
API Gateway Resource
Auto Scaling
Group
EC2 Instance, Classic Load Balancer, Auto Scaling Target Group
CloudFormation
Stack
All Supported Resources*, Nested CloudFormation Stack
DynamoDB
Stream
DynamoDB
Table
DynamoDB Stream
EC2
Instance
EC2 Security Group
EC2
Security Group
EC2 Instance
EC2
Subnet
EC2 Instance, EC2 VPC
EC2
VPC
EC2 Security Group, EC2 Subnet
EC2
VPN Gateway
EC2 VPC
ECS
Cluster
EC2 Instance, ECS Service, ECS Task, Route53 Hosted Zone
ECS
Service
Load Balancing Target Group, ECS Task
ECS
Task
Kinesis
Data Stream
Kinesis Firehose Delivery Stream
Kinesis
Firehose Delivery Stream
S3 Bucket
Lambda
Alias
Lambda
Function
All Supported Resources* (Input), EC2 VPC, Lambda Alias, RDS Instance**
Load Balancing
Application Load Balancer
EC2 VPC, Load Balancing Target Group, Load Balancing Target Group Instance
Load Balancing
Classic Load Balancer
EC2 Instance, EC2 VPC
Load Balancing
Network Load Balancer
EC2 VPC, Load Balancing Target Group, Load Balancing Target Group Instance
Load Balancing
Target Group
EC2 VPC
Load Balancing
Target Group Instance
EC2 Instance
RDS
Cluster
RDS Instance
RDS
Instance
EC2 VPC, EC2 Security Group
Redshift
Cluster
EC2 VPC
Route53
Domain
Route53
Hosted Zone
S3
Bucket
Lambda Function
SNS
Topic
All Supported Resources*
SQS
Queue
Step Functions
Activity
Step Functions
State
Step Functions (All), Lambda Function, DynamoDB Table, SQS Queue, SNS Topic, ECS Cluster, Api Gateway Rest API
Step Functions
State Machine
Step Functions (All)
* "All Supported Resources" - relations will be made to any other resource on this list, should the resource type support it.
** This relation is made by finding valid URIs in the environment variables of the resource. For example, the DNS hostname of an RDS instance will create a relation.
OpenTelemetry creates traces from the AWS services that your Lambdas interacts with. Retrieved traces are available in the Traces Perspective and are also used to enhance the retrieved topology.
Hourly and event-based updates collect data:
Hourly full topology updates - collected by the StackState Agent using an IAM role with access to the AWS services.
Event-based updates for single components and relations - captured using AWS services and placed into an S3 bucket for the StackState Agent to read.
If the StackState Agent doesn't have permission to access a certain component, it will skip it.
The bare minimum necessary to run the StackState Agent is an IAM role with necessary permissions. The Agent will always attempt to fetch as much data as possible for the supported resources. If a permission is omitted, the Agent will attempt to create a component with the data it has.
For example, if the permission s3:GetBucketTagging
is omitted, the Agent will fetch all S3 buckets and their associated configuration, but the tags section will be empty.
Once the Agent has finished reading a file in this bucket, the file will be deleted. Don't use an existing bucket for this, the Agent should have its own bucket to read from. The S3 bucket won't be read from if it doesn't have bucket versioning enabled, to protect data.
The S3 bucket is used to store all incoming events from EventBridge and other event-based sources. The Agent then reads objects from this bucket. These events are used for features such as real-time topology updates, and creating relations between components based on event data such as VPC FlowLogs. If the S3 bucket isn't available to the Agent it will fall back to reading CloudTrail directly, which introduces a 15-minute delay in real-time updates. EventBridge events and VPC FlowLogs are only available via the S3 bucket.
A catch-all rule for listening to all events for services supported by the AWS StackPack. All matched rules are sent to a Kinesis Firehose delivery stream.
Kinesis Firehose is used to receive and batch events from EventBridge. This delivery stream batches events per 60 seconds and pushes an object to S3. 60 seconds is the recommended value - setting this value any higher will negligibly decrease storage costs while increasing the delay in topology updates.
The Prefix must be set to AWSLogs/${AccountId}/EventBridge/${Region}/
, where ${AccountId}
and ${Region}
are the account ID and region, for example, eu-west-1. Files must be compressed using the GZIP option.
A KMS Customer Managed Key (CMK) can be used to secure data at rest in S3. The KMS key is used in the Firehose Delivery Stream. The S3 bucket also uses the KMS key as its default key.
Use of a KMS is key isn't necessary for the operation of the StackPack, however as encryption at rest is a requirement in most environments, the CloudFormation template includes this by default.
VPC FlowLogs support is currently experimental.
A VPC configured to send flow logs to the stackstate-logs-${AccountId}
S3 bucket. The Agent requires the AWS default format for VPC FlowLogs, and expects data to be aggregated every 1 minute. The FlowLogs contain meta information about the network traffic inside VPCs. Only private network traffic is considered, traffic from NAT gateways and application load balancers will be ignored.
S3 objects that have been processed will be deleted from the bucket to make sure they won't be processed again. On the default S3 bucket, object versioning is enabled, this means objects won't actually be immediately deleted. A lifecycle configuration will expire (delete) both current and non-current object versions after one day. When using a non default bucket, you can set these expiry periods differently.
If configuring FlowLogs using CloudFormation, the stackstate-resources
template exports the ARN of the S3 bucket it creates, so this can be imported into your template.
The AWS StackPack CloudFormation template with all resources necessary to run the AWS check on the StackState Agent. The installed resources are kept as minimal as possible. All costs incurred are minimal but variable, with costs scaling depending on how many events are emitted in a given account. In practice, the costs created by the AWS integration will be negligible.
AWS - [instance_name] - All - includes all resources retrieved from AWS by the StackPack instance.
AWS - [instance_name] - Infrastructure - includes only Networking, Storage and Machines resources retrieved from AWS by the StackPack instance.
AWS - [instance_name] - Serverless - includes only S3 buckets, lambdas and application load balancers retrieved from AWS by the StackPack instance.
For example, in the StackState Topology Perspective:
Components of type aws-subnet have the action Go to Subnet console, which links directly to this component in the AWS Subnet console.
Components of type ec2-instance have the action Go to EC2 console, which links directly to this component in the EC2 console.
The following labels will be added to imported AWS topology in StackState:
stackpack:aws-v2
All tags that exist for the imported element in AWS.
The special tags listed below can be added in AWS to influence how the imported topology is built in StackState:
stackstate-identifier
- The specified value will be added as an identifier to the StackState component.
stackstate-environment
- The StackState component will be placed in the specified environment.
Check the StackState support site for:
To uninstall the StackState AWS StackPack, click the Uninstall button from the StackState UI StackPacks > Integrations > AWS screen. This will remove all AWS specific configuration in StackState.
To delete the StackState AWS Cloudformation stack from an AWS account using the AWS web console:
Disable the EventBridge rule:
Go to EventBridge.
Find and open the rule with a name that starts with stackstate-resources-StsEventBridgeRule
.
Click the Disable button.
Delete all FlowLogs that send to this bucket:
Go to the VPC service.
Select each VPC in the VPCs list.
Look in the FlowLogs tab in the Details section.
Delete any FlowLogs that are sent to the S3 bucket starting with stackstate-logs
.
Delete objects in the S3 bucket:
Go to the S3 service.
Select (don't open) the bucket named stackstate-logs-${AccountId}
where ${AccountId}
is the 12-digit identifier of your AWS account.
Select Empty and follow the steps to delete all objects in the bucket.
Delete the CloudFormation template:
Go to the CloudFormation service.
Select the StackState CloudFormation template. This will be named stackstate-resources
if created via the quick deploy method, otherwise the name was user-defined.
In the top right of the console, select Delete.
To delete the StackState AWS Cloudformation stack from an AWS account using the AWS CLI:
Set the region to remove StackState resources from:
Set the S3 bucket that will be deleted:
Disable the EventBridge rule:
Delete all FlowLogs that send to this bucket:
Delete the CloudFormation template:
AWS StackPack v1.2.1 (2022-06-10)
Improvement: Documentation updated.
AWS StackPack v1.2.0 (2022-03-03)
Improvement: Added OpenTelemetry information STAC-15902
AWS StackPack v1.1.4 (2021-11-16)
Improvement: Updated AWS CLI prerequisite text
are retrieved once a minute from the configured S3 bucket. Private network traffic inside VPCs is analysed to create relations between EC2 and RDS database components in StackState.
Communication between NodeJS Lambda functions and the AWS services that they communicate with is monitored using .
AWS is a .
installed on a machine which can connect to both AWS and StackState.
Monitor Account - used to . The CloudFormation stack will create an IAM role that has the permissions required to retrieve data from this Monitor Account (StackStateAwsIntegrationRole
).
.
StackState: Attach the policy to the user .
StackState Agent: Attach the policy to the user .
Note: The AWS Data Collection Account and Monitor Account must be a part of the same AWS organization to be able to authenticate using an IAM role in this way. For details, see the AWS documentation on .
If you did not already do so, in AWS, .
- Deploy all resources to a region in an account using a link.
- Download the StackState CloudFormation template to integrate into your own deployment workflow.
For special environments where the CloudFormation template may not function correctly, advanced AWS users can refer to the for a reference on all resources that must be manually created..
It's recommended to use the wherever possible as this provides an easy upgrade path for future versions and reduces the maintenance burden.
The table below includes links to deploy the template in popular AWS regions. For any regions not listed, follow the steps described for the .
To use , set the IncludeOpenTelemetryTracing
value to true
.
IncludeOpenTelemetryTracing - Default: disabled
. Set to enabled
to include the OpenTelemetry layer in your deployment. Required to .
For more information on how to use StackSets, check the AWS documentation on .
AWS Access Key ID - The Access Key ID of the IAM user that will be used by StackState to collect CloudWatch metrics. This is the same as the IAM user used by the Agent to collect topology data and logs from AWS. If StackState is running within AWS, it may also be possible to .
AWS Secret Access Key - The Secret Access Key of the IAM user that will be used by StackState to collect CloudWatch metrics. This is the same as the IAM user used by the Agent to collect topology data and logs from AWS. If StackState is running within AWS, it may also be possible to .
If StackState Agent is running on Kubernetes, the AWS check should be configured as a .
aws_access_key_id - The AWS Access Key ID. Leave empty quotes to .
aws_secret_access_key - The AWS Secret Access Key. Leave empty quotes to .
aws_access_key_id - The AWS Access Key ID. Leave empty quotes to .
aws_secret_access_key - The AWS Secret Access Key. Leave empty quotes to .
to apply the configuration changes.
For further details, see .
To configure a proxy for events and topology data collected by StackState Agent, see how to .
Click UPDATE to save the proxy settings. Be aware that you may need to it before this succeeds.
The AWS service data shown below is available in StackState as components with the associated relations. The retrieved topology can be further enhanced by enabling .
➡️
A high-level of overview of all resources necessary to run the StackState Agent with full capabilities is provided in the graph below. Users with intermediate to high level AWS skills can use these details to set up the StackState Agent resources manually. For the majority of installations, this isn't the recommended approach. Use the provided unless there are environment-specific issues that must be worked around.
- Give permission for EventBridge to send data to Kinesis Firehose
- Gives permission for Firehose to send data to an S3 bucket.
.
A for each VPC that you want to analyse.
Kinesis Firehose: priced by the amount of data processed. Events use very small amounts of data.
S3: priced by amount of data stored, and amount of data transferred. Running the Agent inside of AWS will reduce data transfer costs.
KMS: a flat fee of $1 per month per key, with additional costs per request.
CloudWatch metrics: priced per metric retrived. Metrics are only retrieved when viewed or when a check is configured on a CloudWatch metric.
When the AWS integration is enabled, three will be created in StackState for each instance of the StackPack.
Components retrieved from AWS will have an additional available in the component context menu and in the right panel details tab - Component details - when the component is selected. This provides a deep link through to the relevant AWS console at the correct point.
All tags
specified for the associated instance in the . You can add a custom label to all topology imported by an instance of the AWS StackPack by adding it to the Agent AWS check configuration.
Note that topology with the label stackpack:aws
was imported by the .
.
.
To clean up the remaining resources inside your AWS account, remove any configured VPC flow logs and delete the StackState AWS Cloudformation stack from the AWS account being monitored. This can be done using the or the .
The steps below assume that you already have the AWS CLI installed and configured with access to the target account. If not, follow the AWS documentation to .
Delete objects in the S3 bucket. This is a versioned S3 bucket, so each object version will be deleted individually. Note that if there are more than 1000 items in the bucket this command will fail, it's likely more convenient to perform this in the :
Find out how to .