How to Send Amazon CloudWatch logs to EraSearch for Cost-Effective Log Management

This blog will walk through the AWS and Vector configuration for centralizing CloudWatch logs in EraSearch.

image of How to Send Amazon CloudWatch logs to EraSearch for Cost-Effective Log Management

Centralizing management of your logs from cloud environments in EraSearch provides a holistic view of operations at any point in time. While CloudWatch lets modern IT teams collect and monitor logs across the AWS application architecture and infrastructure, using CloudWatch limits teams to a specific use case – visibility into AWS products. In today’s distributed environments, you might be using multiple cloud vendors, third-party services, and on-premises resources. CloudWatch is a great source of data, but if it’s not brought together with other data, it creates another data silo. 

EraSearch is an observability and analytics platform that offers scalable log ingestion, low-cost storage, and fast querying capabilities optimized for today’s high-volume workloads. Built on an innovative object storage-based architecture to maximize cost savings, EraSearch lets teams pay for only infrastructure resources they use. Best of all, with their operations toil reduced, teams can quickly search for relevant data using the EraSearch UI and use alerting to notify teams of application and infrastructure issues.

By centralizing all log data in EraSearch, teams can cost-effectively combine CloudWatch data with other data sources for better understanding of end-to-end operations, effective troubleshooting, and analyzing trends. 

Stream CloudWatch logs to EraSearch in 5 steps

Step 1: Configure AWS products to log to CloudWatch

Upon completion of this step, logs for your AWS products are flowing into CloudWatch and are ready to be sent to Amazon Kinesis.

Step 2: Configure Kinesis Firehose

Set up a Kinesis subscription by utilizing the links below. As a best practice, configure the Firehose requests to include an access key for authenticating requests. We’ll need that same access key for our Vector configuration in the next step. After the subscription is set up, stream CloudWatch logs to the subscription.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CreateDestination.html

https://aws.amazon.com/premiumsupport/knowledge-center/streaming-cloudwatch-logs/

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html#FirehoseExample

Step 3: Install and configure Vector

If Vector is not already installed, go ahead and install it using your deployment option of choice. Keep in mind that Vector will need to be externally accessible from AWS’s Firehose service, so put it somewhere with a public IP/hostname and port. We’ll need both when configuring the Firehose destination below. Once installed, let’s create a Vector configuration to accept data from the Kinesis Firehose as shown in the snippet below, where:

  • ${AKF-PWD} is an environment variable containing the access key from Step 2.

  • ${VECPORT} is an environment variable containing the port number on which Vector should listen for data.

data_dir = "/var/lib/vector"

[sources.akf]
type = "aws_kinesis_firehose"
address = "0.0.0.0:${VECPORT}"
access_key= "${AKF-PWD}"
record_compression="auto"
tls.enabled = true
tls.crt_file = "/var/lib/vector/cert/example.com.crt"
tls.key_file = "/var/lib/vector/cert/example.com.key"

The configuration above will allow Vector to receive the data provided by AWS. Once the data is received, we’ll want to parse it into JSON before sending it to EraSearch. To tell Vector to parse the line into JSON and add a timestamp, add the section below.

# Parse Syslog logs
# See the Vector Remap Language reference for more info: https://vrl.dev
[transforms.parse_logs_dev]
type = "remap"
inputs = ["akf"]
source = '''
. = parse_json!(.message)
._ts = to_unix_timestamp(now(),unit: "milliseconds")
'''

Lastly, send the data to EraSearch using an Elasticsearch sink configuration as shown below, where:

  • ${ERA_URL} is an environment variable containing the URL of EraSearch, for example https://erasearch.example.com:9200.

  • ${INDEX_NAME} is an environment variable containing the index name to use for storing this data. This can be customized to be anything you’d like. In this example, the value logs-akf was used. 

  • ${ERAUSR} is an environment variable containing the basic auth username to use when authenticating with EraSearch.

  • ${ERAPWD} is an environment variable containing the basic auth password to use when authenticating with EraSearch.

[sinks.EraSearch]
  type = "elasticsearch"
  inputs = ["parse_logs_dev"]
  endpoint="${ERA_URL}"
  bulk.index = "${INDEX_NAME}"
  # for self-hosted users, use basic auth
  auth.strategy="basic"
  auth.user="${ERAUSR}"
  auth.password="${ERAPWD}"
  # for EraCloud users, use API key auth
  # request.headers.Authorization = "Bearer ${ERACLOUD_API_KEY}"
  healthcheck = false

For more information about how to connect Vector to an EraSearch database, see the docs here.

Step 4: Send data to a Kinesis Firehose destination

Choose a destination URL within Kinesis for sending data to, which will be the scheme/protocol, host name, and port (the ${VEC_PORT} variable above) where Vector was deployed in Step 3.

Step 5: View data in EraSearch

Now that your logs are being sent to EraSearch, it's easy to create a Grafana data source.

The version of Grafana that we are utilizing is 8.1.1. We will assume that you have logged into Grafana and are on the Data sources page, found under Configuration.

From the Data sources page, create a new Elasticsearch data source type. Enter values in the following fields:

  • Name is the name you want to give this Elasticsearch data source.

  • URL contains the EraSeach URL, the value of the${ERA_URL} environment variable described in Step 3. 

  • User contains the basic auth username to use when authenticating with EraSearch, the value of the ${ERAUSR} environment variable described in Step 3.

  • Password contains the basic auth password to use when authenticating with EraSearch, the value of the ${ERAPWD} environment variable described in Step 3.

Also toggle the Basic auth and With Credentials sliders so they show blue. 

For users in EraCloud, use your EraCloud API key as a custom header instead of the basic auth option.

After the values are inserted, scroll to the bottom of the page and enter the Index name used in the Vector configuration from Step 3. In the Time field name field, remove @timestamp and replace it with _ts. Select version 7.10+ and click Save & test.

At this point you should have a valid data source and can explore your data, build a dashboard, and create alerts. For more information on how to connect Grafana to EraSearch, see the docs here.

To experience EraSearch’s fast data ingestion and search capabilities, sign up for a free trial today.

Tags