Era Software

Writing real-time data with Vector

This page shows how to use Vector to write real-time data to EraSearch. In this guide, you'll:

  • Generate sample real-time logs and store them in files
  • Configure Vector to collect, transform, and write the logs to EraSearch
  • View the logs in the EraSearch UI

While the steps below use log data stored in files, you can customize the setup to use any Vector source, including Fluent, AWS Kinesis Firehose, and Kubernetes logs.

Before you begin

This content is intended for engineers and developers using EraSearch on EraCloud. To create an EraCloud account, visit the Getting started series. You'll need your EraSearch Service URI and API key to complete the steps below.

This page also assumes you have the following installed on your machine:

Instructions

Step 1: Configure log-generator

This section uses log-generator's sample configuration to send Apache 2.4 access logs to a file called apache.log.

To configure log-generator, navigate to a new directory and create a file called apache.config. Next, paste in this content, replacing YOUR_FILE_PATH with the path to your new directory:

name: Apache General Access
file: YOUR_FILE_PATH/apache.log
format: "{log_ip} - - [{log_time} +0000] \"{log_method} {log_path} HTTP/1.1\" {log_status} {log_bytes}"
frequency:
  seconds: 5
offset:
  seconds: 0
jitter:
  seconds: 5
amount: 50
fields:
  log_ip:
    type: ip
  log_time:
    type: timestamp
    format: "%d/%b/%Y:%H:%M:%S"
  log_method:
    type: enum
    values: [POST, GET, PUT, PATCH, DELETE]
  log_path:
    type: enum
    values:
      - /auth
      - /alerts
      - /events
      - /playbooks
      - /lists
      - /fieldsets
      - /customers
      - /collectors
      - /parsers
      - /users
  log_status:
    type: enum
    values: [200, 201, 204, 300, 301, 400, 401, 403, 404, 500, 503]
  log_bytes:
    type: integer
    min: 2000
    max: 5000

Step 2: Start log-generator

In the same directory, enter this command to start log-generator:

$ log-generator apache.config

When successful, your terminal displays Starting normal execution and Loaded: apache.config, followed by INFO logs every five seconds.

Note: If you've installed log-generator and the command above returns command not found: log-generator, you may need to add the Python directory to your PATH.

Step 3: Configure Vector

Next, configure Vector to collect logs in apache.log, transform the logs, and send them to EraSearch. In the same directory as your apache.config file, create vector.toml and paste in the content below, replacing:

  • YOUR_FILE_PATH with your directory's file path
  • YOUR_SERVICE URI with your EraSearch Service URI
  • YOUR_API_KEY with your EraSearch API key
  • YOUR_INDEX_NAME with the target EraSearch index -- EraSearch creates the index for you

Note: This guide uses the Elasticsearch sink to let Vector work with EraSearch. That workflow is possible because the EraSearch REST API supports much of the Elasticsearch API.

# Collect logs from apache.log
[sources.myfile]
type = "file"
include = ["YOUR_FILE_PATH/apache.log"]
read_from = "beginning"

# Parse logs
[transforms.parse_logs]
type = "remap"
inputs = ["myfile"]
source = '''
. = parse_apache_log!(string!(.message),format: "combined")
'''

# Send logs to EraSearch
[sinks.erasearch]
type="elasticsearch"
inputs=["parse_logs"]
endpoint="YOUR_SERVICE_URI"
tls.verify_hostname = false
request.headers.Authorization = "Bearer YOUR_API_KEY"
healthcheck.enabled = false
request.concurrency = "adaptive"
index = "YOUR_INDEX_NAME"

Step 4: Start Vector

In the same directory, enter this command to start Vector:

$ vector --config ./vector.toml

When successful, your terminal outputs several INFO logs about Vector.

Step 5: View your data in the EraSearch UI

To view your log data, sign in to your EraCloud account and click EraSearch UI. Next, navigate to the FILTERS tab and select the index you configured in vector.toml. You may need to refresh the UI if the index you specified is new.

The EraSearch UI displays your log data organized by time. Each document has a unique numerical identifier (_lid) and has data stored in several field key-value pairs.

Next steps

That's it. Your EraSearch instance is now receiving real-time log data. To learn more about writing data to EraSearch, visit the Era Software blog and Writing bulk data from files. For more information about Vector, including what logs you can collect and how to configure the Elasticsearch sink, visit these pages: