Era Software

Collecting Cloudflare Logs

Cloudflare logs are rich in data and can provide wonderful value—here's how to collect them in EraSearch.

image of Collecting Cloudflare Logs

Why collect Cloudflare logs?

Too often when we utilize a CDN we set it and forget it. Wouldn't it be great to watch the traffic coming to your website through a CDN? Maybe understanding what the CDN is not allowing through or maybe watching for errors at your origin. Maybe looking for speed issues between your customers all the way through the origin. Whatever your reason, we have a solution for you.

Why use EraSearch for Cloudflare Logs?

We at believe that consolidation of logs across the enterprise—be it on-prem cloud, public cloud, or both—provides the ability to gain more knowledge from these logs. We believe logging should be easy and reduce your management costs. We provide both a SaaS offering as well as the ability to run EraSearch in your own Kubernetes cluster.

We have developed EraDB to allow for any number of API's to be developed on top of it, and our first API is an "Elasticsearch-like interface that is built from the ground up to be optimized for ingesting, indexing, and storing logs while also leveraging the best properties of a cloud-native architecture. We’ve built EraSearch to realize that dream. "

Using the Elasticsearch API, our customers can use the tools they already know and love and keep more of the data at a lower cost.

Logical view of the design


To collect the Cloudflare logs we are using Vector to consolidate the Splunk logs and insert them into EraSearch. Vector is a log collector written in Rust that is a great front end to EraSearch. In this demo setup, we use Vector to mimic the Splunk HTTP Event Collector (HEC). Those logs are transformed a bit and then sent to EraSearch via an Elasticsearch compatible API built-in to Vector.

Cloudflare setup

First you need to setup a token in the Cloudflare API or use your root api key. We already had automation setup so we setup one for demo purposes. This key has zone.Logs permissions


After you have this key created, you will need to send a curl to Cloudflare with a a few key pieces of information.

  • ZoneID: of the domain you want to send logs for. You will need to do a CURL on every zone. ZoneID can be gathered from the bottom right of your domain homepage.
  • Email on the Cloudflare account
  • API Auth key: This is the key that was given
  • Splunk Endpoint: This is the IP that Vector will be setup on to receive the "Splunk" log
  • ChannelID: This is the uniq channel that this data will come in on. You need to have a uniq ID for each Cloudflare hostname

After you have the above data, you can use it in the below curl

curl -s -H "Authorization: Bearer ${INSERTAuthKey}" \
"<ZONE_ID>/logpush/jobs" -X POST \
-d {"name":"<HOSTNAME>","destination_conf": "splunk://<SplunkEndpoint>:8088/services/collector/raw?channel=<ChannelID>&insecure-skip-verify=true&sourcetype=cloudflare:json&header_Authorization=<Auth_Key>","logpull_options": "fields=RayID,EdgeStartTimestamp,CacheCacheStatus,CacheTieredFill,ClientASN,ClientCountry,ClientDeviceType,ClientIPClass,ClientMTLSAuthCertFingerprint,ClientMTLSAuthStatus,ClientSSLCipher,ClientSSLProtocol,ClientSrcPort,ClientTCPRTTMs,ClientXRequestedWith,ClientRequestBytes,ClientRequestHost,ClientRequestMethod,ClientRequestPath,ClientRequestProtocol,ClientRequestReferer,ClientRequestScheme,ClientRequestSource,ClientRequestURI,ClientRequestUserAgent,EdgeCFConnectingO2O,EdgeColoCode,EdgeColoID,EdgeEndTimestamp,EdgePathingOp,EdgePathingSrc,EdgePathingStatus,EdgeRateLimitAction,EdgeRateLimitID,EdgeRequestHost,EdgeResponseBodyBytes,EdgeResponseBytes,EdgeResponseCompressionRatio,EdgeResponseContentType,EdgeResponseStatus,EdgeServerIP,EdgeTimeToFirstByteMs,FirewallMatchesActions,FirewallMatchesRuleIDs,FirewallMatchesSources,OriginDNSResponseTimeMs,OriginIP,OriginRequestHeaderSendDurationMs,OriginSSLProtocol,OriginTCPHandshakeDurationMs,OriginTLSHandshakeDurationMs,OriginResponseBytes,OriginResponseDurationMs,OriginResponseHTTPExpires,OriginResponseHTTPLastModified,OriginResponseTime,OriginResponseStatus,OriginResponseHeaderReceiveDurationMs,WAFAction,WAFFlags,WAFMatchedVar,WAFProfile,WAFRuleID,WAFRuleMessage,WorkerSubrequestCount,WorkerSubrequest,WorkerStatus,WorkerCPUTime\u0026timestamps=rfc3339","dataset": "http_requests"}'

From there checkout our article on configuring Vector for Cloudflare if you are setting up your own vector setup.