articles•June 7th, 2021
Why I joined: A new Era of logs and metrics
Lots of logs everywhere
Throughout my journey in life, logs have been a central part of helping me understand why things are working the way they are. From a simplistic home automation setup to a complex multi-active data center website for United Airlines, logs have been and will always be the center of the communication chain from developer to operator. Logs tell us what’s going on. With the increase of log sources, being able to quickly parse through and make sense of these logs can mean the difference between an operational outage and delighting your customer.
My journey has taken me through both the customer and vendor worlds. While I’ve always used logs, I didn't realize how important they were until 9/11, when a group of talented people were able to keep the United Airlines website available to provide as much information as we could to our customers.
I remember it like it was yesterday. We were running out of 2 co-lo’s, one east coast and one west coast. We had just finished implementing ORCA which would dig through our log files for our web server and create graphs. On 9/11 we were glued to those graphs shifting traffic from west coast to east and back as the servers became taxed with traffic.
Shortly after 9/11 I remember the call to gather the logs from our web and app servers to send the the FBI. Back then, our Apache logs were a bit hard to understand and after we sent them to the FBI they sent back a request to gather logs for a few PNR’s (Passenger Number Record) of people that booked on united.com. The FBI were looking for the IP addresses of where the bookings were made.
Another true story: when working on the digital team at L-Brands we received a request from the FBI to help try and locate a phone of a missing girl. That missing Girl was Jayme Closs. The FBI contacted us to see if she had logged into her account or used the victoriassecret.com website. She had, and we were able to find and send those IP address to the FBI.
All this to say, you never know when a LOG will help save a life or help track down suspicious activity. Take a look at the news: cyber attacks are increasing every day and now ransomware is almost a weekly occurrence.
What the above two examples have in common is that they were all manual. We had to dig through Terabytes of logs by hand, spending countless hours looking for the needle in the haystack. Why? Because the amount of logs that we were logging couldn’t be handled by the systems at the time. The velocity was too high. We kept all of the logs, but searching through them was a “Sed/Awk/Grep” nightmare. Creating map/reduce-like shell functions to look through that data and then aggregating them made me wish we had something better and easier.
Why Era Software
And through all of those struggles, they help me appreciate what we are doing at Era Software. We want to make logging easy and affordable. As it should be. One VP of Engineering put it well: "the cost of the log exceeds the value of the log, so I store only the highest value logs. There is so much potential value in the rest of the logs I don't store, but all the log vendors charge too much."
It doesn't have to be this way!
We are bringing in a new Era of straightforward, store-and-explore logging through stateless micro-services, decoupling storage from compute to allow for nearly-linear scaling and really redefining the art of the possible, but that’s only the product side.
The team that has been gathered here is top-notch. Everyone is excited about redefining logging and databases at large, from single monolithic systems to systems of systems.
And for these reasons and many more I’ve dusted off my vendor hat and excitedly accepted the position as the head of the solutions engineering team.