- Log does matter, a well-organized logging system is a must, https://github.com/dwarvesf/playbook/blob/master/engineering/log.md
- Our infrastructure is using Loki for log management.
- For any DevOps related commands, Quang recommends us to use K9s to give it a bit of visualization.
Minh, our engineer, bumped into some issues handling the database in a new project, so we gathered and discussed points of view as well as possible solutions.
Quang: If the infrastructure uses K8s or Docker, reading logs with Loki is the best. It’s an open-source project of Grafana Labs, launched in Nov 2019. Loki uses the idea of only indexing metadata about your logs: labels (just like Prometheus labels). Log data itself is then compressed and stored in chunks in object stores like S3, GCS, or local file-system. A small index and highly compressed chunks simplify the operation and significantly lowers the cost of Loki.
Minh: Okay, back to a lower level concern: I just have permission to access log reading. Do I need higher access permission to install/hook Loki to the system?
Quang: You had better request installing this stack to PIC. It brings pretty cool benefits. We can query a log over time, so we are able to narrow down the logs and even get to know bugs. In other words, we have a precise observation about what's wrong immediately.
Minh: How’s about using K9s? Does it have the same benefits?
Quang: Actually, no. The newest version of K9s just allows us to read logs in a time set like: a minute ago, 5 minutes ago, 15 minutes ago, etc, or read all logs. You see, it’s obviously time-consuming and limited. That’s why I suggest Loki at the beginning.
Minh: K8s usually expands nodes themselves to define how many nodes the service is running. Does each node have its own log?
Quang: Yes! Each node deploys each container, then it creates a new log.
Minh: When I test an application, I don’t know which log belongs to the node. Could K9s handle this problem?
Quang: Of course it can. Suppose an app like a deployment, if we look at the deployment layer, it'll be divided into bots, and which bot contains the log. Or we go straight to the implement to see bots.
When it comes to lots of bots, you can set a limitation time like a minute, so you will get the idea of which bot contains the log after requesting. Besides, K8s UI is another way to tackle this problem.
Huy: How about setting up tools to read logs with a large Microservices system?
Tay: Regarding the large system, services will sum up to a dashboard, like Grafana. So we can filter and search log queries by app or by container easily. Each programming language also has its own log framework/log library.
Hieu: We have OpenTracing do the job. The span (its fundamental type) is identified by its unique ID, and optionally may include the parent identifier. If the parent identifier is omitted, we call that span as the root span. The span also comprises human-readable operation names, start and end timestamps. All spans are grouped under the same trace identifier. Therefore, we can trace and define what’s going on in each request and service layer. It likes visualizing the request: which, where, and what time. This is an advantage of Open Tracing compared to Loki.
Quang: Great! So do we have any free Cloud Monitoring, like Datadog, to apply OpenTracing?
Hieu: That’s Elastic Search - the best free service as far as I know. It comes with a plugin to visualize APM (Application Performance Management). The last problem stays in writing logs with the right convention.
- What is log? Log helps you understand what your application is doing. Almost every application generates its own logs for the server that hosts it.
- K9s: a terminal based UI to interact with your Kubernetes clusters.
- OpenTracing: vendor-agnostic API to help developers easily instrument tracing into their codebase. In October 2016, OpenTracing became a project under the guidance of the Cloud Native Computing Foundation.
Subscribe for “The Next Bytes” where Han & the crew draft up our observation in the industry.