Skip to main content

Logging and Monitoring

Table of Contents

  1. Introduction
  2. Project Structure
  3. Core Components
  4. Architecture Overview
  5. Component Details
  6. Dependency Analysis
  7. Performance Considerations
  8. Troubleshooting Guide
  9. Conclusion
  10. Appendix

Introduction

This document is the technical documentation for Sparrow's logging and monitoring system, focusing on the following topics:

  • Design architecture and responsibility boundaries
  • Log level management and runtime mode switching
  • Log rotation mechanism and archiving strategy
  • Structured logging and unified format
  • Log sampling and performance optimization
  • Configuration best practices and deployment recommendations
  • Integration ideas with third-party monitoring systems
  • Security and privacy protection, compliance requirements
  • Practical guidance for operations and development

Project Structure

The logging module is located in pkg/logger, implementing high-performance, rotatable logging capabilities around zap and lumberjack; configuration is provided by pkg/config, and bootstrap assembles the global Logger during application startup.

Diagram Sources

Core Components

  • Logger Wrapper: Lightweight encapsulation of zap.Logger, providing Info/Error/Fatal/Panic/Debug/Warn and their formatted variants, unified field style.
  • Development Mode Logger: Console output, colored levels, short caller path, convenient for local debugging.
  • Production Mode Logger: File rotation based on lumberjack, JSON encoding, multi-target output (file + console), supports caller and stack trace.
  • Configuration System: LogConfig defines mode/level/format/output/filename; global Config provides default values and environment variable overrides; loader supports multiple format configuration files and .env loading.
  • Application Assembly: App loads configuration and creates Logger at startup, throughout the entire lifecycle; subsystems output status, errors and warnings through injected Logger.

Architecture Overview

The logging system adopts a "configuration-driven + mode switching + multi-encoder" architecture, ensuring consistent experience and high-reliability output in development and production environments.

Diagram Sources

Component Details

Logger Wrapper and Level Management

  • Provides Info/Error/Fatal/Panic/Debug/Warn and their formatted methods, unified field key names (such as timestamp, level, caller, message, stacktrace).
  • Level mapping: Development mode maps LogConfig.Level to Debug/Info/Warn/Error/Fatal; production mode sets final level by same rules.
  • Runtime mode: When mode=prod|production, enable production mode logger; otherwise use development mode logger.

Diagram Sources

Production Mode Logger and Rotation

  • lumberjack Configuration: File size threshold, backup count, maximum retention days, whether to compress, use local time for naming.
  • Encoder: JSON format, standardized field key names; time, level, caller, duration all use unified encoder.
  • Output Targets: File and console dual-write, convenient for production and auditing.
  • Caller and Stack Trace: Enable caller and error level stack output, convenient for problem location.

Diagram Sources

Development Mode Logger

  • Console Encoder: Colored levels, short caller path, ISO time format.
  • Log Level: Dynamically mapped according to LogConfig.Level.
  • Applicable Scenarios: Local development, quick debugging, CI environment.

Configuration Loading

  • LogConfig Fields: mode, level, format, output, filename.
  • Global Config Default Values: log.level, log.format, log.output, log.filename, log.mode, etc.
  • loader: Supports multiple format configuration files, .env loading, environment variable overrides, Snake namespace replacement.

Diagram Sources

Application Assembly

  • App.NewApp: Loads configuration, creates Logger and injects into global App.
  • Subsystems (database, message bus, task scheduling, etc.) use Logger to record status and errors during initialization, forming a unified observability baseline.

Diagram Sources

Dependency Analysis

  • zap and lumberjack: Production mode logger depends on lumberjack for file rotation, depends on zap for high-performance encoding and multi-target output.
  • Configuration Parsing: viper and gotenv provide configuration loading and environment variable overrides.
  • Application Assembly: App uniformly creates Logger at startup phase, avoiding repeated initialization in each module.

Diagram Sources

Performance Considerations

Encoding and Output

  • Production mode uses JSON encoding, convenient for downstream log collection and retrieval; console only enabled in development mode, avoiding redundant output in production environment.
  • lumberjack rotation parameters should be adjusted based on disk capacity and IO capability: single file size, backup count, maximum retention days, compression switch.

Log Level

  • Strictly distinguish Debug/Info/Warn/Error/Fatal, avoid outputting too much Debug information in production environment.
  • Use Info/Warn to record key status in high-frequency paths, Error for exceptions, Fatal for unrecoverable errors.

Caller and Stack Trace

  • caller and stacktrace help locate problems, but bring additional overhead; only enable stack at error level to reduce performance impact of regular logs.

Multi-target Output

  • File and console dual-write may increase IO pressure in production environment; recommend only enabling console output when necessary, or use external log agent for centralized collection.

Troubleshooting Guide

Startup Failure

  • If configuration loading fails or validation fails, App.NewApp will throw an error; check configuration file format, key names and default values.
  • Pay attention to logger creation failure, confirm whether LogConfig.mode/level/filename are reasonable.

Runtime Errors

  • When component initialization fails, error fields will be output through Logger, such as database connection failure, NATS connection failure, etc.; combine error fields and stack to locate problems.

Server Shutdown and Cleanup

  • App will record shutdown status and cleanup results during graceful shutdown; if cleanup fails, check error fields and confirm resource release order.

Conclusion

Sparrow's logging system is configuration-driven at its core, adapting to different environment needs through development/production modes; achieving high-performance, maintainable log output with the help of zap and lumberjack. Combined with unified structured fields and multi-target output, it meets both local development convenience and production environment auditing and analysis needs. It is recommended to enable file rotation and JSON encoding in production environment, reasonably set log level and sampling strategy, and combine with external monitoring platform to achieve alerting and visualization.

Appendix

Log Level and Runtime Mode Comparison

  • Development Mode: Map Debug/Info/Warn/Error/Fatal according to LogConfig.Level.
  • Production Mode: Set final level by same rules, enable file rotation and JSON encoding.

Structured Field Standards

  • Standard Field Key Names: timestamp, level, caller, message, stacktrace.
  • Encoder Unification: Production mode uses JSON encoder, development mode uses console encoder.

Deployment Recommendations

Development Environment

  • mode=dev, level=debug/info, format=console, output=stdout, filename optional.

Production Environment

  • mode=prod|production, level=info/warn/error, format=json, output=file, filename points to persistent directory.
  • Reasonably set rotation parameters: single file size, backup count, maximum retention days, compression switch.

Environment Variables

  • Use .env and AutomaticEnv to ensure sensitive information and runtime parameters are controllable.

Monitoring Integration

Log Collection

  • Output production mode logs to files, use log agents (such as Fluent Bit/Filebeat) to collect and forward to ELK/OTLP and other platforms.

Alerting

  • Establish alerting rules based on level=message/caller and other fields in log platform, combine with business metrics and distributed tracing for linkage.

Visualization

  • Use log platform dashboards to visualize error trends, response time distribution, caller hotspots.

Security and Privacy, Compliance

Sensitive Information Masking

  • Avoid directly recording plaintext passwords, tokens, ID cards and other sensitive data in message/caller; perform masking or hashing when necessary.

Log Access Control

  • Restrict log file read permissions, only allow operations and audit personnel to access; production logs should be encrypted for storage and transmission.

Compliance Requirements

  • Meet GDPR, network security level protection and other requirements, clarify log retention period and destruction process; regularly conduct compliance audits.