Data Persistence

Introduction
Project Structure
Core Components
Architecture Overview
Detailed Component Analysis
Dependency Analysis
Performance Considerations
Troubleshooting Guide
Conclusion
Appendix

Introduction

This document targets data persistence developers, systematically organizes Sparrow data persistence system: repository pattern implementation, multi-datasource support, transaction management mechanism, as well as characteristics and applicable scenarios of various storage backends (memory, Redis, PostgreSQL, MongoDB, BadgerDB). The document also provides repository interface design principles, data access patterns, query optimization strategies, migration and backup recovery ideas and performance tuning recommendations.

Project Structure

Sparrow's persistence layer adopts "generic repository + multi-backend adaptation" architecture:

usecase layer defines generic repository interfaces and query models
entity layer defines entity contracts
persistence/repo contains concrete implementations for various storage backends
bootstrap and config provide container and configuration assembly capabilities

Core Components

Repository Interface and Query Model

Repository interface uniformly defines CRUD, batch operations, pagination, conditional query, random sampling and other capabilities
Query model includes conditions, sorting, pagination parameters

Base Repository

Provides default implementation skeleton and error handling template, concrete backend implementations reuse

Entity Contract

Unified entity interface, requires ID, create/update time field access and setting capabilities

Architecture Overview

Sparrow's persistence architecture follows "interface segregation + backend adaptation" design:

Shields underlying differences through generic repository interface
Each backend independent implementation, ensures extensibility and replaceability
Transaction management explicitly controlled by implementation layer in SQL/PostgreSQL scenarios
Configuration and container responsible for instance assembly and lifecycle management

Detailed Component Analysis

Repository Interface Design Principles

Clear Responsibility: Single interface carries complete data access capability, avoid excessive subdivision
Unified Convention: All implementations follow same context, error and return specifications
Extensibility: Extend new capabilities through generics and query models without breaking existing contracts
Consistency: Pagination, conditions, sorting maintain consistent behavior expectations across multiple backends

Core Method Classification

Single Entity: Save/FindByID/Update/Delete/Exists
Batch: SaveBatch/DeleteBatch/FindByIDs
List: FindAll/FindWithPagination/Count
Conditional Query: FindByField/FindByFieldWithPagination/CountByField/FindWithConditions/CountWithConditions
Random Sampling: Random

Query Optimization Strategy

Index and Scan

Redis/MongoDB/Badger: No native index, conditional query is full table scan, recommend combining with pagination and filter range
PostgreSQL: Supports field index and composite index, conditional query and sorting can use index to improve performance

Pagination and Sorting

Uniformly use Limit/Offset or cursor approach, avoid loading full data at once
PostgreSQL defaults to descending by creation time, can build index by business field

Batch Operations

Redis uses Pipeline
PostgreSQL/SQL adopts transaction-wrapped batch insert/update

Random Sampling

PostgreSQL uses RANDOM()
MongoDB uses $sample
Redis/Badger through random key selection or full scan then random extraction

Transaction Management

PostgreSQL

SaveBatch/batch insert/update execute within transaction, automatic rollback on failure
Soft delete: Implemented by marking deleted_at field

SQL Generic Repository

Save/SaveBatch/batch operations all execute within transaction
Supports soft delete (DeletedAt field exists) or hard delete

Redis/Badger/MongoDB

No built-in transaction abstraction, need to coordinate consistency in upper business use case
Redis supports Pipeline to improve batch throughput and atomicity

Storage Backend Characteristics and Applicable Scenarios

Memory Storage (Memory Cache/Temporary Data)

Characteristics
- Complete CRUD and batch operations
- Concurrent safe (read-write lock)
- Supports random sampling, pagination, conditional query (full table scan)
Applicable Scenarios
- Test environment, temporary data, cache layer (non-persistent)

Redis Storage

Characteristics
- Key space prefix isolates entity types
- TTL support, Pipeline batch
- Full table scan conditional query
Applicable Scenarios
- Cache, session, message queue event bus, event storage (ESDB)

PostgreSQL Storage

Characteristics
- Complete ACID transaction, soft delete, index and complex query
- Supports batch operations and conditional query
Applicable Scenarios
- Master data, audit log, strong consistency required data

MongoDB Storage

Characteristics
- Collection naming rules, BSON document storage
- Supports aggregation and random sampling
Applicable Scenarios
- Log, event sourcing, semi-structured data

BadgerDB Storage

Characteristics
- LSM-Tree storage, key prefix scan
- No soft delete, full table scan conditional query
Applicable Scenarios
- Local event storage, edge computing, embedded systems

Generic SQL Database Storage

Characteristics
- Automatically identifies soft delete field, supports soft/hard delete
- Transactional batch operations
Applicable Scenarios
- Compatibility requirements, legacy databases

Key Implementation Flow

PostgreSQL Save Flow

Redis Batch Save Flow

Memory Conditional Query Flow

Dependency Analysis

Interface and Implementation

usecase/Repo interface is implemented by all concrete repositories
BaseRepository provides default skeleton for each implementation

Entity Contract

All repository implementation generic parameters implement Entity interface

Configuration and Container

bootstrap/Database uniformly wraps Redis, SQL, Badger clients
bootstrap/Container provides dependency injection and singleton cache

Performance Considerations

Redis

Use Pipeline for batch writes, reasonably set TTL
Control key prefix length, avoid overly long prefixes affecting Keys/MGET performance

PostgreSQL

Build indexes for high-frequency query fields
Use transaction batch operations, reduce round-trip overhead
Pagination query avoid deep pagination (Offset too large), consider cursor-based pagination

MongoDB

Use aggregation pipeline and indexes, avoid full table scan
Reasonably use $sample for random sampling

BadgerDB

Implement pagination through key prefix scan, pay attention to full traversal cost
Adjust Compactor and threshold parameters to balance write amplification and read amplification

General

Batch operations preferentially use transaction or Pipeline
Conditional query尽量缩小范围, combine with pagination and sorting

Troubleshooting Guide

Common Error Types

RepositoryError: Entity ID empty, entity not exists, operation failed, etc.
SQL/Redis/MongoDB Client Error: Connection failed, timeout, key not exists

Troubleshooting Steps

Verify entity ID and entity type
Check backend connection configuration and network connectivity
Observe transaction status and rollback logs (PostgreSQL)
Analyze batch operation Pipeline/transaction boundaries

Recommendations

Add retry and circuit breaker for critical paths
Record context information for critical operations (traceId, entity ID)

Conclusion

Sparrow's persistence system achieves unified data access experience and flexible deployment choices through generic repository interface and multi-backend adaptation. In actual engineering, should choose appropriate backend according to business requirements for consistency, availability and performance, and combine with index, pagination and batch operation strategies to improve overall performance.

Appendix

Data Migration and Backup Recovery

PostgreSQL

Use logical backup tools for periodic snapshots
Pay attention to soft delete field and index rebuild during migration

Redis

RDB/AOF persistence strategy and AOF rewrite
Pay attention to key space prefix and TTL consistency during migration

MongoDB

Use replica set and sharding, periodic backup
Pay attention to collection naming and indexes during migration

BadgerDB

File system level backup and recovery
Pay attention to directory permissions and threshold configuration during migration

Operation and Maintenance Recommendations

Configuration Management

Manage connection parameters uniformly through config/*
Environment isolation: Development/Test/Production use different configurations

Monitoring and Alerting

Key indicators: Connection count, QPS, slow query, error rate
Backend specific: Redis memory usage, PG WAL, MongoDB Oplog, Badger Compaction

Container and Assembly

Use bootstrap/Container to manage dependencies and lifecycle
Database uniformly wraps multiple clients, convenient for replacement and testing

Table of Contents​

Introduction​

Project Structure​

Core Components​

Repository Interface and Query Model​

Base Repository​

Entity Contract​

Architecture Overview​

Detailed Component Analysis​

Repository Interface Design Principles​

Core Method Classification​

Query Optimization Strategy​

Index and Scan​

Pagination and Sorting​

Batch Operations​

Random Sampling​

Transaction Management​

PostgreSQL​

SQL Generic Repository​

Redis/Badger/MongoDB​

Storage Backend Characteristics and Applicable Scenarios​

Memory Storage (Memory Cache/Temporary Data)​

Redis Storage​

PostgreSQL Storage​

MongoDB Storage​

BadgerDB Storage​

Generic SQL Database Storage​

Key Implementation Flow​

PostgreSQL Save Flow​

Redis Batch Save Flow​

Memory Conditional Query Flow​

Dependency Analysis​

Interface and Implementation​

Entity Contract​

Configuration and Container​

Performance Considerations​

Redis​

PostgreSQL​

MongoDB​

BadgerDB​

General​

Troubleshooting Guide​

Common Error Types​

Troubleshooting Steps​

Recommendations​

Conclusion​

Appendix​

Data Migration and Backup Recovery​

PostgreSQL​

Redis​

MongoDB​

BadgerDB​

Operation and Maintenance Recommendations​

Configuration Management​

Monitoring and Alerting​

Container and Assembly​

Table of Contents

Introduction

Project Structure

Core Components

Repository Interface and Query Model

Base Repository

Entity Contract

Architecture Overview

Detailed Component Analysis

Repository Interface Design Principles

Core Method Classification

Query Optimization Strategy

Index and Scan

Pagination and Sorting

Batch Operations

Random Sampling

Transaction Management

PostgreSQL

SQL Generic Repository

Redis/Badger/MongoDB

Storage Backend Characteristics and Applicable Scenarios

Memory Storage (Memory Cache/Temporary Data)

Redis Storage

PostgreSQL Storage

MongoDB Storage

BadgerDB Storage

Generic SQL Database Storage

Key Implementation Flow

PostgreSQL Save Flow

Redis Batch Save Flow

Memory Conditional Query Flow

Dependency Analysis

Interface and Implementation

Entity Contract

Configuration and Container

Performance Considerations

Redis

PostgreSQL

MongoDB

BadgerDB

General

Troubleshooting Guide

Common Error Types

Troubleshooting Steps

Recommendations

Conclusion

Appendix

Data Migration and Backup Recovery

PostgreSQL

Redis

MongoDB

BadgerDB

Operation and Maintenance Recommendations

Configuration Management

Monitoring and Alerting

Container and Assembly