MongoDB Repository

Introduction
Project Structure
Core Components
Architecture Overview
Detailed Component Analysis
Dependency Analysis
Performance Considerations
Troubleshooting Guide
Conclusion
Appendix

Introduction

This technical document focuses on MongoDB repository implementation in Sparrow project, systematically explains its design philosophy, document model mapping, BSON encoding/decoding mechanism, collection and index strategy, aggregation pipeline optimization, complex query (nested document, array, geospatial), pagination and transaction processing, connection configuration and replica set/sharding support, performance optimization and monitoring recommendations, and provides a complete usage guide for NoSQL developers.

Project Structure

MongoDB repository is located in persistence layer, adopts generic and interface abstraction, combines with unified interface constraints of entity layer, forms clear layering and responsibility boundaries:

Repository Layer: MongoDBRepository[T] implements usecase.Repository[T] interface, encapsulates CRUD, pagination, conditional query, random sampling and other capabilities
Entity Layer: Entity interface and BaseEntity provide unified ID, timestamp and other fields
Use Case Layer: Repository[T] interface and QueryOptions/QueryCondition define query contracts
Testing Layer: MongoDB container integration testing based on Testcontainers, covers typical scenarios and boundary conditions

Core Components

MongoDBRepository[T]

Generic repository implementation, responsible for entity save, query, update, delete, batch operations, pagination, conditional query, random sampling, etc.

Entity/BaseEntity

Unified entity interface and base fields, ensures repository can work with any entity type.

Repository[T]/QueryOptions

Repository interface and query contract, supports complex conditions and sorting.

Test Cases

Integration testing based on Testcontainers, covers CRUD, pagination, conditional query, transaction placeholder, complex entities (nested documents/arrays), etc.

Architecture Overview

MongoDB repository executes BSON encoding/decoding and query through Go Driver's Collection API, internally automatically identifies ID type (ObjectID or custom string), and automatically maintains CreatedAt/UpdatedAt timestamps during save. Query supports pagination, sorting, condition combination, and provides random sampling and aggregation pipeline entry.

Detailed Component Analysis

Document Model Mapping and BSON Encoding/Decoding

ID Mapping: Prioritizes attempting to convert string ID to ObjectID; if fails, stores as custom string ID. Query also supports both ID forms.
Timestamps: Automatically sets UpdatedAt during save; sets CreatedAt during insert if entity field contains it.
Field Mapping: Maps entity struct to BSON document through bson.Marshal/Unmarshal, then corrects _id field and removes duplicate ID field before storage.
Nested Documents and Arrays: Test cases demonstrate complete access of complex entities (such as orders) with nested structures and array fields.

Collection and Field Design

Collection Naming: Default collection name is lowercase form of entity type short name; can implement custom collection name strategy through extension.
Field Design: _id as primary key; Created/Updated timestamp fields follow BaseEntity constraints.
Nested and Arrays: Supports complex nested objects and array fields, test cases validate order entity's items array and address/payment information nested objects.

Index Strategy

Primary Key Index: _id is automatically established by MongoDB.
Common Query Fields: Build single field or composite indexes for high-frequency query fields (such as email, status, customer_id, etc.).
Text Index: Build text index for full-text search fields (such as name, title).
Geospatial Index: Build 2dsphere or 2d index for geolocation fields (such as coordinates).
TTL Index: Set TTL index for session/temporary data, automatically expires cleanup.
Aggregation Optimization: Build indexes for aggregation pipeline common fields, reduces sorting and filtering cost in aggregation stages.

Aggregation Pipeline Optimization (Design Recommendations)

Use $match to push down filter conditions, reduces upstream document count.
Use $project to streamline output fields, reduces network and memory pressure.
Use $sort and $limit order optimization, combine with indexes when necessary.
Use $lookup to limit association scale, or filter first then associate.
Use $group to minimize intermediate result set size.

Complex Query Operations

Nested Document Query: Access nested fields through dot path, such as shipping_address.city.
Array Operations: Supports array element matching, array length calculation, in-array query, etc.
Geospatial Query: Perform spatial query through $near/$geoWithin and other operators (requires geographic index).
Condition Combination: QueryOptions/QueryCondition supports multi-condition AND/OR, comparison operators, regex matching, IN/NOT_IN, null judgment, etc.

Pagination and Sorting

Provides pagination capability through FindWithPagination and FindByFieldWithPagination.
Internally uses FindOptions' Limit/Skip to control pagination range.
Defaults to descending by created_at, can customize sorting fields.

Transaction Processing

Repository provides WithTransaction placeholder method, currently returns self; actual transactions need to use in upper session.
Test cases demonstrate how to start session and create SessionContext, reserve extension points for subsequent transaction operations.

Connection Configuration and Replica Set/Sharding Support

Connection Method: Establishes connection through mongo.Connect and options.Client().ApplyURI(uri).
Replica Set: URI can contain replica set name and node list, driver automatically handles master-slave switching.
Sharded Cluster: URI can contain sharded cluster configuration, driver automatically routes to correct shard.
Production Recommendations: Enable connection pool, timeout control, authentication and TLS; set appropriate read preference for read-only queries.

Dependency Analysis

MongoDB repository depends on entity interface and use case layer contract, while interacting with MongoDB through Go Driver; test layer uses Testcontainers to start MongoDB container for integration verification.

Performance Considerations

Optimization Recommendations

Read-Write Separation: Set appropriate read preference for read-only queries, reduces master load.
Index Strategy: Build single-column/composite indexes for query patterns, use text/geographic indexes when necessary.
Aggregation Optimization: Reduce intermediate result sets, use appropriate aggregation stage order.
Connection Pool: Reasonably configure maximum connections, idle connections and timeout, avoids connection contention.
Monitoring and Alerting: Establish slow query and high latency alerts, regularly review hot queries.

Query Plan Analysis and Memory Usage Monitoring

Query Plan: View execution plan through database's explain function, locate bottlenecks.
Memory Monitoring: Observe memory usage in aggregation stages, avoids large result sets causing excessive memory peaks.
Logging and Tracing: Record key query duration and exceptions, assists in locating problems.

Troubleshooting Guide

Common Error Types

RepositoryError: Operation failure, entity not exists, ID empty, etc.

Common Issues

ID Type Problem: Confirm if incoming ID is valid ObjectID or custom string.
Query Exception: Check if QueryOptions/QueryCondition field names and operators are correct.
Connection Problem: Confirm URI is correct, network reachable, authentication and TLS configuration correct.
Test Environment: Use Testcontainers to start MongoDB, ensure container health and database cleanup.

Conclusion

Sparrow's MongoDB repository through generic and interface abstraction, provides unified and extensible NoSQL storage capability. Its BSON encoding/decoding, ID type handling, pagination and conditional query features meet most business scenario requirements. Combined with reasonable index and aggregation optimization, complete connection configuration and monitoring system, can run stably and efficiently in production environment.

Appendix

Entity Examples

Task, Session demonstrate different business entity fields and behaviors.

Complex Entities

Order/OrderItem demonstrate complete access process of nested documents and arrays.

Test Coverage

Covers CRUD, pagination, conditional query, random sampling, transaction placeholder and complex entities.

Table of Contents​

Introduction​

Project Structure​

Core Components​

MongoDBRepository[T]​

Entity/BaseEntity​

Repository[T]/QueryOptions​

Test Cases​

Architecture Overview​

Detailed Component Analysis​

Document Model Mapping and BSON Encoding/Decoding​

Collection and Field Design​

Index Strategy​

Aggregation Pipeline Optimization (Design Recommendations)​

Complex Query Operations​

Pagination and Sorting​

Transaction Processing​

Connection Configuration and Replica Set/Sharding Support​

Dependency Analysis​

Performance Considerations​

Optimization Recommendations​

Query Plan Analysis and Memory Usage Monitoring​

Troubleshooting Guide​

Common Error Types​

Common Issues​

Conclusion​

Appendix​

Entity Examples​

Complex Entities​

Test Coverage​

Table of Contents