MongoDB Repository
Table of Contents
- Introduction
- Project Structure
- Core Components
- Architecture Overview
- Detailed Component Analysis
- Dependency Analysis
- Performance Considerations
- Troubleshooting Guide
- Conclusion
- Appendix
Introduction
This technical document focuses on MongoDB repository implementation in Sparrow project, systematically explains its design philosophy, document model mapping, BSON encoding/decoding mechanism, collection and index strategy, aggregation pipeline optimization, complex query (nested document, array, geospatial), pagination and transaction processing, connection configuration and replica set/sharding support, performance optimization and monitoring recommendations, and provides a complete usage guide for NoSQL developers.
Project Structure
MongoDB repository is located in persistence layer, adopts generic and interface abstraction, combines with unified interface constraints of entity layer, forms clear layering and responsibility boundaries:
- Repository Layer: MongoDBRepository[T] implements usecase.Repository[T] interface, encapsulates CRUD, pagination, conditional query, random sampling and other capabilities
- Entity Layer: Entity interface and BaseEntity provide unified ID, timestamp and other fields
- Use Case Layer: Repository[T] interface and QueryOptions/QueryCondition define query contracts
- Testing Layer: MongoDB container integration testing based on Testcontainers, covers typical scenarios and boundary conditions
Core Components
MongoDBRepository[T]
- Generic repository implementation, responsible for entity save, query, update, delete, batch operations, pagination, conditional query, random sampling, etc.
Entity/BaseEntity
- Unified entity interface and base fields, ensures repository can work with any entity type.
Repository[T]/QueryOptions
- Repository interface and query contract, supports complex conditions and sorting.
Test Cases
- Integration testing based on Testcontainers, covers CRUD, pagination, conditional query, transaction placeholder, complex entities (nested documents/arrays), etc.
Architecture Overview
MongoDB repository executes BSON encoding/decoding and query through Go Driver's Collection API, internally automatically identifies ID type (ObjectID or custom string), and automatically maintains CreatedAt/UpdatedAt timestamps during save. Query supports pagination, sorting, condition combination, and provides random sampling and aggregation pipeline entry.
Detailed Component Analysis
Document Model Mapping and BSON Encoding/Decoding
- ID Mapping: Prioritizes attempting to convert string ID to ObjectID; if fails, stores as custom string ID. Query also supports both ID forms.
- Timestamps: Automatically sets UpdatedAt during save; sets CreatedAt during insert if entity field contains it.
- Field Mapping: Maps entity struct to BSON document through bson.Marshal/Unmarshal, then corrects _id field and removes duplicate ID field before storage.
- Nested Documents and Arrays: Test cases demonstrate complete access of complex entities (such as orders) with nested structures and array fields.
Collection and Field Design
- Collection Naming: Default collection name is lowercase form of entity type short name; can implement custom collection name strategy through extension.
- Field Design: _id as primary key; Created/Updated timestamp fields follow BaseEntity constraints.
- Nested and Arrays: Supports complex nested objects and array fields, test cases validate order entity's items array and address/payment information nested objects.
Index Strategy
- Primary Key Index: _id is automatically established by MongoDB.
- Common Query Fields: Build single field or composite indexes for high-frequency query fields (such as email, status, customer_id, etc.).
- Text Index: Build text index for full-text search fields (such as name, title).
- Geospatial Index: Build 2dsphere or 2d index for geolocation fields (such as coordinates).
- TTL Index: Set TTL index for session/temporary data, automatically expires cleanup.
- Aggregation Optimization: Build indexes for aggregation pipeline common fields, reduces sorting and filtering cost in aggregation stages.
Aggregation Pipeline Optimization (Design Recommendations)
- Use $match to push down filter conditions, reduces upstream document count.
- Use $project to streamline output fields, reduces network and memory pressure.
- Use $sort and $limit order optimization, combine with indexes when necessary.
- Use $lookup to limit association scale, or filter first then associate.
- Use $group to minimize intermediate result set size.
Complex Query Operations
- Nested Document Query: Access nested fields through dot path, such as shipping_address.city.
- Array Operations: Supports array element matching, array length calculation, in-array query, etc.
- Geospatial Query: Perform spatial query through $near/$geoWithin and other operators (requires geographic index).
- Condition Combination: QueryOptions/QueryCondition supports multi-condition AND/OR, comparison operators, regex matching, IN/NOT_IN, null judgment, etc.
Pagination and Sorting
- Provides pagination capability through FindWithPagination and FindByFieldWithPagination.
- Internally uses FindOptions' Limit/Skip to control pagination range.
- Defaults to descending by created_at, can customize sorting fields.
Transaction Processing
- Repository provides WithTransaction placeholder method, currently returns self; actual transactions need to use in upper session.
- Test cases demonstrate how to start session and create SessionContext, reserve extension points for subsequent transaction operations.
Connection Configuration and Replica Set/Sharding Support
- Connection Method: Establishes connection through mongo.Connect and options.Client().ApplyURI(uri).
- Replica Set: URI can contain replica set name and node list, driver automatically handles master-slave switching.
- Sharded Cluster: URI can contain sharded cluster configuration, driver automatically routes to correct shard.
- Production Recommendations: Enable connection pool, timeout control, authentication and TLS; set appropriate read preference for read-only queries.
Dependency Analysis
MongoDB repository depends on entity interface and use case layer contract, while interacting with MongoDB through Go Driver; test layer uses Testcontainers to start MongoDB container for integration verification.
Performance Considerations
Optimization Recommendations
- Read-Write Separation: Set appropriate read preference for read-only queries, reduces master load.
- Index Strategy: Build single-column/composite indexes for query patterns, use text/geographic indexes when necessary.
- Aggregation Optimization: Reduce intermediate result sets, use appropriate aggregation stage order.
- Connection Pool: Reasonably configure maximum connections, idle connections and timeout, avoids connection contention.
- Monitoring and Alerting: Establish slow query and high latency alerts, regularly review hot queries.
Query Plan Analysis and Memory Usage Monitoring
- Query Plan: View execution plan through database's explain function, locate bottlenecks.
- Memory Monitoring: Observe memory usage in aggregation stages, avoids large result sets causing excessive memory peaks.
- Logging and Tracing: Record key query duration and exceptions, assists in locating problems.
Troubleshooting Guide
Common Error Types
- RepositoryError: Operation failure, entity not exists, ID empty, etc.
Common Issues
- ID Type Problem: Confirm if incoming ID is valid ObjectID or custom string.
- Query Exception: Check if QueryOptions/QueryCondition field names and operators are correct.
- Connection Problem: Confirm URI is correct, network reachable, authentication and TLS configuration correct.
- Test Environment: Use Testcontainers to start MongoDB, ensure container health and database cleanup.
Conclusion
Sparrow's MongoDB repository through generic and interface abstraction, provides unified and extensible NoSQL storage capability. Its BSON encoding/decoding, ID type handling, pagination and conditional query features meet most business scenario requirements. Combined with reasonable index and aggregation optimization, complete connection configuration and monitoring system, can run stably and efficiently in production environment.
Appendix
Entity Examples
- Task, Session demonstrate different business entity fields and behaviors.
Complex Entities
- Order/OrderItem demonstrate complete access process of nested documents and arrays.
Test Coverage
- Covers CRUD, pagination, conditional query, random sampling, transaction placeholder and complex entities.