Migration to Index Aggregate
Migration to Index Aggregate Architecture
This document outlines the strategy for migrating from the current multiple-service architecture to the new Index aggregate root design.
Current Architecture Issues
Code Analysis of CodeIndexingApplicationService
The current application service has several problems:
- Service Proliferation: 7+ domain services injected
- Manual Orchestration: Application layer contains complex business logic
- Leaky Abstractions: SQLAlchemy session management at application level
- Scattered State: Snippet state tracked across multiple services
Current Workflow Complexity
# Current approach - complex orchestration
async def run_index(self, index_id: int) -> None:
# 1. Get index from indexing service
index = await self.indexing_domain_service.get_index(index_id)
# 2. Delete old snippets via snippet service
await self.snippet_domain_service.delete_snippets_for_index(index.id)
# 3. Extract snippets via snippet service
snippets = await self.snippet_domain_service.extract_and_create_snippets(...)
# 4. Manual transaction management
await self.session.commit()
# 5. Create BM25 index via separate service
await self._create_bm25_index(snippets, progress_callback)
# 6. Create embeddings via separate service
await self._create_code_embeddings(snippets, progress_callback)
# 7. Enrich snippets via separate service
await self._enrich_snippets(snippets, progress_callback)
# 8. More embeddings via separate service
await self._create_text_embeddings(snippets, progress_callback)
# 9. Update timestamp via indexing service
await self.indexing_domain_service.update_index_timestamp(index.id)
# 10. Final commit
await self.session.commit()
New Architecture Benefits
Simplified Application Service
# New approach - aggregate root handles complexity
async def run_complete_indexing_workflow(
self, uri: AnyUrl, local_path: Path
) -> domain_entities.Index:
# 1. Create index (aggregate root)
index = await self._index_domain_service.create_index(uri)
# 2. Populate working copy (aggregate method)
index = await self._index_domain_service.clone_and_populate_working_copy(
index, local_path, SourceType.GIT
)
# 3. Extract snippets (aggregate method)
index = await self._index_domain_service.extract_snippets(index)
# 4. Simple transaction management
await self._session.commit()
return index
Migration Strategy
Phase 1: Parallel Implementation ✅
- Create new domain entities (
domain/models/entities.py
) - Create repository protocol (
domain/models/protocols.py
) - Create mapping layer (
infrastructure/mappers/index_mapper.py
) - Create repository implementation (
infrastructure/sqlalchemy/index_repository.py
) - Create domain service (
domain/services/index_service.py
) - Create simplified application service (
application/services/simplified_indexing_service.py
)
Phase 2: Feature Parity (Next Steps)
2.1 Complete Index Domain Service
- Implement actual cloning logic in
clone_and_populate_working_copy
- Complete snippet enrichment in
enrich_snippets_with_summaries
- Add snippet search capabilities to Index aggregate
- Add BM25/embedding integration
2.2 Application Service Integration
- Update application factories to create new services
- Add legacy compatibility methods
- Implement search functionality migration
2.3 CLI Integration
- Update CLI commands to use new application service
- Maintain backward compatibility for existing commands
Phase 3: Gradual Migration
3.1 New Endpoints First
- Create new CLI commands using Index aggregate
- Add new MCP tools using simplified service
- Implement new features with aggregate root
3.2 Legacy Adaptation
- Wrap old API calls to use new domain service
- Provide compatibility layer for existing integrations
- Migrate tests gradually
3.3 Search Migration
- Move search logic into Index aggregate
- Create search value objects in domain
- Simplify search application service
Phase 4: Complete Migration
4.1 Remove Old Services
- Remove
IndexingDomainService
- Remove
SnippetDomainService
- Remove
SourceService
- Clean up old value objects
4.2 Final Cleanup
- Remove legacy compatibility methods
- Update all tests to use new architecture
- Remove old application service
Code Examples
Before: Current Complexity
class CodeIndexingApplicationService:
def __init__(self,
indexing_domain_service: IndexingDomainService,
snippet_domain_service: SnippetDomainService,
source_service: SourceService,
bm25_service: BM25DomainService,
code_search_service: EmbeddingDomainService,
text_search_service: EmbeddingDomainService,
enrichment_service: EnrichmentDomainService,
session: AsyncSession, # Leaky abstraction!
):
# 7+ services to coordinate
After: Aggregate Root Simplicity
class SimplifiedIndexingApplicationService:
def __init__(self,
index_domain_service: IndexDomainService,
session: AsyncSession,
):
# Single domain service + session
# All business logic in domain
Benefits of Migration
1. Reduced Complexity
- Single domain service instead of 7+
- Business logic moves to domain layer
- Application layer focuses on coordination
2. Better Domain Modeling
- Index as true aggregate root
- Rich domain objects with behavior
- Proper encapsulation of business rules
3. Improved Testability
- Domain service can be tested in isolation
- No SQLAlchemy dependencies in domain tests
- Cleaner mocking for application tests
4. Enhanced Maintainability
- Clear boundaries between layers
- Easier to add new features
- Reduced coupling between services
5. Better Performance
- Fewer repository round trips
- Optimized aggregate loading
- Reduced object mapping overhead
Risks and Mitigation
Risk: Breaking Changes
Mitigation: Implement compatibility layer during transition
Risk: Feature Regression
Mitigation: Comprehensive test coverage for both old and new
Risk: Performance Impact
Mitigation: Benchmark and optimize aggregate loading
Risk: Complex Migration
Mitigation: Gradual, phase-by-phase approach
Success Metrics
- Reduced lines of code in application service (target: 50% reduction)
- Improved test coverage for domain logic
- Faster indexing workflow execution
- Fewer bugs related to state management
- Easier onboarding for new developers
Next Immediate Steps
- Complete Domain Service: Finish implementing cloning and enrichment
- Factory Integration: Update application factories
- Simple CLI Command: Create one new command using aggregate
- Performance Test: Benchmark against current implementation
- Migration Plan: Detail specific steps for first legacy endpoint