Entity Framework CoreDatabase ArchitectureContent ManagementDraft PublishingEF Core.NET

The Authoring vs. Published Database Problem: How EF Core Made It Worse

October 8, 2023
8 min read
Gonçalo Bastos

The Authoring vs. Published Database Problem: How EF Core Made It Worse

"Can authors save their work without publishing it immediately?"

This seemed like a straightforward request from our content team. They wanted to draft articles, save progress, collaborate on edits, and only publish when ready.

Three months later, we were managing two separate databases and dealing with EF Core migration complexities we hadn't anticipated.

This is how we approached the dual-database architecture challenge with Entity Framework Core.

The Deceptively Simple Requirement

Our content management system started with a straightforward approach: everything was published immediately. Authors created content, hit save, and it appeared on the website instantly.

Then our editorial team grew. Suddenly we had:

  • Draft articles that needed multiple review rounds
  • Scheduled publishing for marketing campaigns
  • A/B testing requiring content variants
  • Legal review workflows for sensitive content
The solution seemed obvious: add an IsPublished flag and call it a day. Simple boolean logic to separate drafts from published content.

Unfortunately, this approach created bigger problems:

  1. Performance: The public website was querying the same database as the authoring tools
  2. Security: Draft content lived alongside published content in the same tables
  3. Complexity: Every public query needed WHERE IsPublished = 1
  4. Risk: One wrong query could leak unpublished content

The Dual-Database Decision

After our third incident where draft content briefly appeared on the live site, we made the call: separate databases.

  • Authoring Database: Where editors work on drafts, revisions, and unpublished content
  • Published Database: Read-only for the public website, containing only approved content

The concept was clean. The implementation with EF Core was a nightmare.

The EF Core Problem

Entity Framework Core makes some assumptions that don't play well with dual-database architectures:

Assumption #1: One database per DbContext Assumption #2: Migrations apply to the target database
Assumption #3: The same entity maps to the same table structure

Our reality broke all these assumptions.

The Architecture That Almost Worked

Our initial approach seemed elegant: create two DbContexts that shared the same Article entity but pointed to different databases. The authoring context would have the full entity graph with drafts and comments, while the published context would be a simplified read-only view.

This looked clean on paper - same entities, different databases, separate concerns.

However, this approach had several issues we discovered during implementation.

Migration Hell

The first issue was during deployment. Both contexts managing the same entity types led to migration conflicts. When attempting to add migrations with the same name to both contexts, EF Core would throw errors about duplicate migration names. This happened because EF Core stores migration metadata globally rather than per-context, making it difficult to manage overlapping entities with different schemas.

The Entity Confusion

The second issue was that the same Article entity needed different behaviors in each database. In the authoring database, articles could exist without publication dates, representing draft states. However, in the published database, every article must have a publication date. EF Core treated these identically, but our business rules required them to behave differently - leading to validation confusion and potential data integrity issues.

The Solution: Entity Segregation

After working through various approaches, we realized: EF Core works best when contexts have clear, separate responsibilities.

The solution was complete entity segregation. We created separate entity classes for each database context: DraftArticle for the authoring system and PublishedArticle for the public site. The draft entity included rich workflow properties like revision tracking, comment systems, and status management. The published entity was streamlined for performance, with denormalized author information and simplified tag storage. This separation allowed each entity to be optimized for its specific use case without compromising the other.

Separate Contexts, Separate Concerns

Each database context was designed for its specific purpose. The authoring context included complex entity relationships to support the editorial workflow - draft articles linked to comments and revisions, with soft delete capabilities for draft management. The published context was optimized for read performance with strategic indexes on publication dates and titles, using JSON columns for tag storage to avoid additional table joins. This separation allowed each context to use EF Core features most appropriate for its workload.

The Publishing Pipeline

With separate entities, we needed a robust publishing pipeline to move content between databases. The publishing service coordinates transactions across both databases to ensure consistency. The process involves retrieving approved drafts, transforming them into optimized published entities, handling both new publications and updates to existing content, and managing the transactional integrity across both databases. The service includes comprehensive error handling and logging to track the publishing process and facilitate troubleshooting when issues arise.

The Migration Strategy That Actually Worked

The key insight was treating the two databases as completely separate systems with their own migration histories. We organized migrations into separate folders for each database context, allowing independent schema evolution. Each context was registered with its own connection string and migration history table to prevent conflicts. This approach required running migrations separately for each context, but eliminated the cross-context migration issues we had encountered earlier.

The Performance Payoff

The separate databases delivered significant performance improvements:

Authoring Database (Optimized for writes):

  • Complex entity relationships for workflow management
  • Full audit trails and revision history
  • Soft delete support
  • Rich navigation properties

Published Database (Optimized for reads):

  • Denormalized data for faster queries
  • Optimized indexes for public queries
  • No complex relationships
  • Simplified entity structure

Authoring queries could be complex, involving multiple entity relationships for editorial workflows, but were infrequent and used by a small number of editors. Published queries were simple but frequent, serving the public website with optimized read patterns that avoided expensive joins and complex filtering.

The Challenges We Still Face

This architecture isn't perfect. Here are the ongoing challenges:

1. Data Consistency

With two databases, consistency becomes eventually consistent at best. We implemented a consistency check service that regularly compares the state between databases, identifying articles marked as published in the authoring system but missing from the published database. This service helps detect and resolve synchronization issues that can occur due to network problems, transaction failures, or deployment issues.

2. Deployment Complexity

Deploying schema changes requires coordinating two separate databases. Our deployment process involves backing up both databases before any changes, then updating the authoring database first since it has lower risk to the live site. The published database is updated second with careful monitoring, as any issues directly affect public users. This sequential approach with proper error handling helps minimize deployment risks.

3. Development Environment Setup

Every developer needs both databases, which complicates local setup. We simplified development environments by configuring automatic publishing in development mode, so developers see changes immediately without manual publishing steps. We also enabled consistency checks in development to catch synchronization issues early in the development cycle.

What We Learned

After running this architecture for some time, here's what we learned:

EF Core Isn't the Problem

Initially, we thought EF Core was making dual databases difficult. The real issue was trying to apply single-database ORM patterns to a multi-database architecture.

Once we adopted separate entities and contexts, EF Core worked much more effectively with our architecture.

Separate Everything Early

Avoid sharing entities between contexts. The minor code duplication is worthwhile for the architectural clarity and framework compatibility it provides.

Consistency Checks Are Critical

With multiple databases, automated consistency checking becomes essential. Implement these checks from the beginning of the project rather than waiting for production issues to surface.

The Performance Gains Are Real

Public website query performance improved significantly after separating read and write workloads into dedicated databases.

Would I Do It Again?

Yes, but with clearer expectations upfront.

The dual-database architecture solved our core problems: drafts stay private until explicitly published, public website performance is independent of authoring complexity, security is simpler since the published database has no sensitive data, and we can scale read and write workloads independently.

But it came with operational complexity:

  • Two databases to monitor and maintain
  • More complex deployment procedures
  • Eventual consistency considerations
  • Additional development environment setup

The Alternative We Should Have Considered

Looking back, there was another approach we could have tried: single database with read replicas. This would involve using the same database for both authoring and publishing, but directing public website queries to a read replica with global query filters to ensure only published content is visible. The read replica approach might have been simpler to implement and maintain, avoiding many of the consistency and deployment challenges we encountered. However, we had already committed significant effort to the dual-database approach by the time we considered this alternative.

The Bottom Line

Sometimes simple feature requests lead to significant architecture changes. Adding draft support required learning more about database design, EF Core patterns, and deployment considerations.

The editorial team appreciates being able to save work without immediate publishing, and the public website benefits from the optimized read database.

The dual-database approach worked for our use case, though other solutions might be better for different requirements.


Working on content management architecture decisions? Let's discuss your specific challenges with drafts, publishing workflows, and database design.