Cohesive Storage

Cohesive.Storage is the storage abstraction layer for Cohesive systems. It stores and retrieves semantic state in terms of shapes, observations, and entities instead of forcing application code to think first in terms of documents, rows, indexes, blobs, or vendor-specific query APIs.

The goal is not to hide storage backends behind a lowest-common-denominator repository. The goal is to define the storage semantics once, expose the capabilities of each backend explicitly, and compile Cohesive queries, aggregations, transitions, processes, and APIs into the best available backend interpretation.

Cohesive.Storage is designed for systems that need to move across storage families such as object/blob storage, table storage, Cosmos DB, Elasticsearch, Azure AI Search, relational stores, event streams, and in-memory repositories used for tests or local execution.

Semantic Storage
Repository Abstractions
Point Reads, Writes, And Batches
Query And Aggregation Compilation
Change Feeds, Outboxes, And Event Sourcing
Storage Binding And Placement
Cohesive.Api Integration
Cohesive.Processes Integration
Backend Families
The Storage Role In Cohesive

Semantic Storage

Cohesive models persisted state as observations of declared shapes. Entities enrich those observations with identity, transitions, invariants, effects, and process participation. Storage repositories persist those observations and expose structured operations over them.

This gives storage a semantic contract:

the shape defines the fields and meaning of the stored value;
the observation is the persisted state snapshot;
the entity definition defines transition behavior and lifecycle rules;
the repository defines how that state is read, written, queried, streamed, and placed;
adapters project those semantics onto concrete storage engines.

Application code can therefore ask for "the selected tenant's Shipment entity" or "all matching TrainingPolicy observations" instead of rebuilding storage-specific logic in each handler.

Repository Abstractions

The core repository surface is intentionally small but expressive.

Repositories support point reads, writes, optimistic concurrency, field projection, native batch declarations, query execution, change streams, and atomic outbox commits.

public interface IEntityRepository
{
    EntityDefinition EntityDefinition { get; }
    ShapeMappingContext MappingContext { get; }
    EntityBatchCapabilities BatchCapabilities { get; }

    Task<EntitySnapshot?> TryGet(
        OperationContext context,
        string id,
        EntityReadOptions? options = null);

    Task<EntitySnapshot> Upsert(
        OperationContext context,
        EntityWriteRequest write);

    Task<EntityBatchWriteResult> UpsertBatch(
        OperationContext context,
        EntityBatchWriteRequest request);
}

Query-capable repositories extend the point-read repository with the Cohesive.Relations query model.

public interface IEntityQueryRepository : IEntityRepository
{
    Task<EntityQueryResponse<EntitySnapshot>> Query(
        OperationContext context,
        EntityQuery query);

    IAsyncEnumerable<EntitySnapshot> QueryStream(
        OperationContext context,
        EntityQuery query);
}

Outbox-capable repositories can persist entity state and emitted messages atomically.

public interface IEntityOutboxRepository : IEntityRepository, IChangeStreamRepository
{
    Task<EntityCommitResult> UpsertWithOutbox(
        OperationContext context,
        EntityOutboxCommit commit);

    IObservationStream GetOutboxStream(
        string processorName,
        string? streamName = null,
        DateTimeOffset? startTime = null);
}

These interfaces let a backend expose only what it can actually guarantee. A simple blob-backed repository might support immutable writes and point reads. Cosmos can support partition-aware document writes, queries, change feed, and atomic outbox patterns. Search engines can support rich predicates and aggregations but may not be the write authority.

Point Reads, Writes, And Batches

Cohesive.Storage separates the semantic operation from the backend behavior.

Point reads can include field selection, expected version, expected concurrency token, and an optional partition key.

var snapshot = await shipmentRepository.TryGet(
    context,
    id: shipmentId,
    options: EntityReadOptions
        .ForFields(nameof(Shipment.Status), nameof(Shipment.UpdatedAtUtc))
        .WithPartitionKey(tenantPartitionKey)
        );

Writes use observations and optimistic concurrency tokens.

var write = new EntityWriteRequest(
    Entity: shipmentObservation,
    ExpectedConcurrencyToken: snapshot?.ConcurrencyToken
    );

var committed = await shipmentRepository.Upsert(context, write);

Batch operations declare the requested atomicity. Backends can advertise whether they support single-write fallback, same-partition atomicity, or all-or-nothing atomicity.

var result = await shipmentRepository.UpsertBatch(
    context,
    new(Writes: [createShipment, createAuditRecord],
        Atomicity: EntityBatchAtomicity.SamePartition)
        );

The repository capability declaration keeps call sites honest: if a backend cannot satisfy the requested atomicity, the failure is explicit instead of silently degrading a transactional operation into independent writes.

Query And Aggregation Compilation

Cohesive.Storage uses the query and aggregation DSL from Cohesive.Relations. Predicates, field paths, windows, ordering, joins, and aggregation plans are semantic IR. Storage adapters compile that IR into backend-specific query languages.

Examples include:

Cosmos SQL for Cosmos DB containers;
Elasticsearch query DSL and aggregations;
Azure AI Search filters, search text, facets, and scoring projections;
table-store partition/range predicates;
blob metadata queries or manifest-backed scans;
in-memory evaluation for tests and local runtimes.

The capability model is first-class. Query compilers advertise exactly which predicate features they support:

public sealed class CosmosSqlQueryCompiler : IQueryCompiler<CosmosSqlQuery>
{
    public QueryCapabilitySet Capabilities { get; } = new(
        QueryCapability.Equality
        | QueryCapability.Prefix
        | QueryCapability.Suffix
        | QueryCapability.Contains
        | QueryCapability.Exists
        | QueryCapability.NumberRange
        | QueryCapability.DateRange
        | QueryCapability.SetMembership
        | QueryCapability.NestedAny
        | QueryCapability.ScopedFields
        | QueryCapability.Negation
        | QueryCapability.CaseInsensitiveStringComparison);
}

Before a predicate is compiled, Cohesive can inspect the required features and emit diagnostics for unsupported capabilities.

var required = QueryCapabilityInspector.GetRequiredCapabilities(predicate);

compiler.Capabilities.EnsureSupports(
    required.Value,
    operation: "compile shipment search predicate"
    );

This is important because different stores are strong in different ways. Cosmos may be the best authority for partitioned operational state. Elasticsearch or Azure AI Search may be better for full-text search and facets. Blob storage may be the right place for large artifacts. Cohesive.Storage lets the semantic query lead, then makes backend capability gaps explicit.

Aggregations follow the same pattern. Backends declare support for count-if, filtered metrics, top-hit samples, histograms, nested buckets, and metric ordering. Unsupported aggregation plans fail with actionable diagnostics instead of producing partial or surprising results.

Change Feeds, Outboxes, And Event Sourcing

Storage is not only read/write state. It is also how systems observe change.

Cohesive.Storage includes checkpointed observation streams for raw entity changes and logical outbox messages.

var stream = shipmentRepository.GetOutboxStream(
    processorName: "shipment-effects",
    streamName: "process-effects"
    );

await stream.Process(
    async (batch, records, ct) =>
    {
        foreach (var record in records)
            await DispatchEffect(record, ct);
    },
    context.CancellationToken
    );

An ObservationRecord can represent either an entity change or an outbox event. It carries the observation, partition key, document id, stream name, subject type, subject id, subject version, correlation id, occurrence time, and concurrency token.

This supports several patterns:

change-feed processors that react to state updates;
transactional outbox dispatch for effects emitted by transitions;
process continuation messages;
audit trails;
event-sourced state reconstruction where observations or event observations are the durable record;
dual-write avoidance when state changes and downstream messages must commit together.

Storage Binding And Placement

Cohesive.Storage includes a placement framework for deciding where an observation belongs physically.

The placement contract is EntityPartitionKeyPolicy: a semantic rule that resolves the write partition key from the operation context and observation, and optionally resolves a point-read partition key from the context and id.

public static EntityPartitionKeyPolicy CreateTenantScopePartitionKeyPolicy(
    string tenantFieldName
    )
{
    var normalizedFieldName = Guard.RequireNotNullOrWhiteSpace(tenantFieldName);

    return new(
        description: $"tenant scope resolved from field '{normalizedFieldName}'",
        writePartitionKeyResolver: (context, observation) =>
        {
            var tenantId = observation
                .GetField(normalizedFieldName)
                .GetRequiredString();

            return context
                .ResolveScope(TenantScopePolicy, tenantId)
                .PartitionKey;
        },
        pointReadPartitionKeyResolver: static (context, _) =>
            context.TryResolveSingleScope(TenantScopePolicy, out var tenant)
                ? tenant!.PartitionKey
                : null);
}

This is where Cohesive.Identity becomes part of storage placement. A tenant, workspace, organization, or other identity scope can carry a logical id and a physical partition key. The storage binding rule maps scoped observations to the correct backend partition without duplicating tenant logic in repositories, API routes, process handlers, and background workers.

The same framework can express other placement strategies:

observation id placement;
field-based placement;
identity-scope placement;
date or bucket placement;
backend-specific partition and routing rules;
policy selection by entity definition or storage annotation.

Cohesive.Api Integration

Cohesive.Api exposes storage-backed entities and queries through declared API operations. Scope policies from Cohesive.Identity can be attached to those operations, and storage placement policies can use the resolved scope.

var api = Cohesive.Api.Api.Define("Shipping")
    .Entity<Shipment>()
    .Query("Get")
        .Route("GET", "/api/shipments/{id}")
        .RouteParameter<Guid>("id")
        .Returns<ShipmentDto>()
        .Scope(ShippingTenantScope.SingleFromHeader())
        .Done()
    .Action("Search")
        .Route("GET", "/api/shipments")
        .Query<SearchShipmentsRequest>()
        .Returns<ShipmentDto[]>()
        .Scope(ShippingTenantScope.SetFromQuery())
        .Done()
    .Build();

The same semantic API definition can drive server endpoint metadata, OpenAPI output, typed clients, generated frontend request metadata, and storage-backed query execution. API adapters do not need to rediscover which operations are scoped or how the scope should be supplied; the declaration carries that information.

Cohesive.Processes Integration

Cohesive.Processes uses storage repositories to load, transition, and commit entity state as part of durable or ephemeral workflows.

The ProcessEntityRepositoryAdapter adapts a storage repository to the process runtime. When transition effects are enabled, it can persist effects into the repository outbox in the same commit as the entity update.

var processRepository = new ProcessEntityRepositoryAdapter(
    repository: shipmentRepository,
    partitionKeyPolicy: CreateTenantScopePartitionKeyPolicy(nameof(Shipment.TenantId)),
    options: new()
    {
        PersistEffectsInOutbox = true,
        EffectOutboxStreamName = "process-effects"
    });

This lets a process execute multi-step, multi-entity workflows while preserving storage semantics:

point reads use identity-aware partition placement;
transitions commit with optimistic concurrency;
emitted effects can be stored atomically through the outbox;
downstream processors can consume effect streams;
process ids and correlation ids flow into outbox records;
durable process backends can coordinate with storage repositories without becoming the storage model.

Backend Families

Cohesive.Storage is built to support multiple backend interpretations.

Blobs

Blob stores are useful for large artifacts, generated packages, datasets, durable exports, binary payloads, and append-only manifests. Cohesive adapters can bind semantic outputs to blob targets while preserving shape and observation metadata.

Tables

Table-style stores are useful for high-volume keyed access, partition/range scans, and simple operational records. Cohesive placement policies map identity scope, entity id, and field-derived partition keys onto table partition and row keys.

Cosmos DB

Cosmos is a natural fit for observation documents, partitioned entity repositories, query compilation, change feed processing, and atomic outbox commits.

Elasticsearch

Elasticsearch is a strong projection target for full-text search, filtering, grouping, and analytics over observation-shaped documents. Cohesive query and aggregation IR can be compiled into Elasticsearch query DSL.

Azure AI Search

Azure AI Search is a projection target for search-first experiences, typed API search endpoints, facets, scoring profiles, hybrid retrieval, and indexed views over semantic observations.

The important point is that backend choice is an interpretation of the semantic storage model. A product can use one backend as the source of truth, another as a search projection, and another as artifact storage while still working from the same shapes, observations, identity scopes, queries, and process effects.

The Storage Role In Cohesive

Cohesive.Storage is where semantic definitions meet durable infrastructure.

It provides:

repositories for point reads, writes, field projection, concurrency, and batches;
query repositories compiled from Cohesive.Relations predicates and aggregation plans;
backend capability models and diagnostics for unsupported features;
change-feed and outbox streams;
event-sourcing-friendly observation records;
typed repository wrappers through shape mapping;
storage binding and placement policies;
first-class identity-scope integration for partition keys;
adapters and compiler targets for operational stores, search stores, artifact stores, and local runtimes;
integration with Cohesive.Api and Cohesive.Processes.

The net effect is a storage layer that remains semantic at the surface and concrete at the edge. Shapes, observations, entities, scopes, transitions, and processes define what the system means; storage adapters decide how that meaning is persisted, indexed, streamed, and queried on real infrastructure.