Implementing Cross-entity Unique Keys in Kalix - Part 2

Renato Cavalcanti.: Principal Engineer, Lightbend.

3 October 2022,
7 minute read

Introduction

In part one we talked about the challenges of implementing cross-entities unique keys. We explained how a solution based on Views is unsatisfactory and we hinted that there are other ways to build strictly consistent models with cross-entity unique keys in a distributed system. In this part 2, we will show you how to combine Value Entities and Event Sourced Entities to achieve it.

For the rest of this article and in the following part 3, we will only cover the User handle. The same technique can be applied for the User email.

NOTE

The full project code can be found here.

We can start our exploration with a simple observation: an entity in Kalix is always consistent with itself. When we send a command to an entity, there is only one instance of this same entity in our system and it will process each command sequentially. In other words, entities offer very strong consistency guarantees. We will use this characteristic in our favor to build cross-entity unique keys.

We said that we want the User handle to be unique across all User entities, but we don’t want it to be part of the primary key because we want to allow users to change the handle in the future.

The first thing that we need to do is to promote the concept of handle to an entity. Up to now, a handle was just a field inside the User entity. We will make it an Entity in itself and therefore we will make it unique by definition.

We will first define it in protobuf. For instance in src/main/proto/com/example/handle/domain/handle_domain.proto.

syntax = "proto3";

package com.example.handle.domain;

option java_outer_classname = "HandleDomain";

message HandleState {
string handle = 1;
bool confirmed =  4;
}

And its service definition. This time as a ValueEntity defined in src/main/proto/com/example/handle/handle_api.proto.

syntax = "proto3";

import "google/protobuf/empty.proto";
import "kalix/annotations.proto";

package com.example.handle;

option java_outer_classname = "HandleApi";

message Handle {
string handle_id = 1 [(kalix.field).entity_key = true];
}

service HandleService {
option (kalix.codegen) = {
  value_entity: {
    name: "com.example.handle.domain.HandleEntity"
    entity_type: "handle"
    state: "com.example.handle.domain.HandleState"
  }
};
option (kalix.service).acl.allow = { principal: ALL };

rpc Reserve(Handle) returns (google.protobuf.Empty);
rpc Confirm(Handle) returns (google.protobuf.Empty);
}

Note that the state itself has a handle field (its primary key) and a confirmed field and the service is exposing only two methods for now, Reserve and Confirm. The handle has a lifecycle. First we need to reserve it and then we need to confirm it. We will later cover why we need the confirmed flag.

The implementation for these two methods for the Handle Value Entity will look like this:

@Override
public Effect<Empty> reserve(HandleDomain.HandleState currentState, HandleApi.Handle handle) {
if (currentState != null)
  return effects().error("Handle [" + handle.getHandleId() + "] is already taken");
else
  return effects()
    .updateState(
      HandleDomain.HandleState.newBuilder().setHandle(handle.getHandleId()).build())
    .thenReply(Empty.getDefaultInstance());
}

@Override
public Effect<Empty> confirm(HandleDomain.HandleState currentState, HandleApi.Handle handle) {
HandleDomain.HandleState updated = currentState.toBuilder().setConfirmed(true).build();
return effects().updateState(updated).thenReply(Empty.getDefaultInstance());
}

As we can see, we can only reserve it if it was never created before. If currentState is not null, it means that someone already created that handle and we can’t take it anymore.

In part 1 we said that we need a coordination point to make decisions about the uniqueness of the user handle. We will now implement an Action that will play the role of coordinator between the Handle entity and the User entity.

We can define the action in src/main/proto/com/example/api/user_action.proto

syntax = "proto3";

import "google/protobuf/empty.proto";
import "kalix/annotations.proto";

package com.example.api;

option java_outer_classname = "UserActionApi";

message CreateUser {
string full_name = 2;
string handle = 3;
string email =  4;
}

message UserId {
string value = 1;
}
service UserActionService {
option (kalix.codegen) = {
  action: { name : "UserAction" }
};
option (kalix.service).acl.allow = { principal: ALL };
rpc Create(CreateUser) returns (UserId);
}

And its implementation will be.

@Override
public Effect<UserActionApi.UserId> create(UserActionApi.CreateUser createUser) {

UserActionApi.UserId userId =
  UserActionApi.UserId
    .newBuilder().setValue(UUID.randomUUID().toString()).build();

// reserve the handle first
HandleApi.Handle reserveHandleCmd =
  HandleApi.Handle
    .newBuilder()
    .setHandleId(createUser.getHandle())
    .build();

CompletionStage<Empty> handleRes =
  components().handleEntity().reserve(reserveHandleCmd)
    .execute();

// create user command
UserApi.CreateUser createUserCmd =
  UserApi.CreateUser.newBuilder()
    .setUserId(userId.getValue())
    .setFullName(createUser.getFullName())
    .setHandle(createUser.getHandle())
    .setEmail(createUser.getEmail())
    .build();

CompletionStage<Effect<UserActionApi.UserId>> userCreationRes =
  handleRes
    // compose handleRes with userCreation
    .thenCompose(handleCreated ->
      components()
        .userEntity().create(createUserCmd)
        .execute()
        // when user creation completes, return user id wrapped in an effect
        .thenApply(userCreated -> effects().reply(userId))
    )
    // if handleRes is a failure, we return its original message
    .exceptionally(exp -> effects().error(exp.getMessage()));

return effects().asyncEffect(userCreationRes);
}

Let’s see what we have put in place.

The first thing to note is that the action receives its own CreateUser command. It is similar to the User entity command, but it doesn’t have a userId. The reason is that the Action will generate a UUID for it.

After that, it will reserve the handle by making a call to the Handle Entity. There are only two possible outcomes for this call. Either it will reserve the handle by creating an unconfirmed Handle entity or it will fail because the Handle has been already created by someone else.

Let’s remember that such an operation on an Entity is atomic and there is no possibility that the same handle (primary key) can be created in two different places.

For the moment, let’s only consider the sunny day scenario and assume that handleRes successfully completed CompletionStage.

On the next step, we will compose handleRes with a call to the User entity to create it. Again we assume that it successfully completes. At this point, we have created two entities. The Handle and the User. The Handle is still unconfirmed, but it already played its role as a consistency barrier to guarantee its uniqueness across the whole application.

Let’s now briefly consider the failure scenarios. The first possible failure and the easiest to solve is when the Handle exists already. In that case, handleRes will be completed with a failure and we will never create the User entity. This is the easiest one as it will immediately return to the caller with a message that the chosen handle is already in use. In that case, the barrier just worked as expected and we are done.

There is however a less obvious failure scenario.

What happens when we reserve the Handle, but we fail to call the User entity because of some hardware or network issue? In this scenario, the caller may receive an error message, or most probably a timeout, and will try it again. On retry, they will get the message that the handle is already taken. In other words, the Handle was reserved but not really in use because we didn’t create the User entity.

We should consider that such a failure scenario is rare, but can still occur. When it happens, indeed the user gets blocked from using their preferred handle, but the system stays consistent at least.

In the next blog, we cover how we can unlock such unused Handle and talk a little bit more about recovery strategies and clean-ups. Stay tuned...