Kalix: Move to the Cloud. Extend to the Edge. Go Beyond.

Jonas Bonér.: CTO & Co-founder, Lightbend.

17 May 2022,
12 minute read

Moving to the cloud and edge comes with big rewards such as new use-cases and business opportunities.

But how to efficiently build cloud and edge applications can be a confusing journey, mainly due to the lack of managed platforms that unifies:

Developer Experience (DX) for modeling and building stateful and stateless cloud-native and edge-native applications (including consistency and query models).
ZeroOps Experience, merging the cloud and edge into a single “thing,” a cloud-to-edge continuum that provides a unified way to do system design, deployment, and operations.
Reactive Runtime that delivers ultra-low latency with high resilience throughout the whole range of cloud and edge applications by continuously optimizing data access, placement, locality, and replication.

We set out to create a powerful PaaS that addresses these challenges for the cloud-to-edge continuum. An API PaaS that enables any developer with any language to build high-performance, data-centric applications without a database.

We did it. This is the story of Kalix.

Cloud complexity is slowing down engineering teams

Cloud infrastructure today is a thriving ecosystem of products driving innovation in many areas. But cloud infrastructure is proving to be quite complex, skills are limited, and development teams are drowning in how, where, when, and even why to make use of it. Business initiatives structured on the promise of the cloud—speed to market, scalability, and cost-effectiveness—are instead faced with slower and more costly projects than projected, fraught with overwhelmed engineering teams and a growing accumulation of technical debt.

To tackle some of this complexity, Kubernetes has emerged as the de facto standard “operating system” for the cloud today; it is excellent at managing, orchestrating, and ensuring availability and scalability of empty “boxes”: the containers. But that’s only half of the story. It matters equally much what you put in these “boxes” and how you stitch them together into a single coherent system. There are too many options and decisions today, leaving developers with the task of building a complete application in a more or less ad hoc fashion. Often using tools, techniques, and patterns that they already know but are not designed for the world of these distributed systems—e.g., the cloud, microservices, Kubernetes, service meshes—setting them up for failure.

There needs to be an equal investment in the application layer to address this problem. We need to make it easier for the developer to build complete applications, bring together, complement, and take full advantage of all the excellent underlying cloud infrastructure we have at our disposal. Also, more efficient cloud infrastructure utilization results in better performing and more scalable applications, lower costs, and reduced environmental impact.

That sounds simple enough, but there are many roadblocks preventing organizations from fully embracing the value of what cloud infrastructure can provide, including managing local and distributed application data with consistency and integrity.

Managing local and distributed workflow, communication, and coordination.
Managing client context and communication.
Maintaining business rules and operational semantics of the application as a whole.
Ensuring intelligent and adaptive co-location of data, compute, and end-user.
Integration with other systems.

So, how can we help developers manage this complexity gracefully? With a programming model and a holistic Developer Experience (DX) designed specifically for the cloud and edge, working in concert with the underlying infrastructure in maintaining application properties and requirements on behalf of its users.

Serverless, sure—but no cigar

Alfred North Whitehead famously said, “Civilization advances by extending the number of important operations which we can perform without thinking of them.” This quote translates directly to software development. We are constantly climbing the ladder of abstractions, trying to do more with less, faster, and automate away as much as possible.

The Serverless DX is a considerable step in this direction and shows the way for the future DX for the cloud and edge. As I’ve said for a long time, the Serverless DX is too revolutionary to be left to Function-as-a-Service (FaaS) products. In the last year, we have seen many different products, such as databases, message brokers, and API platforms, providing a serverless DX.

It's all great and a step in the right direction, but application developers are left with many different SDKs and APIs that need to be understood how to compose, each with its own feature set, semantics, guarantees, and limitations. Creating an integration project and a new bag of challenges to maintain—end-to-end correctness, data integrity, and consistency, ensuring efficiency, scalability, and availability of the system as a whole—is all very hard. As we all know, systems most often break at their boundaries when trying to compose disparate parts into a cohesive whole.

We believe we can do better than this by taking yet another step on the ladder of abstractions and developing a unifying abstraction layer that pulls together the necessary pieces—including databases, message brokers, caches, services meshes, API gateways, blob storages, CDN networks, CI/CD products, etc.—and exposes them into one single unified programming model and DX, tailored for the cloud and edge. A programming model with well-defined and thought-through holistic semantics, ensuring end-to-end guarantees and SLAs. A programming model and Serverless DX that lets us as developers focus on the essence of value creation: building direct end-user and business value that leaves us with a coherent, understandable, predictable, and maintainable system, all managed for us in the cloud.

As Stephen O’Grady, RedMonk writes, calling for vertical integration between the application and database layers: “There are already too many primitives for engineers to deeply understand and manage them all, and more arrive by the day. And even if that were not the case, there is too little upside for the overwhelming majority of organizations to select, implement, integrate, operate and secure every last component or service. Time spent managing enterprise infrastructure minutiae is time not spent building out its own business.”

Meet Kalix!

So what is the ideal programming model and DX for cloud application development and edge computing? What is the minimum set of things developers need to be in charge of that can’t be outsourced, generated, or automated? I think it comes down to three things:

Domain data—how to model the business data; its structure, constraints, guarantees, and query model.
Business logic—how to act and operate on the data; mine intelligence, transform, downsample, relay, and trigger side-effects.
API—how to communicate and coordinate between services and the outside world, using workflow, dataflow, communication, and coordination patterns.

In Kalix, developers only have to care about these three things. They can declaratively define the API for a service (the commands and events it receives and emits), the domain data the service manages (including its constraints, guarantees, and how to query it), and then write the business logic. Once complete, push it up to the cloud, and the rest is on Kalix.

Kalix is a fully managed PaaS for building event-driven, data-centric, real-time, cloud-native applications. It’s a complete developer and zero-ops experience from start to finish. Developers never have to see or care about any underlying infrastructure, including databases, message brokers, caches, service meshes, and API gateways. It provides “vertical integration” and extends serverless by being “databaseless,” reducing complexity and time-to-market immensely. Since Kalix manages all services and underlying infrastructure pieces in the system end-to-end, it can look across them all in a holistic way, learning how they operate collectively and optimizing things at a system level (instead of at the individual service or infrastructure component level).

Kalix builds on the lessons we have learned from more than a decade of building Akka (leveraging the actor model) and our experience helping large (and small) enterprises move to the cloud and use it in the most time, cost, and resource-efficient way possible.

Nowadays, businesses get measured by their data: the quality of their data, the insights they can get from their increasing volumes of data, and the speed at which they can deliver it to their users. Speed has become a competitive advantage, getting intelligence and value from your data faster—ideally in real-time, as it “flies by.”

Kalix leverages Akka to enable this new category of applications: ultra-low-latency, high-throughput, scalable, always-available, self-healing, and self-adapting distributed applications, delivered through a simple polyglot programming model available to most popular languages today—including Java, Scala, TypeScript, and JavaScript, with Go, Python, Swift, Kotlin, and more around the corner.

Kalix: bridging the worlds of edge and cloud

We have moved closer to edge computing in the last couple of years. Today, our industry has a well-built-out edge infrastructure. Most cloud providers offer small data centers out at the edge. CDN networks allow you to attach compute to the static data served. 5G is radically changing what's possible and how customers consume data and services. The edge opens up many new exciting use-cases and possibilities for serving customers more reliably and efficiently. Examples of use-cases include emergency services, trading systems, health care, factories, autonomous vehicles, e-commerce, farming, and gaming—the list is significant and growing.

From an architectural perspective, the edge consists of hierarchical layers between the cloud and the devices; each layer is further away from the cloud but closer to the end-users. This means both new opportunities and challenges:

Further out, toward the devices:	Further in, towards the cloud:
10,000s to 100,000s of PoPs to coordinate (millions of “things” if we count the devices).	10s to 1000s of nodes to coordinate.
Unreliable networks and hardware.	Reasonably reliable networks and hardware.
Limited resources and compute power.	Vast resources and compute power.
Ability to take local, faster, but less accurate decisions.	Ability to take global, slower, but more knowledgeable decisions.
Low latency real-time streaming through in-process processing.	Batch-oriented, high latency to backend services (from the edge users' perspective).
Calls for weaker consistency guarantees (eventual or causal consistency).	Allows for stronger consistency guarantees (ACID).
More resilient and available (data, compute, and user co-located, so all needed to serve user is already there).	Less resilient and available (dependent on a fast, reliable connection to the backend cloud to send/fetch data and perform computations).
Requires fine-grained data replication and mobility (that is adaptive and selective).	Coarse-grained batch-oriented data replication is possible, and data can be stationary.

On top of all this, edge computing means orders of magnitude more user and telemetry data to manage. In fact, to this extent, Gartner predicts that by 2025, 75% of enterprise-generated data will be created and processed at the edge, compared to 10% today. That’s a lot of critical and very demanding challenges for most organizations to tackle in less than three years to try and keep pace with known and (potentially even more harmful) competitors they don’t even know of yet that are building native solutions. Organizations can’t afford to add another layer of complexity to the picture by adding a new and distinct class of products, tools, techniques, and worries to evolve and keep up.

We need to see the edge as a continuum extending the cloud. Where we simply deploy applications that can run and move seamlessly between cloud and edge layers, depending on where it is currently most efficient for them to execute. What we need is a geo-distributed cloud to edge data plane serving application data that can run anywhere: from the public cloud to 10,000s of Points-of-Presence (PoPs) out at the edge of the network, in close physical approximation to its users, where the co-location of data, processing, and end-user ensures ultra-low latency and high-throughput. A data plane consumed as a PaaS, with a single unified programming model and DX, where the question of if you are running in the cloud or at the edge is a deployment and runtime decision, not a development or architectural decision.

As discussed, low latency and high throughput are some of the most compelling advantages of edge computing. But many miss that the edge is equally much about high reliability and availability. It means that we can build applications that can continue to function without disruption in the face of failure. Applications composed of multiple autonomous services, all working independently with local data always right there, physically co-located with the end-user. Where services can communicate and share data point-to-point at the edge directly and not depend on an always-up connection back to the central cloud. This architecture (so-called local-first software) allows for highly resilient systems, systems that must run 24/7 without stopping and that can adaptively detect, react to, and cope with the failure of its parts.

Kalix is moving toward making this vision a reality. We are currently working on extending the Kalix cloud PaaS to the edge as a cloud-to-edge data plane. Soon, developers will be able to build services without worrying if they will—now or eventually—run in the cloud or at the far edge; the programming model and operational experience are the same. With Kalix, semantics throughout the cloud-to-edge continuum will be maintained and ensure that the data will always be where it needs to be (and just the data that is currently required and nothing more); always available, correct, and consistent; injected into the services on an as-needed basis, automatically, timely, efficiently, and intelligently.

With Kalix and its innovative Developer and ZeroOps Experience, powered by its low-latency and resilient Reactive Runtime, building cloud applications and extending them to the edge is now simple, just as it should be.

NOTE
Watch the accompanying webinar Kalix: Tackling the Cloud to Edge Continuum.