Mastering Distributed Systems: How Temporal Can Revolutionize Your Workflow
- 6 minutes read - 1181 wordsIntroduction
This post is designed to introduce you to Temporal, a powerful workflow engine, especially if you’re new to its capabilities. We’ll explore how Temporal can bring value to your projects, breaking down its complexities into understandable parts, whether you’re just starting or are already experienced in software development.
What is a Workflow?
Let’s start with some definitions:
Workflow: Think of a workflow as a series of activities that are executed in a specific order. Each step flows into the next, like a well-choreographed dance. Workflows are central to how Temporal operates. For a deeper understanding of workflows and how they differ from sagas, you can refer to this Stack Overflow discussion and this YouTube presentation.
Activity: An activity is a single, focused task—like sending an email, processing a file, or calling another service. These tasks can vary in complexity and duration.
Global Algorithm: This is an algorithm that spans across multiple components, often outliving a single process. It’s essential for coordinating activities in distributed systems.
Local Algorithm: In contrast, a local algorithm runs within a single component, using only local data.
Distributed System: This refers to a system where components are spread across different networked computers. Each part has its own data, and they communicate by sending messages over a network. As Dominik Tornow defines it, “A distributed system is a set of concurrent, communicating components that communicate by sending and receiving messages over a network. Each component has exclusive access to its own local state, which is not accessible by any other components.”
Core Abstraction: A fundamental concept or tool within a system, like transactions in a database, that simplifies complex tasks.
The Challenge of Distributed Systems
Engineering teams often face the task of developing solutions that are not just functional but also reliable, scalable, and fault-tolerant. When you build on an event-driven microservice architecture, you need to coordinate various processes across different services—this is where global algorithms come into play.
Imagine trying to complete an order in an online store. Multiple services, like payment processing, inventory management, and shipping, must work together seamlessly. Each of these services runs its own local algorithms, but the overall process is a global algorithm that orchestrates them all.
But here’s the catch: building these global algorithms is tricky. Engineers often have to solve the same problems repeatedly—handling errors, managing timeouts, dealing with failures, and balancing workloads. These are challenges inherent to distributed systems, and they must be addressed at the application level because no platform-level solution fully tackles them.
The Need for a Better Solution
What if there was a way to simplify these challenges? Think about databases that follow ACID (Atomicity, Consistency, Isolation, Durability) principles. They provide core abstractions like transactions and rollbacks, which guarantee that either everything in a transaction happens or nothing does. This kind of system abstracts away a lot of the complexity, making it easier for engineers to focus on the business logic rather than the underlying mechanics.
Similarly, distributed systems need a generalized solution that can abstract away most of their problems, making global algorithms easier to design, understand, and maintain. This is where Temporal comes in.
Introducing Temporal
Temporal offers a new core abstraction called a Workflow. A workflow in Temporal is essentially a series of commands that guarantees the desired outcome across a distributed system, even in the face of failures or timeouts. By handling these challenges at the platform level, Temporal eliminates them at the application level, making your job as an engineer much easier.
Temporal is an open-source workflow system that lets you write your workflows in code, using languages like Go, JavaScript, and Java. The beauty of Temporal is that it can take a regular function and elevate it into a workflow that is resilient to failures like server crashes, timeouts, and restarts. This means your function will execute to completion, no matter what happens.
Why Temporal is a Game-Changer
Using Temporal can significantly boost both the durability and reliability of your systems while also increasing developer productivity. For instance, Nuon, an infrastructure startup, reported a 36x increase in developer efficiency after switching to Temporal. That might sound unbelievable, but consider how much easier it is to manage data in a database compared to doing it all in your application code.
How Temporal Works
Temporal revolves around two core concepts: Workflows and Activities.
Workflows are defined using the Temporal SDK, similar to writing a regular function. However, workflows must be deterministic, meaning they produce the same result every time they’re run with the same input.
Activities are the individual tasks within a workflow. They can be non-deterministic, like calling an external service, but it’s recommended they be idempotent (produce the same result if executed multiple times).
These elements come together to provide durability, reliability, and scalability in distributed systems.
Temporal’s Architecture
Temporal consists of client SDKs and a cluster/server setup. The server orchestrates everything and communicates with clients via gRPC. The cluster can be configured in various ways, from a simple setup with all services in a single container to a complex multi-regional cluster managed by Helm charts. The architecture is illustrated in the diagram below:
You can explore more details about Temporal’s architecture in their official documentation.
When you develop with Temporal, you write your Workflow and Activity code within your microservice. You then register a worker with the Temporal cluster, which will execute the workflow. This setup allows for a flexible, scalable approach to managing distributed workflows.
Here’s another diagram that shows how workflows interact within the Temporal system:
Temporal’s Industry Impact
Temporal Technologies, founded in 2019, has rapidly gained traction, raising $128 million in funding and reaching a valuation of $1.5 billion. Companies like Netflix, DoorDash, and Stripe rely on Temporal for handling billions of workflow executions daily. This widespread adoption is a testament to its effectiveness in solving complex, large-scale distributed tasks.
Real-World Examples
Nuon
Nuon, an infrastructure startup, used Temporal to rebuild their entire provisioning system, significantly speeding up their development process. They reported a 36x increase in development velocity and expanded from 26 to over 125 endpoints in just a couple of months. The diagram below illustrates their event loop using Temporal:
Here’s a snippet of their code that represents the above workflow:
You can watch Nuon’s full video presentation here.
Instacart
Instacart, a leading gig economy company, has used Temporal for core operations in infrastructure and payments for over 2.5 years. They now run 11 million workflow executions a day with only minimal incidents. You can learn more from their conference presentation.
Conclusion
Temporal represents a major step forward in managing workflows within distributed systems. By abstracting away the complexities of distributed programming, it allows you to focus more on business logic and less on infrastructure challenges. The result is increased productivity, greater reliability, and a more scalable system.
If your team is looking to build robust, scalable solutions, adopting Temporal could be a strategic move that aligns well with your goals. By leveraging Temporal, you can expect to see significant improvements in efficiency and a more maintainable approach to building your microservices architecture.