Uni Distributed Systems Summary

Summary for the distributed systems course at HdM Stuttgart

Felicitas Pojtinger

2023-02-05

1.1 Meta

1.1.1 Contributing

These study materials are heavily based on professor Kriha’s “Verteilte Systeme” lecture at HdM Stuttgart and prior work of fellow students.

Found an error or have a suggestion? Please open an issue on GitHub (github.com/pojntfx/uni-distributedsystems-notes):

QR code to source repository

If you like the study materials, a GitHub star is always appreciated :)

1.1.2 License

AGPL-3.0 license badge

Uni Distributed Systems Notes (c) 2023 Felicitas Pojtinger and contributors

SPDX-License-Identifier: AGPL-3.0

1.1.3 Course Timeline

This course on distributed systems covers a range of topics related to the design, implementation, and management of distributed systems. The course is divided into several sections, including:

  1. Introduction to distributed systems: This section provides an overview of distributed systems and their key characteristics.
  2. Theoretical models of distributed systems: This section covers the use of theoretical models, such as queuing theory and process and I/O models, to understand and analyze distributed systems.
  3. Message protocols: This section covers the use of message protocols, including delivery guarantees, causality, and reliable broadcast, to facilitate communication between components in a distributed system.
  4. Remote procedure calls: This section covers the use of remote procedure calls (RPCs) to invoke functions on a remote machine, as well as different RPC mechanisms such as marshaling, thrift, and gRPC.
  5. Distributed objects: This section covers distributed objects such as CORBA, RMI and DCOM.
  6. Distributed business components: This sections covers general component technology as well as Enterprise Java Beans (EJBs).
  7. Services: This sections covers a distribution paradigm between hype and revolution.
  8. Theoretical foundations of distributed systems: This section covers key concepts and theories that are relevant to the design of distributed systems, including the FLP theorem, time, causality, consensus, eventual consistency, and optimistic replication.
  9. Distributed services and algorithms I: This section covers the design and implementation of various distributed services and algorithms, including load balancing, message queues, caching, and consistent hashing.
  10. Distributed services and algorithms II: This section covers more advanced topics in distributed services, including persistence, transactions, eventual consistency, and coordination.
  11. Design of distributed systems: This section covers the methodology and principles for designing distributed systems, as well as examples of different architectures and design patterns.

1.2 Introduction to Distributed Systems

1.2.1 Overview

  1. Definition of a distributed system (DS)
  2. Challenges for developers working with DS
  3. Reasons for using a DS
  4. Examples of DS
  5. Characteristics of DS
  6. Middleware for DS
  7. Concepts and architectures in DS, including scale, parallelism, and latency
  8. Resources related to DS

1.2.2 Reasons for Distributing Systems

1.2.2.1 Definition of a Distributed System

A system that is made up of independent agents that interact with each other and produce a cohesive behavior. The interactions and events in this system happen concurrently and in parallel, meaning that they occur simultaneously and independently of one another. This type of system is often used to perform tasks that are too complex or large for a single agent to handle, and the concurrent and parallel nature of the interactions allows for efficient and effective processing of the tasks.

1.2.2.2 Emergence

Four types of emergence: strong emergence, weak emergence, evolutionary emergence, and constructed emergence.

Emergent failure modes: Instances of cascading failures in constructed emergence.

1.2.3 Why are Distributed Systems complicated?

1.2.3.1 Why Distribute a System?

There are several reasons why an organization might choose to use a distributed system:

1.2.3.2 Examples of Distributed Systems

There are many examples of distributed systems, including:

1.2.4 Characteristics of Distributed Systems

1.2.4.1 Overview

1.2.4.2 Eight Fallacies of Distributed Computing

1.2.4.3 Programming Languages and Distributed Systems

There are two main approaches to programming languages and distributed systems: the transparency camp and the message camp.

The transparency camp focuses on hiding the complexity of a distributed system from the programmer. This can be achieved through techniques such as creating type-safe interfaces and calls, and hiding security, storage, and transactions behind frameworks such as .NET or Enterprise Java Beans (EJBs). This approach treats the distributed system as a programming model, rather than something that requires special handling.

The message camp, on the other hand, takes a more direct approach to programming distributed systems. This approach typically involves using a simple create, read, update, delete (CRUD) interface and using message content as the interface. Messages are often coarse-grained, meaning that they carry a large amount of data in a single message, often in the form of documents. Programmers in this camp deal with the complexity of remoteness directly, and architectures are often event-based or based on the representational state transfer (REST) model.

1.2.4.4 History of Distributed Systems

The history of distributed systems can be divided into several distinct periods:

1.2.4.5 Metcalfe’s Law

Metcalfe’s law is a principle that states that the value or utility of a network increases as the number of users in the network increases. This is because the more people who are using the network, the more useful it becomes as a platform for communication, collaboration, and the exchange of information and resources. The adoption rate of a network also tends to increase in proportion to the utility provided by the network, which is why companies often give away software or other products for free in order to increase the size of their user base and the value of their network.

Metcalfe’s law is often cited as a factor that can contribute to the emergence of scale-free, or power law, distributions in networks. This type of distribution is characterized by a few nodes (or users) with many connections, and many nodes with only a few connections. The existence of network effects, in which the value of a network increases with the number of users, can help to explain why we don’t see many Facebooks or Googles – it can be difficult for new networks to gain traction and achieve the same level of utility as established networks with a large user base.

1.2.4.6 Security Topics for Distributed Systems

Security is an important concern in distributed systems, as they often handle sensitive data or perform critical functions. Some key security topics that are relevant to distributed systems include:

1.2.4.7 Theoretical Foundations of Distributed Systems

The theoretical foundations of distributed systems are a set of concepts and principles that form the basis for the design and analysis of these systems. Some of the key theoretical foundations of distributed systems include:

1.2.4.8 Distributed Systems Design Fields

The design of a distributed system involves addressing a number of common problems and considering various architectural factors in order to create a system that is scalable, reliable, and secure. Some key considerations in distributed system design include:

1.2.5 Middleware

1.2.5.1 An Introduction to Middleware

Middleware is software that sits between the operating system and the application layer of a distributed system, providing a layer of abstraction that enables communication and resource sharing among the various components of the system. Some key characteristics of middleware include:

The importance of middleware in distributed systems cannot be overstated. It is essential for enabling communication and resource sharing among the various components of the system, and for abstracting away the complexities of working with distributed systems. Without middleware, it would be much more difficult to develop distributed applications that are scalable, reliable, and secure.

1.2.5.2 Distribution Transparencies

Distribution transparencies are features that are designed to hide the complexities of working with distributed systems from the user or developer. Some key distribution transparencies include:

1.2.5.3 Classification of Middleware

Middleware can be classified into several categories based on the type of service it provides and the way it communicates with other components in a distributed system. Some common types of middleware include:

1.3 Theoretical Models of Distributed Systems

1.3.1 Overview

  1. Message passing theoretical model
  2. Distributed computing topologies
  3. Client-server systems, including critical points, architectures, processing and I/O models

1.3.2 Basic Principles

1.3.2.1 Synchronous vs. Asynchronous Systems

Synchronous and asynchronous systems are two types of distributed systems that differ in the way that they handle communication and the passage of time.

In a synchronous system, events are assumed to be delivered in a lockstep manner, with a fixed delay between the occurrence of an event and its delivery. This means that events are delivered at predetermined intervals, and the system can be designed to operate on the assumption that events will be delivered at these intervals.

Asynchronous systems do not have a fixed delay between the occurrence of an event and its delivery. Events may be delivered at any time, and the system must be able to handle this uncertainty. Asynchronous systems typically require more complex distributed algorithms to ensure correct operation, but they are generally easier to build and maintain than synchronous systems.

In practice, most distributed systems are asynchronous, with additional mechanisms such as failure detectors and randomization used to help ensure correct operation.

1.3.2.2 Properties of Message Protocols

Message protocol properties are characteristics that describe the desired behavior of a message passing protocol in a distributed system. These properties are used to ensure that the protocol operates correctly and achieves its intended goals.

Some common message protocol properties include:

1.3.2.3 Complexity of Distributed Algorithms

1.3.2.4 Failure Types

In distributed systems, there are several types of failures that can occur. These failures can have different impacts on the system and can require different approaches to handling them.

Some common types of failures in distributed systems include:

1.3.2.5 Distributed Computing Topologies

There are several types of distributed computing topologies that can be used to design distributed systems. These topologies can have different characteristics and can be used to achieve different goals, depending on the needs of the system.

Some common types of distributed computing topologies include:

1.3.2.6 Queuing Theory: Kendall Notation M/M/m/ß/N/Q

The Kendall notation, also known as the Kendall notation for Markov chains, is a way of describing the behavior of a queuing system. It is often used in the field of operations research to analyze the performance of systems that have a finite number of servers and a finite queue size.

1.3.3 Queuing Theory

1.3.3.1 Generalized Queuing Theory Terms (Henry Liu)

1.3.3.2 Little’s Law

1.3.3.3 Hejunka

Hejunka is a Japanese term that refers to the practice of leveling the production process by smoothing out fluctuations in demand and task sizes. It is often used in lean manufacturing and just-in-time (JIT) production systems to improve the efficiency and flow of work through a system.

The goal of Hejunka is to create a steady, predictable flow of work through the system by reducing variability in task sizes and demand. This can be achieved through a variety of methods, such as:

By leveling the production process and reducing variability in task sizes, Hejunka can help to improve the efficiency and flow of work through a system, and reduce the risk of bottlenecks or delays caused by large differences in task size.

1.3.3.4 Lessons Learned from Queuing Theory

1.3.3.5 Request Problem in Multi-Tier Networks

In a multi-tier network, the request problem refers to the fact that requests must travel through multiple layers or tiers of servers in order to be processed, and each layer adds its own processing time and potential delays to the overall response time.

The average response time in a multi-tier network is therefore the sum of the trip average (the time it takes for a request to travel from one server to the next) multiplied by the wait time (the time a request spends waiting for a server to become available) at each layer, plus the sum of the service demand (the time it takes for a server to process a request) at each layer.

It is important to note that in a multi-tier network, all requests are synchronous and may be in contention with each other, which means that wait times can occur due to contention for server resources. This can impact the overall efficiency of the system and may require the use of techniques such as caching or batching to reduce the number of requests that need to be processed.

1.3.3.6 Task Size Problem in Multi-Tier Networks

In a multi-tier network, the task size problem refers to the fact that differences in task size can cause delays and inefficiencies in the processing of requests.

In the case of pipeline stalls between nodes, large differences in task size can cause requests to be held up at one node while waiting for the next node to become available, leading to delays in the overall response time.

1.3.3.7 Theories & Model vs. Reality

When applying queuing theory models to real-world systems, there are several factors that can impact the accuracy and usefulness of the model. These include:

1.3.4 Client/Server Architectures

1.3.4.1 Critical Points in Client/Server Systems

On the client side:

On the server side:

Between both: Bandwidth/latency; The bandwidth and latency of the network connection between the client and the server can impact the performance and efficiency of the system.

1.3.4.2 Stateful vs Stateless Systems (The Stateful Server Problem)

The stateful server problem refers to the trade-offs that must be considered when designing and implementing a server-based system that maintains state information.

On the one hand, stateful servers have several advantages:

However, stateful servers also have some disadvantages:

Stateless design, on the other hand, stores all data in external storage such as databases or caches, which can make it easier to design and implement the system. However, in the event of failures, programming stateless systems can be more difficult, as all data must be retrieved from external storage.

1.3.4.3 Terminology for Client/Server Systems

1.3.4.4 Overarching Client/Server Architectures

1.3.5 Process and I/O Models

1.3.5.1 Different Process Models

1.3.5.2 Questions for Process Models

1.3.5.3 Amdahl’s Law

According to Amdahl’s Law, the maximum improvement in overall system performance that can be achieved by improving a particular part of the system is limited by the fraction of time that the improved part of the system is used. In other words, if only a small portion of the system’s workload is affected by the improvement, the overall improvement in performance will also be small.

speedup = \frac{1}{(1-parallelfraction+\frac{parallelfraction}{numberofprocessors}}

For example, if a particular part of a system is improved so that it runs twice as fast, but that part of the system is only used 10% of the time, the overall improvement in system performance will be limited to a 10% increase. On the other hand, if the improved part of the system is used 50% of the time, the overall improvement in performance will be much larger, at 50%.

Amdahl’s Law is often used to understand the potential benefits and limitations of optimizing or improving specific parts of a system. It can be a useful tool for determining how much resources should be invested in improving a particular part of the system, and for understanding the potential impact of those improvements on overall system performance.

1.3.5.4 Different I/O Models

1.3.5.5 Questions for I/O Models

1.4 Message Protocols

1.4.1 Overview

  1. Message protocols
    1. Delivery guarantees in point-to-point communication
    2. Reliable broadcast
    3. Request ordering and causality
  2. Programming client-server systems using sockets

1.4.2 Delivery Guarantees

1.4.2.1 The Role of Delivery Guarantees

Shop order: The scenario described involves an online shop in which orders are placed and processed. The goal is to ensure that orders are delivered correctly and efficiently, regardless of any potential issues that may arise.

1.4.2.2 Why is TCP not Enough?

While TCP (Transmission Control Protocol) is a widely used networking protocol that provides a reliable communication channel between devices, it is not always sufficient on its own to ensure proper behavior in all situations. Here are some reasons why TCP may not be enough:

To address these and other issues, it may be necessary to use additional protocols or techniques, such as message-oriented middleware or application-level protocols, to ensure proper behavior in case of connection failures and to provide additional features and guarantees for message delivery.

1.4.2.3 Different Levels of Timeouts

1.4.2.4 Delivery Guarantees for RPCs

1.4.3 Idempotency

1.4.3.1 Overview

Idempotency is a property of operations or requests that ensures that they can be safely repeated without changing the result of the operation. In other words, if an operation is idempotent, it will have the same result whether it is performed once or multiple times.

1.4.3.2 Server State and Idempotency

Idempotency is an important property to consider when designing operations or requests that may be repeated or delivered multiple times, as it can help to ensure that the operation or request is able to be safely repeated without affecting the overall result. Here are some additional considerations related to idempotency and server state:

1.4.3.3 Implementing Delivery Guarantees for Idempotent Requests

1.4.3.4 Repeating Non-Idempotent Operations

If an operation is not idempotent, it means that it cannot be safely repeated multiple times and is likely to have unintended side effects. In this case, there are several measures that can be taken to ensure reliable communication:

It is important to carefully manage state on the server in order to avoid overloading the system with old replies. It may be necessary to set limits on how long to store old replies and when to discard them.

1.4.4 Order

1.4.4.1 Request Order in Multi-Point-Protocols

1.4.4.2 Request Ordering with Multiple Nodes

In a multi-node system, it may be necessary to use a reliable broadcast protocol to ensure that requests are processed in the desired order. Here are some examples of protocols that can be used for request ordering with multiple nodes:

1.4.4.3 Implementing Causal Ordered Broadcasts

1.4.4.4 Implementing Absolutely Ordered Casts

There are however disadvantages with these implementations:

1.4.5 Sockets

1.4.5.1 Properties of Sockets

Sockets are a programming interface that enables communication between networked computers. They are used to send and receive data over a network connection using either TCP (Transmission Control Protocol) or UDP (User Datagram Protocol) connections.

Socket properties include:

1.4.5.2 Berkeley Sockets

Berkeley sockets provide a set of primitives or functions that can be used to create, manage, and manipulate network connections and communication channels.

Here is a brief description of the meaning of each of the Berkeley Sockets primitives:

1.4.5.3 Server Side Processing using Processes

Server-side processing using processes is a model for handling incoming requests in a networked system. It involves the following steps:

  1. Server: The server listens on a specified port, waiting for incoming connection requests.
  2. Client: A client connects to the server on an arbitrary port and establishes a connection between the client and the server.
  3. Server: The server accepts the connection request and spawns a new process to handle the request. The process is assigned to the same port as the original connection, and the server goes back to listening for new connection requests.

This model allows the server to scale to some degree, as it can handle multiple requests concurrently by spawning new processes to handle each request. However, process creation can be expensive, and there may be limits on the number of processes that can be created on a given system. An example of this model is traditional CGI processing in a web server, where a new process is spawned to handle each incoming request.

1.4.5.4 Server Side Processing using Threads

Server-side processing using threads is a model for handling incoming requests in a networked system that is similar to the process-based model, but uses threads instead of processes to handle requests. It involves the following steps:

  1. Server: The server listens on a specified port, waiting for incoming connection requests.
  2. Client: A client connects to the server on an arbitrary port and establishes a connection between the client and the server.
  3. Server: The server accepts the connection request and spawns a new thread to handle the request. The thread is assigned to the same port as the original connection, and the server goes back to listening for new connection requests.

This model allows the server to scale well, as it can handle multiple requests concurrently by spawning new threads to handle each request. Thread creation is less expensive than process creation, and it is possible to create a larger number of threads on a given system. An example of this model is servlet request processing in a servlet engine, also known as a “web container”.

In a threaded server, it is important to ensure that the functions that are used to handle incoming requests are re-entrant, or able to be safely called concurrently by multiple threads. This is because multiple threads may be executing the same function simultaneously, and the function must be able to handle this without causing problems.

To ensure that functions are re-entrant, it is important to avoid using unprotected global variables, as these can be modified concurrently by multiple threads, leading to potential race conditions and other problems. Instead, state should be kept on the stack, with each thread having its own copy of the state. This ensures that each thread has its own private version of the state, and can operate independently of other threads.

1.4.5.5 Design Considerations for Socket-Based Services

1.4.5.6 Stateless vs. Stateful Socket-Based Services

A stateless service is one in which the server does not store any information about previous requests or maintain any state between requests. This can have several advantages, including:

On the other hand, a stateful service is one in which the server stores information about previous requests and maintains state between requests. This can have several advantages, including:

However, stateful services can also have some drawbacks, including:

1.4.5.7 TCP SYN Flooding

TCP SYN flooding is a type of denial of service (DoS) attack that exploits a weakness in the TCP connection establishment process. In a normal TCP connection, the client sends a SYN (synchronize) packet to the server to initiate the connection, and the server responds with a SYN-ACK (synchronize-acknowledge) packet to confirm that the connection can be established. The client then sends an ACK (acknowledge) packet to complete the three-way handshake and establish the connection.

In a TCP SYN flooding attack, the attacker sends a large number of SYN packets to the server, either from a single machine or from a network of compromised machines. The server responds to each SYN packet with a SYN-ACK packet, but the client never completes the three-way handshake by sending an ACK packet. As a result, the server is left waiting for the ACK packet, and the resources used to track the half-open connections can be exhausted, preventing the server from accepting any new connections.

TCP SYN flooding attacks can be mitigated by implementing SYN cookies, which use cryptographic techniques to allow the server to track half-open connections without using any resources. Other measures, such as rate limiting and filtering of incoming SYN packets, can also help to prevent or mitigate the effects of TCP SYN flooding attacks.

1.4.5.8 Writing a Socket Client & Server

For the server:

  1. Define the port number of the service: The server should specify the port number on which it will listen for incoming connections. For example, the HTTP server typically listens on port 80.
  2. Allocate a server socket: The server creates a server socket and binds it to the specified port number. The server socket then listens for new connections.
  3. Accept an incoming connection: When a client attempts to connect to the server, the server socket accepts the connection and creates a new socket for the client connection.
  4. Get the input channel: The server reads the input channel from the socket to receive messages from the client.
  5. Parse the client message: The server parses the client message to understand the request being made.
  6. Get the output channel: The server gets the output channel from the socket, which is where it will send responses to the client.
  7. Do request processing: The server processes the client request, either by handling it directly or by creating a new thread to handle it.
  8. Create a response message: The server creates a response message, such as an HTTP response, to send back to the client.
  9. Write the message to the output channel: The server writes the response message to the output channel, which sends it back to the client.
  10. Read new messages or close the connection: The server can either read new messages from the client’s input channel or close the connection if it is no longer needed.

For the client:

  1. Define the hostname and port number of the server host
  2. Allocate a socket with the host and port parameters
  3. Obtain the input channel (for messages from the server) and output channel (for messages to the server) from the socket
  4. Create a message to send to the server, such as “GET /somefile.html HTTP/1.0”
  5. Write the message to the output channel to send it to the server
  6. Read the response from the input channel and display it

For a multithreaded client:

1.4.5.9 Distribution Transparency with Sockets

1.4.5.10 Infrastructure of Client-Server Systems

  1. Directory: Helps locate the server
  2. Proxy: Checks client authorization and routes requests through the firewall
  3. Firewall: Allows outgoing calls only
  4. Reverse proxy: Caches results, ends SSL sessions, and authenticates clients
  5. Authentication server: Stores client data and authorizes clients
  6. Load balancer: Distributes requests across servers

1.5 Remote Procedure Calls

1.5.1 Overview

  1. Call versions: local, inter-process, and remote
  2. Mechanics of remote calls:
    1. Marshaling/serialization
    2. Data representation
    3. Message structure and schema evolution
    4. Interface definition language
    5. Tooling: generators
  3. Cross-language call infrastructures: Thrift and gRPC

1.5.2 Types of RPCs

1.5.2.1 Call Versions

  1. Local calls: made within a single process and do not require network communication
  2. Inter-process calls: Made between processes on the same machine and can be implemented using a variety of mechanisms, such as shared memory or pipes
  3. Remote calls: Made between processes on different machines and typically require network communication

1.5.2.2 Remote Calls vs. Remote Messages

1.5.2.3 In-Process (Local) Calls

1.5.2.4 Inter-Process Calls (IPC)

The good news is that inter-process communication occurs on the same hardware and uses the same language at the sender and receiver, which can reduce security issues and ensure that a system crash affects both sender and receiver (fail-stop semantics)

1.5.2.5 Remote Procedure Calls (RPC)

1.5.3 Implementing Remote Calls

1.5.3.1 Overview

1.5.3.2 Marshaling/Serialization

Marshaling/serialization is the process of converting parameters (basic types or objects) into a common transfer format (message) that can be transmitted over a network. At the target site, the message is transformed back into the original types or objects. There are several different approaches to marshaling/serialization, each with its own trade-offs:

Serialization to text is a method of converting data into a human-readable format, typically using a markup language such as XML. This approach can be less efficient in terms of the size of the resulting message, as it is generally less compact than binary serialization. It is important to be aware of language limits, such as the range of integers and floating-point numbers that can be represented in JavaScript, when using this approach. XML is a popular choice for language-independent encoding because it is widely supported and can represent a wide range of data types.

Serialization to binary is a method of converting data into a compact, machine-readable format. It requires a schema, which defines the structure of the message and the types of data it contains. This approach is generally more efficient in terms of the size of the resulting message, but it is not as flexible as text-based serialization because it requires a shared understanding of the schema between the sender and receiver. Binary serialization can also be used for language-independent encoding.

1.5.3.3 Schema Evolution

When data or functions change, it is important to consider how different versions of a system will coexist and communicate with each other.

Forward compatibility means that older receivers should be able to understand messages from newer senders. This allows newer versions of a system to be deployed without causing problems for existing clients.

Backward compatibility means that newer receivers should be able to understand messages from older senders. This allows older versions of a system to continue functioning even after newer versions have been introduced.

1.5.3.4 Keeping Compatibility when Evolving a Schema

For JSON

Forward compatibility means that older versions of a system should be able to understand messages from newer versions. Examples of changes that can be made to a system in a way that maintains forward compatibility:

Backward compatibility means that newer versions of a system should be able to understand messages from older versions. Examples of changes that can be made to a system in a way that maintains backward compatibility:

Full compatibility means that a change can be made to a system without breaking compatibility with either older or newer versions. Examples of changes that can be made to a system in a way that maintains full compatibility include:

Incompatibility means that a change to a system will break compatibility with either older or newer versions. Examples of changes that can break compatibility include:

1.5.3.5 Stubs and Skeletons

Stubs and skeletons are code that is used to facilitate communication between different systems, typically in the context of remote procedure calls (RPCs). Stubs are used by clients to initiate a remote call, while skeletons are used by servers to receive and process the remote call.

There are several ways to generate stubs and skeletons, including:

1.5.3.6 Finding an RPC Server

In a remote procedure call (RPC) system, a client needs to be able to locate the server in order to initiate a remote call.

Service:

  1. Start listening at port X: The server starts listening for incoming requests on a specific port.
  2. Tell portmapper about program, version, and port: The server registers itself with the portmapper, which is a service that keeps track of the programs and ports that are available on the system. The portmapper is typically a separate daemon that runs on the system.

Client:

  1. Ask portmapper for program, version: The client sends a request to the portmapper asking for the port number of the desired program and version.
  2. The portmapper responds with the port number where the desired program and version can be found.
  3. Send procedure call to service: The client sends the procedure call to the server on the specified port.

This process is known as “binding,” and there are several ways to handle it, including using inetd, DCE, or a Unix portmapper.

1.5.3.7 Factors to Consider when Choosing an IDL

1.6 Distributed Objects

1.6.1 Overview

  1. Fundamental Properties of Objects
  2. Local and remote object references
  3. Parameter passing
  4. Object invocation types
  5. Distributed Object Services
  6. Object Request Broker Architectures
  7. Interface Design
  8. Java RMI

1.6.2 Basic Principles

1.6.2.1 Objects vs. Abstract Data Types

There are some differences between the two:

1.6.2.2 Properties of Objects

Objects in an object-oriented (OO) programming language do have many properties that can make them challenging to implement in a concurrent or distributed system.

1.6.2.3 Challenges for Remote Objects

Remote objects (ROs) are objects that are accessed and managed remotely, typically over a network. They are a key concept in distributed object systems, which are systems that enable objects on different machines to communicate and interact with each other.

1.6.2.4 What is a Remote Object?

1.6.3 Components of Distributed Objects

1.6.3.1 Object Models and Type Systems

CORBA:

CORBA is designed to provide language independence, so it defines its own types and does not allow user-defined classes to be used in the interfaces of remote objects.

Java RMI:

Java RMI allows user-defined classes to be used as value objects if they are serializable, but is specific to the Java programming language.

1.6.3.2 Accessing Remote Objects

1.6.3.3 The Broker Pattern

1.6.3.4 Remote Object Reference

1.6.3.5 Static vs Dynamic Invocation

Static Remote Method Invocation

Dynamic Invocation:

1.6.3.6 Asynchronous Invocations

1.6.4 Implementing Remote Objects

1.6.4.1 Main Distributed Object Services

1.6.4.2 Portable Interceptors

Interceptors can be used to transparently add additional (context) information to calls and transport it between object request brokers (ORBs).

1.6.4.3 Remote Interface Design

1.6.4.4 Problems with Remote Objects

1.6.5 Java RMI

1.6.5.1 Java RMI Request/Reply Protocols

JRMP (Java Remote Method Protocol) is the first protocol for RMI. It has the following characteristics:

RMI-IIOP (RMI over CORBA’s Internet Inter-Orb Protocol) has the following characteristics:

1.6.5.2 Java RMI Classes & Tools

1.6.5.3 Activation in Java RMI

Activation is an important feature in Java RMI (Remote Method Invocation) because it allows servers to transparently store servant state on persistent storage and recreate servants on demand. This helps the server control its resources against memory exhaustion and performance degradation.

1.6.5.4 Security in Java RMI

1.7 Distributed Business Components

1.7.1 Overview

1.7.2 Basic Principles

1.7.2.1 Problems With Object Interfaces

1.7.2.2 Component Based Processing

Enterprise components build on this:

1.7.2.3 From Objects to Components

  1. Object-oriented design involves isolated, monolithic applications that are not distributed.
  2. Distributed objects involve calls between applications, but can have management and performance issues with large numbers of small remote objects.
  3. Distributed systems involve multi-tier systems with point-to-point connectivity, but can be expensive and difficult to develop.
  4. Distributed components use a framework for pluggable business components and create a market for interoperable components. This approach addresses modeling, development, and deployment and aims to achieve re-use and lower development costs.

Components go beyond distributed systems and were designed to address some of the same issues as object-oriented development.

1.7.2.4 Business Components

1.7.2.5 Alternatives to Isomorphic Mapping

1.7.2.6 Generative Computing

1.7.2.7 Monolithic Software vs. Components

1.7.2.8 Objects vs. Components

1.7.2.9 Separation of Concerns and Context

1.7.3 Enterprise Java Beans

1.7.3.1 Overview

1.7.3.2 EJB Component Model

The EJB component model includes four types of EJBs: stateless session beans, stateful session beans, entity beans, and stateless message-driven beans.

These different EJB types allow for scalability, client code on the server side, asynchronous processing, and the representation of business data. Entity beans are permanent, while stateful session beans do not survive a server crash.

1.7.3.3 Session Beans (Stateful and Stateless)

1.7.3.4 Entity Beans (Deprecated Since 3.0)

1.7.3.5 Message-Driven Beans

1.7.3.6 Client View of EJBs

1.7.3.7 Local vs. Remote Interfaces

1.7.3.8 Separation of Concerns and Context in EJB

Separation of concerns is done through the EJB framework, separation of context is done through deployment:

1.7.3.9 The EJB Container

  1. A client invokes an entity bean interface.
  2. The container delegates the request to the entity bean business logic.
  3. The entity bean business logic delegates tasks such as transactions, persistence, and security to resources accessed through JDNI and a database.

At the point of interception, the container provides resource management, lifecycle, state management, transactions, and security services to the bean.

1.7.3.10 Containers and Threads

1.7.3.11 Entity Bean Container Contract

The Entity Bean-Container Contract defines a set of methods that the container can call on an entity bean in order to manage its lifecycle and access resources.

These methods include:

The actions that can be performed in these framework methods depend on the availability of a transactional context, object identity, local or remote view, and client security context.

1.7.3.12 Bean Managed vs. Container Managed Persistence

There are two main approaches to persistence in EJBs: bean-managed and container-managed.

It is generally believed that container-managed persistence is the preferred approach in the future, as it is more portable and does not require adjustments to different datastores. Bean-managed persistence, on the other hand, is less portable and requires more effort to adapt to different datastores.

1.7.3.13 JNDI Naming Context

1.7.3.14 Deployment Descriptor

1.7.3.15 Security Support

Principal delegation:

“Run as” identity:

1.7.3.16 Transaction Modes

EJBs support several transaction modes, which allow developers to specify the level of transaction support required by the EJB.

The available transaction modes are:

1.7.3.17 Best Practices for EJBs

1.7.3.18 Shortcomings of EJBs

1.7.3.19 New Ways for Object-Relational Mapping

1.7.3.20 Lessons Learned from EJBs

1.8 Services

1.8.1 Overview

1.8.2 Timeline of Distributed Service Architectures

1.8.3 CORBA

1.8.3.1 CORBA Security Model

1.8.3.2 CORBA Core Properties

1.8.4 Web Services

1.8.4.1 Overview

1.8.4.2 Web Services Core Properties

1.8.4.3 UDDI Functionality

UDDI (Universal Description, Discovery, and Integration) is a registry that provides a “find and publish” API for distributed services. It works like this:

  1. Providers publish their services in a registry.
  2. Requesters search for the desired service in the UDDI (Universal Description, Discovery, and Integration) registry.
  3. The UDDI registry retrieves the provider location and WSDL (Web Service Description Language) service description for the requester.
  4. The requester creates a request based on the WSDL description.
  5. The requester sends the request to the provider using a specified transport protocol (such as SOAP over HTTP).

This type of architecture is called “service-oriented” because it uses a broker for service advertisement and lookup, and requester and provider bind dynamically with respect to the transport protocol used.

1.8.4.4 UDDI Content and Categories

The UDDI registry has three main categories of information:

1.8.4.5 WSDL Overview

1.8.4.6 WSDL Elements

It includes the following elements:

1.8.4.7 SOAP

There are several aspects that define it’s performance:

It has been found that internet transport time, especially in the absence of Quality of Service (QoS) measures, has a greater impact on overall request time than the size and interpretation effort of a textual format.

1.8.4.8 Security and Web Services

1.8.4.9 Reliable Messaging

Reliable B2B (Business-to-Business) messages require the following qualities:

SOAP and HTTP partially achieve this like so:

  1. The first application layer exchanges persistent messages with the requester.
  2. The requester sends a SOAP message with a message ID, sequence number, and QoS (Quality of Service) tag to the receiver.
  3. The receiver must send an acknowledgement.
  4. The receiver exchanges persistent messages with the second application layer.

1.8.4.10 Secure Messages

1.8.4.11 SAML

1.8.4.12 Transaction Models

Atomic transactions:

Activity transactions:

1.8.4.13 Stateful Web Services

1.8.4.14 Scaling Web Services

1.8.4.15 Why UDDI Failed

UDDI relied on the following assumptions:

These assumptions were problematic because:

To summarize, UDDI lacks technology to address the following issues:

1.8.4.16 Lessons Learned from Web Services and CORBA

1.8.5 SOA

1.8.5.1 SOA Core Properties

1.8.5.2 SOA Interface Design

1.8.5.3 SOA Blueprint Service Types

1.8.5.4 SOA vs. Microservices

SOA:

Microservices:

1.8.6 REST

1.8.6.1 RPC vs. REST

1.8.6.2 The Web’s Architecture

1.8.6.3 REST Maturity Model

1.8.6.4 REST Resource Archetypes

1.8.6.5 CRUD with REST

In RESTful web services, requests are made by a requestor to a representation of a resource. The HTTP methods (GET, POST, PUT, DELETE) are used to perform different actions on the resource:

The separation of updates and reads is a principle of good software design that has been around for a long time. It is known as the “command-query separation principle” and was made a requirement in the Eiffel programming language.

1.8.6.6 RESTful Web Features

RESTful web services have four key characteristics:

1.8.6.7 Critical Points with REST

1.8.6.8 GraphQL as a REST Alternative

1.8.7 Microservices

1.8.7.1 The Reasons for Microservice Adoption

1.8.7.2 Scalability Problems of Monolithic Applications

1.8.7.3 Scalability Benefits of Microservices

1.8.7.4 The Microservice Ecosystem

1.8.7.5 Microservice Design Patterns

1.8.8 Critical Points with Microservices

1.8.9 Serverless

1.8.9.1 Serverless Definition

1.8.9.2 Stateless Applications

1.8.9.3 Issues with Serverless

1.8.9.4 Serverless Design Patterns

1.9 Theoretical Foundations of Distributed Systems

1.9.1 Overview

1.9.2 Foundational Concepts

1.9.2.1 The Eight Fallacies of Distributed Computing

1.9.2.2 Analyzing Latency

1.9.2.3 Characteristics of Distributed Systems

1.9.3 Consistency

1.9.3.1 Liveness vs. Correctness

Correctness and liveness are two important properties of distributed systems that ensure they function as intended and make progress.

Both correctness and liveness are based on assumptions about failures and other conditions in the system, such as fairness and the presence of Byzantine errors. Ensuring that a distributed system has both correctness and liveness is critical for its success.

1.9.3.2 Liveness and Correctness in Practice

Here is an example of how correctness and liveness can be defined for an event-based system:

Correctness:

Liveness:

Failure Assumptions: Fail-stop model with fairness

In this example, the system is designed to ensure correctness by limiting the notifications a node receives to those it is subscribed to and ensuring that notifications are received only once. It is designed to ensure liveness by ensuring that a node will eventually start receiving notifications after subscribing. The system also makes assumptions about failures, such as the fail-stop model with fairness, which are used to ensure the correctness and liveness of the system.

1.9.3.3 Timing Models

In distributed systems, timing models refer to the way that events and communication between nodes are synchronized. There are three main types of timing models: synchronous, asynchronous, and partial synchronous.

1.9.3.4 The Fischer, Lynch and Patterson Result

1.9.4 The CAP Theorem

1.9.4.1 Overview

States that in the presence of network partitions, a client must choose either consistency or availability, but not both.

1.9.4.2 Preconditions for the CAP Theorem

1.9.4.3 Common Misconceptions of the CAP Theorem

1.9.4.4 The Modern View of the CAP Theorem

1.9.4.5 PACELC

PACELC: A more complete portrayal of the space of potential consistency tradeoffs for distributed database systems.

1.9.5 Failure

1.9.5.1 Technical Failures

Unfortunately, in most cases there is no failure detection service that can identify when these failures occur and allow others to take appropriate action. However, it is possible to develop a failure detection service that could detect even partitioning and other types of failures through the use of MIBs (Management Information Bases) and triangulation. Applications could also be designed to track themselves and restart if necessary.

1.9.5.2 Failure Types

1.9.5.3 Failure Models

Many protocols for achieving consistency and availability make assumptions about failure models. For example, transaction protocols may assume recovery behavior by its participants if the protocol terminates.

1.9.5.4 Failures and Timeouts

1.9.5.5 Failure Detectors

A failure detector (FD) is a mechanism used in distributed systems to detect failures of processes or machines. It is not required to be correct all the time, but it should provide the following quality of service (QoS):

1.9.6 Clocks

1.9.6.1 Time in Distributed Systems

In distributed systems, there is no global time that is shared across all processes. Instead, different approaches are used to model time in these systems. These approaches include:

Logical time is modeled as partially ordered events within a process or between processes. It is used to represent the relative order of events in a distributed system, rather than the actual clock time at which they occurred.

Causal meta-data in the system can also order the events properly.

1.9.6.2 Consistent vs. Inconsistent Cuts

1.9.6.3 Event Clocks (Logical Clocks)

1.9.6.4 Lamport Logical Clock

1.9.6.5 Vector Clocks

1.9.6.6 Physical Interval Time (TrueTime)

1.9.6.7 Hybrid Clocks

Hybrid clocks are systems that combine elements of both logical and physical models of time in order to address the limitations of each approach. There are several reasons why hybrid clocks may be used in distributed systems:

1.9.6.8 Ordering in Distributed Event Systems

1.9.7 Consensus

1.9.7.1 Overview

Consensus is a process used by a group of processes to reach agreement on a specific value, based on their individual inputs. The objective of consensus is for all processes to decide on a value v that is present in the input set.

1.9.7.2 Algorithms for Consensus

These protocols offer trade-offs in terms of correctness, liveness (availability and progress), and performance:

1.9.7.3 Two-Phase Commit (2PC)

Steps:

  1. Preparation phase: In this phase, the processes prepare to commit to a decision. Each process sends a “prepare to commit” message to a coordinator process, which is responsible for coordinating the decision-making process.
  2. Decision phase: Once all the processes have prepared to commit, the coordinator process sends a “commit” message to all the processes. This message instructs the processes to commit to the decision.
  3. Confirmation phase: Each process sends a “commit confirmation” message to the coordinator process, indicating that it has successfully committed to the decision.
  4. Finalization phase: Once all the processes have confirmed that they have committed to the decision, the coordinator process sends a “commit complete” message to all the processes, indicating that the decision has been successfully made.

Example:

  1. Imagine that there are three processes in a distributed system: A, B, and C.
  2. The coordinator process is A.
  3. The processes are deciding whether to commit to a new software update.
  4. In the preparation phase, A sends a “prepare to commit” message to B and C.
  5. B and C send “prepare to commit” messages back to A.
  6. In the decision phase, A sends a “commit” message to B and C.
  7. In the confirmation phase, B and C send “commit confirmation” messages back to A.
  8. In the finalization phase, A sends a “commit complete” message to B and C, indicating that the decision to commit to the software update has been successfully made.

Liveness and Correctness:

1.9.7.4 Quorum-Based Consensus

Steps:

  1. A decision is proposed by one of the processes in the distributed system.
  2. Each process in the system votes on the proposed decision.
  3. The votes are counted and checked against the quorum requirement. The quorum requirement is the minimum number of votes that must be received in favor of the decision in order for it to be approved.
  4. If the quorum requirement has been met, the decision is considered to have been approved.
  5. The processes move forward with implementing the decision.

Example:

  1. A group of five processes in a distributed system (A, B, C, D, and E) are deciding whether to commit to a new software update.
  2. Process A proposes the decision to update the software.
  3. Processes B, C, D, and E all vote on the proposed decision.
  4. The votes are counted and checked against the quorum requirement, which is set at three votes.
  5. Since a quorum of three votes has been received in favor of the decision, it is considered to have been approved.
  6. The processes move forward with implementing the software update.

Liveness and Correctness:

1.9.7.5 Paxos

Steps:

  1. Prepare (Phase 1a):
  1. Promise (Phase 1b):
  1. Accept! (Phase 2a):
  1. Accepted (Phase 2b):

Example:

  1. Prepare (Phase 1a):
  1. Promise (Phase 1b):
  1. Accept! (Phase 2a):
  1. Accepted (Phase 2b):

1.9.7.6 Raft

RAFT is a distributed consensus protocol that allows a group of processes (called “replicas”) to agree on a value (“decide”) in the presence of failures. RAFT is divided into three distinct roles: Leader, Follower, and Candidate.

The protocol consists of the following steps:

  1. Leader Election:
  1. Log Replication:
  1. State Machine Update::

1.9.8 Broadcasting

1.9.8.1 Atomic Broadcast Conditions

A distributed algorithm that guarantees correct transmission of a message from a primary process to all other processes in a network or broadcast domain, including the primary.

It satisfies the following conditions:

It is widely used in distributed computing for group communication and defined as a reliable broadcast that satisfies total order.

1.9.8.2 Atomic Broadcast Protocol

Data:

Phases:

  1. Leader election/discovery: Members decide on a new leader and form a consistent view of the group.
  2. Synchronization/recovery: Leader gathers outstanding, uncommitted requests recorded at members and updates members missing certain data until all share the same state.
  3. Working: Leader proposes new transactions to the group, collects confirmations, and sends out commits.

1.9.8.3 Gossip Protocols

Gossip protocols are a class of distributed algorithms that rely on randomly chosen pairs of nodes in a network to exchange information about the state of the system. They are typically used for group membership, failure detection, and dissemination of information.

There are several key characteristics of gossip protocols:

1.9.8.4 DWAL

A DWAL (Distributed Write-Ahead-Log) is a data structure that is used to ensure that updates to a distributed system are stored in a way that allows them to be recovered in case of system failure. It is a type of write-ahead log, which means that updates are written to the log before they are applied to the system’s state. This allows the updates to be replayed in the correct order after a system failure.

1.9.8.5 Design Components of DWALs

1.9.9 Replication

1.9.9.1 Properties of Replication Models

1.9.9.2 Single-Leader Replication

Steps:

  1. Client sends x=5 to Node1 (master)
  2. Master updates node 2 and node 3 (followers)
  3. Client receives changed value (or old value; due to replication lag)

Advantages:

Disadvantages:

1.9.9.3 Eventually Consistent Reads

Eventual consistency model: Allows for a certain level of lag between updates to be propagated to all replicas

Steps:

  1. Client updates value on Master-Replica node
  2. Master-Replica eventually propagates update to Slave replica
  3. Client performs a stale read from client node, potentially returning outdated value

1.9.9.4 Multi-Master Replication

Multi-Master Replication (MMR) is a type of replication in which multiple servers can accept write requests, allowing any server to act as a master. This means that updates can be made to any server, and the changes will be replicated to all other servers in the network. MMR can be used to improve the availability and scalability of asystem, as it allows updates to be made to any server and allows multiple servers to handle write requests.

It also introduces the possibility of conflicts, as multiple servers may receive updates to the same data simultaneously. To resolve these conflicts, MMR systems typically use conflict resolution algorithms (last writer wins, keeping different versions, anti-entropy background merge/resolve) or allow the user to manually resolve conflicts.

Steps:

  1. Client 1 writes x=5 to Node 1 (master)
  2. Client 2 writes x=10 to Node 2 (master)
  3. The masters detect a conflict

Conflict types:

1.9.9.5 Leaderless Quorum Replication

Write:

Read:

1.9.9.6 Session Modes of Asynchronous Replication

The following guarantees seem to enable “sequential consistency” for a specific client, meaning that the program order of updates from this client is respected by the system. Clients can track these guarantees using vector clocks:

We can also derive session anomalies from this:

1.9.9.7 Global Modes of Replication

There are multiple different modes to choose from:

Each of them have their own trade-offs:

1.10 Distributed Services and Algorithms I

1.10.1 Overview

  1. Distributed Services
    1. Replication, Availability and Fault-Tolerance
    2. Global Server Load Balancing for PoPs
  2. Typical Cluster Services
    1. Fail-over and Load-Balancing
    2. Directory Services
    3. Cluster Scheduler (neu)
  3. Distributed Operating Systems
  4. Example Services and Algorithms
    1. Distributed File System with replication and map/reduce
    2. Distributed (streaming) Log
    3. Distributed Cache with consistent hashing
    4. Distributed DB with sharding/partitioning functions
    5. Distributed Messaging with event notification and gossip

1.10.2 Types of Distributed Services

1.10.2.1 What is a Distributed Service?

Function provided by a distributed middleware with:

1.10.2.2 Services and Instances

1.10.2.3 Core Distributed Services

1.10.3 Availability

Is defined as:

Availability = \frac{Uptime_{agreed\ upon} - Downtime_{planned\ and\ unplanned}}{Uptime_{agreed\ upon}}

Continuous availability does not allow for planned downtime.

1.10.3.1 Typical Hardware Causes for Downtime

1.10.3.2 Availability through Redundancy

Across groups of resources:

Within a group of resources:

Between two resources:

For an individual resource:

1.10.3.3 3 Copy Disaster Recover Solution

1.10.4 Load Balancing

1.10.4.1 Serial vs. Redundant Availability

1.10.4.2 Global Server Load Balancing

1.10.4.3 Failover with Virtual IPs

Failover with one virtual IP:

Multi-site failover:

1.10.4.4 Failover, Load Balancing and Session State

Today: Stateless servers with state in DB are the norm, but sticky sessions are still useful because records need to be replicated.

Compromise: Replicate sessions between pairs of servers, then enable switching between them as failovers

1.10.4.5 P2P Load Balancing

1.10.5 Service Organization

1.10.5.1 Requirements of Distributed Name/Directory Systems

Functional Requirements:

Non-Functional Requirements:

Definition: Non-Functional Requirements are necessary functions of a system that ensure speed, reliability, availability, security, etc. Many systems fulfill functional requirements but fail to meet non-functional requirements.

1.10.5.2 Design of Distributed Name/Directory Systems

Name Space:

Naming Service:

1.10.5.3 Examples of Naming Services

1.10.5.4 Google vs. Amazon

Amazon:

Services

Google:

Products

1.10.5.5 Best Practices for Designing Services

1.10.6 Caching

1.10.6.1 CQRS (Command Query Responsibility Segregation)

1.10.6.2 Requirements of Caching Services

1.10.6.3 Handling Changing Machine Counts in Caching Services

1.10.6.4 Consistent Hashing Algorithms

Simple Consistent Hashing Algorithm:

Dynamo Consistent Hashing Algorithm:

1.10.6.5 Cache Patterns

Pull:

Push:

Pre-warmed:

General consideration: Be aware of LRU or clocked invalidations as cache is mission critical.

1.10.6.6 Cache Design Considerations

1.10.7 Events

1.10.7.1 Local vs. Distributed Events

Local:

Distributed:

1.10.7.2 Asynchronous Event-Processing

1.10.7.3 Features of Event-Driven Interaction

1.10.7.4 Types of Message Oriented Middleware (MOMs)

Centralized Message-Oriented-Middleware:

Clustered Message-Oriented-Middleware:

Simple P2P Event Libraries:

Flooding Protocols:

1.10.7.5 ZeroMQ

1.10.8 Sharding

1.10.8.1 Horizontal vs. Vertical Sharding

Horizontal: Per (database) row, e.g. first 100 users are in shard 1, 200 in shard 2 etc.

Vertical: Per (database) column, e.g. profile and email is in shard 1, photos and messages in shard 2 etc.

1.10.8.2 Sharding Strategies

1.10.8.3 Horizontal Sharding Functions

Algorithms applied to the key (often: user ID) to create different groups (shards):

1.10.8.4 Consequences of Sharding

1.10.9 Relationships

1.10.9.1 Overview

For example when an employee leaves:

1.10.9.2 Functional Requirements of Relationship Services

1.10.9.3 Why Relationship Services Failed

The good:

The bad:

1.11 Distributed Services and Algorithms II

1.11.1 Overview

1.11.2 Problems with Classic Concurrency

1.11.2.1 Why Truth is Expensive

1.11.2.2 Aspects of Classic Distributed Consistency

1.11.2.3 Persistent Object Representations

1.11.2.4 Mechanisms for Persistence

SQL Driver:

Object Relational Mapper (EJB/Hibernate):

Just storing an object is simple - doing this in a way that protects from concurrent access, system failures, and across different data stores is much harder.

1.11.2.5 Persistent Object Mapping

1.11.2.6 Data Store Session Pooling

1.11.3 Locking

1.11.3.1 Locking Against Concurrent Access

Binary locks:

Modal locks (read/write locks):

Lock Granularity: The granularity of locks (the scope of the resources being protected by the lock) affects the overall throughput of a system. The smaller the lock granularity, the better the performance will be.

1.11.3.2 Optimistic Locking

Process:

  1. Lock a row, read it along with its timestamp, and then release the lock.
  2. Start a transaction
  3. Write the data to the database.
  4. Acquire locks for all data read and compare the data timestamps.
  5. If one of them is newer, the transaction operation is rolled back, otherwise it is commited.

Advantages:

1.11.3.3 Serializability with Two-Phase Locking

Process:

  1. Allocate all locks
  2. Manipulate the data
  3. Release all locks

Advantages: Requires that all locks be allocated before any data manipulation and released only after the manipulation is complete. Guarantees serializability.

1.11.3.4 Deadlocks

1.11.3.5 Distributed Deadlock Detection

Local wait-for-graphs:

Detection servers:

Distributed edge chasing algorithms:

Stochastic detection:

1.11.4 Transactions

1.11.4.1 Classic ACID Definitions

1.11.4.2 Transaction Properties and Mechanisms

1.11.4.3 Serializability and Isolation

Definition: States that the outcome of executing a set of transactions should be equivalent to some serial execution of those transactions.

Purpose: The purpose of serializability is to ensure that each transaction operates on the database as if it were running by itself, which maintains the consistency and correctness of the database.

Importance: Without serializability, ACID consistency is generally not guaranteed, making it a crucial component in maintaining the integrity of the database.

1.11.4.4 Transaction API

  1. System starts in a consistent state
  2. Begins transaction
  3. Modifies objects

Commit transaction:

On error: Rollback:

1.11.4.5 Components of Distributed Transactions

Process

RPCs:

Components:

1.11.4.6 Service Context

1.11.4.7 Distributed Two-Phase Commit

Vote:

Completion:

1.11.4.8 Failure Models of Distributed Transactions

Work Phase:

Voting Phase:

Commit Phase (Server Uncertainty):

1.11.4.9 Special Problems of Distributed Transactions

Resources:

Coordinator as a Single Point of Failure:

Heuristic Outcomes for Transactions:

1.11.4.10 Transaction Types

Flat Transactions:

Nested Transactions:

Long-running Transactions:

Compensating Transactions:

1.11.4.11 ANSI Transaction Levels

Problems:

Transaction Levels:

The higher the level, the more overhead is required.

1.11.5 Distributed Filesystems

1.11.5.1 Filesystem Block Order Guarantees

1.11.5.2 Eventually Consistent Storage Systems

1.11.5.3 Grid Storage vs. NAS/SAN

Grid Storage:

NAS/SAN:

To summarize:

1.11.6 NoSQL

1.11.6.1 Forces behind NoSQL

1.11.6.2 Scaling Problems of RDBMs

1.11.6.3 NoSQL Design Patterns

1.11.6.4 DynamoDB Design Principles

1.11.7 Beyond Relaxed Consistency

1.11.7.1 Overview

1.11.7.2 CALM Principle

1.11.7.3 CALM Operations

Logically Monotonic:

Non-monotonic:

1.11.7.4 CRDTs

State-based CRDTs:

Operation-based CRDTs:

1.11.7.5 Bending the Problem

1.11.7.6 Examples of CRDTs

Counters:

Registers:

Sets:

1.11.7.7 Distributed Coordination

Features:

Liveness and Correctness:

1.11.8 Distributed Coordination Services

1.11.8.1 Zookeeper API

1.11.8.2 Primary-Order Atomic Broadcast with Zab

1.11.8.3 Consistency Requirements for ABCast (Reliable Ordered Atomic Broadcast)

1.11.8.4 Primary Order

1.11.8.5 HA Transactions

1.12 Design of Distributed Systems

1.12.1 Overview

1.12.2 Design Principles for Distributed Systems

1.12.2.1 Overview

Know Your No. 1 Enemy: Latency!

1.12.2.2 Sharing Ressources and Data

1.12.2.3 Connection Pooling

1.12.2.4 Horizonal Scaling/Parallelization

1.12.2.5 Caching and Replication

1.12.2.6 End-to-End Argument

1.12.2.7 Design Methodology

1.12.2.8 Uncomfortable Real-World Questions

1.12.3 Architecture Fields

1.12.3.1 Overview

1.12.3.2 Information Architecture (to analyze Caching)

Defines pieces of information to aggregate or integrate

Data/changed by Time Personalization
Country Codes No (not often, reference data) No
News Yes (aging only) No, but personal selections
Greeting No Yes
Message Yes (slowly aging) Yes

1.12.3.3 Distribution Architcture

Tells portal how to map/locate fragments defined in the information

Data Type Source Protocol Port Avg. Resp. Worst Resp. Downtimes Max Conn. Loadbal. Security Contact/SLA
News hostX http/xml 3000 100ms 6 sec. 17.00-17.20 100 client plain Mrs.X/News-SLA
Research hostY RMI 80 50ms 500ms 0.00-1.00 50 server SSL Mr.Y/res-SLA

Additional factors to consider:

Example results:

1.12.3.4 Service Access Layer

Is determined by the distribution architecture.

Simple Alternative: Sidecar, contains circuit breaker & service discovery

Advanced Alternative: Service mesh with separate data and control plane

1.12.3.5 Physical and Process Architecture

Physical Architecture:

Horizontally scalable application:

Vertically scalable application:

1.12.3.6 Architecture Validation

In the architecture validation phase these questions are answered: How does the architecture …

1.12.4 Improving an Existing Design

1.12.4.1 Problems with Naive Application Designs

The portal had no caching etc.

1.12.4.2 Improving the GUI Architecture

1.12.4.3 Improving the System Architecture

Use a system architecture diagram:

1.12.4.4 Lessons Learned from Naive Portal Designs

1.12.5 Fan-Out Architecture

1.12.5.1 Overview

Calls are parallel instead of serial.

1.12.5.2 Reliability Issues in Dependencies

1.12.5.3 Fragments

Pages: Unique to customers, cannot be re-used

Page fragments:

1.12.5.4 Latency Reduction & Tolerance

1.12.5.5 Avoiding Getting Stuck

1.12.6 Containing Failures

1.12.6.1 Circuit Breakers

1.12.6.2 Bulkheads

1.12.6.3 Blast Reduction