Understanding Infrastructure Edge Computing. Alex Marcham
of data have a window of time in which they are most useful. If within that time period they cannot be processed and used to extract an actionable insight, the value of that data decreases exponentially. Examples of this type of data include many real‐time applications; for example, in the scenario of an industrial robotics control system, instructing the system to perform an action such as orienting a robotic arm in a certain position to catch a piece of falling material is of limited use if the command reaches the arm too late to perform that action in a safe manner before the material falls.
Data velocity is the name given to this concept. If data for real‐time applications can be processed and used to extract insight within the shortest possible span of time since its creation or collection, that data and the resulting insight are able to provide their highest possible value to their end user. This processing must occur at a point of aggregation in terms of both network topology and compute resources, such that the resulting data analysis has the full context of relevant events and the power to perform the analysis at an acceptable rate for the application and its users to prevent any issues.
2.5.4 Transport Cost
Particularly with emerging use cases such as distributed artificial intelligence (AI), the cost of transporting data from the device edge locations where it is generated to a data centre location where it can be processed in real time will present a growing challenge. This is not only a technical consideration where network operators must appropriately provision upstream bandwidth in the access and midhaul layers of the network, but there is also a significant operational expenditure (OPEX) and capital expenditure (CAPEX) burden on the network operator associated with overprovisioning long‐haul network connectivity.
Infrastructure edge computing aims to address this challenge by moving the locations at which large amounts of data can undergo complex processing, for example, by distributed AI inferencing, to a set of locations which are positioned closer to the sources of this data than with today’s centralised data centres. The shorter the distance over which the bulk of data must be transmitted, the lower the data transport cost can be for the network operator which allows any use case reliant on moving such large volumes of data to be more economical and thus more practical to deploy and operate.
2.5.5 Locality
The locality of a system describes both the physical and logical distances between key components of the system. In the context of infrastructure edge computing, the system we are most concerned with spans from a user located on the device edge to an application operating from an edge data centre at the infrastructure edge, a facility which itself is then connected to a regional data centre.
Locality is an important concept in system design. In many ways it is the summation of all of the previously described issues in this section; by addressing all of them, locality allows infrastructure edge computing to enable a new class of use case which generates large amounts of data and needs that data to be processed in a complex fashion in real time. This is the true driving factor of why the infrastructure edge computing model is needed; new use cases in addition to useful augmentations of existing use cases require the capabilities which it offers, and these use cases are valuable enough to make the design, deployment, and operation of infrastructure edge computing itself worthwhile.
2.6 Basic Edge Computing Operation
With an understanding of the basic terminology and history behind infrastructure edge computing, as well as the primary factors, beyond specific use cases, which are driving its design, deployment, and adoption, we can explore an example of how edge computing operates in practice. This example will describe how each of the primary factors are addressed by infrastructure edge computing, as well as how interoperation can occur between the device edge, infrastructure edge, and RNDCs to make a useful gradient of compute, storage, and network resources from end to end.
To begin, let’s explore the operation of an application which needs only device edge computing to function. In this scenario, all of the compute and storage resources required are provided by a local device, in this example, a smartphone. Any data that is required is being generated locally and is not obtained from a remote location as the application operates, unlike if the application were reliant on the cloud. The application is entirely self‐contained at the user’s device, and so operates as follows in Figure 2.3:
In this case, the application is limited by the capabilities of the device itself. All of the resources that the application requires, such as to process data, display a complex 3D rendering to the user, or store data which results from the user’s actions, must all be present on the local device and also available to the application. If this is not met, the application will either fail or its operation will be degraded, leaving the user with a suboptimal experience. The use of only device resources requires devices to be powerful enough to provide everything that is required by any application which the user may wish to use, which is especially detrimental to mobile devices which must be battery powered and so not capable of supporting dense amounts of compute and storage resources as may be needed.
The extent to which this is a drawback varies depending on the type of application and on the type of device in question. A lightweight application may operate exactly as intended on a device alone, whereas an application which introduces more of a mismatch between the capabilities of the device and the requirements of the application, such as performing high‐resolution real‐time computer vision for facial recognition on a battery‐powered mobile device, may either not operate at all or compromise the user experience, for example, by providing greatly reduced performance or poor battery life, to the extent that the application is unable to fulfil the needs of the user and so fails.
Figure 2.3 Self‐contained application operating on device.
Next, we will add an RNDC to the same application. This addition opens up significant new functionality and opportunities for the application but also comes with its own set of drawbacks. The user’s device is connected to the remote data centre using internet connectivity. The device connects to a last mile network, in this example a fourth generation (4G) LTE cellular network, and uses this connection to send and receive data to an application instance which is operating in the remote data centre. This application instance is now using a combination of device resources and data centre resources, most likely by utilising a public or private cloud service. Note, however, that the cloud is a not a physical place in and of itself; it is a logical service which uses physical data centre locations and the resources present inside them to provide those services to its users. This distinction will become increasingly important throughout this book as the infrastructure used by the cloud includes not only RNDCs but also IEDCs (see Figure 2.4).
In this case, the application is able to call on not just the local resources which are available at the device but also remote resources located within the remote data centre in order to perform its functions. These resources are primarily processing power and data storage, both of which are capable of adding additional capabilities and levels of performance to the application which the device alone is unable to support, and access to them often greatly enriches the user experience.
One difficulty with this case is that the RNDC is typically located a large distance away from the end user and their device. This imposes two challenges on the application: When the transmission of large amounts of data is required, that data is sent using long‐distance network connectivity which, if all other characteristics of the network are equal, is costlier and is prone to introducing more opportunities for network congestion than the network connectivity which would be required for a shorter distance between a device and its serving data centre. The other challenge is latency: Should a real‐time