Node Computing

Welcome to Helsing’s blog! In this first post, our CTO introduces the “Node Computing” paradigm, the architectural north star for our software systems. We invite you to subscribe to the blog if you’d like to read more about Helsing’s company culture, AI research, software systems, and engineering practices.

Cloud Computing and Edge Computing

Two major trends in enterprise IT architecture are Cloud Computing and Edge Computing. Enabled by seminal breakthroughs in distributed storage and processing technology, Cloud Computing is often narrowly interpreted as “Big Data”, thus promoting centralisation of data storage and processing. In the limit, and by straightforward application of Conway’s law, Big Data also centralises organisational knowledge and ultimately decision-making capacity.

Edge Computing is the inverse: assuming a swarm of independent agents — each of which acting as a source of data — the Edge Computing paradigm promotes distribution of computation. The rationale for Edge Computing is straightforward: since algorithms and models are usually orders of magnitudes smaller than data, edge computing yields lower latencies, in particular in environments with unreliable communication networks. Edge Computing also distributes the distillation of data into information and thus localises organisational knowledge and decision-making ability.

It is clear that neither Cloud Computing nor Edge Computing are satisfactory architectural paradigms on their own: complex Cloud Computing systems collapse when communication networks become unstable, and pure Edge Computing systems suffer from a lack of coordination and cannot benefit from the compounding effects of shared knowledge. [By analogy, it is equally clear that neither fully centralised (ie, absolutist) nor fully distributed (ie, anarchic) organisational decision-making systems are desirable.]

Node Computing

Node Computing is a generalisation of Cloud and Edge Computing and reconciles their seemingly contradictory design goals. The Node Computing paradigm is applicable whenever data is produced by distributed agents, communication links are sometimes present and at other times unreliable or non-existent, and when decision-making benefits from central knowledge and coordination but still needs to happen in their absence.

The nodes in a Node Computing system can be tiny, embedded edge devices on the one side, or scale-out compute clusters in the other extreme. Nodes exchange both data (eg, in the form of real-time sensor streams, meta data, or human inputs) and computation (eg, as algorithms, models, or complex AI-enabled applications).

Node communication is adaptive vis-à-vis changes to the topology or quality of the communication network. When network links are good, high-fidelity, real-time information can be transmitted from edge systems to central cloud-based infrastructure in order to be indexed and processed by central systems. But when network links deteriorate, nodes must automatically reduce the fidelity of the information transmitted and enable suitable algorithmic and decision-making processes directly at the edge — while still maximising the quality of information and the system’s overall capabilities.

A Node Computing network is scale-invariant in the sense that it exhibits the same conceptual behaviour in a single node, in a group of interconnected nodes, or as a whole. Scale-invariance is achieved by dynamically adapting to changes in system resources and topology, by rewiring the information flow as new nodes come online and others become disconnected, and by deploying and redeploying, configuring and reconfiguring suitable algorithmic capabilities on those nodes on which they achieve the greatest utility.

Challenges

We have embraced the Node Computing paradigm due to its conceptual elegance, but also because it has helped us analyse and describe the technical as well as governance and user experience challenges we are solving in building our platforms.

Examples of engineering challenges include: information flow is fundamentally asynchronous and thus nodes must make sense of out-of-order data and confluent (at best) semantics; networks and algorithms need to self-configure according to changes in topology, bandwidth, and latency; data flow must be prioritised based on global expected utility of the information transmitted; and software deployment and configuration systems need to guarantee compatibility of data and service APIs even when different nodes are upgraded and reconfigured asynchronously.

The overarching governance and user experience challenge is that the Node Computing system’s complex inner behaviour must surface as simple, interpretable, and transparent to its users, operators, and regulators. For example, auditors need logs and deep information and decision provenance so they can make sense of what the Node Computing system and its users did and why; system governance relies on pervasive permission and access control mechanisms; and users want to understand the lineage and fidelity of all information artefacts in order to build confidence in the system.

Conclusion

The Node Computing paradigm is the conceptual north star for Helsing’s software architecture. Designing and implementing Node Computing systems is a crazy challenge with lots of opportunities for elegant solutions. We will post regular updates on our software and engineering practices on this blog — stay tuned.