Distribution and communication in software engineering environments. Application to the HELIOS Software Bus.

Distribution and communication in Software Engineering Environments. Application to the HELIOS Software Bus

Frangois-Christophe Jean', Marie-Christine Jaulent', Jerome Coignard2 and Patrice Degoulet'

'Medical Informatics Department, Broussais University Hospital, Paris, France 2 Cap Sesa Tertiaire, Paris, France

Abstract Modularity, distribution and integration are current trends in Software Engineering. To reach these goals HELIOS, a distributive Software Engineering Environment dedicated to the medical field, has been conceived and a prototype implemented. This environment is made by the collaboration of several, well encapsulated Software Components. This paper presents the architecture retained to allow communication between the different components and focus on the implementation details of the Software Bus, the communication and integration vector of the currently running prototype. 1

Introduction

While current medical applications become more and realized by a single prosoftware development methods. The natural architecture of these applications can greatly beneficiate from the collaboration of individualized, well encapsulated and packaged tools, inside a cooperative environment. Also, most of the new medical applications are in fact widely distributed, reflecting the natural organization of their targets. For instance, a Ward Information System could be composed of various multimedia workstations, database servers, front ends to HIS or PACS, connected through a high speed local area network. The logical structure and behavior of such a network is highly dynamic: a workstation, a device or a service could whenever be added or removed, while the users (physicians, nurses, technical staff) dynamically share these hardware and software resources to access the different medical objects available. This modular and communicating architecture, that is well recognized at run-time, must also be retrieved at conception and realization time. However, most of today operational systems are conceived in a way that the tools that compose them remain tightly coupled, so dependent from one another. This aspect is a major bridle to evolution and extension of such systems. more complex, they cannot be grammers team using traditional

0195-4210/91/$5.00 © 1992 AMIA, Inc.

Parallel to that, modularity, integration, genericity and reusability are key words in software engineering. Tools integration is widely recognized as the basement of future Software Engineering Environments (SEE) that aim at providing a cohesive support of the software life cycle. Tools gathering inside independent and reusable modules, as well as the dynamic integration of those components in a unique operational system allows the user to define his own view of the available software systems according to his needs [5]. Modularity allows to enhance the evolution capabilities of the system: each component could be updated individually according to the technical evolution, avoiding the rewriting of the whole system. Integrated programming environments have been widely touted as a way to increase productivity. An integrated environment is both a system in which all programs beneficiate from the same user interface, and a system that defines a common understanding for exchange and storage of data. In the past 20 years, a wide range of environments, using many ways to integrate their tools, has been developed. A first approach is integration through file systems (e.g., UNIX SCCS and RCS tools). These tools are "integrated" in the sense that they operate on a common set of files but each has a different interface and obey to different conventions. Another mode is integration through program databases where a single database stores all relevant information about a system. In these systems the tools share common data structures that represent different aspects of the programs and their execution. A well known illustration of this technique is provided by the European ESPRIT project PCTE [1]. Finally, integration is possible through message management: a loosely coupled message passing facility helps by its simplicity, capability of reuse of existing tools, and extendibility. This approach is successfully undertaken in the Eureka Software Factory [8]. According to [6], a list of tools integration criteria can be made, based on the analysis of the required interactions. Each tool must be able to interact directly with every other environment's tool; tools must be able to share dynamically the available information; users must be able to access that

506

information through a unique interface, whatever the tool used or its physical localization. These requirements imply the cooperation of various tools, coming from different sources, running on possible different operating systems, inside a unique but open framework that support transparent calling mechanisms [2]. This paper presents the development, made in the HELIOS European project, to build a Medical Software Engineering Environment [4] based on such integration recommendations.

2 The HELIOS distributed environment 2.1 The Software Components According to the concepts presented above, HELIOS is a distributed environment based on several independent, well encapsulated pieces of software called Software Components, connected to a Software Bus. Three main components, the Information System (built above an object oriented database), the Documentation Facilities (hypermedia network of documentation) and the Interface Manager (unique interface of the system), which are functionally strongly linked, form the kernel of the system. It offers the medical applications' developers all the tools necessary to build basic applications. Beside this kernel is a collection of highly specialized components, called services that a developer can add (i.e., plug onto the system) to enhance the management capabilities concerning a specific modality of the medical information. The first representative of these services is a Medical Image Processing Toolbox. Other services, like electrophysiological signal processing, speech recognition or medical language processing toolboxes could be added later on. From a logical point of view, a Software Component is a specialized computer program that has its proper kernel for treatment but takes most of its Input/Output from the remaining Software Components. From a computational point of view, it is a composite entity made of two parts: an Interface Area, which supports the connection of the component to the Bus and handles all its communications with the exterior, and a Core, a composite part, made of the association of several cooperative elements within an object-oriented architecture which is responsible of the effective processing operations.

2.2 The Software Bus The key point of this architecture is integration which is supported by two complementary aspects. First there is a common understanding between components on the semantics of the data exchanged through the messages. Exchanges take place over a logical communication stream, the Software Bus, which hides all heterogeneity and distri-

bution aspects from the communicants. Second, the notion of plug-in mechanism allows Service components to be added to the existing HELIOS environment without being a priori accounted for by the whole system. This feature, used at component installation-time, is fully supported by the Interface Area of each component.

2.3 Asynchronism in communications From a communication point of view, HELIOS is based on a dynanic client-server architecture. A Software Com-

ponent can thus be considered as a server when another component (the client) asks it for a resource through a message; it can be viewed as a client when, in turn, it asks another component for a service. Any Software Component can communicate directly with any or all the others independently of the type of their physical connection. For instance, when two communicating components run on the same host, local interprocess communication facilities are automatically used and the network primitives bypassed. For the remainder of this paper we shall suppose that the different Software Components are logged on different UNIX based machines linked through a local area network, i.e., the present implementation of the HELIOS Software Bus. Figure 1 explains this communication architecture.

Figure 1: The communication architecture On each Software Component, a process belonging to the Interface Area: the Interface Area Daemon (IAD), is responsible for message management. This IAD is physically linked to each other IADs through streams mounted over TCP/IP socket based connections. Each IAD is also linked to its component core process by local Inter Process Communication (IPC) facilities. Two IPC channels are implemented between the core and the interface area: a control channel and a data channel. When an incoming message arrives on an opened stream connection, the IAD extracts the data field from its envelope (see figure 2) and puts the object on the data channel. It then sends a message via the control channel in order to interrupt (UNIX signal) the Software Component application process and

507

trigger the fetch of available data. This mechanism allows the process to continue its own execution without having to issue unsuccessful receive requests as it would do if the system was based on a polling mechanism. From the Component's process point of view, one way to implement such a behavior is to base the system on an event-driven state model. The system's state must be altered each time an action is processed. Each action generates, like in the X Window system, a time-stamped event that is delivered via an event queue to the Software Component's process. For processes already based on event management, like the Interface Manager, the modification of the program flow control is quite straightforward. In that particular case, both user actions and messages from the other Software Components are treated as events. This is easily done by declaring the interruptions generated by the IAD as valid X events. With the appropriate event handler, these foreign events are treated exactly like mouse clicks or keyboard entries. For the other components, organized along the object paradigm, these messages are just forwarded to their target classes through an abstraction boundary [9]. This latter is the interface that a Software Component presents to its clients. It determines the form in which resources provided by a component may be accessed. It constitutes the Software Component's protocol, that is mainly a set of method selectors. At each Software Component level, an event is processed atomically. The process alters its state in a way that a potential response can be handled appropriately when it will arrive. Doing so, a client never waits for specific input from a server. However, stronger synchronization can be obtained, according to the requirements of the considered Software Component. If a component's process cannot continue its execution before receiving an answer message, a blocking form can be used to implement remote procedure calls on both sides. The synchronization is then data-driven and implements a wait-by-necessity model (a process waits only for data to be returned). It is then the responsibility of the application program to figure out which transactions require a blocking form, and which do not. The solution retained was to implement message typing that relates to synchronization by using subclasses of messages, which share the need of a blocking form. Each subclass contains a class variable, passed via the control channel and telling the process to remain blocked until completion of the remote process state.

as they are defined and manipulated at each component level. This avoids the need of defining explicitly an intermediate structure used only during the transfer of the object from one component to another. The complete structure of a message is presented in figure 2. Each message belongs to exactly one message Class that encapsulates certain essential properties of the message. These classes are abstract classes that are specialized by real subclasses, which deal with the blocking form discussed previously. The generic classes implemented so far are: Assertion: a message that conveys information but requires no reply; Query: a request for information; Reply a response to an earlier query. A message is thus an object containing variables such as: the class of the message, the component that created the message, the destination of the message, the message header and the message body which consists in an array of application objects.

Figure 2: Structure of a message

2.4.2 External representation of the messages

Complex and structured information exchange among different processes faces the problem of a potential non unique machine representation of these objects. Well known cases are the different representations of integer and floating point numbers among processors. For complex structures, like composite objects, memory alignment problems must also be solved. To minimize these problems, HELIOS uses a standard external representation mechanism (independent of the machine's physical structure and thus portable) to code the different messages. The mechanism 2.4 The messages as communication agents is based on the XDR (External Data Representation) protocol defined by SUN Microsystems and then adopted as 2.4.1 Structure of a message a standard by the UNIX community. When a Software Component wants to transmit a mesTo guarantee a high level of abstraction during the into another component, it must proceed to a serializasage teractions between the various HELIOS Software Comtion (or coding) of the different objects composing the ponents, the exchanged messages encapsulate the objects

508

message, which may be composite themselves. For the sender of the message, the serialization operation consists in appending the XDR representations of the message and object parts, coding also the structural relationships existing among these parts. The result of this operation is an information stream. This stream is then associated to the resource in charge of conveying the information to the receiver. At the other end, the receiver process can extract from the stream XDR representations of the message parts and their relations. Then, it can apply them the conversion functions inherent to the protocol to obtain the local representation of the message and its embedded objects. This process supposes a "logical association" between the coding and the decoding streams. The association is strong if the sender and receiver processes are running on the same machine, or weaker when necessiting a transport mechanism to vehicle information from one machine to another.

fluenced by the OSI model. Three abstract layers, which encapsulate several OSI layers, were distinguished to help the implementation (see figure 3). The bottom layer, called the carriage layer, provides all the facilities for transporting messages from one Software Component to another. This layer is represented by communication primitives integrated in the host operating systems and was beyond the scope of HELIOS developments.

2.5 Transparent binding and location broking

Figure 3: HELIOS communication layers

The computers in the LAN system are identified by their hostnames. The application names of currently running Software Components are registered locally, with associated information, by a name server daemon listening for incoming connection requests on a well-known port. To plug a component, a connect request is issued by the user through the Interface Manager. The local host gets a local port number for the requested Software Component, sets up control structures and buffer space associated with the connection, maps the remote hostname to the corresponding address, and sends to the daemon at the remote address a message containing the local Software Component name and the local port number used. On the remote side, this opening request is received by the daemon and forwarded to the invoked Software Component process if this one is running. If it is not currently running, it is proceeded to its backgrounded launch via an automatic remote command procedure. Then, associated structures are set up on remote host and an acceptance message, containing the remote port number, is sent to the requesting Software Component on its local port passed by the initial message.

3 Results A prototype of HELIOS has been realized and actually serves as a testing platform for tools integration. A first version of the kernel, composed of the Information System, which relies on the GemStone object oriented database, and of the Interface Manager built using OSF/Motif windowing environment was implemented in ANSI C. A first specialized toolbox, the Image Processing Toolbox, was also implemented in ANSI C. The current implementation of the Software Bus relies on TCP/IP but was strongly in-

OSI layers 7 6 4 3 2 -

Software bus

Application I-

Pmsentation Session Transport. Network Data link Physical

I

Helios message semantics XDR protocol Synchronized full duplex Transport control protocol Internet protocol EEE 802.3 CSMA/CD

La

I IOnt I

Above this datagram network, we have considered a layer distributed among the components, henceforth referenced as the components layer. This layer provides intercomponent communication and synchronization mechanisms. It is mainly represented by the cooperation of each IAD during a connection. The last layer, the Environment layer, is made by the application programs (core processes) that run using the facilities provided by the component layer. A first attempt to implement the Software Bus over heterogeneous operating systems was made by integrating DECNet as a possible transport protocol to plug components running on VAX/VMS machines. This effort will be followed up and will rely on OSI protocols, as soon as they will be available in many environments, to integrate platforms coming from different hardware vendors into the HELIOS environment.

4 Discussion and conclusion Maintainability (reduction of the ripple effect of software changes) and production of understandable software are recognized to beneficiate greatly from an architecture in which the software modules are independenL That means that to perform its functions no module requires detailed knowledge of other module functions. To have a Software Engineering system as open as possible is of great interest for the construction of medical applications especially to make integration of technological improvements as easy as possible. In HELIOS, the best way to achieve this goal seemed to organize the system as a cooperation of distributed, well encapsulated software components, which are close to Cox's Software ICs [3]. The interest of such an architecture raised during the exploratory phase of the HE-

509

LIOS project where applications were broken into component parts and handled separately by different development teams. The Image Processing Toolbox, developed independently from the kernel, demonstrates the efficiency of this approach. Distribution, communication and resource management are these activities that constitute the major differences between the distributed environments and conventional ones. Following [7], some directions to organize such distributed environments have been pointed out and integrated in the current HELIOS implementation. The use of a message based communication scheme allowes for instance independence in module operations [8]. In fact, Software Components function autonomously (a major attribute of distributed systems) because a component interaction only occurs when a message is transmitted. The use of a logical bus communication scheme maximized the accessibility (direct communication among objects) and allowed the flexibility of using objects name for message communication, leading the users to access resources by type of service. Specifying a virtual machine as the vehicle for achieving system requirements leads the concerns about hardware or software details, such as bus structure or network protocols, not to interfere with the thought process of identifying desirable system properties, which are independent of the method of implementation. However, a true bus master devoted to communications control has not been implemented so far. This absence of central controller seems not to be restricting while the number of Software Components remains reasonable and, above all, the number of connected components is not a dynamic parameter during a work session with HELIOS. If no new additional component are available for work during an initialized session, the connection protocol can be drastically simplified. In fact, we do not have to manage signal sending from one component to master at plug time, to maintain an address table in the master component and to periodically send a list of the actual connected components with their logical addresses to all the other components. This avoids the use of a slow polling mechanism, time consuming for each component and responsible for network trade growing. On the opposite, each component must manage itself its connections with all the other ones. If the number of components increases significantly, the virtual connections between components can become untrackable. As future developments will focus on introducing new components in the HELIOS environment, it seems important to plan the implementation of a bus master which could be integrated in the Information System. In that case, the messages will be considered as communication objects, with structure and behavior managed within the database. The Information System will then pass from the state of a central repository to the state of an active objects manager.

Acknowledgements Partners of the project include Cap Sesa (prime), Broussais University Hospital, German Center for Cancer Research in Heidelberg and the Geneva State Hospital. The development of HELIOS has benefited from the financial support of the Commission of European Communities (AIM 1004) and DIGITAL Equipment Corporation (external research program FR-018). We would like to thank Pauline de Vos and Jean-Jacques Nicole (DIGITAL) for their helpful comments on the HELJOS project, and Servio corporation which significantly helped the project in providing the latest versions of the GemStone product

References [1] Boudier G, Gallo F, Monot R, Thomas I: An overview of

PCTE and PCTE+. In: Proceedings of the ACM SIGSOFTISIGPLAN. ACM Press, Nov. 1988. pp 248-257. [2] Clement D: A Distributed Architecture for Programming

Environments. SIGSOFT '90. Proceedings ofthe Fourth ACM SIGSOFT Symposium on Software Development Environments. Irvine, CA. Dec. 3-5, 1990, pp. 11-21.

[3] Cox BJ: Object Oriented Programming. An Evolutionary Approach. Addison-Wesley, Reading M.A. 1987. [41 Degoulet P, Coignard J, Jean FC et all. The HELIOS European project on software engineering. In 'Software Engineering in Medical Informatics. Timmers T and Blum B (eds). Amsterdam: North-Holland. 1991 (in press).

[5] Meyers S: Difficulties in integrating Multiview Development Systems. IEEE Software 1991; 8(1): 49-57. [6] Reiss S: Connecting Tools Using Message Passing in the

Field Environment. IEEE Software 1990; 7(4): 57-66 [7] Schneidewind NF: Distributed System Software Design Paradigm with Application to Computer Networks. IEEE

Transactions on Software Engineering 1989; 15(4): 402412.

[8] TDG. ESF Technical Reference Guide - Version 1.1 ESF (European Software Factory) Technical Administrative Team, 152 Hohenzollerdamm, D-1000 Berlin, Germany, July 1989. [9] Wegner P: Concepts and Paradigms of Object-Oriented Pro-

510

gramming. OOPS Messenger ACM Press, 1990; 1(1): 887.

Object-oriented Information System in the HELIOS Medical Software Engineering Environment.

Ten recommendations for software engineering in research.

Happy software developers solve problems better: psychological measurements in empirical software engineering.

w4CSeq: software and web application to analyze 4C-seq data.

Scoring of medical publications with SIGAPS software: Application to orthopedics.

Real-time respiratory monitoring workstation--software and hardware engineering aspects.

A software communication tool for the tele-ICU.

Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

Millstone: software for multiplex microbial genome analysis and engineering.

Large-scale visualization projects for teaching software engineering.

OriginPro 9.1: scientific data analysis and graphing software-software review.

What Happened to Software Metrics?

Development and application of new quality model for software projects.

Spreadsheet software to assess the locomotor disability: Submitting the actual software.

MCHS: an application software for family welfare programmes.

'Software and Scholarship' - Editorial.

Visual Recognition Software for Binary Classification and Its Application to Spruce Pollen Identification.

Demonstration of a software design and statistical analysis methodology with application to patient outcomes data sets.

The study of unfoldable self-avoiding walks - Application to protein structure prediction software.

Detecting Optic Atrophy in Multiple Sclerosis Patients Using New Colorimetric Analysis Software: From Idea to Application.

Using visualization to debug visualization software.

Open source software to control Bioflo bioreactors.

Quantitative Redox Imaging Software.

Software for smart users.