Robotics middleware is middleware to be used in complex robot control software systems.
General concepts
As glue code, the middleware should be invisible, and introduce no overhead or extra constraints on the components. This is of course an unreachable (non-functional) design requirement, so compromises have to be made. Different middleware projects mostly differ in which compromises are made (implicitly, most often!) and in which robotics applications are being targeted.
As glue software, middleware should support the coupling of subsystems, which is a fundamentally different software development skill than the decoupling design requirements that most software engineers are educated in. Indeed, decoupling is the major focus of class library development: one should make a library as independent from other libraries as possible. Middleware, on the other hand, must provide optimal support for coupling: allowing to couple multiple, decoupledly designed components together in a way that satisfies system-level requirements.
The composition of sub-systems into a new system is often a difficult task: designing the architecture of the system is hard, since it requires to find the optimal trade-offs between all system requirements and to realise the optimal cooperation between all system components. There are currently close to no software tools, or internationally accepted standards and workflows, to support the job of the system designer.
Some of the problems to be solved when designing a composite system are:
- the composed system should have an interface that is not (much) more complex than the combination of all composing subsystems. Otherwise, the composite system offers no real design advantages to the human developer. In practice, this means that the composite system developer makes some design decisions that restrict the use of each of the components to only a part of its potential domain.
- the composed system should act to its users as one consistent, monolithic system in itself.
- building a system from reusable components is challenging with respect to the balance between performance (it (seems to be) easier to optimize performance if one is not restricted to using only pre-built components) and ease of reuse.
At a conceptual level, a complex robot controller has components that each serve one of the following four concerns:
- Communication: components must exchange information (data, events, commands,…), and how this exchange is done is an important property of the composite system.
- Computation: each component performs certain computations that are necessary to provide the functionality that is expected from the system.
- Configuration: components should be usable in more than one possible configuration (i.e., concrete settings for each of their variable parameters), but the amount of configuration is an important aspect of the design and the implementation of components and systems. Configuration is required at various moments in the lifetime of a software system: compile time, deployment time, run time,…
- Coordination: the activities in components have to be coordinated by something at the system level, in order to guarantee the expected behaviour and performance of the composed system. Coordination involves: decision making, scheduling, (de)activating subsystems and/or their interconnections,…
Whether these four above-mentioned primitive concepts are really minimal (i.e., one needs only these four concepts to cover all relevant system design aspects) and/or complete (i.e., these concepts cover all possible systems) is not so important in this discussion; the important thing is that, at systems level, the designer should benefit from a level of abstraction that is an appropriate trade-off between complexity (the fewer concepts are needed, the better) and flexibility (the more diverse systems can be represented by the conceptual primitives, the better). Again, the appropriate trade-off is not an absolute concept, so it will depend on many (non-functional) design requirements. As such, both the number and the nature of the primitive concepts, and the particular trade-off, are discriminating factors between different middleware projects.
Composing two or more components that each belong to one of these categories is an architectural design activity. It is often complex, in that it has to balance a large amount of functional and non-functional requirements (performance, compositionality,…). The robotics research community has not yet come up with fully satisfying software frameworks, architectures, or methodologies to deal with the composition problem, but a large number of (open source) projects exist already, and they all claim to provide good solutions to this component composition problem, at least to (implicitly described) parts of it. Anyway, many fundamental questions are still unsolved, or rather, are still unnoticed within the robotics research community. This article presents an overview of some of the relevant issues to be considered in the design and use of such middleware, and also provides an annotated list of middleware projects with an evaluation of which design constraints they took (or did not take) into account, and about how well they perform.
Composition of subsystems
How to optimally compose subsystems into a larger system is the core activity of system developers, but is remains more of an art than of a science. The major challenge is to develop subsystems that are stable on their own, while still very willing to be part of a larger system. There are four different ways of composing software components:
- linking object classes by providing explicit references to each other:
- composing object classes without them knowing about each other
- composing components
- composing software services:
A composed system is stable if it can be used without the user having to know that it is a composed system in itself. Examples of commonly used compositions that are not stable are:
- Simulink blocks in feedback controllers: one often has to introduce explicit delay blocks; one cannot predict the overall performance on the basis of the performance of the individual blocks.
- Realtime aspects at the system level: only one of the components can really have the highest priority; schedulability of the activities in all components becomes exponentially harder to analyse, let alone to guarantee, when the number of components grows; IPC deadlocks become more likely, and more difficult to trace; formal verification becomes more difficult, since jitter and latency deteriorate in less predictable ways, compromising the ideal, abstract model of states with atomic and infinitely fast transitions and condition checks.
- Adding sensor processing or control blocks to a control loop: each new sensor can bring with it a device driver that requires a different sampling frequency, that provides a different spatial resolution, …
Robotics middleware projects
Player Project
- System organization/decomposition - Player is a device server (application server) with the collection of dynamically loadable device shared libraries. The main portion of this server is custom communication protocol which enables client-server communication model. Player is implemented in C++ and uses POSIX compliant pthread interface for multithreaded applications. Player can be viewed as a multithreaded application server providing applications/services to client programs. Here application/services are devices which are used by client programs. Since devices (applications/services that is) can be loaded dynamically, Player can be considered as "some way in between monolithic application and modular middleware+services" approach. A main Player server thread listens to client side connections and spawns threads whenever the client program asks a service from some specific device. Each device (device program) has its own thread of execution. As can be seen from the figure 1, the threads communicate via a shared global address space. Each device has an associated command and data buffer. Whenever the client needs to access some specific device, it sends a command which is queued in a command buffer and then read by the device. The same applies to the data buffer: The device writes the data to the buffer and the client side program reads it. Since Player does not implement any device locking mechanism whenever multiple clients are connected to a Player server, one can overwrite the commands of other clients. This applies to commands and data (it is noteworthy to mention that data and commands are implemented as asynchronous one-way continuous streams) but not to configuration requests which can be used to access specific hardware features (this mechanism is implemented as a two-way synchronous request-reply interaction). One can also define the frequencies at which a particular device provides data to the client, by default this value is 10 Hz. Depending on the needs of a client data can be served in PUSH or PULL modes, by default the mode is PUSH, so server/devices send all data available to the client.
|
- Communication approach - Player, of current version, decomposed into two main parts: Player core, which is the core Player API, device drivers, driver loading code, configuration parsing code and driver registry. Transport layer, this part is independent of device drivers and can be any type of transport system, of current version it is implemented as two libraries, libplayertcp and libplayerxdr, and is based on TCP communication protocol using sockets (and message queues). Other transport types which can be used are JINI and CORBA based (means there is support for RMI and RPC).
|
- Platform/language support and other system features - Support for plugin drivers and simulation environments Stage and Gazebo. No device locking mechanism, client side data is always older than server side data because of the "buffer sit" times. Devices are treated as files with read/write access. Most of the devices adopt character device model. On the client side, programs communicate with server/devices through their respective local proxycies. There is no considerable support for fault tolerant capabilities. One of the features which can be accounted for is the “libplayererror” library, which can be used to perform error reporting. Also, there are some procedures to perform thread locking in order to avoid conflicts. Player is supported on most of the UNIX flavors and under Windows using Cygwin. Client programs can be written in any language providing socket mechanism, e.i. TCL, Python, C, C++, Java etc.
|
- Fault tolerance and robustness aspects: Though the Player project has been active for quite a long time, there have been no considerable developments to improve its fault tolerant capabilities. One of the features which can be accounted for is the libplayererror library, which can be used to perform error reporting. Also, there are some procedures to perform thread locking in order to avoid conflicts. From the perspective of robustness, it should be possible to implement planning, learning, state estimation or similar functionalities for an application based on Player. In the future the authors want to implement a resource awareness attribute, i.e. enable the programs to perform resource discovery and change their behavior according to the availability of resources.
|
RT-middleware Projects
- System organization/decomposition - RT-middleware (Robotics Technology Middleware) is a common platform standards for Robots based on the distributed object technology.[1] RT-middleware supports the construction of various networked robotic systems by the integration of various network enabled robotic elements called RT-Components. The specification standard of the RT-component is discussed / defined by the Object Management Group (OMG).[2]
|
- System organization/decomposition - In the RT-middleware, robotics elements, such as actuators, are regarded as RT-components, and the whole robotic system is constructed by connecting those RT-components. This distributed architecture helps developers to re-use the robotic elements, and boosts the reliability of the robotic system. Each RT-component has port as an endpoint for communicating other RT-components. Every port has its type and the ports which have the same type can be connected each other. RT-components also has its state, so the RT-components behaves as state machines. The states that RT-components can have are CREATED, INACTIVE, ACTIVE, and ERROR, and the states and behaviors are controlled by the execution-context. If developers want to change the behavior of their RT-components, the execution-context can be replaced at run-time.
|
- Implementations - RT-middleware is just standard of the Robotics platform software. Therefore, there are several implementations of RTM. RT-middleware does not assure the communicability between the RTM implementations, however, the bridging tools that intermediate the communication can be easily created because the state machines and their programming model is defined. Indeed, almost all RTMs have already prepared the bridge tools (or native support) for communication with OpenRTM-aist. Implementations of the RT-middleware are as follows:
- "OpenRTM-aist" is an implementation of the RT-middleware on the basis of CORBA platform developed by National Institute of Advanced Industrial Science and Technology. Using CORBA's characteristics, OpenRTM-aist is available for multi-platform (Windows, Linux, OSX, VxWorks, TOPPERS, QNX ...) and multi-language (C++, Python, Java ..) environment.[3]
- "OpenRTM.NET" is an implementation of the RT-middleware for the .NET Framework platform.[4]
- "RTM on Android" is an implementation on Android OS that is communicable with OpenRTM-aist.[5]
- "RTC Lite" is a modified RT-middleware for embedded or small-resource systems such as PIC (Microchip), ARM, H8 (Runesus), and so on.[6]
- "RTM Safety" is a market available implementation of RTM to pass the IEC 61508 standard.[7]
|
Urbi
Urbi is an open source cross-platform software platform in C++ used to develop applications for robotics and complex systems. It is based on the UObject distributed C++ component architecture. It also includes the urbiscript orchestration language which is a parallel and event-driven script language. UObject components can be plugged into urbiscript and appear as native objects that can be scripted to specify their interactions and data exchanges. UObjects can be linked to the urbiscript interpreter, or executed as autonomous processes in "remote" mode, either in another thread, another process, a machine on the local network, or a machine on a distant network.
- System organization/decomposition - MIRO uses an abstract machine model, i.e. the system is divided into several distinct layers, as depicted in figure 3. The higher layers can only access the lower layers via their interfaces. In case of MIRO, these layers are:
- MIRO device layer - this layer provides classes to interface hardware and abstract the low level hardware details This classes enable access to hardware resources via simple procedure calls.
- MIRO service layer - this layer provides service abstractions for sensors and actuators by means of the CORBA interface definition language (IDL). These services are implemented as network transparent objects/CORBA objects. The classes in this layer present the sensors and actuators as generic services. For example, the RangeSensor class defines functionality common to the sensors which return range readings such as sonars, lidars and other type of range finders.
- MIRO framework layer - on this level functional modules specific to robotics are provided. Examples are mapping, localization, path planning and similar facilities.
|
- Communication approach - For communication purposes MIRO relies on a TAO middleware C++ implementation of the CORBA standard. Subsystems/Objects communicate according to a client-server model, which is an example for a distribution model. With respect to the time properties both synchronous and asynchronous modes of communication are utilized. The system also supports event driven communication. Both push and pull models are used. A comprehensive overview on the layout and accessibility of each respective layer is depicted in figure 3
|
- Platform/language support and other system features
[To be done] |
- Fault tolerance and robustness aspects - MIRO does not provide any explicit fault tolerant capabilities on the system level, not considering the ones provided by the underlying middleware and the operating system (resource management, conflict resolution etc.). There are some exception handling capabilities apart from the ones provided by the underlying middleware, though, and there is a list of MIRO exceptions which indicate hardware problems, service call failures or malfunction and load problems. In addition, a post- or predevelopment phase capability which may improve reliability of the software is a so-called “logging service” with several levels of notification. To increase the reliability of the software and minimize the number of errors, a partially automated code generation is provided. This comes automatically when using the IDL compiler, which helps to generate all the code for the communication and underlying middleware services. But again, most of the features mentioned are not part of the SIS itself but rather facilities it relies on. The use of a BAP (behaviors, action patterns, policy) framework can contribute to the robustness of the applications based on MIRO. The BAP proposes ways of combining simple behaviors to form complex ones. The principle used for creating complex behavior hierarchies is similar to the finite state machine (FSM) principle, represented in figure 6. Action patterns, represented in figure 5, are composed of behaviors and “guards” which can notify about some external event. Apart from this, the authors claim that a dynamic reconfiguration of policies is possible. This feature also may contribute to robustness. On the other hand, there are no implications whether it is possible to implement any planning or learning capabilities.
|
Wasp Actuator Sensor Protocol
[To be done]
[To be done]
- System organization/decomposition -
- OpenRDK is a modular software framework focused on rapid development of distributed robotic systems. It has been designed following users' advice and has been in use within our group for several years. By now OpenRDK has been successfully applied in diverse applications with heterogeneous robots and as we believe it is fruitfully usable by others we are releasing it as open source.
- In our framework the main entity is a software process called agent. A module is a single thread inside the agent process; modules can be loaded and started dynamically once the agent process is running. In the figure below we see an example. Two agents are executed on two different machines and three modules run inside them: hwInterface retrieves data from a laser range finder and the odometry from a robotic base; given this data, scanMatcher uses a scan-matching algorithm in order to estimate the robot positions over time; mapper uses the estimated robot positions, together with the laser scans, to build a map of the environment.
- An agent configuration is the list of which modules are instantiated, together with the values of their parameters and their interconnection layout. It is initially specified in a configuration file.
|
- Communication approach -
- Modules communicate using a blackboard-type object, called repository (see figure below), in which they publish some of their internal variables (parameters, inputs and outputs), called properties. A module defines its properties during initialization, after that, it can access its own and other modules' data, within the same agent or on other ones, through a global URL-like addressing scheme. Access to remote properties is transparent from a module perspective; on the other hand, it reduces to shared memory (OpenRDK provides easy built-ins for concurrency management) in the case of local properties.
- Special queue objects also reside in the repository and they share the same global URL-like addressing scheme of other properties.
- In the figure below, the hwInterface module pushes laser scan and odometry objects into queues, that are remotely accessed by the scanMatcher module, which, in turn, pushes the estimated poses in another queue, for the mapper to access to them. Finally, the mapper updates a property which contains a map.
|
OpenRDK comprises also RConsole (RDK Console), which allows for remote operations, inspection, parameter updates, etc.
|
References
- ↑ Noriaki ANDO, Takashi SUEHIRO, Kosei KITAGAKI, Tetsuo KOTOKU, Woo-Keun Yoon, "RT-Middleware: Distributed Component Middleware for RT (Robot Technology)", 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2005), pp.3555-3560, 2005.08, Edmonton, Canada
- ↑ Robotics Technology Component Specification version 1.1, Object Management Group (OMG)
- ↑ OpenRTM-aist official website, http://www.openrtm.org/
- ↑ SEC. Co., Ltd., [SEC, Robot Site http://www.sec.co.jp/robot/download_rtm.html]
- ↑ SEC. Co., Ltd., [SEC, Robot Site http://www.sec.co.jp/robot/download_rtm.html]
- ↑ SEC. Co., Ltd., [SEC, Robot Site http://www.sec.co.jp/robot/download_rtm.html]
- ↑ SEC. Co., Ltd., [RTM Safety http://www.sec.co.jp/english/business/rtmsafety/index.html]
See also
- RoSta: a European project reaching out to the robotics community to get clearer insights into robotics middleware and architectures.
- BRICs: a European project that attempts to establish best practices in robot development