Resilience in air transport - old version

From EngageWiki

Introduction

This section addresses the research theme “Resilience in Air Transportation”. Both for the Air Transportation sector as well as for Complexity Science this research theme is rather new. The objective of this chapter is to develop a better understanding of Resilience in Air Transportation, as well as a complexity science perspective on this topic. This introduction will focus on the development of a better understanding of Resilience in Air Transportation. Section 5.1.1 addresses the general objective. Section 5.1.2 will address the definition of resilience in air transportation. Section 5.1.3 will identify the scope of resilience in air transportation. The complexity science perspective will be developed in Section 5.2 for the following three specific Research lines of resilience in air transportation:

  1. Development of Resilience Metric.
  2. Top-down designing of Resilience.
  3. Modelling of resilience.

Within each of these research lines, different research challenges are identified and analysed. Finally, in Section 5.3 it will be illustrated how the Air Transportation resilience needs and complexity science approaches meet each other in specific Case Studies. The specific Case Studies are:

  • Mining historical data for assessment of resilience.
  • Agent-based modelling of hazards in air transportation.

General objective

The air transportation system is a complex socio-technical system that is constantly influenced by internal and external events. Every day, several times each day, and at different locations, the operation of the system is perturbed by disturbances of different nature and impact. These disturbances may interact with each other, potentially creating a cascade of adverse events that may span over different spatial as well as time scales, ranging from affecting only one aircraft or a crew, up to a group of aircraft. In the current air transportation system interacting disturbances usually have a small impact on the overall performance of the system, e.g. some flights are rerouted, some passengers are rescheduled. Besides that, events occurring each day that do not fit within the pilot or controller trained procedures. Nevertheless most problems are adequately solved. Due to this kind of resilience of commercial air transportation operations, almost all these events pass without any discomfort for passengers.

In some exceptional cases, however, the resilience of the air transportation system falls short resulting in passenger discomfort. In some rare exceptional cases the discomfort is out of any proportion. There are two categories of such exceptional events: i) catastrophic accidents involving one or two aircraft; and ii) events that push the dynamics of the air transportation system far away from its point of operation and therefore dramatically affect the performance of the system. Examples of the latter are terror action causing closing down of air travel in a large areas (e.g. 9/11 in 2001), a disease causing passengers to change their travel behaviour (e.g. SARS in 2003) or volcanic ashes blocking air travel in a large area (e.g. Iceland volcano in 2010). Examples of the former are fatal runway incursion (e.g. Linate runway collision in 2001), fatal mid-air collision (e.g. Ueberlingen mid-air in 2002), loss of control of an aircraft flying through a hazardous weather system (e.g. Air France crash in Atlantic Ocean in 2009).

The examples above show a wide variety of consequences stemming from an event that escapes from the resilience of commercial air transportation operations. At the same time, these examples show that rare exceptional events also have important commonalities: each of these events involves both economy and safety aspects. The 9/11, SARS and Volcano events happened as a result of precautionary safety measures (by authorities or by potential passengers) and implied large economic losses for airlines. The Linate, Ueberlingen and Air France crash accidents are examples of catastrophic outcomes in terms of passenger fatalities as well as hull losses. And because of the large economy and safety impacts, each of these rare exceptional events triggered in depth studies towards better understanding of why things happened and what can be learned from it for the further improvement of the air transportation system.

Learning from these rare exceptional events that happened because resilience failed to work forms has played a key role the evolution of the air transportation socio-technical system into the current one. However, these exceptional rare events have not been the only source of learning. An important complementary source of learning is formed by the many human operators that have daily experience in handling many situations that are not exactly covered by procedures. This means that human operators in the air transportation system have the possibility to learn from a much larger set of events, rather than only catastrophic events.

The learning from incidents and accidents, as well as the more process involved learning by human operators, has resulted in step by step improvements of the air transportation system. In practice this means that resilience of the current air transportation operation has largely evolved from learning experience by human operators, whereas the knowledge about rare unsafe events largely comes from safety analysis. This decoupled way of working has led to the extraordinary situation that commercial air transportation has become safer, but at the same time, system safety analysts have no objective approach in establishing which role resilience plays in realizing these high safety levels in combination with accommodating capacity, economy and environment requirements 24 hours a day and 365 days a year. Although we may believe to have some qualitative understanding of resilience in the context of current air transportation, no quantitative results exist, and we are not able to assess whether system A is more resilient than system B.

This lack of objective insight regarding the embedding of resilience in the air transportation system makes it difficult for SESAR and NEXTGEN to implement resilience systematically into the design of future air transportation system. Because resilience will remain crucial for the complex socio-technical air transportation system, the only way to escape from this restricted situation is that the role of resilience and how to design it is systematically investigated.

Definitions

Following Eurocontrol (2009), we adopt the following definition for resilience in air transportation: “Resilience is the intrinsic ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions.” Other definitions of resilience have been developed in the following four fields:

  • Ecology
  • Sociology
  • Organization Science
  • Safety Science

Ecology

Ecology has been one of the first fields of research where the concept of resilience has been successfully developed: see, for instance, the first works of Beddington et al. (1976) and Pimm (1984), up to more recent reviews of Gunderson and Pritchard (2002), or Berkes et al. (2003). In this context, resilience is defined as the capacity of an ecosystem to tolerate disturbance without collapsing into a qualitatively different state, which is controlled by a different set of processes. For instance, an ecosystem may be shocked by the entrance of a new animal, which may interact with the original species and change the availability of foods and of other resources inside the region; if the original species are able to adapt to the new conditions and thus survive, the ecosystem is defined as resilient. Therefore, a resilient ecosystem is characterized by its ability to withstand external influences or shocks and rebuild itself when necessary. In the case, that not only the moment of appearance, but also the nature of the shocks itself is unexpected, they are usually called black swans (Taleb (2007)). Regarding ecosystems, black swans have three interesting characteristics: they are treated as outliers (that is, extremely rare events), they have a high impact on the system, and finally they are usually viewed as predictable (Murawsky et al. (2009)). It is also interesting to note that these events also have positive effects: they do not only modify the way the system works, but also the way we understand the system; in other words, they usually trigger a learning process.

Sociology

A complementary approach toward resilience has been developed in social sciences, and specifically by sociology (Berkes et al. (2003)) and human development (Luthar et al. (2000)). Here, resilience was defined here as the dynamic process encompassing positive adaptation, i.e., leading to an improvement of the social and personal conditions of the individual, within the context of significant adversity. Implicit within this notion are two critical conditions: (1) exposure to significant threat or severe adversity; and (2) the achievement of positive adaptation despite major assaults on the developmental process. Although this definition may seem equivalent to the one presented above, social systems have an added capacity with respect to ecological ones: the humans' ability to anticipate and plan for the future. This ability allows the human being to act proactively, and improve resilience before the adverse events impact the system.

Organization Science

The third development of resilience concerns organizations. The building of resilient organizations has been proposed by Robb (2000), integrating ideas on the structure and dynamics of organizations that successfully survive and develop in complex and turbulent environments (Argyris and Schon (1996), Stace (1996)). In order to construct a resilient organization, the numerous parts or units composing its complex structure should be organized in two intermingled and integrated streams, which are referred to as Performance system and Adaptation system respectively. The Performance system is in charge of pursuing excellent performances in the short term, which are of course essential for the organization to survive in the market. While in classical organizations a Performance system is static, within the new view to resilient organizations, the Adaptation system allows the Performance system to be dynamically created and be dissolved to respond to continuous changes in the environment. The management and coordination of this creation / dissolution process is performed by the Adaptation system. The Performance System forms the conservative part of the organization trying to preserve the current status, whereas the Adaptation system is the reactive or even the pro-active part to anticipate new market trends. Although both systems have to work together and have to be integrated in the whole organization, each one of them should be characterized by a different set of architectures, skills and culture. A Performance system is oriented toward production and tasks, with clear procedures and analytical and rational thinking, in order to obtain a competitive advantage in the market in the short term. On the other hand, an Adaptation system primes innovation, experimentation and learning for the long term, focusing on the outside of the organization to detect uncertainties and changes in the environment.

Safety Science

In safety sciences, Hollnagel et al. (2006) introduced the concept of resilience engineering with the aim to address the human and organizational aspects well in the design of safety critical socio-technical systems. In a Socio-technical system, introduced in Emery and Trist (1960), there are complex interactions between humans, machines and environmental aspects. More recently, interaction of a socio-technical system with its environment has been identified as an essential ingredient (e.g. Badham et al. (2000)) of an open socio-technical system. Typically, these interactions are bidirectional: in order to be able to fulfil its objective in an ever-changing context, an open socio-technical system adapts to the environment and at the same time it influences that environment with its actions. In safety science it is commonly recognized that the established safety engineering approach falls short in adequately handling the challenges posed by the design of safety critical socio-technical systems, especially if open.

Scope

The scope of resilience in air transportation is large. In order to describe this in a systematic way, we explain the key dimensions, which are:

  • Multiple key performance areas (KPA’s)
  • Multiple human operators
  • Multiple stakeholders
  • Multiple time-scales
  • Multiple spatial layers
  • Emergent behaviour
  • Growing air traffic demands

Multiple key performance areas

The examples provided in the previous section show that analysis of resilience involves at least the metrics economy and safety, and that improving for one of the two may come at the cost of the other. This balancing aspect in the further improvement of the air transportation system also applies for the various other key performance areas in air transportation. For example for an aircraft landing under significant wind conditions, the safety preferred landing runway may differ from the noise preferred landing runway. In such situation safety and environment KPA’s are competing with each other. Another example is increasing robustness in flight scheduling to avoid delays which may have a negative overall economic impact for airlines, but which may also have a positive effect for passengers because their chance to be on time is increased. Similar competing examples exist for the other KPA’s; therefore resilience should be studied against the full spectrum of KPA’s in air transportation.

Multiple human operators

In air transportation, many interacting human operators and technical systems, functioning in different organizations at a variety of locations, work at the sharp edge in assuring efficient and safe air transportation amidst various uncertainties and disturbances (e.g. delays, weather, system malfunctioning). Although procedures and regulations tend to specify and oversee sharp end working processes to a considerable extent, the flexibility of human operators appear to be essential for assuring efficient and safe operations in normal and rarer conditions (see FAA/Eurocontrol, 2011). Along this way human operators play an essential role in the resilience of the socio-technical air transportation system. A good understanding of this human-invoked resilience is essential for the design of more automated and adaptive future air transportation. In line with the advances in automation of future air transportation, the roles and responsibilities of humans will change. Nevertheless, it is expected that the flexibility of human operators will remain essential for resilient performance of future air transportation.

Multiple stakeholders

Another challenge is that the analysis of resilience in air transportation typically involves multiple stakeholders. For example both the 9/11 terror event and the Iceland volcanic ashes event involved flight crews, air traffic controllers and passengers at the front level, and airlines and ATC centres plus several organizations at the management level. Similarly, the SARS event involved flight crews and passengers at the front level, and airlines and a several organizations at the management level. The runway incursion event and the fatal mid-air collision involved flight crew and air traffic controllers at the front level, and airlines, ATC centres and some organizations at the management level. The AirFrance crash involved flight crew at the front level, and their airline, the aircraft manufacturer and safety oversight at the management level.

Multiple time scales

As is depicted in the air transportation resilience pyramid in Figure 5.1, one of the challenges of analyzing resilience in air transportation is that the relevant events extend along multiple time scales, which varies from normal activities which may happen many times per flight hour to acting upon potentially hazardous situations, which may happen once per thousand flight hours, to catastrophic accidents or large economic losses which may happen once in a billion flight hours. In order to identify and learn understanding how interacting behaviour in the air transportation socio-technical system may influence resilience at various heights along the slope of the safety pyramid.

Figure 5.1 Air transportation resilience pyramid.

Multiple spatial layers

The air transportation socio-technical system is composed of a plethora of different elements, which are located on different scales that hold interaction between them. These scales have both a spatial and a temporal dimension; for instance, a conflict resolution covers a time scale of seconds and a spatial scale of few Nautical Miles, while fleet planning may cover the whole European airspace along multiple days. While any external observer may easily perform the identification of these scales, it also has to be noticed that the elements of the air transportation socio-technical system are organized in more abstract layers, each one contributing to the dynamics of the system. They may include: airport capacity layer, airspace capacity layer, weather layer, etc.

Emergent behaviour

Due to the interactions between various human operators, technical systems and procedures, the air transportation socio-technical system exhibits emergent behaviour. Some examples are propagation through the network of some disturbances, as reactionary delays; impact on the performance of the system of failures of elements that may seem independent (but which indirectly interact, and lead to important consequences); or, in a future context, the behaviour emerging from different agents making decisions in a collaborative environment. At the same time, one should be aware of the fact that current aviation also works thanks to the explicit use of emergent behaviour for the better. Examples of this are the various control loops that are working in current ATM, within each aircraft itself and also those formed by the interplay between the aircraft crew and each ATM centre on its path. And there is no doubt that the role of control loops will only increase for advanced ATM. Unfortunately intended positive emergent behaviour also may have undesired negative effects. Typically one even should expect that emergent behaviour that is not well understood will have the tendency to have negative effects. Only once emergent behaviour is well understood, it may be exploited for the better. Appendix B gives an overview of safety specific emergent behaviour aspects.

Growing air traffic demands

Air Transportation has experienced an important and fast evolution in the last decades, with a constant growth in the number of flights, aircraft and airports. Also, the market itself has changed significantly: from being composed by a small number of national airlines, up to the recent appearance of many companies with new business models. In this context, the optimization of common airspace resources, along with more strict safety regulations, has reduced the flexibility of some actors, as well as their capacity to react to a changing environment, in turn, reducing the resilience of the system. Even the definition of the "normal" events of the air transportation socio-technical system, is not a trivial problem: and this is worsened by the evolving nature of air transportation – an event may be extremely rare today, but not so rare tomorrow.

Research lines

Define a metric or another quantitative measure of the degree to which a system is resilient

Problem statement

Although resilience is well studied in Ecology, Sociology, Organization and Safety Science it is a rather unknown area in air transportation, both in practical application as well as in research. Resilience is defined with a significant variety in the mentioned research domains. Applying resilience to air transportation we are not able to manage it until resilience can be measured (“Count it or forget it”). Therefore, the main objective of this research line is to find an appropriate quantitative measure of resilience in order to manage it in the context of ATM.

Addressing this objective, Section 5.2.1.2 gives definitions of the term metric in mathematics, economic and software development. The section shows, which correspondence the terms metric and measure have. Section 5.2.1.3 provides research challenges in defining of quantitative measure of resilience in air transportation.

Literature review

Resilience in air transportation is a very rich research topic, both in terms of outstanding challenges and in available analysis techniques. This definitively makes it a very interesting topic of research. Eurocontrol, 2009 has proposed the following resilience definition for air transportation: “Resilience is the intrinsic ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions.”

However, it is not clear how resilience in air transportation can be measured. Obviously, we should find or define a metric or performance indicator, which quantifies operability of the air transportation system and that can help to evaluate its resilience. When we decide to use a metric for evaluation of resilience, it has to be inspected whether a metric itself and properties of metrical and topological spaces induced by the metric are advantageous for the desired evaluation. Mathematically (Kolmogorov and Fomin, 1970) a metric is a distance function, that is, a metric defines the distance between elements of a set. For instance, it can be between the elements of a set of performance indicator values. In the case of no advantages by using metric functions for the evaluation of resilience, a performance objective function, which incorporates above mentioned performance indicators, should be constructed and a performance goal should be specified. It should be also noted, that mathematically each metric is a measure, but not each measure is a metric.

In the literature the term performance metric is often used (Deru and Torcellini, 2005). It leads to confusion, since it has a similar meaning as performance indicator. Under a performance indicator we understand a measurable quantity that indicates some aspects of performance. Also the term metric is used in economic in the meaning business metric (Brue, 2002), which is a unit of measurement that provides a way to objectively quantify a process. Any measurement that helps management understand its operations might be a business metric: number of products completed per hour, percent of defects from a process, hours required to deliver a certain number of outputs or provide a service, and so on.

The term “metric” is not only defined in the context of ATM. IEEE (Std 610.12 from 1990) defines metric in the context of Software Engineering not as a distance function, but as a measure:

metric: A quantitative measure of the degree to which a system, component or process possesses a given attribute. When we are talking about measures we are talking about scales and units. If we abstract from the units we can reduce the huge number to only a few scale types (Ludewig and Lichter, 2010, sect. 1.8). The work of Stevens (1946) seems to be the original source for scale types. A scale type defines all constraints and attributes, which are valid for all scales of this type. If on a scale B all operations are allowed, which are also allowed for a scale A and B defines an operations not allowed for A, then we say, that scale B is stronger than scale A. This definition defines a strict order on the scale types:

  1. Nominal scale: This is a projection into an unordered set, e.g. the set of all callsigns. No order, even no partial order, is defined between the elements of the set. We can sort the callsigns alphabetically, but this order is independent of the semantics of the elements.
  2. Ordinal scale: In this case we have an ordered set (e.g. wake vortex classes: light, medium, heavy, super-heavy), we, however, have no defined distances between the elements. The median is defined for this type. Another example is the sequence of arrivals at runway 25R.
  3. Interval scale: This scale is stronger, because the difference between elements is defined, e.g. the arrival times of the sequence of arrivals at runway 25R forms an interval scale. The difference between two intervals times is defined. The average value is defined. This, however, is not a rational, because we use an arbitrary origin.
  4. Rational scale: Additional to the interval scale we have a defined origin (not an arbitrary), e.g. the flight time of a flight from Madrid to Frankfurt. The ratio between different elements is defined.

The absolute scale is also a rational scale. The number itself is the measure. Normally the absolute scale is only defined for natural numbers. Therefore the absolute scale is not stronger than the relational scale because the ratio between arbitrary natural numbers is not defined. Therefore the median is defined on an absolute scale, but the average is not defined.

Research challenges

Definition of quantitative a measure of resilience in air transportation.

The Air Transportation System is constantly influenced by internal and external events. Every day, several times each day, and at different locations, the operation of the system is perturbed by disturbances of different nature and impact. These disturbances may interact with each other, potentially creating a cascade of adverse events that may span over different spatial as well as time scales, ranging from affecting only one aircraft or a crew, up to a group of aircraft. Finding of a quantitative measure of resilience that captures and estimates the influence of all relevant internal and external events on Air Transportation System is the main research challenge of this research line. Specification of these relevant events and estimation of possible, but not yet known ones plays the key role by the definition of a resilience measure. Hence, another challenge is the evaluation of the constructed quantitative measure, especially in the case of hypothesized events.

With respect to the different scale types we do currently not know which scale type we will be able to define. The most suitable of course is a rational scale. Having an interval scale would be, however, more than we have today. We must be careful not to mix different scale types. If an ordinal scale is involved we cannot calculate. This restriction is very cumbersome if we want to compare our resilience with already known ATM key performance indicators.

Different definitions will be found to measure/quantify resilience and robustness. These “metrics” will be applied to different test data sets and situations from history data where we have already a feeling that the ATM system was resilient in that case and in another case not.

Resilience as a capability to understand perturbations,Modelling of Resilience in Air Transportation

Problem Statement

While the definition of resilience as the ability to withstand shocks and rebuild itself when necessary is widely accepted, different definitions are adopted in different research fields. Most of the definitions lead to a "bottom-up" research strategy, since they try to set up the basic elements of design of a system to be consider resilient. When best practices or design methodologies are defined for a system to be resilient, the outcome of those research lines is a set of principles for re-designing the system.

However, some systems present certain design constraints that make them very difficult to change. Sometimes, regulatory issues or organisational complexities, require a different approach to resilience research: instead of analysis resilience of a system through a bottom-up approach, the behaviour of the system against certain inputs could lead to a definition of resilience for those system. This is especially true in the case of (complex) system that are not completely known, or of system composed by such quantity of elements and interactions that the construction of a complete map of their structure is not viable (think, for instance, in the brain).

In particular, the study of the response of the ATM system against "perturbations" could provide powerful insights on how resilient the system is. Those perturbations are sometimes consequence of external factors (e.g. drastic reduction of local capacity due to weather) and the network analysis on how those external perturbations propagate throughout the system would provide a definition on how resilient the system is against those perturbations.

However, the perturbations could also be due to internal misalignment on some of the characteristics of the system. For instance, an unbalanced capacity-demand would lead to local capacity regulations that would surely lead to perturbations, manifesting in the form of additional delay. The capacity of the system to absorb those unbalances would therefore be a proxy of how resilient the system is.

It is important to mention that while this approach could be taken as a "black box" research approach, the study of the propagation of those perturbations, like spatial propagation, temporal propagation or time needed for a perturbation to be absorbed, could also provide design guidelines that, while they would be far from a bottom-up perspective, might nevertheless lead to important re-design principles.

Literature Review

As already introduces, a top-down approach to resilience analysis may be especially useful for the study of those system that are too complex (in the sense of the number of elements and interactions composing them) for constructing a map of them, allowing the application of a bottom-up analysis. It is not surprising, therefore, to see that two of the field where such technique has been used since decades ago have been psychology and neuroscience.

In psychology, resilient people have been described to have different characteristics, from patience, toleration of negative affect, up to sense of humor. Yet, how is it possible to exactly define the resilience of a person, in a quantitative way? The solution has been the creation of scales, and the analysis of the correlation of the resilience of subjects with some personal and social characteristics (Connor and Davidson, 2003; Parker et al., 1990).

In neuroscience, and in all its branches (like cognitive neuroscience, or social neuroscience), a similar problem is found. As the structure of the human brain is too complex to be analyzed from a bottom-up perspective (in other words, it is not possible to know what is the function of a given neuron, except in some very special cases), a different perspective had to be developed. The objective here was two-fold. On the one side, identify which part of the brain was responsible to perform a specific task, e.g., memory, text recognition, and so forth; on the other side, understand the resilience of the brain, as it's capacity of keeping performing these tasks even when part of the responsible region was damaged. The approach followed was the analysis of available data about people who have suffered an injury in a specific Gino of the brain: the abilities lost were detected, as well as the relation between the severity of the injury and the loss of the ability. The two most notable examples are the discovery of the Broca's area, linked to speech production (Dronkers et al., 2007), and the hippocampus, whose damage is related to retrograde amnesia.

Coming back to air transport, the resilience of the AT system to adverse weather is probably the field where the top-down approach has been more fruitful. The reasons for this reside in the complexity of weather phenomena, and in the complexity of the relationships between weather and ATM, which make a bottom-up approach unfeasible. Several are the examples of researches in this line; among them, Callaham et al. (2001) define an index to assess the impact of weather in en-route and terminal efficiency; and Sridhar and Swei (2006) analyzed the impact of weather in the delay appearance in National Airspace System.

Another important element in Air Traffic Management is the effect of controller workload in the safety of operation. Here again, develop an exact model connecting these two elements is a complex task, and a top-down approach, involving the analysis of real data about ATC operators actuation, is one of the most promising options. Among others, Majumdar and Ochieng (2002), Averty et al. (2004), and Lee (2005) analyzed the relationships between different factors affecting controllers workload and air sector capacities.

In all these examples, the system under analysis (being it a natural one, or the air transport system) is studied by means of a top-down approach, involving the analysis of real data of the system under nominal condition, and under the influence of some kind of perturbation; the result is an assessment of the resilience of the system, which may give hints for strategies for its improvement.

Research Challenges

Defining nominal values and perturbations. The task of analyzing the propagation of perturbations in the air transport system starts by defining quantitative metrics for perturbation assessment, which, in turn, requires the definition of a nominal value. This first step is far from be trivial. For instance, let's suppose we want to define the nominal capacity of an airport; while we could use information about the definition of the system (e.g., theoretical capacity of runway), this value would not account for all the constrains acting on the system (e.g., procedures, weather), and the resulting estimation would be unrealistic. Furthermore, in some cases such nominal value may not be pre-defined. This is the case, for instance, of passengers mobility, which is not included as a design parameter of the system, but instead evolves from the interaction of its constituting elements. Once this nominal level has been established, it is necessary to define the perturbation, as an increase (or decrease) over the nominal value. Yet, no objective way of defining this perturbation has been found in the Literature, and usually this step is let to the experience of the researcher.

Data gathering to support resilience assessment. While this task can be seen as simple, several barriers obstacle its execution. Firstly, not all information is publicly available for research purposes; on the contrary, most of the required information is considered as sensitive, for commercial and safety reasons. Even if all relevant data is gathered, the second (and of utmost importance) step involves the pre-processing of such information, in order to ensure its consistency and representativeness, and to identify and isolate abnormal events.

Investigate cause-effect relationships. The difficulty of identifying cause-effect relationships resides in the intrinsic non-linearity of the air transport system. First of all, one must escape the temptation of using expert judgments in this task, as relevant relationships at exactly those not foreseen by an expert: data-mining thus appears as the most suitable solution. Also, it is important to go beyond simple correlations, as they may not represent a truly causal relationship. Several techniques, drawn from the study of complex systems, have been highlighted in Section 3: yet, the reader must notice that most of them have been developed for the study of univariate time series, and are not directly suitable in more complicated situations (for instance, the detection of causality between trajectories' structures and safety levels).

Modelling of Resilience in Air Transportation

Problem statement

The socio-technical air transportation system shows a plethora of different emergent behaviour: which may be classified as desired and undesired, typical and rare, risky and non-risky, or known and unknown emergent behaviours. The key challenge is to learn understanding emergent behaviour and to use this knowledge in design strategies allowing the mitigation of undesirable emergent behaviours and promoting desirable ones. In order to learn understanding emergent behaviour for the benefit of resilience, the analysis of resilience in the current and future socio-technical air transportation system is in need of modelling approaches which account for the various key dimensions identified in the previous section. Implicitly this means there will not be a silver bullet which addresses all dimensions. Instead multiple complementary approaches need to be explored.

Literature Review

As has been explained in the introduction, the study of resilience from a complexity science perspective is quite recent. This means that it is not yet clear which techniques are most useful in modeling of resilience in air transportation. Part of the research is to identify which techniques are most useful. In view of this, the aim of this section is to give a rather broad overview of complexity science techniques that are of potential use in the modelling of resilience in current and future air transportation. Specific tools and techniques that may be of value in this context are:

  • Static analysis. The aim of static analysis is the identification of topological (i.e., structural) metrics of the graph that are of special significance for the resilience of the system. For the analysis of a multilayer resilience graph, most relevant are topological efficiency and clustering coefficient. Topological efficiency is defined as the harmonic mean of the lengths of the minimal paths connecting all pairs of nodes. The higher this value, the easier it is to move from one node to another node of the network; in other words, a high efficiency describes a system in which a perturbation can easily propagate and affect different parts of it.
  • Out-of-equilibrium physics (Van Vliet, 2008). Understanding the general characteristics of systems which exhibits behaviours close to criticality is crucial to design managing strategies to improve their performance. Several examples can be found in the literature, including the power transportation network (Carreras, 2004a; Carreras, 2004b), streets network inside a city (Jiang, 2009) or more theoretical systems like sand piles. Nevertheless, little effort has been dedicated to applying this branch of physics to the ATM;
  • Complex Networks (Boccaletti, 2006; Newman, 2003). This mathematical framework has been widely used in the last decade to understand the hidden topology of relations between elements of a large number of natural and man-made systems. Applications range from networks of human personal relations, to yeast genes, through the Internet or relations between economical agents. Complex Networks can help in the understanding of relations between different agents of the ATM systems at several levels: from interactions between aircraft and safety, to networks of flights connecting different cities through different airlines;
  • Swarm intelligence, natural and spatial computing (DeCastro, 2006). In next years, ATM will go through important changes: the responsibility of many decisions will be decentralized to aircraft or airports; with this, there will be a need for more coordination and collective awareness. Such changes may lead to undesirable emerging behaviours, which should be predicted and managed to ensure the safety of the system;
  • Viability theory (Aubin, 1991) has originally been developed to study dynamical systems which collapse or badly detoriate if they leave a given subset of the state space. Therefore the objective is to keep the system in the part of the state space where it can survive, i.e. where it is viable. In follow-up research (Aubin et al., 2002) viability theory has been extended to hybrid dynamical systems. Recently (Martin et al., 2011) have explained that viability theory provides a natural mathematical framework for the modeling and analysis of resilience in complex systems.
  • Agent-based simulations and Multi-Agent Systems. For the modelling and analysis of sociotechnical systems, it has become common practice to adopt an ABM simulation approach. Bonabeau (2002) captures the benefits of ABM over other modelling techniques in three statements: (i) ABM captures emergent phenomena; (ii) ABM provides a natural description of a system; and (iii) ABM is flexible. It is clear, however, that the ability of ABM simulation to deal with emergent phenomena is what drives the other benefits. In Burmeister (1997) it is further argued that multiple interacting agent models are suited to domains that are functionally or geographically distributed into autonomous subsystems, where the subsystems exist in a dynamic environment and interact more flexibly. This makes ABM simulation a logical choice for the evaluation of advanced ATM designs. For example, Shah et al. (2005) showed that ABM simulation offers the capability to integrate cognitive and technology models and description of their operating environment. Simulation of these individual models acting together can predict the results of transformations in procedures and technology. This emergent behaviour typically cannot be foreseen and evaluated by examining the individuals behaviour alone.
  • Human performance modelling. In modelling a socio-technical system a key element is to capture human agents through a stochastic dynamical model. Within air transportation such human performance models have been developed; overviews and comparisons of these models are provided in (Corker et al., 2005), (Blom et al., 2005) and (Foyle and Hooey, 2008). Of these human performance modelling approaches, MIDAS has specifically adopted the agent based modelling (ABM) framework in order to include human directed situation awareneness of the world. Within TOPAZ ABM has explicitly been embraced (Stroeve et al., 2003) in order to extend the human directed situation awareness (SA) model of Endsley (1995a) to a multi-agent SA propagation model. This model covers both human and technical agents. The motivation for developing this extension was twofold: 1) Endsley (1995b) showed that more than 60% of the causal factors underlying aircraft accidents involving major air carriers in USA involved problems with proper SA; and 2) our finding that many hazards identified through brainstorming with pilots and controllers could be properly modelled through such a multi-agent SA propagation model. The multi-agent SA model of (Stroeve et al., 2003) makes explicit that in a multi-agent system, SA propagates from one agent to another agent. This is comparable to Chinese whisper going from one person to another person. Just like Chinese whisper errors may sneak in without noticing by the participants, errors may sneak in the SA’s of agents in a multi agent system without noticing by the agents.
  • Reachability analysis. In an ABM simulation, safety critical events can be defined as events where the joint state of the simulated agents involved hit a certain subset of their joint state space. In systems theory, the estimation of the probability of reaching a given subset of the state space within a given time period is known as a problem of probabilistic reachability analysis, e.g. (Kurzhanski and Varaija, 2002). Because of the huge dimensionality of a multi-agent model of a complex sociotechnical system, existing probabilistic reachability approaches, e.g. (Prandini and Hu, 2006), fall short. In safety-critical industries, e.g., nuclear, chemical, etc., reachability analysis is addressed by methods that are known as dynamical approaches towards probabilistic risk analysis (PRA). For an overview of these dynamical methods in PRA, see (Labeau and Swaminathan., 2000). These dynamical PRA methods make explicitly use of the fact that between two discrete events the dynamical evolution satisfies an ordinary differential equation. In the stochastic control theory these are known as piecewise deterministic Markov process (Davis MHA, 1993), (Bujorianu and Lygeros, 2003). For proper safety modelling of air traffic operations, however, it is often needed to incorporate Brownian motion in the piecewise deterministic Markov process models, e.g. to represent the effect of random wind disturbances on aircraft trajectories (Pola et al., 2003).
  • Generalised Stochastic Hybrid Process (GSHP). The class of systems which incorporates Brownian motion within piecewise deterministic Markov processes, has been defined as a stochastic hybrid automaton (Bujorianu, 2004). Such automaton has a hybrid state consisting of two components: a continuous valued state component and a discrete valued state component. The continuous state evolves according to a stochastic differential equation (SDE) whose vector field and drift factor depend on both hybrid state components. Switching from one discrete state to another discrete state is governed by a probability law or occurs when the continuous state hits a pre-specified boundary. Whenever a switching occurs, the hybrid state is reset instantly to a new state according to a probability measure which depends itself on the past hybrid state. Complementary dynamic and stochastic effects are induced by the interaction between the hybrid state components. A key quality of a stochastic hybrid automaton is that it generates a process named generalised stochastic hybrid process (GSHP) which satisfies the strong Markov property (Bujorianu and Lygeros, 2006), (Krystul et al., 2007).
  • GSHP generating Petri Nets. For the modelling of accident risk of safety-critical operations in nuclear and chemical industries, the most advanced approaches use Petri nets as model specification formalism, and stochastic analysis and Monte Carlo simulation to evaluate the specified model, e.g., [Labeau and Swaminathan, 2000].. Since their introduction as a systematic way to specify large discrete event systems that one meets in computer science, Petri nets have shown their usefulness for many practical applications in different industries, e.g. [David and Alla, 1994.]. Various types of Petri net modelling have also found their way into reliability and safety applications, e.g. (Sadou and Demmou, 2009; Kleyner and Volovoi, 2009; Bouali et al., 2012; Ghazel, 2009).
  • Monte-Carlo simulations and Probabilistic Complex Networks. Uncertainty is present all the time in the Air Transportation System, both for external (for instance, weather) and internal causes (instruments precision, equipment failures…). Monte-Carlo simulations are the standard way to account for this uncertainty in simulating the behaviour of a system. Yet more approaches are available, like adapting Complex Networks theories to non-deterministic analysis;
  • Stochastic Differential Equations (Oksendal, 2003) and Hybrid SDEs (Krystul et al., 2007). SDEs are widely used in physics and finance to describe processes with a stochastic (that is, not deterministic) part; they have the form of a differential equation in which one or more of the terms are related to some form of stochasticity, for instance white noise;
  • Hybrid Petri Nets, High-level Hybrid Petri Nets and Hybrid Automata (Wieting, 1996; Allam, 1996; David, 2001). Hybrid models allows to join a continuous part, describing some physical process with continuous flows, with a discrete logic and computational functioning;
  • Bisimilarity: This refers to formally proven transformations of one formalism to another one, e.g. (Van der Schaft, 2004). A bisimilarity transformation allows to combine the specific theory and tools available for both formalisms. For the different classes of stochastic hybrid systems bisimilarity relations have been developed by (Everdij and Blom, 2005; 2006; 2010). These bisimilarity relations are for example exploited in agent-based accident risk analysis (Blom et al., 2007).

Research Challenges

Application of complexity techniques to resilience in air transportation. As has been explained in Section 5.1, resilience in air transportation has a very broad scope. At the same time complexity science directed studies of resilience have started recently only. This combination makes a full study of complexity techniques on their applicability to resilience in air transportation demanding. Hence one may expect that there is low hanging fruit that can be harvested on the short term by approaching the problem from a complexity science perspective. The key challenge then is to also identify useful applications of these low hanging fruits to the understanding of resilience in current and future air transportation. In analysing the behaviour of the air transportation system through the help of mathematical models one should always be aware that the results obtained apply to the adopted model. By the very nature of a model there will be differences between the model analysed and the true operation. The question then is how large these differences are and how much impact these differences have on the results obtained for the model analysed. The analysis of the effects of differences between model and reality should be accomplished through sensitivity and bias and uncertainty analysis


Case Studies

Mining historical data for the assessment of resilience

Introduction As has been explained in Section 5.2, the air transport socio-technical system is constantly influenced by internal and external events. Every day, several times each day, and in different locations at the same time, the operation of the system is perturbed by small disturbances. Even worse, these disturbances may interact with each other, creating a cascade of adverse events that may span over different spatial and time scales, from affecting only one aircraft or a crew, up to a group of airways met with a thunderstorm; but, at the same time, they usually have a small impact on the overall performance of the system, thanks to its own resilience – aircraft and crews may be rescheduled, flights may be rerouted, and so forth. Yet, this capacity to withstand perturbations is finite, and the situation is expected to get worse in the future. The optimization of common airspace resources, due to increasing traffic, along with more strict safety regulations, have reduced the flexibility of some actors, as well as their capacity to react to a changing environment, in turn, reducing the resilience of the system. As a result, and in order to continue making air transport one of the pillars of our society, allowing the connectivity and mobility of European citizens, and to be both competitive and complementary to other alternative transportation modes, resilience should be clearly included in future ATM research and engineering. Network-based operational techniques to absorb non-ordinary and "black swan" events should be developed, aiming to retain acceptable performance metrics in any condition. Problem Statement One of the main problems that should be faced in order to improve the resilience of the AT system is the definition of which events are "normal", and what is the impact of abnormal events in the dynamics of the system. Yet, even the definition of which events are "normal", that is, taken into account in the design of the system, is not a trivial problem, and this is worsened by the ever changing nature of ATM - an event may be extremely rare today, but not so rare tomorrow. Complex data analysis can help in this task by analyzing historical data. Relationships between different elements of the system can be assessed in terms of the effects that an abnormal behavior in one element can have in another. Such relationships can be represented in a network format. In turn, this would allow an identification of the bottlenecks, or of the critical elements, thus opening the door to innovative mitigation strategies and policies. Analysis Phase The initial data required for such an analysis would be historical ATM operational data. Initially, let us focus on the capacity of each airport, which can be simply mined from historical traffic data. Techniques like Granger Causality or Permutation Entropy Causality (see Section 3.2.1.2) can be used to detect relationships between the different airports of the network. In other words, we would be assessing if a change in the capacity of airport A is generating a change in the capacity of a second airport, B. When all relationships between all pairs of airports have been assessed (some of them will be present, while some pairs of airports may not be related at all), this information can be represented in a network format (see also Section 3.2.2.2). The following figure is an example of such a network representation.

Figure 5.2 Example of a network representation of resilience links.

Notice that the global (or macro-scale) analysis of the topology (i.e., structure) of the network can unveil relevant information about the resilience of the system as a whole. Specifically, in the previous figure, an event affecting airport A would spread to the other three airports, thus strongly affecting the dynamics of the whole system. On the contrary, a perturbation in airport B would have limited impact. The same process can be expanded to cover several aspects of the operation. For instance, one may try to assess the impact of an adverse weather event to the capacity of airports. The following figure depicts such a situation.

Figure 5.3 Example of a network with two interacting layers.

Note that several interactions have been added. Firstly, there are links (relationships) across layers: in this case, a single meteorological phenomena is affecting the capacities of different airports. But also, there are links between different nodes of the weather layer. This represents situations in which the weather condition at one point (e.g., in a TMA) is related to the weather condition at another point (e.g., adjacent sectors). Again, the analysis of the system as a whole would provide information for fostering the resilience of the system. Specifically, in this example, improving the equipment for bad weather operations in airport A would result in a smaller disturbances on the whole network

Agent-based modelling of hazards in air transportation

Introduction Air Traffic Management (ATM) is a complex socio-technical system in which a large variety of human and technical agents interact with each other [Hollnagel and Woods, 2005]. Thanks to these interactions, the agents jointly cope in an intelligent manner with the various disturbances that may be caused by the environment. Resilience Engineering [Hollnagel et al., 2008, 2006] is the scientific discipline that studies the design of such intelligent socio-technical systems. Resilience indicates that operations and organisations are able to resist a wide variety of demands within their domains and thus should be able to recover from any condition in their domains that may disturb the stability of the operation or organisation. Hence, resilience engineering aims to address a wide range of nominal and non-nominal conditions. Resilience engineering has some common grounds with hazard assessment. Nevertheless, there also are two significant differences:

  1. Resilience engineering emphasises much more the potential ways human agents in the joint cognitive system can respond in a flexible way to the various hazards, rather than assessing safety risks of these hazards.
  2. Focus of traditional hazard assessment (e.g., [Eurocontrol, 2004]) is on hazards that can be evaluated using linear causation mechanisms (e.g. fault/event trees); the consequence of which is that many human related hazards tend to fall out of sight.

The flexibility of human responses is especially important to respond well when the air traffic situation evolves into a condition for which the procedures are no longer unambiguous. From a resilience engineering perspective this means that we should find out what these kind of non-nominal conditions are and how humans anticipate upon their potential evolution from a nominal condition into a non-nominal condition.

Problem statement For a complex socio-technical system as ATM is, resilience engineering is at an early stage of development. During recent years novel psychological model constructs have been studied in capturing human cognition and its interaction with other joint cognitive system entities [Hollnagel et al., 2008, 2006]. A limitation of this approach is a lack of a systematic approach to modelling and simulation of all possible interactions in a complex socio-technical system. To support a more systematic analysis, the MAREA (Mathematical Approach towards Resilience Engineering in ATM) project aims to develop a mathematical modelling and analysis approach for resilience engineering in ATM. In the literature of modelling and analysis of complex socio-technical systems, agent-based modelling and simulation has emerged as a remarkably powerful approach. For this reason the study of agent-based modelling of hazards in ATM is one of the main MAREA research streams.

Analysis phase In a preparatory phase of this agent-based hazard modelling research, a large data base of hazards in ATM has been identified, including ways that pilots and controllers deal with them [Stroeve et al., 2011]. Subsequently the agent-based modelling of hazards is organized in three phases. During the first phase 13 existing agent-based model constructs of the TOPAZ safety risk assessment methodology [Blom et al., 2001, 2006] are compared against the hazards in the database. This analysis indicates that 58% of the hazards in the ATM hazard database are modelled well, 11% are partly modelled, and 30% of the hazards are not modelled [Stroeve et al., 2011]. It should be noticed that within the TOPAZ methodology, the impact of unmodelled hazards on safety risk is evaluated using sensitivity and bias and uncertainty analysis [Everdij et al., 2006]. During the second phase the same is done for existing agent-based model constructs developed by VU Amsterdam; 11 novel model constructs have been identified. The percentage of hazards that has been found to be well modelled increased from 58% to 80%, the percentage of hazards partly modelled decreased from 11% to 7%, and the percentage of hazards not modelled decreased from 30% to 14%. This improvement was mainly due to the modelling of additional human performance-related hazards. In particular, the coverage of hazards related to pilot performance increased from 50% to 85% and the coverage rate for controller performance shows an increase from 42% to 87%. During the third phase 14 complementary model constructs have been developed for the remaining hazards. These developments concerned human performance related hazards as well as hazards that are in the ‘weather’ and ‘other’ clusters. The 38 model constructs from the three phases together are analysed with respect to the extent to which they model the various hazards in the database [Bosse et al., 2012]. The results indicate that the total set of model constructs is capable to model 92% of the hazards well, 6% of the hazards partly and only 2% of the hazards not. In follow-up research on agent-based hazard modelling, the applicability of the 38 agent-based model constructs will be explored in more detail. To this end, the agent-based model constructs need to be tailored to the ATM domain, formalised and integrated. For example, when applying the model construct for handling of inconsistent information by a technical system, choices need to be made regarding the exact inputs and outputs that will be modelled. Such choices will generally depend on domain-specific aspects of the system under consideration. Regarding integration of the model constructs, we expect that this will lead to modelling of safety-relevant scenarios that cannot be captured by individual model constructs alone. For example, bad weather in itself may not lead to a safety-relevant scenario, but in combination with an incorrect focus of the pilot’s attention (as modelled via the ‘goal-oriented attention’ model construct) it could. These types of safety-relevant scenarios can emerge when the separate model constructs are connected together, and the global behaviour of the integrated system is studied (e.g., by simulation). A validation of the agent-based model constructs will be pursued by performing ‘proof-of-concept simulations’, which qualitatively describe ways that hazards can evolve in ATM scenarios. The behaviour of the agent-based models will be evaluated for a second hazard set (as defined in [Stroeve et al., 2011]) and by having experts judge the plausibility of the resulting proof-of-concept simulations. Through a complementary study, a comparison will be made between our agent-based hazard modelling and the psychological modelling that has been triggered by [Hollnagel et al., 2008, 2005].

References

  • Allam M., Alla H., From Hybrid Petri Nets to Hybrid Automata, Technical Report 96.113, Laboratoire d’Automatique de Grenoble (Grenoble, France), (1996).
  • Aubin JP, 1991, Viability theory, Birkhauser, Basel
  • Aubin JP, Lygeros J, Quincampos M, Sastry S, Seube N, 2002, Impulse differential inclusions: a viability approach to hybrid systems, IEEE Transactions on Automatic Control, Vol. 47, No. 1, January 2002.
  • Averty P, Collet C, Dittmar A, Athènes S, Vernet-Maury E, 2004. Mental Workload in Air Traffic Control: An Index Constructed from Field Tests. Aviation, Space, and Environmental Medicine, Vol. 75, pp. 333-341.
  • Bedau, M. (2002). Downward Causation and the Autonomy of Weak Emergence. Principia, Vol. 6, issue 1.
  • Berkes F., Colding J. and Folke C,. 2003. Navigating social ecology systems, Cambridge University Press
  • Blom, H.A.P., Bakker, G.J., Blanker, P.J.G., Daams, J., Everdij, M.H.C., and Klompstra, M.B. (2001). Accident risk assessment for advanced air traffic management. In G. L. Donohue & A. G. Zellweger (Eds.), Air Transport Systems Engineering, AIAA, pp. 463-480.
  • Blom HAP, Corker KM, Stroeve SH, 2005. On the integration of human performance and collision risk simulation models of runway operation, Proc. 6th USA/Europe Air Traffic Management R&D Seminar, Baltimore, USA, June 2005.
  • Blom HAP, Krystul J, Bakker GJ, Klompstra MB and Klein Obbink B., 2007. Free flight collision risk estimation by sequential Monte Carlo simulation. In: Cassandras CG, Lygeros J, editors. Stochastic hybrid systems; recent developments and research trends. CRC Press, Boca Raton, USA, pp. 249-281.
  • Blom, H.A.P., Stroeve, S.H., and Jong, H.H. de (2006). Safety risk assessment by Monte Carlo simulation of complex safety critical operations. In F. Redmill & T. Anderson (Eds.), Developments in Risk-based Approaches to Safety: Proc. of the 14th Safety-citical Systems Symposium, Bristol, Springer, UK.
  • Boccaletti S., Latora V., Moreno Y., Chavez M. and Hwang D.-U., (2006). Complex networks: Structure and dynamics, Physics Reports, Vol. 424, pp. 175-308.
  • Bonabeau, E., 2002. Agent-based modeling: methods and techniques for simulating human systems. Proc. of the National Academy of Sciences of the USA. Vol. 99, pp. 7280-7287.
  • Bosse, T., Sharpanskykh, A., Treur, J., Blom, H.A.P., and Stroeve, S. (2012). Library of Existing VU Model Constructs. Technical report D2.1 for the SESAR WP-E project MAREA.
  • Bosse, T., Sharpanskykh, A., Treur, J., Blom, H.A.P., and Stroeve, S. (2012). Modelling of Human Performance-Related Hazards in ATM. In: Air Transport and Operations – Proc. 3rd Int. Air Transport and Operations Symposium. IOS Press, Amsterdam.
  • Bosse, T., Sharpanskykh, A., Treur, J., Blom, H.A.P., and Stroeve, S. (2012). New Model Constructs for Hazard Coverage. Technical report D2.2 for the SESAR WP-E project MAREA.
  • Bouali M, Barger P and Schon W., 2012. Backward reachability of Colored Petri Nets for systems diagnosis. Reliability Engineering & System Safety, Vol. 99, pp. 1-14
  • Brue, G., 2002. Six Sigma for Managers, McGraw-Hill, N.Y.
  • Bujorianu ML and Lygeros J., 2003. Reachability questions in piecewise deterministic Markov processes. In: Mahler O, Pnuelli A, editors. Proc. of Hybrid Systems Computation and Control, Springer, Berlin, pp. 126-140
  • Bujorianu ML., 2004. Extended stochastic hybrid systems. In: Mahler O, Pnuelli A, editors. Proceedings of Hybrid Systems Computation and Control. Springer, Berlin, pp. 234-249.
  • Bujorianu ML, Lygeros J., 2006. Towards a general theory of stochastic hybrid systems. In: Blom HAP, Lygeros J, editors. Stochastic Hybrid Systems: Theory and Safety Critical Applications. Springer, Berlin, pp. 3-30
  • Burmeister, B., Haddadi, A. and Matylis, G., 1997. Applications of multi-agent systems in traffic and transportation. IEE Transactions on Software Engineering, Vol. 144, pp. 51-60.
  • Callaham M.B., DeArmon J.S., Arlene M.C., Goodfriend J.H., Moch-Mooney D., Solomos G.H., Assessing NAS performance: normalizing the effects of weather, Proc. 4th USA/Europe Air Traffic Management R&D Symposium.
  • Carreras B., Lynch V., Dobson I. and Newman D., (2004) Complex dynamics of blackouts in power transmission system, Chaos: An Interdisciplinary Journal of Nonlinear Science, Vol. 14.
  • Carreras B, Newman D.E., Dobson I. and Poole A.B. (2004), Evidence for self-organized criticality in a time series of electric power system blackouts, IEEE Transactions on Circuits and Systems Part 1, Vol. 51, pp. 1733-1740.
  • Cheng V., Crawford L., Menon P., 2002. Air traffic control using genetic search techniques, Proc. 1999 IEEE Int. Conference Control Applications.
  • Comfort LK., Boin A. and Demchak CC., 2010. Designing Resilience: Preparing for Extreme Events, University of Pittsburgh Press
  • ComplexWorld, ATM White Paper/ Position Paper: Resilience and Adaptation in the context of Air Transportation, http://www.complexworld.eu/complexworld-network-position-paper/
  • Connor KM, Davidson JRT, 2003. Development of a new resilience scale: the Connor-Davidson resilience scale. Depression and Anxiety, Vol. 18, pp. 76-82.
  • Corker, K.M., Blom, H.A.P., & Stroeve, S.H. (2005). Study on the integration of human performance and accident risk assessment models: Air-MIDAS and TOPAZ. Proc. Int. Symposium Aviation Psychology 2005, Dayton (OH), USA, pp. 147-152.
  • David R and Alla H., 1994. Petri Nets for the modeling of dynamic systems - A survey. Automatica, Vol. 30, pp. 175-202.
  • David R, Discrtete, Continuous and Hybrid Petri Nets, 2001, Springer.
  • Davis MHA, 1993. Markov Models and Optimization. Chapman & Hall, London, UK.
  • DeCastro L.N. (2006), Fundamentals of Natural Computing: Basic Concepts, Algorithms, and
  • Applications, Chapman and Hall/CRC (2006).
  • Deru, M. and Torcellini, P., 2005. Performance Metrics Research Project – Final Report, Technical Report, NREL/TP-550-38700
  • Dronkers NF, Plaisant O, Iba-Zizen MT, Cabanis EA, 2007. Paul Broca's historic cases: high resolution MR imaging of the brain of Leborgne and Lelong. Brain Vol. 130, pp. 1423-1441.
  • Endsley MR., 1995a. Toward a Theory of Situation Awareness in Dynamic Systems. Human Factors: The Journal of the Human Factors and Ergonomics Society. Vol. 37, pp. 32-64.
  • Endsley MR., 1995b. A taxonomy of situation awareness errors. In: Fuller R, Johnston N, McDonald N, editors. Human factors in aviation operations. Ashgate Publishing, Aldershot, UK, pp. 287-292
  • Erzberger H., Nedell W., 1989. Design of automated system for management of arrival traffic, NASA Technical Memorandum 102201
  • Eurocontrol (2004). Air navigation system safety assessment methodology. SAF.ET1.ST03.1000-MAN-01, edition 2.0.
  • Everdij MHC and Blom HAP, 2005. Piecewise deterministic Markov processes represented by Dynamically Coloured Petri Nets. Stochastics. 77:1-29
  • Everdij, M.H.C., Blom, H.A.P., and Stroeve, S.H. (2006). Structured assessment of bias and uncertainty in Monte Carlo simulated accident risk, Proc. 8th Int. Conf. on Probabilistic Safety Assessment and Management (PSAM8), May 2006, New Orleans, USA.
  • Everdij MHC and Blom HAP, 2006. Hybrid Petri Nets with diffusion that have into-mappings with generalised stochastic hybrid processes. In: Blom HAP, Lygeros J, editors. Stochastic Hybrid Systems: Theory and Safety Critical Applications. Berlin, Germany: Springer; p. 31-63
  • Everdij MHC and Blom HAP, 2010. Hybrid state Petri nets which have the analysis power of stochastic hybrid systems and the formal verification power of automata. In: Pawlewski P, editor. Petri Nets. Vienna, Austria: I-Tech Education and Publishing; 2010. p. 227-52
  • Eurocontrol, 2009. A white paper on Resilience Engineering for ATM. September 2009.
  • Folke C. et al., 2002. Resilience and sustainable development: building adaptive capacity in a world of transformations, A Journal of the Human Environment, Vol. 31, pp. 437-440.
  • Foyle, D.C., and Hooey, B.L. (Eds.). (2008). Human performance modeling in aviation. Boca Raton (FL), USA: CRC Press.
  • Ghazel M., 2009. Using stochastic Petri nets for level-crossing collision risk assessments. IEEE Transactions on Intelligent Transportation Systems, Vol. 10, pp. 668-677.
  • Gilbo E., 2002. Optimizing airport capacity utilization in air traffic flow management subject to constraints at arrival and departure fixes, IEEE Transactions Control Systems Technology.
  • Gilbo E., Kenneth H., 2000. Collaborative Optimization of Airport Arrival and Departure Traffic Flow Management Strategies for CDM, 3rd USA/Europe Air Traffic Management R&D Seminar, Napoli, Italy, 2000, June 13-16
  • Gilbo E., Smith S., 2011. Monitoring and Alerting Congestion at Airports and Sectors under Uncertainty in Traffic Demand Predictions, Air Traffic Control Quarterly, Vol. 19, pp. 83-113
  • Gluchshenko O., 2011. Dynamic Usage of Capacity for Arrivals and Departures in Queue Minimization, Proc. 2011 IEEE Int. Conf. on Control Applications (CCA), pp. 139-146
  • Gunderson LH.,Carpenter SR., Folke C., Olsson P. and Peterson GD., 2006. Water RATs in lake and wetland social-ecology systems, Ecology and Society 11 (1), art. 16
  • Gunderson LH and Pritchard L., 2002. Resilience and the behavior of large-scale systems, Island Press
  • Helmke H., Hann R., Müller D., Uebbing-Rumke M., Wittkowski D., 2009. Time-Based Arrival Management for Dual Threshold Operation and Continous Descent Approaches, Proc. 8th USA/Europel ATM R&D Seminar, Napa, California, USA
  • Heidt, A., and Gluchshenko, O. (2012). From uncertainty to robustness and system's resilience in ATM: a case-study. Air Transport and Operations Symposium 2012 (ATOS), Delft, June, 2012.
  • Hollnagel, E., and Woods, D.D. (2005). Joint cognitive systems: Foundations of cognitive systems engineering. CRC Press, Boca Raton (FL), USA.
  • Hollnagel, E., Nemeth, C.P., and Dekker, S. (2008). Resilience Engineering Perspectives, Volume 1: Remaining sensitive to the possibility of failure. Ashgate, Aldershot, England.
  • Hollnagel, E., Woods, D.D., and Leveson, N. (2006). Resilience engineering: Concepts and precepts. Ashgate, Aldershot, England.
  • Jackson S., 2010. Architecting Resilient Systems: Accident Avoidance and Survival and Recovery from Disruptions, Wiley
  • Jiang B., Liu C., 2009, Street-based Topological Representations and Analyses for Predicting Traffic Flow in GIS, International Journal of Geographical Information Science 23, (9).
  • Kleyner A and Volovoi V., 2009. Application of Petri nets to reliability prediction of occupant safety systems with partial detection and repair. Reliability Engineering & System Safety, Vol. 95, pp. 606-613
  • Kolmogorov, A.N. and Fomin, S.V., 1970. Introductory real analysis. Prentis-Hall, Inc., Englewood Cliffs, N. J., ISBN 0-486-61226-0
  • Krystul J, Blom HAP and Bagchi A., 2007. Stochastic hybrid processes as solutions to stochastic differential equations. In: Cassandras CG, Lygeros J, editors. Stochastic hybrid systems: Recent developments and research trends: CRC Press; p. 15-45.
  • Kurzhanski AB and Varaiya P., 2002. On reachability under uncertainty. SIAM Journal on Control and Optimization. Vol. 41, pp. 181-216.
  • Labeau PE, and Swaminathan S., 2000. Dynamic reliability: towards an integrated platform for probabilistic risk assessment. Reliability Engineering and System Safety, Vol. 68, pp. 219-254.
  • Lee PU, 2005. A Non-Linear Relationship between Controller Workload and Traffic Count. Proc. of the Human Factors and Ergonomics Society Annual Meeting.
  • Ludewig J., Lichter H., 2010: Software Engineering (in German), dpunkt.verlag, 2 eds.
  • Luthar SS, Cicchetti D. and Becker B., 2000. The Construct of Resilience: A Critical Evaluation and Guidelines for Future Work, Child Dev. 71 (3): 543-562
  • Majumdar A, Ochieng WY, 2002. Factors Affecting Air Traffic Controller Workload: Multivariate Analysis Based on Simulation Modeling of Controller Workload. Journal of the Transportation Research Board 1788, pp. 58--69.
  • Martin S, Deffuant G, Calabrese JM, 2011, Defining resilience mathematically: from attractors to viability, In: Viability and resilience of complex systems, Eds: G. Deffuant and N. Gilbert, Springer, 2011.
  • Neuman F., Erzberger H., 1990. Analysis of sequencing and scheduling methods for arrival traffic, NASA Technical Memorandum, Moffett Field, California
  • Newman M.E.J. (2003), The structure and function of complex networks, SIAM Review 45, (2003) pp. 167-256.
  • Oksendal BK, 2003. Stochastic differential equations, Springer, 5th printing, 2010.
  • Parker GR, Cowen EL, Work W., Wyman PA, 1990. Test correlates of stress resilience among urban school children. Journal of Primary Prevention 11 (1), pp. 19-35.
  • Pola G, Bujorianu ML and Di Benedetto MD, 2003. Stochastic hybrid models: an overview with applications to air traffic management. Proc. IFAC Conf Analysis and Design of Hybrid Systems (ADHS). Saint-Malo, Brittany, France
  • Prandini M and Hu J., 2006. A stochastic approximation method for reachability computations. In: Blom HAP, Lygeros J, editors. Stochastic Hybrid Systems: Theory and Safety Critical Applications. Berlin, Germany: Springer
  • Rochlin GI, La Porte and Roberts KH., 1987. The Self-Designing High-Reliability Organization: Aircraft Carrier Flight Operations at Sea, Naval War College Review 40: 76-90
  • Sadou N and Demmou H., 2009. Reliability analysis of discrete event dynamic systems with Petri nets. Reliability Engineering & System Safety. 94:1848-61
  • SESAR consortium, 2007. The ATM Target Concept D3, DLM-0612-001-02-00
  • Shah, A.P., Pritchett, A.R., Feigh, K.M., Kalarev, S.A., Jadlav, A., Corker, K.M., et al., 2005. Analyzing air traffic management systems using agent based modeling and simulation. Proceedings of the 6th USA/Europe Air Traffic Management R&D Seminar, Baltimore, USA
  • Sridhar B, Swei SSM, 2006. Relationship between Weather, Traffic and Delay Based on Empirical Methods. Proc. ATIO Conference 2006.
  • Stevens SS (1946): On the Theory of Scales of Measurement. Science, Vol. 103, pp. 677-680.
  • Stroeve SH, Blom HAP, Bakker GJ., 2003. Multi-agent situation awareness error evolution in accident risk modelling. 5th USA/Europe Air Traffic Management R&D Seminar. Budapest, Hungary
  • Stroeve, S.H., Everdij, M.H.C., and Blom, H.A.P. (2011). Hazards in ATM: model constructs, coverage and human responses. Technical report D1.2 for the SESAR WP-E project MAREA.
  • Stroeve, S.H., Everdij, M.H.C., and Blom, H.A.P. (2011). Studying hazards for resilience modelling in ATM - Mathematical approach towards resilience engineering. In D. Schaefer (Ed.), Proc. of the SESAR Innovation Days 2011. Eurocontrol, Brussels.
  • Van Der Schaft AJ, 2004. Equivalence of dynamical systems by bisimulation, IEEE Transactions on Automatic Control, Vol. 49, pp. 2160-2172.
  • Van Vliet C.M. (2008), Equilibrium and non-equilibrium statistical mechanics, World Scientific Publishing.
  • Völkers U., 1990. Arrival planning and sequencing with COMPAS-OP at the Frankfurt ATC-Center, The 1990 American Control Conference, San Diego (CA).
  • Wang TC., Li YJ, 2011. Optimal Scheduling and Speed Adjustment in En Route Sector for Arriving Airplanes, Journal of Aircraft, Vol. 48, No. 2
  • Wu CL., 2010. Airline Operations and Delay Management, Ashgate Publishing Ltd.
  • Wieting R. (1996), Hybrid High-Level Nets, Proc. 1996 Winter Simulation Conf., Coronado (CA), pp. 848-855.

This project has received funding from the SESAR Joint Undertaking under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 783287.