Modelling Wide Area Network Traffic of a High-Energy Physics Collaboration

Christoph von Praun - CERN / IT Division

Introduction

Today's work of large scientific collaborations depends on worldwide telecommunications and in particular on the worldwide disposability of large data volumes. This dependence on efficient telecommunication infrastructure is going to increase in the future, especially for research fields where large amounts of experimental data are involved. In the following, we particularly refer to such systems in the High-Energy Physics domain but research collaborations in other scientific domains are faced with similar requirements.

For computing in the context of the Large Hadron Collider at CERN in 2005 (LHC), the development process of future distributed data processing systems has produced several so-called computing models. A computing model in this context is understood as a general proposal for the architecture of a future large scale data processing system. The planning timescale is thereby 5 years and more. The models vary in their degree of worldwide distribution of data and processing capabilities. Feasibility studies of such models will be mainly based on detailed simulations. Unlike today's physics data analysis systems, the implementation and operation of future physics data analysis systems will be largely determined through the availability of wide area data communications.

The work presented in the following could be understood as a preliminary step towards the refinement and simulation of such computing models. The terms of reference can be defined as follows:

"Evaluate the network bandwidth consumption issued through multiple physicists in a non-trivial, routed network. The physicists work in different time zones and their wide area network utilisation profile is aligned to a working day."

The work was carried out during my stay at the Caltech HEP department in June 1998.

Background

The modelling and simulation of distributed computing systems has been investigated in CERN's Physics Data Processing group previously and a framework for the Simulation of Distributed Architectures (SoDA) has been developed. SoDA has been predominantly used for the simulation of local-scale distributed systems with a deterministic workload profile. The present studies should investigate if SoDA is also suited to address particular characteristics of wide area networks such as

The problem naturally suggests to separate the specification of users (who issue the workload) and network resources (which handle the workload). We will address both aspects separately in the following.

Previous studies of a similar scenario have been carried out by J. Bunn using the modelling tool ModNet.

Specification of the Users

The workload is understood as the entirety of data transfer requests that are addressed to a (dedicated) wide area network. The current model understands the total workload as mutual data exchanges among physics institutes, e.g. Caltech and CERN. A particular data exchange results thereby from a wide area utilisation profile of individual physicists. In the case of the given model, such an individual utilisation profile is characterised by 'wide area network tasks of an average physicist in 2005' as proposed by Stu Loken. A task represents a wide area network data transfer / session of an individual physicists related to a certain purpose. The following table lists some of the tasks and their characterising attributes:

Conferencing

Coffee Room

Seminar

Email

...

duration

2.0

0.5

0.4

2.0

[h]

max. bandwidth send

52.0

1000.0

200.0

-

[kbit/s]

max. bandwidth receive

460.0

1000.0

800.0

-

[kbit/s]

volume send

-

-

-

36.0

[Mbit]

volume receive

-

-

-

144.0

[Mbit]

priority send

1.0

1.0

1.0

1.0

[weight]

priority receive

1.0

1.0

1.0

1.0

[weight]

lots per day

2

1

1

60

[#]

start daytime

8.0

10.0

9.0

8.0

[daytime]

stop daytime

18.5

11.0

17.0

18.0

[daytime]

timezone

-8

-8

-8

-8

[GMT+x]

source

Caltech

Caltech

Caltech

Caltech

destination

CERN

CERN

CERN

CERN

Table1: Typical wide area network tasks and their attributes.

As a preliminary heuristic, the total data transmission requirements of a task of limited duration are characterised either through an upper bandwidth or a volume constraint. The execution of a task is limited to a certain period per working day (start daytime and stop daytime). The execution of the task can be sparsed over several lots per day (lots per day). Each lot accounts for an equal share of a the total requirements. The lots are uniformly distributed over the considered period of a day. If some requirement can not be fulfilled or can be only fulfilled with delay (i.e. time that exceeded stop daytime), the respective task collects statistics on this.

We have explained how tasks serve to model the behaviour of an 'average physicist'. In order to extrapolate the behaviour of multiple users, a workgroup is conceived as a set of users, an institute in turn is understood as a set of workgroups.The total workload issued per institute is thus the sum/overlap of individual workloads induced by workgroups and users.

WPE2D.GIF (12527 bytes)

Figure1: The component model for entities that issue/determine the workload. Components of this category are depicted red.

The structure and properties of entities who issue workloads are assumed to be static during a simulationl and are thus defined in terms of a SoDA component model. The aggregation of tasks to users, users to workgroups and workgroups to institutes is modelled through a component hierarchy as illustrated in figure 1. The hierarchy can be grasped as a composition of behaviour, i.e. the behaviour of an institute shall be composed by the behaviours of its workgroups etc.. Behaviour in this context is understood as the issuing of processes that represent the actual workload. According to the task descriptions in table 1, such behaviour can be characterised by the frequence, the volume of data emission and further parameters. In general, the behaviour of a component parameterised through its attributes. The attributes of a component instance are intialised at creation time through a configuration file. The following configuration file excerpt illustrates for example the intialisation information of the task name 'phy01cernCoffeeRoom':

...
[phy01cernCoffeeRoom]
duration             =   0.5     # [h/day]
bandwidthSend        =  1000.0   # [kbit/s]
bandwidthReceive     =  1000.0   # [kbit/s]
requirementsSend     =  -1.0     # [Mbit]
requirementsReceive  =  -1.0     # [Mbit]
granularity          =  1        # [#]
startTimeOfDay       =  10.0     # [local time]
stopTimeOfDay        =  11.0      # [local time]
...

Specification of the Network Resources

Network resources specify all constituents that jointly carry out the requests / workloads. The scenario presented here is a strongly simplified image of reality. It demonstrates nevertheless how SoDA modelling concepts can account for particular structural characteristics of wide area networks. We illustrate complex entities as clouds. In the given model, one of the clouds, namely the Network entity, is refined and an explicit component model is defined for it. The other clouds in the figure are characterised by standard components taken from the SoDA library. These components offer a standard behaviour that can be customised to a certain degree through parameters. These clouds can be refined if further investigation is required, e.g. if a bottleneck is suspected.

WPE29.GIF (5987 bytes)

Figure 2: Simplified scenario of a wide area network that connects High-Energy Physics sites

The logical network topology foresees full connectivity among the sites Caltech, CERN and Fermilab. Every logical end-end transfer crosses the Network entity. Within the Network entity, end-end connections are resolved and a physical transmission path is determined according to routing strategies. A physical path corresponds to a series of links (e.g. Washington Cern) and routers (e.g . Router cernusa).

The distinction between logical and physical structure of a wide area network, is represented by a hierarchy of components in the SoDA model. One component class is foreseen for each of the abstractions Network, Router and Link.

WPE2E.GIF (11368 bytes)

Figure 3: The component model of the wide area network indicates a hierarchical structure. Components that represent logical elements in the network featuring a high level of abstraction, can be found up in the hierarchy, e.g. the component instance 'wan' of class Network. The behaviour of this high-level component is composed by the behaviour of further sub-components that represent a lower level of abstraction. Theses are of class Router (e.g. instance 'cernusa') and of class Link (e.g. instance 'lkCernWas'). The components of class Router and Link in turn have their behaviour determined through a standard SoDA library component which is instanced from class SharedResource.

Interaction between Users and Network Resources

So far, entities that issue workloads and that handle workloads have been defined in component models. The actual workload as dynamic element is featured by the SoDA modelling concept of processes. The only entities that actually issue processes are those instanced from class Task. It should however be emphasized that the existence and intialisation information of a Task object depends on the hierarchy of Users - Workgroups and Institutes as discussed in the previous section. Thus also those objects of class User, Workgroup and Institute are in some sense 'indirectly' characterising the workload.

WPE28.GIF (12671 bytes)  

WPE36.GIF (6983 bytes)

Figure 4: The interaction between components that issue workload (red) and components the handle workload (blue) is modelled by processes.

Tasks do thus issue  processes of type TaskProcess in certain intervals in simulation time. The length of the interval is determined by a heuristic. The heuristic calculates the value from the attributes 'lots per day', 'start daytime' and 'end daytime'. In addition, the heuristic foresees that no TaskProcess may be created until the previous instance has finished execution (see process model at the right hand side of figure 3). The process model illustrates that a TaskProcess process is split into several sub-processes that execute concurrently on all components that belong to a certain network path. This heuristics of concurrent execution is feasible if the TaskProcess itself represents a streamed data transfer. For small data drops of data issued by an interactive session, a different heuristic may be necessary.

Simulation and Analysis

The behaviour of a SoDA model can be observed either during its evaluation or after a simulation run. The following examples of visualised result data illustrate either way.

WPE1B.GIF (8064 bytes)

Figure 5: The utilisation of a particular network link over 24 hours. One clearly recognises certain prominent, i.e. bandwidth intensive, tasks such as the Coffee Hour between 10 and 10:30 and the Virtual Reality session around 17:30 in the afternoon. The background data transfers during the night result from the Background Analysis task that was characterised by a 24 hours duration, small bandwidth requirements and a fine granularity. The averaged utilisation over the 24 hours period is in exact concordance with figures that can be easily calculated from the task descriptions. The ordinate in the figure indicates a probability [%], which results from the algorithm that generated the histogram after the simulation run. The y-scale can be easily (linearly) converted to utilisation figures or absolute bandwidth consumption.

WPE1D.GIF (44646 bytes)

Figure 6: The behaviour of a Router component as it is observed during the simulation of the model with the SoDA Performance Monitor.

The displayed segment illustrates the values of various observable attributes of the component during a working day such that the leftmost ticks of the graph represent the status at midnight. The straight black line illustrates the daytime which increases steadily up to 24 hours. In the simple model, a router is conceived as a backplane with limited capacity where all data transmissions pass through. The utilisation of this backplane is determined in intervals of 192 simulated seconds. The value is depicted as light blue graph. The capacity of the backplane (represented by a standard SoDA component of type SharedResource) has been laid out to a reasonable value such that utilisations up to 75% could be observed for a model with 4 physicists in working in two different time zones. The long-term average utilisation is indicated by the purple line. The brown line indicates the number of concurrent streams that induce traffic on the router. The green line indicates the absolute usability of the resource. As a heuristic, the router component decreases its bandwidth capabilities in case of sustained high utilisation. This heuristic should meet observations of packet loss and flow control mechanisms in transport protocols. It was used experimentally here and is subject to change.

Future Work

During the exercise, it turned out that modelling the wide area scenario at hand is mainly determined by two aspects:

Structural Aspect Structure in this context addresses the transfer, allotment and distribution of any kind of workload from the issueing component of the workload to the handling component. In the current model, the complex lifecycle of the workload is characterised in terms of SoDA processes that 'flow' from a hierarchy of load issueing components through a hierarchy of load handling components (see figure 7)
Qualitative Aspect The qualitative aspect of a model addresses how workload is created and handled. This is determined by heuristics that determine the behaviour of load issueing and load-handling components.

WPE2C.GIF (19664 bytes)

Figure 7: The structural aspect of a model: many tasks are effected (through workload processes) on many resources concurrently

It is demonstrated that the SoDA modelling approach can efficiently represent the structural aspect of modelling. A strategy for a continuation of this work would thus rather focus on the development of the qualitative aspect, i.e. the definition of heuristics.

The applied heuristics for tasks featuring application network protocol and users (being a collection of tasks that are pursued concurrently), are insufficient to represent and  understand today's observed network behaviour with the developed model. In literature, different approaches have been taken to match wide area traffic with formal methods. These approaches range from a strict empiric proceeding to a mathematically sound formalisation with different flavours of distributions. Initial approaches have been started to characterise the behaviour of a user who is confronted with a finite 'resource' with concepts from game theory.

In addition to heuristics for load issueing elements, it is feasible to implement non-trivial heuristics for workload handling components such as Router components. The observation of the Router component with the SoDA Performance Monitor already suggested a deterioration of its capability in case of lasting high utilisation. It should however be emphasised that the implementation of complex heuristics on both sides - the load issueing and the load handling side - may 'overcook' the actual problem. This may render the behaviour of the model 'incomprehensible'. Finally, traces of network traffic and their empirical analysis may already take a deterioration of certain load handling component into account. We thus suggest to develop heuristics rather on the origin side of the workload, i.e. heuristics for the behaviour of users and application protocols.

References

[1] Christoph von Praun: Modelling and Simulation of Wide Area Data Communications. A talk given at the CMS Computing Steering Board on 19/06/98.
[2] SoDA Web pages at http://wwwinfo.cern.ch/~praun/soda/Welcome.html.