Skip navigation.
Home

Cloud Based Monte Carlo Simulations (of Repast models)

1 Introduction

As we have already introduced, Parameter sweep application is a class of application in which the same code is run multiple times using unique sets of input parameter values. This includes varying one parameter over a range of values or varying multiple parameters over a large multidimensional space. Examples of parameter sweep applications are Monte Carlo simulations or parameter space searches.

In this paper, we present a distributed implementation of a network based Monte Carlo Simulation application by using Offspring on Enterprise Clouds.

2 Preliminary

2.1 Monte Carlo Simulation

A Monte Carlo method is a technique that involves using random numbers and probability to solve problems.

Computer simulation has to do with using computer models to imitate real life or make predictions. When you create a model with a spreadsheet like Excel, you have a certain number of input parameters and a few equations that use those inputs to give you a set of outputs (or response variables). This type of model is usually deterministic, meaning that you get the same results no matter how many times you re-calculate.

Monte Carlo simulation is a method for iteratively evaluating a deterministic model using sets of random numbers as inputs. This method is often used when the model is complex, nonlinear, or involves more than just a couple uncertain parameters. A simulation can typically involve over 10,000 evaluations of the model, a task which in the past was only practical using super computers.

A very simple Monte Carlo Simulation steps are:

Step 1. Create a parametric model

Step 2. Define a acceptable criterion d

Step 3. Generate a set of random inputs

Step 4. Evaluate the model and store the results

Step 5. Repeat steps 3 and 4 for n times

Step 6. Analyze the results compared to the criterion d, if not satisfied, Goto step 3

Step 7. Terminate the simulation

2.2 Offspring

Offspring has been designed to support researchers in combinatorial optimization in quickly deploying their algorithms on a distributed computing infrastructure.

Offspring delivers to users (i) an environment through which run, monitor, and control distributed applications; (ii) a thin distribution middleware that takes care of interacting with the Enterprise Cloud; and (iii) a reference model for implementing such applications. The environment is fully customizable by using plug-ins that: (i) expose control endpoints in order to let the environment and the user visually control the execution of the algorithm; (ii) embed a distribution engine in charge of controlling the execution of the application; (iii) provide the user interface support for configuring and monitoring the execution of the application. The environment is able to load and manage multiple plugins and multiple applications at the same time.

Offspring provides two different integration models for building distributed applications:

  • It is possible to develop a complete plug-in and then taking a finer control on how the environment interacts with the Cloud.
  • It is possible to simply define distribution logic of the application, which provides to the environment the task that need to be executed at each of the iterations.

The first approach is more powerful but requires the users to know the APIs exposed by the Enterprise Cloud. The second approach makes the use of the Cloud completely transparent to the users and hence has been chosen for developing the Monte Carlo Simulation Model in this work.

2.3 Aneka

Offspring relies on Aneka to distribute the computation of applications. And the Offspring Aneka architecture is shown below:

3 Implementation

We port the Monte Carlo Model to a distributed version since (i) we wanted to be able to run the algorithm with a reasonable number of runs in order to take advantage of the network based model; and (ii) we wanted to apply smart migration strategies among different population of individuals which evolved by using different network topologies.


The figures above describe the object model exposed by Offspring for implementing Monte Carlo method. A strategy provides a collection of tasks that are executed on the distribution middleware by the strategy controller. In order to implement the MC Strategy plug-in we defined a concrete class for the strategy (MCStrategy), for the single task (MCTask). In this section, we will describe how these components interact together.

3.1 Remote Node Execution

For what concerns the execution of Monte Carlo simulation on the single node, there are no special requirements. It is only necessary to start the MC algorithm with the proper input and configuration parameters and collect the results of execution. Given this, the implementation of the MCTask class consists of a very thin software layer that performs the following operations: (i) retrieves the input files and executable for the execution; (ii) starts a process and run the Monte Carlo Simulation; (iii) waits for the termination and collects the results generated.

3.2 Implementation of the Distribution Strategy

The concrete implementation of the strategy is defined in the MCStrategy class that defines the distribution and coordinating logic of our simulation. It provides the tasks that will be executed by means of StrategyController on Aneka. It controls the evolution of each of the iterations, merges the results obtained by the execution of tasks, and performs statistical analysis of data.

The figure above shows the the interaction between the StrategyController and the MCStrategy at runtime. The main execution flow is characterized by a sequence of iterations, and for each of the iterations the controller queries the strategy for a task to be executed. This execution model perfectly fits Monte Carlo Simulation, which are characterized by an iterative behavior.

The StrategyController class authenticates with Aneka by using the credential obtained from the Offspring environment, initializes the distributed application, and then the strategy. During initialization, the strategy configures the initial parameters, and prepares the common data for each task, such as the executable running defining the Monte Carlo Simulation. The main loop of the controller executes the iterations until the strategy does not set its Complete property to true. For each of the iterations the controller repeatedly asks new tasks to execute until the strategy provides a null task. The controller then puts itself in waiting mode. At the same time, a monitoring thread is responsible of collecting the tasks that completed their execution and---according to their status---of forwarding them to the strategy. The strategy merges the front with the current active front and updates statistics. For each task collected, the controller queries the strategy in order to know whether the current iteration has completed or not. If the iteration is completed the control thread is woken up and the execution proceeds to the next iteration.

This architecture concentrates within the strategy controller concurrency and distributed middleware management by keeping the definition of a strategy simple and only concerned with the implemented algorithm.

4 References

[1] The first article.

[2] http://www.manjrasoft.com

[3] Multi-Objective Problem Solving With Offspring on Enterprise Clouds, Vecchiola, Christian; Kirley, Michael; Buyya, Rajkumar , Mar, (2009)

[4] Wittwer, J.W., "Monte Carlo Simulation Basics" From Vertex42.com, June 1, 2004, http://vertex42.com/ExcelArticles/mc/MonteCarloSimulation.html

[5] The 2nd article.

AttachmentSize
OffspringAnekaArch.jpg47.13 KB
OffspringStrategy.jpg29.32 KB
MonteCarloStrategy.png2.84 KB
OffspringStrategyExecution.jpg28.54 KB

Latest image