| [2ab0b0] | 1 | /* | 
|---|
|  | 2 | * Project: MoleCuilder | 
|---|
|  | 3 | * Description: creates and alters molecular systems | 
|---|
|  | 4 | * Copyright (C)  2012 University of Bonn. All rights reserved. | 
|---|
|  | 5 | * Please see the LICENSE file or "Copyright notice" in builder.cpp for details. | 
|---|
|  | 6 | */ | 
|---|
|  | 7 |  | 
|---|
|  | 8 | /** | 
|---|
|  | 9 | * \file automation.dox | 
|---|
|  | 10 | * | 
|---|
|  | 11 | * Created on: May 13, 2012 | 
|---|
|  | 12 | *    Author: heber | 
|---|
|  | 13 | */ | 
|---|
|  | 14 |  | 
|---|
|  | 15 | /** \page automation Automation | 
|---|
|  | 16 | * | 
|---|
|  | 17 | *  This page explains the (Fragmentation) Automation framework. The framework is | 
|---|
|  | 18 | *  meant to outsource all (server/client) operations that are required to actually | 
|---|
|  | 19 | *  calculate all the fragments that are created by the \ref Fragmentation structure. | 
|---|
|  | 20 | *  The general design is a server/client/controller ansatz. Server and client are | 
|---|
|  | 21 | *  external programs whereas the controller is eventually merged into the main | 
|---|
|  | 22 | *  MoleCuilder code. | 
|---|
|  | 23 | * | 
|---|
|  | 24 | *  These are handed out to a server in \ref FragmentScheduler. Many clients in | 
|---|
|  | 25 | *  \ref PoolWorker may connect to it and work on these \ref FragmentJob's, when | 
|---|
|  | 26 | *  finished they send back \ref FragmentResult. These can be retrieved from the | 
|---|
|  | 27 | *  server. Sending results, shutting down, getting results, and checking on | 
|---|
|  | 28 | *  present results is done via a controller in \ref FragmentController. | 
|---|
|  | 29 | * | 
|---|
|  | 30 | *  Technically, everything is implemented via boost::asio for a-/synchronous | 
|---|
|  | 31 | *  input/output operations. Also, boost::serialization is essential to send | 
|---|
|  | 32 | *  and receive \ref FragmentJob's and \ref FragmentResult's over the net. | 
|---|
|  | 33 | * | 
|---|
|  | 34 | *  A number of asynchronous operations (i.e. reads and writes) are combined | 
|---|
|  | 35 | *  into a so-called \ref Operation. This can be either a \ref SyncOperation or | 
|---|
|  | 36 | *  a \ref AsyncOperation. The latter needs a stagered list of callback functions, | 
|---|
|  | 37 | *  as soon as one asynchronous operation is done, the called function places | 
|---|
|  | 38 | *  the next operation into boost::asio's io_service. | 
|---|
|  | 39 | * | 
|---|
|  | 40 | *  In the following we explain these structures in more detail. | 
|---|
|  | 41 | * | 
|---|
|  | 42 | *  \section automation-serverclientcontroller Server, Client, and Controller | 
|---|
|  | 43 | * | 
|---|
|  | 44 | *  \subsection automation-serverclientcontroller-server Server | 
|---|
|  | 45 | * | 
|---|
|  | 46 | *  The main workload of the server is implemented in the \ref FragmentScheduler. | 
|---|
|  | 47 | *  It listens on two ports, one is for connecting workers, the other for a | 
|---|
|  | 48 | *  controller. | 
|---|
|  | 49 | *  This scheduler contains a pool of workers, \ref WorkerPool, and a queue of | 
|---|
|  | 50 | *  jobs, \ref FragmentQueue. The former contains all clients and knows which | 
|---|
|  | 51 | *  one is busy and which one is currently idling. The latter contains all | 
|---|
|  | 52 | *  \ref FragmentJob's and \ref FragmentResult's that are to be sent to idling | 
|---|
|  | 53 | *  clients or have been received from once busy clients. | 
|---|
|  | 54 | * | 
|---|
|  | 55 | *  \subsection automation-serverclientcontroller-client Client | 
|---|
|  | 56 | * | 
|---|
|  | 57 | *  Clients are mainly implemented in \ref PoolWorker. They connect to a server | 
|---|
|  | 58 | *  and enroll in its \ref WorkerPool. They listen on an individual port whose | 
|---|
|  | 59 | *  address is being sent to the server on enrollment. Any time the server may | 
|---|
|  | 60 | *  contact the client and sends it a job. There are two kinds of jobs: | 
|---|
|  | 61 | *  -# NoJob | 
|---|
|  | 62 | *  -# any other | 
|---|
|  | 63 | *  \ref FragmentJob::NoJob just tells the client to shutdown. Any other job | 
|---|
|  | 64 | *  is being FragmentJob::Work()'d on and the thereby created result is sent | 
|---|
|  | 65 | *  back to the server. | 
|---|
|  | 66 | * | 
|---|
|  | 67 | *  \subsection automation-serverclientcontroller-controller Controller | 
|---|
|  | 68 | * | 
|---|
|  | 69 | *  The Controller is an external program that connects to the server via | 
|---|
|  | 70 | *  a different port than the clients to give it individual commands. | 
|---|
|  | 71 | *  The list of commands is as follows: | 
|---|
|  | 72 | *  -# createjobs: Creates a test job | 
|---|
|  | 73 | *  -# addjobs: Creates a (mpqc) job by reading a file | 
|---|
|  | 74 | *  -# getnextid: get a bunch of unique job ids from server | 
|---|
|  | 75 | *  -# receiveresults: receive all currently present results | 
|---|
|  | 76 | *  -# checkresults: Get information on waiting jobs and present results | 
|---|
|  | 77 | *  -# receivempqcresults: receive and combine all results as Mpqc jobs | 
|---|
|  | 78 | *  -# removeall: server should remove all workers from its pool | 
|---|
|  | 79 | *  -# shutdown: server should shutdown if pool is empty. | 
|---|
|  | 80 | * | 
|---|
|  | 81 | *  \subsection automation-serverclientcontroller-operations Operation | 
|---|
|  | 82 | * | 
|---|
|  | 83 | *  An \ref Operation is implemented as a functor, i.e. all internally | 
|---|
|  | 84 | *  required information is given to the Operation in its cstor, the | 
|---|
|  | 85 | *  operator() function only receives information required for its | 
|---|
|  | 86 | *  specific functionality (here the address to connect to). | 
|---|
|  | 87 | *  An \ref Operation is a collection of read's and writes such that two | 
|---|
|  | 88 | *  sides (e.g. client/server, controller/server) understand each other | 
|---|
|  | 89 | *  and the Operation reads meet writes on the other side and vice versa. | 
|---|
|  | 90 | *  Therefore, the operations are structured into three different groups: | 
|---|
|  | 91 | *  -# client | 
|---|
|  | 92 | *  -# controller | 
|---|
|  | 93 | *  -# server | 
|---|
|  | 94 | *  They all simply dervie from either \ref SyncOperation or \ref AsyncOperation | 
|---|
|  | 95 | *  and implement a AsyncOperation::handle_connect() which is called by the | 
|---|
|  | 96 | *  base class after the connection to the other side has been established. | 
|---|
|  | 97 | *  More callback functions may be implemented in the derived class | 
|---|
|  | 98 | *  depending on whether the asynchronous write or read needs to followed | 
|---|
|  | 99 | *  up by further operations. To finish the connection, | 
|---|
|  | 100 | *  AsyncOperation::handle_FinishOperation() is called which terminates the | 
|---|
|  | 101 | *  operation and calls callback handlers in case of success or failure. | 
|---|
|  | 102 | *  Also in case of failure, this function must be called to correctly | 
|---|
|  | 103 | *  call the correct callback function. | 
|---|
|  | 104 | * | 
|---|
|  | 105 | *  These callback function that are activated in case of success or failure | 
|---|
|  | 106 | *  are given to the Operation in its cstor. | 
|---|
|  | 107 | * | 
|---|
|  | 108 | *  Only the \ref WorkerAddress to connect to is given in AsyncOperation::operator(). | 
|---|
|  | 109 | * | 
|---|
|  | 110 | *  \subsection automation-serverclientcontroller-operationqueue Operations queue | 
|---|
|  | 111 | * | 
|---|
|  | 112 | *  As operations are usually asynchronous ones, they should not keep the | 
|---|
|  | 113 | *  executing code waiting. For this purpose there is a \ref OperationQueue. | 
|---|
|  | 114 | *  New Operations are created in a straight-forward manner and simply pushed | 
|---|
|  | 115 | *  into the OperationQueue that takes care of their sequential operation. | 
|---|
|  | 116 | *  Both server and client have such a \ref OperationQueue. | 
|---|
|  | 117 | * | 
|---|
|  | 118 | *  \subsection automation-serverclientcontroller-listener Listener | 
|---|
|  | 119 | * | 
|---|
|  | 120 | *  The \ref Listener is a very important component for specifically both the | 
|---|
|  | 121 | *  server and the client as each needs to listen on a specific port for incoming | 
|---|
|  | 122 | *  connections. The Listener component implements this functionality. Via | 
|---|
|  | 123 | *  a number of callback functions in much the same way as with the Operation's | 
|---|
|  | 124 | *  incoming requests can be handled. | 
|---|
|  | 125 | * | 
|---|
|  | 126 | *  \subsubsection automation-serverclientcontroller-listener-pool Pool listener | 
|---|
|  | 127 | * | 
|---|
|  | 128 | *  The pool listener is the \ref Listener component of the client that listens | 
|---|
|  | 129 | *  for incoming connections from the server that sends it jobs. | 
|---|
|  | 130 | * | 
|---|
|  | 131 | *  \subsubsection automation-serverclientcontroller-listener-worker Worker listener | 
|---|
|  | 132 | * | 
|---|
|  | 133 | *  The \ref FragmentScheduler has two \ref Listener components, one listens for | 
|---|
|  | 134 | *  incoming connections from clients. Here, they can enroll and also remove them | 
|---|
|  | 135 | *  selves from the worker pool contained in the server. The client always first | 
|---|
|  | 136 | *  sends its own address which is checked whether it is contained in the pool. | 
|---|
|  | 137 | *  Only afterwards may the client send a valid command. | 
|---|
|  | 138 | * | 
|---|
|  | 139 | *  \subsubsection automation-serverclientcontroller-listener-controller Controller listener | 
|---|
|  | 140 | * | 
|---|
|  | 141 | *  The second \ref Listener component of the \ref FragmentScheduler listens to | 
|---|
|  | 142 | *  incoming connections from the controller to execute its commands. | 
|---|
|  | 143 | *  The controller initiallygives one of the \ref ControllerChoices, i.e. an enum | 
|---|
|  | 144 | *  that encodes a certain command, followed by further arguments such as serialized | 
|---|
|  | 145 | *  jobs. | 
|---|
|  | 146 | * | 
|---|
|  | 147 | *  \section automation-jobsresults Jobs and Results | 
|---|
|  | 148 | * | 
|---|
|  | 149 | *  \ref FragmentJob and \ref FragmentResult are the internal core of the | 
|---|
|  | 150 | *  automation framework that handed around via boost::serialization | 
|---|
|  | 151 | *  mechanism between controller, client, and server. Each job is uniquely | 
|---|
|  | 152 | *  identified by a unique \ref JobId. This pool of job ids is managed by the | 
|---|
|  | 153 | *  server and the ids are request from the controller who creates a \ref | 
|---|
|  | 154 | *  FragmentJob and sends it to the server. \ref FragmentResult's are created | 
|---|
|  | 155 | *  by the client who has worked successfully on a job and who sends the result | 
|---|
|  | 156 | *  back to the server. Eventually, the controller receives the results and | 
|---|
|  | 157 | *  performs further computations (e.g. combination of energy and forces in the | 
|---|
|  | 158 | *  case of our \ref Fragmentation jobs) | 
|---|
|  | 159 | * | 
|---|
|  | 160 | *  \subsection automation-jobsresults-jobs Jobs | 
|---|
|  | 161 | * | 
|---|
|  | 162 | *  A \ref FragmentJob has a unique \ref JobId and a specific Work() operation | 
|---|
|  | 163 | *  that tells the client what is to do. SystemCommandJob is typical derivation | 
|---|
|  | 164 | *  that executes a system command on a temporarily created file and retrieves | 
|---|
|  | 165 | *  its output, places it into a string and sends it back as the result. | 
|---|
|  | 166 | *  Specifically, it has a virtual \ref SystemCommandJob::extractResult() function | 
|---|
|  | 167 | *  to extract a specific object from the result string of the system command | 
|---|
|  | 168 | *  and to place it in serialized form into the \ref FragmentResult's string. | 
|---|
|  | 169 | * | 
|---|
|  | 170 | *  \subsection automation-jobsresults-results Results | 
|---|
|  | 171 | * | 
|---|
|  | 172 | *  A \ref FragmentResult has a unique \ref JobId that has to match a before | 
|---|
|  | 173 | *  present \ref FragmentJob. It contains a string, where either the output | 
|---|
|  | 174 | *  of a job but also serialization information may be contained in. It also | 
|---|
|  | 175 | *  stores the exit code of e.g. a \ref SystemCommandJob. | 
|---|
|  | 176 | * | 
|---|
|  | 177 | * \date 2012-05-13 | 
|---|
|  | 178 | * | 
|---|
|  | 179 | */ | 
|---|