| [2ab0b0] | 1 | /*
 | 
|---|
 | 2 |  * Project: MoleCuilder
 | 
|---|
 | 3 |  * Description: creates and alters molecular systems
 | 
|---|
 | 4 |  * Copyright (C)  2012 University of Bonn. All rights reserved.
 | 
|---|
 | 5 |  * Please see the LICENSE file or "Copyright notice" in builder.cpp for details.
 | 
|---|
 | 6 |  */
 | 
|---|
 | 7 | 
 | 
|---|
 | 8 | /**
 | 
|---|
 | 9 |  * \file automation.dox
 | 
|---|
 | 10 |  *
 | 
|---|
 | 11 |  * Created on: May 13, 2012
 | 
|---|
 | 12 |  *    Author: heber
 | 
|---|
 | 13 |  */
 | 
|---|
 | 14 | 
 | 
|---|
 | 15 | /** \page automation Automation
 | 
|---|
 | 16 |  *
 | 
|---|
 | 17 |  *  This page explains the (Fragmentation) Automation framework. The framework is
 | 
|---|
 | 18 |  *  meant to outsource all (server/client) operations that are required to actually
 | 
|---|
 | 19 |  *  calculate all the fragments that are created by the \ref Fragmentation structure.
 | 
|---|
 | 20 |  *  The general design is a server/client/controller ansatz. Server and client are
 | 
|---|
 | 21 |  *  external programs whereas the controller is eventually merged into the main
 | 
|---|
 | 22 |  *  MoleCuilder code.
 | 
|---|
 | 23 |  *
 | 
|---|
 | 24 |  *  These are handed out to a server in \ref FragmentScheduler. Many clients in
 | 
|---|
 | 25 |  *  \ref PoolWorker may connect to it and work on these \ref FragmentJob's, when
 | 
|---|
 | 26 |  *  finished they send back \ref FragmentResult. These can be retrieved from the
 | 
|---|
 | 27 |  *  server. Sending results, shutting down, getting results, and checking on
 | 
|---|
 | 28 |  *  present results is done via a controller in \ref FragmentController.
 | 
|---|
 | 29 |  *
 | 
|---|
 | 30 |  *  Technically, everything is implemented via boost::asio for a-/synchronous
 | 
|---|
 | 31 |  *  input/output operations. Also, boost::serialization is essential to send
 | 
|---|
 | 32 |  *  and receive \ref FragmentJob's and \ref FragmentResult's over the net.
 | 
|---|
 | 33 |  *
 | 
|---|
 | 34 |  *  A number of asynchronous operations (i.e. reads and writes) are combined
 | 
|---|
 | 35 |  *  into a so-called \ref Operation. This can be either a \ref SyncOperation or
 | 
|---|
 | 36 |  *  a \ref AsyncOperation. The latter needs a stagered list of callback functions,
 | 
|---|
 | 37 |  *  as soon as one asynchronous operation is done, the called function places
 | 
|---|
 | 38 |  *  the next operation into boost::asio's io_service.
 | 
|---|
 | 39 |  *
 | 
|---|
 | 40 |  *  In the following we explain these structures in more detail.
 | 
|---|
 | 41 |  *
 | 
|---|
 | 42 |  *  \section automation-serverclientcontroller Server, Client, and Controller
 | 
|---|
 | 43 |  *
 | 
|---|
 | 44 |  *  \subsection automation-serverclientcontroller-server Server
 | 
|---|
 | 45 |  *
 | 
|---|
 | 46 |  *  The main workload of the server is implemented in the \ref FragmentScheduler.
 | 
|---|
 | 47 |  *  It listens on two ports, one is for connecting workers, the other for a
 | 
|---|
 | 48 |  *  controller.
 | 
|---|
 | 49 |  *  This scheduler contains a pool of workers, \ref WorkerPool, and a queue of
 | 
|---|
 | 50 |  *  jobs, \ref FragmentQueue. The former contains all clients and knows which
 | 
|---|
 | 51 |  *  one is busy and which one is currently idling. The latter contains all
 | 
|---|
 | 52 |  *  \ref FragmentJob's and \ref FragmentResult's that are to be sent to idling
 | 
|---|
 | 53 |  *  clients or have been received from once busy clients.
 | 
|---|
 | 54 |  *
 | 
|---|
 | 55 |  *  \subsection automation-serverclientcontroller-client Client
 | 
|---|
 | 56 |  *
 | 
|---|
 | 57 |  *  Clients are mainly implemented in \ref PoolWorker. They connect to a server
 | 
|---|
 | 58 |  *  and enroll in its \ref WorkerPool. They listen on an individual port whose
 | 
|---|
 | 59 |  *  address is being sent to the server on enrollment. Any time the server may
 | 
|---|
 | 60 |  *  contact the client and sends it a job. There are two kinds of jobs:
 | 
|---|
 | 61 |  *  -# NoJob
 | 
|---|
 | 62 |  *  -# any other
 | 
|---|
 | 63 |  *  \ref FragmentJob::NoJob just tells the client to shutdown. Any other job
 | 
|---|
 | 64 |  *  is being FragmentJob::Work()'d on and the thereby created result is sent
 | 
|---|
 | 65 |  *  back to the server.
 | 
|---|
 | 66 |  *
 | 
|---|
 | 67 |  *  \subsection automation-serverclientcontroller-controller Controller
 | 
|---|
 | 68 |  *
 | 
|---|
 | 69 |  *  The Controller is an external program that connects to the server via
 | 
|---|
 | 70 |  *  a different port than the clients to give it individual commands.
 | 
|---|
 | 71 |  *  The list of commands is as follows:
 | 
|---|
 | 72 |  *  -# createjobs: Creates a test job
 | 
|---|
 | 73 |  *  -# addjobs: Creates a (mpqc) job by reading a file
 | 
|---|
 | 74 |  *  -# getnextid: get a bunch of unique job ids from server
 | 
|---|
 | 75 |  *  -# receiveresults: receive all currently present results
 | 
|---|
 | 76 |  *  -# checkresults: Get information on waiting jobs and present results
 | 
|---|
 | 77 |  *  -# receivempqcresults: receive and combine all results as Mpqc jobs
 | 
|---|
 | 78 |  *  -# removeall: server should remove all workers from its pool
 | 
|---|
 | 79 |  *  -# shutdown: server should shutdown if pool is empty.
 | 
|---|
 | 80 |  *
 | 
|---|
 | 81 |  *  \subsection automation-serverclientcontroller-operations Operation
 | 
|---|
 | 82 |  *
 | 
|---|
 | 83 |  *  An \ref Operation is implemented as a functor, i.e. all internally
 | 
|---|
 | 84 |  *  required information is given to the Operation in its cstor, the
 | 
|---|
 | 85 |  *  operator() function only receives information required for its
 | 
|---|
 | 86 |  *  specific functionality (here the address to connect to).
 | 
|---|
 | 87 |  *  An \ref Operation is a collection of read's and writes such that two
 | 
|---|
 | 88 |  *  sides (e.g. client/server, controller/server) understand each other
 | 
|---|
 | 89 |  *  and the Operation reads meet writes on the other side and vice versa.
 | 
|---|
 | 90 |  *  Therefore, the operations are structured into three different groups:
 | 
|---|
 | 91 |  *  -# client
 | 
|---|
 | 92 |  *  -# controller
 | 
|---|
 | 93 |  *  -# server
 | 
|---|
 | 94 |  *  They all simply dervie from either \ref SyncOperation or \ref AsyncOperation
 | 
|---|
 | 95 |  *  and implement a AsyncOperation::handle_connect() which is called by the
 | 
|---|
 | 96 |  *  base class after the connection to the other side has been established.
 | 
|---|
 | 97 |  *  More callback functions may be implemented in the derived class
 | 
|---|
 | 98 |  *  depending on whether the asynchronous write or read needs to followed
 | 
|---|
 | 99 |  *  up by further operations. To finish the connection,
 | 
|---|
 | 100 |  *  AsyncOperation::handle_FinishOperation() is called which terminates the
 | 
|---|
 | 101 |  *  operation and calls callback handlers in case of success or failure.
 | 
|---|
 | 102 |  *  Also in case of failure, this function must be called to correctly
 | 
|---|
 | 103 |  *  call the correct callback function.
 | 
|---|
 | 104 |  *
 | 
|---|
 | 105 |  *  These callback function that are activated in case of success or failure
 | 
|---|
 | 106 |  *  are given to the Operation in its cstor.
 | 
|---|
 | 107 |  *
 | 
|---|
 | 108 |  *  Only the \ref WorkerAddress to connect to is given in AsyncOperation::operator().
 | 
|---|
 | 109 |  *
 | 
|---|
 | 110 |  *  \subsection automation-serverclientcontroller-operationqueue Operations queue
 | 
|---|
 | 111 |  *
 | 
|---|
 | 112 |  *  As operations are usually asynchronous ones, they should not keep the
 | 
|---|
 | 113 |  *  executing code waiting. For this purpose there is a \ref OperationQueue.
 | 
|---|
 | 114 |  *  New Operations are created in a straight-forward manner and simply pushed
 | 
|---|
 | 115 |  *  into the OperationQueue that takes care of their sequential operation.
 | 
|---|
 | 116 |  *  Both server and client have such a \ref OperationQueue.
 | 
|---|
 | 117 |  *
 | 
|---|
 | 118 |  *  \subsection automation-serverclientcontroller-listener Listener
 | 
|---|
 | 119 |  *
 | 
|---|
 | 120 |  *  The \ref Listener is a very important component for specifically both the
 | 
|---|
 | 121 |  *  server and the client as each needs to listen on a specific port for incoming
 | 
|---|
 | 122 |  *  connections. The Listener component implements this functionality. Via
 | 
|---|
 | 123 |  *  a number of callback functions in much the same way as with the Operation's
 | 
|---|
 | 124 |  *  incoming requests can be handled.
 | 
|---|
 | 125 |  *
 | 
|---|
 | 126 |  *  \subsubsection automation-serverclientcontroller-listener-pool Pool listener
 | 
|---|
 | 127 |  *
 | 
|---|
 | 128 |  *  The pool listener is the \ref Listener component of the client that listens
 | 
|---|
 | 129 |  *  for incoming connections from the server that sends it jobs.
 | 
|---|
 | 130 |  *
 | 
|---|
 | 131 |  *  \subsubsection automation-serverclientcontroller-listener-worker Worker listener
 | 
|---|
 | 132 |  *
 | 
|---|
 | 133 |  *  The \ref FragmentScheduler has two \ref Listener components, one listens for
 | 
|---|
 | 134 |  *  incoming connections from clients. Here, they can enroll and also remove them
 | 
|---|
 | 135 |  *  selves from the worker pool contained in the server. The client always first
 | 
|---|
 | 136 |  *  sends its own address which is checked whether it is contained in the pool.
 | 
|---|
 | 137 |  *  Only afterwards may the client send a valid command.
 | 
|---|
 | 138 |  *
 | 
|---|
 | 139 |  *  \subsubsection automation-serverclientcontroller-listener-controller Controller listener
 | 
|---|
 | 140 |  *
 | 
|---|
 | 141 |  *  The second \ref Listener component of the \ref FragmentScheduler listens to
 | 
|---|
 | 142 |  *  incoming connections from the controller to execute its commands.
 | 
|---|
 | 143 |  *  The controller initiallygives one of the \ref ControllerChoices, i.e. an enum
 | 
|---|
 | 144 |  *  that encodes a certain command, followed by further arguments such as serialized
 | 
|---|
 | 145 |  *  jobs.
 | 
|---|
 | 146 |  *
 | 
|---|
 | 147 |  *  \section automation-jobsresults Jobs and Results
 | 
|---|
 | 148 |  *
 | 
|---|
 | 149 |  *  \ref FragmentJob and \ref FragmentResult are the internal core of the
 | 
|---|
 | 150 |  *  automation framework that handed around via boost::serialization
 | 
|---|
 | 151 |  *  mechanism between controller, client, and server. Each job is uniquely
 | 
|---|
 | 152 |  *  identified by a unique \ref JobId. This pool of job ids is managed by the
 | 
|---|
 | 153 |  *  server and the ids are request from the controller who creates a \ref
 | 
|---|
 | 154 |  *  FragmentJob and sends it to the server. \ref FragmentResult's are created
 | 
|---|
 | 155 |  *  by the client who has worked successfully on a job and who sends the result
 | 
|---|
 | 156 |  *  back to the server. Eventually, the controller receives the results and
 | 
|---|
 | 157 |  *  performs further computations (e.g. combination of energy and forces in the
 | 
|---|
 | 158 |  *  case of our \ref Fragmentation jobs)
 | 
|---|
 | 159 |  *
 | 
|---|
 | 160 |  *  \subsection automation-jobsresults-jobs Jobs
 | 
|---|
 | 161 |  *
 | 
|---|
 | 162 |  *  A \ref FragmentJob has a unique \ref JobId and a specific Work() operation
 | 
|---|
 | 163 |  *  that tells the client what is to do. SystemCommandJob is typical derivation
 | 
|---|
 | 164 |  *  that executes a system command on a temporarily created file and retrieves
 | 
|---|
 | 165 |  *  its output, places it into a string and sends it back as the result.
 | 
|---|
 | 166 |  *  Specifically, it has a virtual \ref SystemCommandJob::extractResult() function
 | 
|---|
 | 167 |  *  to extract a specific object from the result string of the system command
 | 
|---|
 | 168 |  *  and to place it in serialized form into the \ref FragmentResult's string.
 | 
|---|
 | 169 |  *
 | 
|---|
 | 170 |  *  \subsection automation-jobsresults-results Results
 | 
|---|
 | 171 |  *
 | 
|---|
 | 172 |  *  A \ref FragmentResult has a unique \ref JobId that has to match a before
 | 
|---|
 | 173 |  *  present \ref FragmentJob. It contains a string, where either the output
 | 
|---|
 | 174 |  *  of a job but also serialization information may be contained in. It also
 | 
|---|
 | 175 |  *  stores the exit code of e.g. a \ref SystemCommandJob.
 | 
|---|
 | 176 |  *
 | 
|---|
 | 177 |  * \date 2012-05-13
 | 
|---|
 | 178 |  *
 | 
|---|
 | 179 |  */
 | 
|---|