[404d2b] | 1 | /*
|
---|
| 2 | * Project: MoleCuilder
|
---|
| 3 | * Description: creates and alters molecular systems
|
---|
| 4 | * Copyright (C) 2012 University of Bonn. All rights reserved.
|
---|
| 5 | * Please see the LICENSE file or "Copyright notice" in builder.cpp for details.
|
---|
| 6 | */
|
---|
| 7 |
|
---|
| 8 | /**
|
---|
| 9 | * \file jobmarket.dox
|
---|
| 10 | *
|
---|
| 11 | * Created on: May 13, 2012
|
---|
| 12 | * Author: heber
|
---|
| 13 | */
|
---|
| 14 |
|
---|
| 15 | /** \page jobmarket JobMarket
|
---|
| 16 | *
|
---|
| 17 | * This page explains the (Fragmentation) Automation framework. The framework is
|
---|
| 18 | * meant to outsource all (server/client) operations that are required to actually
|
---|
| 19 | * calculate all the fragments that are created by the \ref Fragmentation structure.
|
---|
| 20 | * The general design is a server/client/controller ansatz. Server and client are
|
---|
| 21 | * external programs whereas the controller is eventually merged into the main
|
---|
| 22 | * MoleCuilder code.
|
---|
| 23 | *
|
---|
| 24 | * These are handed out to a server in \ref FragmentScheduler. Many clients in
|
---|
| 25 | * \ref PoolWorker may connect to it and work on these \ref FragmentJob's, when
|
---|
| 26 | * finished they send back \ref FragmentResult. These can be retrieved from the
|
---|
| 27 | * server. Sending results, shutting down, getting results, and checking on
|
---|
| 28 | * present results is done via a controller in \ref FragmentController.
|
---|
| 29 | *
|
---|
| 30 | * Technically, everything is implemented via boost::asio for a-/synchronous
|
---|
| 31 | * input/output operations. Also, boost::serialization is essential to send
|
---|
| 32 | * and receive \ref FragmentJob's and \ref FragmentResult's over the net.
|
---|
| 33 | *
|
---|
| 34 | * A number of asynchronous operations (i.e. reads and writes) are combined
|
---|
| 35 | * into a so-called \ref Operation. This can be either a \ref SyncOperation or
|
---|
| 36 | * a \ref AsyncOperation. The latter needs a stagered list of callback functions,
|
---|
| 37 | * as soon as one asynchronous operation is done, the called function places
|
---|
| 38 | * the next operation into boost::asio's io_service.
|
---|
| 39 | *
|
---|
| 40 | * In the following we explain these structures in more detail.
|
---|
| 41 | *
|
---|
| 42 | * \section jobmarket-serverclientcontroller Server, Client, and Controller
|
---|
| 43 | *
|
---|
| 44 | * \subsection jobmarket-serverclientcontroller-server Server
|
---|
| 45 | *
|
---|
| 46 | * The main workload of the server is implemented in the \ref FragmentScheduler.
|
---|
| 47 | * It listens on two ports, one is for connecting workers, the other for a
|
---|
| 48 | * controller.
|
---|
| 49 | * This scheduler contains a pool of workers, \ref WorkerPool, and a queue of
|
---|
| 50 | * jobs, \ref FragmentQueue. The former contains all clients and knows which
|
---|
| 51 | * one is busy and which one is currently idling. The latter contains all
|
---|
| 52 | * \ref FragmentJob's and \ref FragmentResult's that are to be sent to idling
|
---|
| 53 | * clients or have been received from once busy clients.
|
---|
| 54 | *
|
---|
| 55 | * \subsection jobmarket-serverclientcontroller-client Client
|
---|
| 56 | *
|
---|
| 57 | * Clients are mainly implemented in \ref PoolWorker. They connect to a server
|
---|
| 58 | * and enroll in its \ref WorkerPool. They listen on an individual port whose
|
---|
| 59 | * address is being sent to the server on enrollment. Any time the server may
|
---|
| 60 | * contact the client and sends it a job. There are two kinds of jobs:
|
---|
| 61 | * -# NoJob
|
---|
| 62 | * -# any other
|
---|
| 63 | * \ref FragmentJob::NoJob just tells the client to shutdown. Any other job
|
---|
| 64 | * is being FragmentJob::Work()'d on and the thereby created result is sent
|
---|
| 65 | * back to the server.
|
---|
| 66 | *
|
---|
| 67 | * \subsection jobmarket-serverclientcontroller-controller Controller
|
---|
| 68 | *
|
---|
| 69 | * The Controller is an external program that connects to the server via
|
---|
| 70 | * a different port than the clients to give it individual commands.
|
---|
| 71 | * The list of commands is as follows:
|
---|
| 72 | * -# createjobs: Creates a test job
|
---|
| 73 | * -# addjobs: Creates a (mpqc) job by reading a file
|
---|
| 74 | * -# getnextid: get a bunch of unique job ids from server
|
---|
| 75 | * -# receiveresults: receive all currently present results
|
---|
| 76 | * -# checkresults: Get information on waiting jobs and present results
|
---|
| 77 | * -# receivempqcresults: receive and combine all results as Mpqc jobs
|
---|
| 78 | * -# removeall: server should remove all workers from its pool
|
---|
| 79 | * -# shutdown: server should shutdown if pool is empty.
|
---|
| 80 | *
|
---|
| 81 | * \subsection jobmarket-serverclientcontroller-operations Operation
|
---|
| 82 | *
|
---|
| 83 | * An \ref Operation is implemented as a functor, i.e. all internally
|
---|
| 84 | * required information is given to the Operation in its cstor, the
|
---|
| 85 | * operator() function only receives information required for its
|
---|
| 86 | * specific functionality (here the address to connect to).
|
---|
| 87 | * An \ref Operation is a collection of read's and writes such that two
|
---|
| 88 | * sides (e.g. client/server, controller/server) understand each other
|
---|
| 89 | * and the Operation reads meet writes on the other side and vice versa.
|
---|
| 90 | * Therefore, the operations are structured into three different groups:
|
---|
| 91 | * -# client
|
---|
| 92 | * -# controller
|
---|
| 93 | * -# server
|
---|
| 94 | * They all simply dervie from either \ref SyncOperation or \ref AsyncOperation
|
---|
| 95 | * and implement a AsyncOperation::handle_connect() which is called by the
|
---|
| 96 | * base class after the connection to the other side has been established.
|
---|
| 97 | * More callback functions may be implemented in the derived class
|
---|
| 98 | * depending on whether the asynchronous write or read needs to followed
|
---|
| 99 | * up by further operations. To finish the connection,
|
---|
| 100 | * AsyncOperation::handle_FinishOperation() is called which terminates the
|
---|
| 101 | * operation and calls callback handlers in case of success or failure.
|
---|
| 102 | * Also in case of failure, this function must be called to correctly
|
---|
| 103 | * call the correct callback function.
|
---|
| 104 | *
|
---|
| 105 | * These callback function that are activated in case of success or failure
|
---|
| 106 | * are given to the Operation in its cstor.
|
---|
| 107 | *
|
---|
| 108 | * Only the \ref WorkerAddress to connect to is given in AsyncOperation::operator().
|
---|
| 109 | *
|
---|
| 110 | * \subsection jobmarket-serverclientcontroller-operationqueue Operations queue
|
---|
| 111 | *
|
---|
| 112 | * As operations are usually asynchronous ones, they should not keep the
|
---|
| 113 | * executing code waiting. For this purpose there is a \ref OperationQueue.
|
---|
| 114 | * New Operations are created in a straight-forward manner and simply pushed
|
---|
| 115 | * into the OperationQueue that takes care of their sequential operation.
|
---|
| 116 | * Both server and client have such a \ref OperationQueue.
|
---|
| 117 | *
|
---|
| 118 | * \subsection jobmarket-serverclientcontroller-listener Listener
|
---|
| 119 | *
|
---|
| 120 | * The \ref Listener is a very important component for specifically both the
|
---|
| 121 | * server and the client as each needs to listen on a specific port for incoming
|
---|
| 122 | * connections. The Listener component implements this functionality. Via
|
---|
| 123 | * a number of callback functions in much the same way as with the Operation's
|
---|
| 124 | * incoming requests can be handled.
|
---|
| 125 | *
|
---|
| 126 | * \subsubsection jobmarket-serverclientcontroller-listener-pool Pool listener
|
---|
| 127 | *
|
---|
| 128 | * The pool listener is the \ref Listener component of the client that listens
|
---|
| 129 | * for incoming connections from the server that sends it jobs.
|
---|
| 130 | *
|
---|
| 131 | * \subsubsection jobmarket-serverclientcontroller-listener-worker Worker listener
|
---|
| 132 | *
|
---|
| 133 | * The \ref FragmentScheduler has two \ref Listener components, one listens for
|
---|
| 134 | * incoming connections from clients. Here, they can enroll and also remove them
|
---|
| 135 | * selves from the worker pool contained in the server. The client always first
|
---|
| 136 | * sends its own address which is checked whether it is contained in the pool.
|
---|
| 137 | * Only afterwards may the client send a valid command.
|
---|
| 138 | *
|
---|
| 139 | * \subsubsection jobmarket-serverclientcontroller-listener-controller Controller listener
|
---|
| 140 | *
|
---|
| 141 | * The second \ref Listener component of the \ref FragmentScheduler listens to
|
---|
| 142 | * incoming connections from the controller to execute its commands.
|
---|
| 143 | * The controller initiallygives one of the \ref ControllerChoices, i.e. an enum
|
---|
| 144 | * that encodes a certain command, followed by further arguments such as serialized
|
---|
| 145 | * jobs.
|
---|
| 146 | *
|
---|
| 147 | * \section jobmarket-jobsresults Jobs and Results
|
---|
| 148 | *
|
---|
| 149 | * \ref FragmentJob and \ref FragmentResult are the internal core of the
|
---|
| 150 | * jobmarket framework that handed around via boost::serialization
|
---|
| 151 | * mechanism between controller, client, and server. Each job is uniquely
|
---|
| 152 | * identified by a unique \ref JobId. This pool of job ids is managed by the
|
---|
| 153 | * server and the ids are request from the controller who creates a \ref
|
---|
| 154 | * FragmentJob and sends it to the server. \ref FragmentResult's are created
|
---|
| 155 | * by the client who has worked successfully on a job and who sends the result
|
---|
| 156 | * back to the server. Eventually, the controller receives the results and
|
---|
| 157 | * performs further computations (e.g. combination of energy and forces in the
|
---|
| 158 | * case of our \ref Fragmentation jobs)
|
---|
| 159 | *
|
---|
| 160 | * \subsection jobmarket-jobsresults-jobs Jobs
|
---|
| 161 | *
|
---|
| 162 | * A \ref FragmentJob has a unique \ref JobId and a specific Work() operation
|
---|
| 163 | * that tells the client what is to do. SystemCommandJob is typical derivation
|
---|
| 164 | * that executes a system command on a temporarily created file and retrieves
|
---|
| 165 | * its output, places it into a string and sends it back as the result.
|
---|
| 166 | * Specifically, it has a virtual \ref SystemCommandJob::extractResult() function
|
---|
| 167 | * to extract a specific object from the result string of the system command
|
---|
| 168 | * and to place it in serialized form into the \ref FragmentResult's string.
|
---|
| 169 | *
|
---|
| 170 | * \subsection jobmarket-jobsresults-results Results
|
---|
| 171 | *
|
---|
| 172 | * A \ref FragmentResult has a unique \ref JobId that has to match a before
|
---|
| 173 | * present \ref FragmentJob. It contains a string, where either the output
|
---|
| 174 | * of a job but also serialization information may be contained in. It also
|
---|
| 175 | * stores the exit code of e.g. a \ref SystemCommandJob.
|
---|
| 176 | *
|
---|
| 177 | * \date 2012-05-18
|
---|
| 178 | *
|
---|
| 179 | */
|
---|