1 | /*
|
---|
2 | * Project: MoleCuilder
|
---|
3 | * Description: creates and alters molecular systems
|
---|
4 | * Copyright (C) 2012 University of Bonn. All rights reserved.
|
---|
5 | * Please see the LICENSE file or "Copyright notice" in builder.cpp for details.
|
---|
6 | */
|
---|
7 |
|
---|
8 | /**
|
---|
9 | * \file jobmarket.dox
|
---|
10 | *
|
---|
11 | * Created on: May 13, 2012
|
---|
12 | * Author: heber
|
---|
13 | */
|
---|
14 |
|
---|
15 | /** \page jobmarket JobMarket
|
---|
16 | *
|
---|
17 | * This page explains the (Fragmentation) Automation framework. The framework is
|
---|
18 | * meant to outsource all (server/client) operations that are required to actually
|
---|
19 | * calculate all the fragments that are created by the \ref Fragmentation structure.
|
---|
20 | * The general design is a server/client/controller ansatz. Server and client are
|
---|
21 | * external programs whereas the controller is eventually merged into the main
|
---|
22 | * MoleCuilder code.
|
---|
23 | *
|
---|
24 | * These are handed out to a server in \ref FragmentScheduler. Many clients in
|
---|
25 | * \ref PoolWorker may connect to it and work on these \ref FragmentJob's, when
|
---|
26 | * finished they send back \ref FragmentResult. These can be retrieved from the
|
---|
27 | * server. Sending results, shutting down, getting results, and checking on
|
---|
28 | * present results is done via a controller in \ref FragmentController.
|
---|
29 | *
|
---|
30 | * Technically, everything is implemented via boost::asio for a-/synchronous
|
---|
31 | * input/output operations. Also, boost::serialization is essential to send
|
---|
32 | * and receive \ref FragmentJob's and \ref FragmentResult's over the net.
|
---|
33 | *
|
---|
34 | * A number of asynchronous operations (i.e. reads and writes) are combined
|
---|
35 | * into a so-called \ref Operation. This can be either a \ref SyncOperation or
|
---|
36 | * a \ref AsyncOperation. The latter needs a stagered list of callback functions,
|
---|
37 | * as soon as one asynchronous operation is done, the called function places
|
---|
38 | * the next operation into boost::asio's io_service.
|
---|
39 | *
|
---|
40 | * In the following we explain these structures in more detail.
|
---|
41 | *
|
---|
42 | * \section jobmarket-serverclientcontroller Server, Client, and Controller
|
---|
43 | *
|
---|
44 | * \subsection jobmarket-serverclientcontroller-server Server
|
---|
45 | *
|
---|
46 | * The main workload of the server is implemented in the \ref FragmentScheduler.
|
---|
47 | * It listens on two ports, one is for connecting workers, the other for a
|
---|
48 | * controller.
|
---|
49 | * This scheduler contains a pool of workers, \ref WorkerPool, and a queue of
|
---|
50 | * jobs, \ref FragmentQueue. The former contains all clients and knows which
|
---|
51 | * one is busy and which one is currently idling. The latter contains all
|
---|
52 | * \ref FragmentJob's and \ref FragmentResult's that are to be sent to idling
|
---|
53 | * clients or have been received from once busy clients.
|
---|
54 | *
|
---|
55 | * \subsection jobmarket-serverclientcontroller-client Client
|
---|
56 | *
|
---|
57 | * Clients are mainly implemented in \ref PoolWorker. They connect to a server
|
---|
58 | * and enroll in its \ref WorkerPool. They listen on an individual port whose
|
---|
59 | * address is being sent to the server on enrollment. Any time the server may
|
---|
60 | * contact the client and sends it a job. There are two kinds of jobs:
|
---|
61 | * -# NoJob
|
---|
62 | * -# any other
|
---|
63 | * \ref FragmentJob::NoJob just tells the client to shutdown. Any other job
|
---|
64 | * is being FragmentJob::Work()'d on and the thereby created result is sent
|
---|
65 | * back to the server.
|
---|
66 | *
|
---|
67 | * \subsection jobmarket-serverclientcontroller-controller Controller
|
---|
68 | *
|
---|
69 | * The Controller is an external program that connects to the server via
|
---|
70 | * a different port than the clients to give it individual commands.
|
---|
71 | * The list of commands is as follows:
|
---|
72 | * -# createjobs: Creates a test job
|
---|
73 | * -# addjobs: Creates a (mpqc) job by reading a file
|
---|
74 | * -# getnextid: get a bunch of unique job ids from server
|
---|
75 | * -# receiveresults: receive all currently present results
|
---|
76 | * -# checkresults: Get information on waiting jobs and present results
|
---|
77 | * -# receivempqcresults: receive and combine all results as Mpqc jobs
|
---|
78 | * -# removeall: server should remove all workers from its pool
|
---|
79 | * -# shutdown: server should shutdown if pool is empty.
|
---|
80 | *
|
---|
81 | * \subsection jobmarket-serverclientcontroller-operations Operation
|
---|
82 | *
|
---|
83 | * An \ref Operation is implemented as a functor, i.e. all internally
|
---|
84 | * required information is given to the Operation in its cstor, the
|
---|
85 | * operator() function only receives information required for its
|
---|
86 | * specific functionality (here the address to connect to).
|
---|
87 | * An \ref Operation is a collection of read's and writes such that two
|
---|
88 | * sides (e.g. client/server, controller/server) understand each other
|
---|
89 | * and the Operation reads meet writes on the other side and vice versa.
|
---|
90 | * Therefore, the operations are structured into three different groups:
|
---|
91 | * -# client
|
---|
92 | * -# controller
|
---|
93 | * -# server
|
---|
94 | * They all simply dervie from either \ref SyncOperation or \ref AsyncOperation
|
---|
95 | * and implement a AsyncOperation::handle_connect() which is called by the
|
---|
96 | * base class after the connection to the other side has been established.
|
---|
97 | * More callback functions may be implemented in the derived class
|
---|
98 | * depending on whether the asynchronous write or read needs to followed
|
---|
99 | * up by further operations. To finish the connection,
|
---|
100 | * AsyncOperation::handle_FinishOperation() is called which terminates the
|
---|
101 | * operation and calls callback handlers in case of success or failure.
|
---|
102 | * Also in case of failure, this function must be called to correctly
|
---|
103 | * call the correct callback function.
|
---|
104 | *
|
---|
105 | * These callback function that are activated in case of success or failure
|
---|
106 | * are given to the Operation in its cstor.
|
---|
107 | *
|
---|
108 | * Only the \ref WorkerAddress to connect to is given in AsyncOperation::operator().
|
---|
109 | *
|
---|
110 | * \subsection jobmarket-serverclientcontroller-operationqueue Operations queue
|
---|
111 | *
|
---|
112 | * As operations are usually asynchronous ones, they should not keep the
|
---|
113 | * executing code waiting. For this purpose there is a \ref OperationQueue.
|
---|
114 | * New Operations are created in a straight-forward manner and simply pushed
|
---|
115 | * into the OperationQueue that takes care of their sequential operation.
|
---|
116 | * Both server and client have such a \ref OperationQueue.
|
---|
117 | *
|
---|
118 | * \subsection jobmarket-serverclientcontroller-listener Listener
|
---|
119 | *
|
---|
120 | * The \ref Listener is a very important component for specifically both the
|
---|
121 | * server and the client as each needs to listen on a specific port for incoming
|
---|
122 | * connections. The Listener component implements this functionality. Via
|
---|
123 | * a number of callback functions in much the same way as with the Operation's
|
---|
124 | * incoming requests can be handled.
|
---|
125 | *
|
---|
126 | * \subsubsection jobmarket-serverclientcontroller-listener-pool Pool listener
|
---|
127 | *
|
---|
128 | * The pool listener is the \ref Listener component of the client that listens
|
---|
129 | * for incoming connections from the server that sends it jobs.
|
---|
130 | *
|
---|
131 | * \subsubsection jobmarket-serverclientcontroller-listener-worker Worker listener
|
---|
132 | *
|
---|
133 | * The \ref FragmentScheduler has two \ref Listener components, one listens for
|
---|
134 | * incoming connections from clients. Here, they can enroll and also remove them
|
---|
135 | * selves from the worker pool contained in the server. The client always first
|
---|
136 | * sends its own address which is checked whether it is contained in the pool.
|
---|
137 | * Only afterwards may the client send a valid command.
|
---|
138 | *
|
---|
139 | * \subsubsection jobmarket-serverclientcontroller-listener-controller Controller listener
|
---|
140 | *
|
---|
141 | * The second \ref Listener component of the \ref FragmentScheduler listens to
|
---|
142 | * incoming connections from the controller to execute its commands.
|
---|
143 | * The controller initiallygives one of the \ref ControllerChoices, i.e. an enum
|
---|
144 | * that encodes a certain command, followed by further arguments such as serialized
|
---|
145 | * jobs.
|
---|
146 | *
|
---|
147 | * \section jobmarket-jobsresults Jobs and Results
|
---|
148 | *
|
---|
149 | * \ref FragmentJob and \ref FragmentResult are the internal core of the
|
---|
150 | * jobmarket framework that handed around via boost::serialization
|
---|
151 | * mechanism between controller, client, and server. Each job is uniquely
|
---|
152 | * identified by a unique \ref JobId. This pool of job ids is managed by the
|
---|
153 | * server and the ids are request from the controller who creates a \ref
|
---|
154 | * FragmentJob and sends it to the server. \ref FragmentResult's are created
|
---|
155 | * by the client who has worked successfully on a job and who sends the result
|
---|
156 | * back to the server. Eventually, the controller receives the results and
|
---|
157 | * performs further computations (e.g. combination of energy and forces in the
|
---|
158 | * case of our \ref Fragmentation jobs)
|
---|
159 | *
|
---|
160 | * \subsection jobmarket-jobsresults-jobs Jobs
|
---|
161 | *
|
---|
162 | * A \ref FragmentJob has a unique \ref JobId and a specific Work() operation
|
---|
163 | * that tells the client what is to do. SystemCommandJob is typical derivation
|
---|
164 | * that executes a system command on a temporarily created file and retrieves
|
---|
165 | * its output, places it into a string and sends it back as the result.
|
---|
166 | * Specifically, it has a virtual \ref SystemCommandJob::extractResult() function
|
---|
167 | * to extract a specific object from the result string of the system command
|
---|
168 | * and to place it in serialized form into the \ref FragmentResult's string.
|
---|
169 | *
|
---|
170 | * \subsection jobmarket-jobsresults-results Results
|
---|
171 | *
|
---|
172 | * A \ref FragmentResult has a unique \ref JobId that has to match a before
|
---|
173 | * present \ref FragmentJob. It contains a string, where either the output
|
---|
174 | * of a job but also serialization information may be contained in. It also
|
---|
175 | * stores the exit code of e.g. a \ref SystemCommandJob.
|
---|
176 | *
|
---|
177 | * \date 2012-05-18
|
---|
178 | *
|
---|
179 | */
|
---|