Design and configuration of distributed job processing systems
T. Risse. Darmstadt University of Technology, (2006)
Abstract
A key criterion in the design, procurement and use of computer systems is performance. Performance typically means the throughput and response time of a system. The effects of poorly performing systems range from dissatisfied users to high penalties for companies due to missed processing deadlines. As a result of continuously increasing hardware performance, companies often solve performance problems by replacing existing hardware with faster machines. One consequence can be that they achieve a performance increase, but the overall performance increase is less than expected. The reason for this is that the combination of hardware and software does not match. For system designers it would be helpful to have a systematic method which supports them in the design of new systems and in the extension of existing systems. The need for a systematic configuration method is motivated by a typical B2B application from the financial industry. Banks have to deal with several payment messages standards like EDIFACT or S.W.I.F.T. which have to be converted into the banks' internal representation for further processing. Such converters have to handle message size ranging from some 100 bytes to about 60 MB and have to fulfil certain performance requirements. To achieve the performance goals, identification of the hardware and software configuration is an important step in the implementation of a distributed message converter system. This thesis presents a systematic approach for the cost performance analysis of distributed job processing systems based on given requirements on throughput and system response time. Our method allows us to search for suitable configurations while minimizing the use of expensive methods for performance evaluation to the largest degree. The method is organized into a hardware and a software configuration step. For each of these configuration steps algorithms were developed. For the hardware configuration step we first approximate single host performance by a coarse model that requires few, inexpensive to obtain, key parameters. Based on it we perform the hardware selection and determine the workload distribution for the selected host configuration. The workload distribution and the hardware configuration are used to build a Layered Queueing Network model (LQN) of the complete system. It is used to determine a software configuration that actually achieves the performance that has been predicted given the hardware configuration. Since evaluations of the complete model are rather expensive, we use a greedy heuristic, which tries to minimize the number of model evaluations required. We have used our method to configure a large distributed system in order to demonstrate the scalability of the method. For a smaller system configuration we compared the predicted results with real system measurements. The verification on the real system shows that the method could be applied successfully to configure a distributed system to reach maximum performance. As we are using queueing networks for system performance modeling, our system configuration method is based on average system performance values. Hence runtime deviations are not covered during the system design phase and have to be handled during runtime by a scheduler to distribute incoming jobs in an optimal way among the hosts. In our case of the EDI message converter it turned out that the standard online scheduling method doesn't fulfil all requirements. Hence we adapted the Bin Stretching scheduling approach to fulfil the functional requirements of deadline processing and priority processing as well as the system performance requirements of low system response times and high system throughput. The algorithm behavior has been analyzed by simulation in different scenarios corresponding to different message distributions. The simulation results shows that the modified Bin-Stretching strategy generally gives better results than the well known list scheduling in the FCFS variety. We were also able to verify on our real message converter system the general good behavior of our algorithm.
%0 Thesis
%1 DBLP:phd/de/Risse2006
%A Risse, Thomas
%D 2006
%K configuration design myown performance_modeling processing queueingnetworks
%T Design and configuration of distributed job processing systems
%U https://tuprints.ulb.tu-darmstadt.de/id/eprint/665
%X A key criterion in the design, procurement and use of computer systems is performance. Performance typically means the throughput and response time of a system. The effects of poorly performing systems range from dissatisfied users to high penalties for companies due to missed processing deadlines. As a result of continuously increasing hardware performance, companies often solve performance problems by replacing existing hardware with faster machines. One consequence can be that they achieve a performance increase, but the overall performance increase is less than expected. The reason for this is that the combination of hardware and software does not match. For system designers it would be helpful to have a systematic method which supports them in the design of new systems and in the extension of existing systems. The need for a systematic configuration method is motivated by a typical B2B application from the financial industry. Banks have to deal with several payment messages standards like EDIFACT or S.W.I.F.T. which have to be converted into the banks' internal representation for further processing. Such converters have to handle message size ranging from some 100 bytes to about 60 MB and have to fulfil certain performance requirements. To achieve the performance goals, identification of the hardware and software configuration is an important step in the implementation of a distributed message converter system. This thesis presents a systematic approach for the cost performance analysis of distributed job processing systems based on given requirements on throughput and system response time. Our method allows us to search for suitable configurations while minimizing the use of expensive methods for performance evaluation to the largest degree. The method is organized into a hardware and a software configuration step. For each of these configuration steps algorithms were developed. For the hardware configuration step we first approximate single host performance by a coarse model that requires few, inexpensive to obtain, key parameters. Based on it we perform the hardware selection and determine the workload distribution for the selected host configuration. The workload distribution and the hardware configuration are used to build a Layered Queueing Network model (LQN) of the complete system. It is used to determine a software configuration that actually achieves the performance that has been predicted given the hardware configuration. Since evaluations of the complete model are rather expensive, we use a greedy heuristic, which tries to minimize the number of model evaluations required. We have used our method to configure a large distributed system in order to demonstrate the scalability of the method. For a smaller system configuration we compared the predicted results with real system measurements. The verification on the real system shows that the method could be applied successfully to configure a distributed system to reach maximum performance. As we are using queueing networks for system performance modeling, our system configuration method is based on average system performance values. Hence runtime deviations are not covered during the system design phase and have to be handled during runtime by a scheduler to distribute incoming jobs in an optimal way among the hosts. In our case of the EDI message converter it turned out that the standard online scheduling method doesn't fulfil all requirements. Hence we adapted the Bin Stretching scheduling approach to fulfil the functional requirements of deadline processing and priority processing as well as the system performance requirements of low system response times and high system throughput. The algorithm behavior has been analyzed by simulation in different scenarios corresponding to different message distributions. The simulation results shows that the modified Bin-Stretching strategy generally gives better results than the well known list scheduling in the FCFS variety. We were also able to verify on our real message converter system the general good behavior of our algorithm.
@phdthesis{DBLP:phd/de/Risse2006,
abstract = {A key criterion in the design, procurement and use of computer systems is performance. Performance typically means the throughput and response time of a system. The effects of poorly performing systems range from dissatisfied users to high penalties for companies due to missed processing deadlines. As a result of continuously increasing hardware performance, companies often solve performance problems by replacing existing hardware with faster machines. One consequence can be that they achieve a performance increase, but the overall performance increase is less than expected. The reason for this is that the combination of hardware and software does not match. For system designers it would be helpful to have a systematic method which supports them in the design of new systems and in the extension of existing systems. The need for a systematic configuration method is motivated by a typical B2B application from the financial industry. Banks have to deal with several payment messages standards like EDIFACT or S.W.I.F.T. which have to be converted into the banks' internal representation for further processing. Such converters have to handle message size ranging from some 100 bytes to about 60 MB and have to fulfil certain performance requirements. To achieve the performance goals, identification of the hardware and software configuration is an important step in the implementation of a distributed message converter system. This thesis presents a systematic approach for the cost performance analysis of distributed job processing systems based on given requirements on throughput and system response time. Our method allows us to search for suitable configurations while minimizing the use of expensive methods for performance evaluation to the largest degree. The method is organized into a hardware and a software configuration step. For each of these configuration steps algorithms were developed. For the hardware configuration step we first approximate single host performance by a coarse model that requires few, inexpensive to obtain, key parameters. Based on it we perform the hardware selection and determine the workload distribution for the selected host configuration. The workload distribution and the hardware configuration are used to build a Layered Queueing Network model (LQN) of the complete system. It is used to determine a software configuration that actually achieves the performance that has been predicted given the hardware configuration. Since evaluations of the complete model are rather expensive, we use a greedy heuristic, which tries to minimize the number of model evaluations required. We have used our method to configure a large distributed system in order to demonstrate the scalability of the method. For a smaller system configuration we compared the predicted results with real system measurements. The verification on the real system shows that the method could be applied successfully to configure a distributed system to reach maximum performance. As we are using queueing networks for system performance modeling, our system configuration method is based on average system performance values. Hence runtime deviations are not covered during the system design phase and have to be handled during runtime by a scheduler to distribute incoming jobs in an optimal way among the hosts. In our case of the EDI message converter it turned out that the standard online scheduling method doesn't fulfil all requirements. Hence we adapted the Bin Stretching scheduling approach to fulfil the functional requirements of deadline processing and priority processing as well as the system performance requirements of low system response times and high system throughput. The algorithm behavior has been analyzed by simulation in different scenarios corresponding to different message distributions. The simulation results shows that the modified Bin-Stretching strategy generally gives better results than the well known list scheduling in the FCFS variety. We were also able to verify on our real message converter system the general good behavior of our algorithm.},
added-at = {2016-10-19T11:49:26.000+0200},
author = {Risse, Thomas},
bibsource = {dblp computer science bibliography, http://dblp.org},
biburl = {https://www.bibsonomy.org/bibtex/2005bd5d6ccda70d84278e949a8543269/trisse69},
interhash = {b3f7c71ee483ddcab41a839d44bb6348},
intrahash = {005bd5d6ccda70d84278e949a8543269},
keywords = {configuration design myown performance_modeling processing queueingnetworks},
school = {Darmstadt University of Technology},
timestamp = {2024-03-28T15:49:16.000+0100},
title = {Design and configuration of distributed job processing systems},
url = {https://tuprints.ulb.tu-darmstadt.de/id/eprint/665},
urn = {urn:nbn:de:tuda-tuprints-6654},
year = 2006
}