Essay Example on Hybrid Scheduling techniques use static allocation Policies









IV HYBRID ALGORITHMS Hybrid Scheduling techniques use static allocation policies along with a dynamic strategy to adjust and comply with changes in timing predictions which arise due to several real time factors such as error in predictions resource failures or concurrent applications Hybrid schemes use initial static mapping and a dynamic policy to cope with runtime issues of communication and processing For example dynamic schedulers can use priorities defined by static algorithm HEFT to decide which task should be scheduled first in situations when more than one tasks becomes available 8 9 have considered the static and dynamic strategies for the outer product computation 5 focuses on the design and analysis of static dynamic and hybrid schemes for matrix multiplication Several divide and conquer schemes are defined for matrix multiplication such as strassen s where successive steps can be considered as sequence of phases of independent tasks which share data 

This context is very apt to compare static and dynamic schemes and to define a hybrid scheme that take advantages of both the worlds Analyzing the performance of an allocation strategy is difficult due to large number of parameters designing a dynamic strategy that considers data reuse is a tedious task Proposed Hybrid strategy will consider varying processor speeds and take into account the information on the performance of the resources to define the static allocation scheme that can modify allocation decisions at runtime based on the state of both the system and the applications In the environment where independent tasks are working on independent data replication strategies are used to achieve a good lifecycle while avoiding too many executing tasks The decision of allocating tasks should be based on the processing estimated speed of the resources and location of data so that unnecessary communication costs can be avoided In the context of matrix multiplication many studies have focused on comparing different schedulers on dense computations on heterogeneous systems 11 10 has proposed and analyzed some hybrid techniques for the scheduling problem with precedence constraints 12 has analyzed the cost of communication incurred for matrix multiplication where we have limited memory 

V ALGORITHM DESCRIPTION Hybrid scheduling techniques are used for analyzing some very challenging problems of scheduling with predefined constraints In the proposed scheme we will use independent tasks which use shared data and both the lifecycle and communication cost will be analyzed The amount of communication that is required to perform matrix multiplications has been analyzed in 18 and a lower bound for the communication cost is calculated Static algorithms for the matrix multiplication satisfying these lower bounds have been designed We will try to analyze hybrid schemes that matches the lower bounds on communication costs while having good operational behavior The objective of the static algorithm we discuss here is to balance the computation tasks between the different processors in order to reach the optimal lifespan while reducing the amount of communications Before the first partial product we assume a static heuristic that partitions the tasks between the processors according to an estimation of their speeds Let us assume that areas allocated to the different processors are rectangles i e Wk m line_m row Then the volume of communications corresponds to the half perimeter of the rectangle Wk and the processing cost is proportional to the area of Wk We obtain the total amount of communications by C N k Ck N2 the factor N comes from the N phases and the N2 term represents the cost of transferring the matrix C at the end of the computation 

Major problem is to achieve a perfect load balance and can be solved by partitioning a square into rectangles of fixed area This problem has been already studied in 15 16 17 In 15 the proposed COLUMNBASED algorithm given a set of target values Si summing up to 1 returns a partition of an unit size square into rectangles each of area Si and aims to minimize the total perimeter of the rectangles COLUMNBASED has been proven to be a 7 4 approximation algorithm and in practice the approximation ratio is often below 1 1 Nagamochi et al propose a different algorithm namely DIVIDEANDCONQUER whose approximation ratio is 5 4 and is as efficient in practice A little variation on DIVIDEANDCONQUER allows a approximation ratio of 2 3 when the areas of rectangles are mostly balanced 17 The above mentioned approximation ratios are of interest for our problem However due to rounding errors even in the case of known and constant over time processing speeds STATICDIVIDEANDCONQUER turns out to perform relatively poorly with respect to lifespan minimization The communication ratio is much better as expected than the worst case bound 5 4 however the ratio for the lifespan that would be 1 if rounding was not used is much worse than expected In particular on heterogeneous platforms the lifespan ratio can be as high as 1 38 for the heterogeneous platform Moreover it is observed that the situation gets worse with the addition of new processors The reason behind these results is that COLUMNBASED and DIVIDEANDCONQUER are designed for the continuous case On the other hand the block size needs to represent a good trade off between large granularity to fully exploit accelerators like GPUs and fine granularity to have good behavior on regular cores Thus block sizes of order 1000 are required For relatively small number of blocks COLUMNBASED AND DIVIDEANDCONQUER returns the same optimal results

Write and Proofread Your Essay
With Noplag Writing Assistance App

Plagiarism Checker

Spell Checker

Virtual Writing Assistant

Grammar Checker

Citation Assistance

Smart Online Editor

Start Writing Now

Start Writing like a PRO