«CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. (2015) Published online in Wiley Online Library ...»
Another scheduler is called dynamic priority scheduler which utilizes parallel scheduling, that is, it uses any of the previous methods with a priority condition. This condition differs from an algorithm to another, but most of priority algorithms use job deadline, pricing, or other thresholds .
In 2012, Facebook introduced Corona , which is a new scheduling technique. Corona uses two management levels: (1) cluster manager: there is one cluster manager to keep track the resources and calculate the management of free slots; (2) job tracker: it assigns dedicated job tracker for each job; Corona depends on requester client to process small job to increase cluster performance.
The authors of  note that the number of jobs, which has small weights, is more than the number of jobs which has large weights. So, this algorithm depends on weights of jobs and hence uses round-robin algorithm for scheduling. The round-robin algorithm idea is to partition job queue into different sub-queues and mark each sub-queue according to its weight. Improved Weighted Round Robin depends on running jobs in unweighted queue in the ﬁrst round. After running the ﬁrst task of each job, the scheduler classiﬁes the jobs into weighted sub-queues according to their weights.
To maintain fairness principle, the scheduler will allocate the resources to sub-queues equally in terms of percentages, where small-weighted sub-queues are allocated in the same averages that are allocated for large-weighted sub-queues. Experimental results show that the scheduler has signiﬁcant impact on load balancing among sub-queues. This in fact has good outcomes in the whole Hadoop platform.
The authors of  observe that the order of executing job has an impact on all execution times for all jobs. So, they tried to concentrate more on ordering jobs to achieve best system performance.
So, they proposed Balanced Pool algorithm that applies capacity scheduler principle on Johnson algorithm. Balanced pool partitions the jobs with two queues. It then partitions the resources into two pools that keep enough resources to execute jobs for each pool. It then applies Johnson algorithm on each pool. It saves about 20% of all jobs execution time.
The authors of  proposed constraint-based scheduler, where it takes the job deadline as input.
It then calculates the minimum number of tasks that are adequate to continue processing for each submitted job. It then decides to continue in job or reject it, to avoid wasting resources with noncompleted jobs.
The authors of  proposed a new technique (the Purlieus) to improve data locality in MapReduce framework that runs on clusters such as a cloud server. The Purlieus provisions VM in data locality manner where job characteristics set up this method. The Purlieus classiﬁes the job into three types, where each type has a proposed procedure for provision purpose and VMs.
The authors of  mathematically impacted the factors such as number of task and nodes on the data locality. They used term ‘goodness’ as a percent of local map task. The proposed optimal scheduler builds a cost matrix that assigns zero if the task is local, otherwise one. It then uses linear sum assignment problem (LSAP) to build its scheduler, where LSAP is used to assign N items to N workers. The algorithm sometimes uses some cases to make matrix as a square matrix; this simpliﬁes using the LSAP for building optimal scheduler.
The authors of  proposed Locality-aware reduce task scheduling algorithm a new localityaware reduce task scheduler. This new practical scheduler improves MapReduce performance via scheduling the reduce phase. This scheduler waits until all mappers are ﬁnished. It then deﬁnes a sweet spot that combines all information about size of the partitions that are produced by mappers. Depending on such information, the scheduler decides to trade-off between early shufﬂed or combining more information.
The authors of  proposed a new scheduler that estimates execution time and prefetching input data before assigning new tasks for computing node. This algorithm proposed a new shufﬂe schema to improve system performance. The predicate scheduler (another name for prefetching scheduler) depends on predicting data blocks and forwards it to tasks that need it. The prediction module predicts three types of information, which are: (1) ﬁnish time for current tasks, (2) waiting tasks to be assigned to slave node, and (3) time duration for waiting tasks. This algorithm has good result specially in reducing input data transfers.
The authors in  proposed a new scheduler that is adapted for MapReduce clusters with different users. The new proposed scheduler depends on real time predicting which is used to reduce weight of data locality for unprocessed tasks. The proposed algorithm starts by predicting resource consumption for unprocessed tasks by estimating current processing tasks. The algorithm then calculates the data locality weight for each job to avoid small job problem. Hence, the proposed algorithm tries to enhance the map reduce cluster resources; eventually, the experiments present promising results when compared with traditional map reduce scheduler.
LiPS system is introduced in  and promised a cost-efﬁcient data and task co-scheduler for MapReduce in a cloud environment using linear programming technique.
A hierarchical MapReduce scheduler for hybrid data centers is presented in . It accompanied task scheduling in integrated virtualized and non-virtualized environments.
3. MAPREDUCE SCHEDULING ALGORITHMS
MapReduce framework is one of the major big data management systems introduced by Google in
2004. It is mainly based on parallel processing of computing infrastructure that exploits the massive computing infrastructure available to handle many major issues introduced by the rising use of the big data. MapReduce framework provides a set of features such as user-deﬁned functions, automatic parallelization and distribution, fault tolerance, and high availability by data replicating. Hadoop is an Apache open source implementation for MapReduce parallel framework. It hides the details of parallel processing and helps in fault tolerance by replicating data into multiple nodes. MapReduce framework follows master/slave technique where job tracker is responsible for scheduling, monitoring, and re-executing failure tasks on slave machines. Task tracker is responsible for processing tasks that have been assigned by job tracker.
As clear from the MapReduce methodology name, it deals with the big data management in two phases: the map phase and the reduce phase. It is important to mention that because all the processing that involves the scheduling work will be in the map phase, and because this paper deals with evaluating a set of different schedulers (which are part of the map phase), we limit this paper to deal with only the map phase. Figure 1 illustrates a high-level overview of the MapReduce methodology.
In this paper, a set of scheduling algorithms will be considered to test their performance over a virtualized environment. Their results will be then compared according to some criteria. The set of schedulers includes the FIFO scheduler, MM scheduler, Delay scheduler, and our proposed scheduler the MTL. Such set of schedulers is chosen in a way that spans a group of well-known and widely used schedulers as well as schedulers with different architectures. The following brieﬂy illustrates how each of these schedulers works.
3.1. First in ﬁrst out scheduler First in ﬁrst out schedulers ‘default Hadoop’ attempts to increase data locality [7, 23]. FIFO scheduler works using a regular FIFO queue. Each loaded job is decomposed into a set of tasks and then loaded to the queue one by one. The FIFO scheduler processes the ﬁrst task submitted and assign
Figure 1. MapReduce framework (Adapted from MapReduce Systems ).
it a free slot as it becomes available. After handling the ﬁrst task, it starts handling the next task in the queue and keep doing this process till ﬁnishing the last task in a ﬁrst come, ﬁrst served fashion. Hence, in these criteria, the jobs are processed by the order of jobs as ﬁrst come, ﬁrst served.
As internal details of this algorithm, a slave node has free map slots. Once a heartbeat is sent, the scheduler checks local map tasks in the ﬁrst job and accordingly assigns it to those free map slots.
If the slave node does not have a data locality, then the scheduler assigns only one non-local map task for this node during that heartbeat interval.
The FIFO scheduler has several drawbacks that are important to be discussed. Although the shared cluster looks like it gives a fair share for scheduled tasks, however, the scheduled tasks are impossible to be from another job for current node. In other words, the node does not acquire fair chance to process the data locality. Another drawback of the FIFO scheduler comes from the fact that when small ad hoc tasks are run on large clusters lead to low data locality. The following example illustrates the second disadvantage. Suppose that you want to run a job from 10 map tasks on a cluster of 200 nodes, every map task needs input data, so, the job needs 10 input data. If every input data exist on individual node, then the job can be processed in 10 nodes, and every input data has two replicas on different nodes. As a result, the numbers of nodes that contain input data reach to 30 nodes at maximum. So, low data localities are achieved. Because there will be 170 nodes without local input, data will be processed.
3.2. Matchmaking scheduler Matchmaking algorithm  gives every slave node a chance to process local tasks. In fact, this scheduler gives every job a fair share of the cluster over a respected predeﬁned time slot. Predeﬁning the map slot is helpful in preventing any greedy jobs from reserving resources for long times which might lead other jobs from starvation.Unlike the FIFO scheduler, the MM scheduler allows nodes
to process local tasks from other jobs in the queue, hence, solving the main problem that the FIFO suffers from. Looking for internal details of this MM scheduler shows that when a scheduler does not ﬁnd local task in the ﬁrst job, the MM scheduler searches for local map tasks for that node in the next job. In case of not ﬁnding more local tasks at this heartbeat, no any non-local task will be assigned, hence, giving a fair share of all the tasks. In the second heartbeat cycle, if a node still does not ﬁnd more local tasks to free slots and to avoid starvation, the MM scheduler will assign only one non-local map task to the node that has free slots in every heartbeat interval. When a new job arrives, the MM algorithm will clear all markers on nodes and start from the beginning for every new arriving job, because the new job tasks may exist in marked nodes.
The MM scheduling algorithm solves the problems that the FIFO scheduling algorithm suffers from. As mentioned earlier, it gives a fair share for all scheduled tasks by improving the data locality.
However, the MM scheduler suffers from a shortcoming in its execution time. In fact, the MM scheduler does not assign non-local tasks until a new heartbeat is obtained. This means that more time will be taken with large clusters. Another shortcoming appears when new jobs arrived; the MM scheduler will clear all markers and start from the beginning. This in fact will result in many jobs arriving such as Facebook framework.
3.3. Delay scheduler The Delay algorithm  is proposed to solve some of the problems of the FIFO scheduler. Specifically, like MM, the Delay scheduler gives every job a fair share of the cluster over a respected predeﬁned time slot. Predeﬁning the map slot is helpful in preventing any greedy job from reserving resources for long periods of time which might lead to job starvation. The main idea in Delay algorithm is that when a node sends heartbeat to request a map task, if the ﬁrst job cannot process local task, it delays the job and processes the next one. To ensure the fairness among all scheduled jobs and avoid job starvation, a maximum delay time is used to limit the time a job is delayed.
This scheduler solves problems appearing in FIFO; however, it creates new problems. The ﬁrst issue is that the Delay scheduler has a problem in terms of its efﬁciency because it does not work well when the map slots are freed slowly (map slots need more time to process map tasks). The second issue appears when no local task is found in a node, it assigns multiple non-local tasks to that node if it has multiple free slots in that heartbeat. Consequently, the Delay scheduler, in some cases, performs worse than the default FIFO.
3.4. Multithreading locality scheduler Multithreading locality scheduler utilizes multithreading architecture to perform all the scheduling processes where each thread is assigned to a predetermined block of jobs. The MTL scheduler differs from all the previous schedulers in dealing with synchronizations of blocks. Such blocks’ synchronization is achieved through using multithreading scheduling. It uses multithreading scheduling on a cluster of nodes in one scheduling direction from wait job queue to cluster nodes (unlike other schedulers that work from the nodes to the job queue). This way of dealing with the scheduling processes makes it more appropriate and feasible because the MTL schedule tries to ﬁnd the best node for a corresponding job. In fact, it has an advantage over the other existing schedulers in solving data locality problems by parallel searching using multithreading techniques .
The MTL starts by dividing the cluster into N blocks. Each block contains a number of commodity machines to process and store input data. Each block is scheduled by a special thread that schedules the jobs in the wait queue. Once a new job arrives to the cluster, the MapReduce scheduler contacts the Namenode to determine the rack that includes the largest proportion of data locality tasks for this job.
When any job is to be processed, the threads start searching in their blocks node for local map task, where each thread takes information about current task and starts searching in their blocks.