[Linux2.6 core of - the process of scheduling] [04.07.08]

  Linux2.6 version of the process of scheduling made no small change. 
  1, the introduction of the concept of a domain scheduling CPU has a "basic" scheduling domain (struct sched_domain).    Function cpu_sched_domain (i) and Acer this_sched_domain () can be used to indicate the cpu domain.    Domain hierarchical structure is passed -> parent pointer at the basis of the domain and constructed.    –> Parent NULL pointer to the end to ensure that, and should be allocated for each CPU in the domain structure in order to increase the CPU. 
  In each domain within the scope of scheduling all have several CPU (deposit -> span).    Scheduling domain of meaning: "In the reach between the CPU load balance."    The scope of a domain must be not less than the scope of its subdomain, and CPU i the basis of the jurisdiction of at least i.    In the domain of the top-level structure of the scope of the domain system will cover all the CPU. 
  Scheduling each domain must have one or more CPU (struct sched_group), with -> link groups as a guide ring.    These groups in the cpumask must be within the jurisdiction of the same.    Arbitrary cpumask intersection of the two groups must be empty set (that is not the same).    –> Pointer by groups at the group must contain this domain owned by the CPU.    And when certain groups were established, because they may contain some read-only data was shared by the CPU. 
  In the group scheduling domain will happen between the balance of operation.    This time, each group will be as a whole.    Load group will be defined for each and every one CPU load sum.    Therefore, when the group and the group load imbalance, it will be in operation scheduling domain balance in the group and the balance between the group. 
  In the kernel / sched.c, rebalance_tick in each CPU on a cyclical operation.    The function of the CPU scheduling the basis of the region to re-check the balance.    If so, that would be in operation within load_balance.    The father rebalance_tick will then check scheduling domain (if any), and then to the father of the father of scheduling domain scheduling domain, so continue. 

  "Basis" domain "scope" would constitute a domain structure of the first-class level.    SMT in the circumstances, will span all the physical CPU, and each group will have a virtual CPU. 

  Linux kernel 2.6 provides a procedure for more than a different way, that is, NUMA (Non Uniform Memory Access).    This method, the memory and processors are interconnected, but for each processor, some memory is "closed", and some memory is "farther."    This means that when a memory competition, "more recent" of processor memory to the nearest higher use right.    2.6 kernel provides a set of functions defined between the memory and processor topological relations.    Scheduling procedures can use the information to local memory allocation for the task.    This will reduce competition in the memory bottlenecks, improve throughput. 

  In the SMP, the basis of the parent domain domain will span all nodes on the physical CPU.    And each group will have a physical CPU.    Through NUMA, the parent domain SMP domain will span the whole machine, and each group would have every node cpumask.    Alternatively, you can use multi-level NUMA or Opteron. 

  2, the new scheduling algorithm O (1) 

  A) ultra-scalable O (1) Scheduler advantages: 
  — Even in the high-load, they also can have relatively good interactive performance - in the process of scheduling and 1-2 up with better performance - Equity: No process has occupied a large number of CPU time - Priority: Important high-priority tasks, and vice versa - SMP efficiency: If there are tasks, there will be no spare CPU 
  SMP-related: as far as possible in the process will be run on a CPU, the process will not be different CPU frequent relocation 

  B) the new scheduling algorithm change: 
  — Complete O (1) Scheduling Algorithm: Cancel the Recosting cycle (recalculation loop), and priority cycle (goodness loop) 
  — Real SMP: runqueue_lock abolished, and each CPU has adopted an independent runqueues and locks-two independent tasks on the CPU simultaneously (parallel) have been awakened, scheduling and context switching, no longer need any interlock .    Scheduling all the relevant data to the greatest extent can come in the division. 
  — Better SMP related [mapping]: old scheduler has a special disadvantage is that when there are higher priorities or real-time tasks, tasks in the CPU of arbitrary relocation.    The reason is that while the time-Recosting cycle (timeslice recalculation loop) is first calculated the current mandate of the operation of each time slice expenses.    Time Slice Recosting algorithm requirements in all the process had been exhausted their time slice, the new time will be re-calculated.    In a multi-CPU system, when they run out of the process after time to wait for the re-count, in order to get the new time slice.    But when such a situation has occurred, there may be caused by the part of CPU idle without running any tasks.    Only when the last time a film only one of the tasks that used its time slice, Recosting cycle才被activated, while the other CPU can continue to carry out its mandate - in the functioning of several idle CPU clock after the interval.    Therefore, the CPU more, the bad effects. 
  Furthermore, there was a situation will cause the same effect: When there is a whole cohort of running a task "time-remaining" spare CPU will be running the task, but when the initial task is not running on the CPU (because of the CPU and related tasks [mapping] the film has run out of time), also refers to the beginning of the implementation of those spare CPU time slice has not been exhausted and in a state of waiting for the process, the process will lead to the processor between "migration." 

  New Scheduler module with a single CPU to allocate time for the film, so as to solve this problem and abolish the global synchronization and Recosting cycle. 
  — Batch scheduling. 
  — Load-balancing, reducing CPU load capacity in excess of those of the process of priority, so that they do not collapse, and it does not produce scheduling turmoil - O (1) RT scheduling. 
  — Father process, the operation of the process has been created 

  C) the operation of the new scheduling mechanism - each CPU has two of the priority queue.    An "activities" Queue, a "measured" queue.    All activities included in the queue mapping in the CPU time slice and with the mandate.    The queue contains all expired End has been using the time-task (is still an orderly queue).    Active Queue and Queue can not be measured directly access, but through the operation of each CPU structure of the queue to visit the two indicators.    Scheduling procedures to scan all need each task, but a task in a state of readiness to put its "activities" in the queue.    When the scheduler runs, only choose the most beneficial activities queue to implement the mandate.    Thus, scheduling can be in a constant time to complete.    When the mandate, it will be a time slice, or its transfer to another thread be some time before the right to use the CPU.    When time-after use, tasks will be transferred to the "expired" in the queue.    In the queue, the task will be carried out in accordance with its priority ranking.    When the activities of the task queue were running at the end, we exchanged two indicators, and as a result a previously expired queue queue activities, the activities of the same previously idle cohort will change for the expired queue.    Therefore, we can see that, "activities" in the queue is always a good row the sequence of tasks queue, so improving the speed. 

  — A 64 - bit bitmap cache used to store queue Index.    So finding the most senior with two tasks-search x86 BSFL bit instructions can be completed. 

  Separation queue method allows us to control activities and mandate expired mandate of the number, and when time-use the latter, can be quickly re-count the time slice.    Visit queue because the queue is running two indicators, the exchange indicators are basically very fast, so the two exchanged queue will be very fast. 

Bookmark it: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Google
  • DotNetKicks
  • DZone
  • Furl
  • Netvouz

Releated Articles

  • Popuklar Articles

0 Comments to “[Linux2.6 core of - the process of scheduling] [04.07.08]”

No Comments. Send your comment.

Leave a Reply

You must be logged in to post a comment.