Hadoopメーリングリスト
For our CPU-bound application, I set the value of
mapred.tasktracker.tasks.maximum (number of map tasks per tasktracker)
equal to the number of CPUs on a tasktracker. Unfortunately, I think
this value has to be set per cluster, not per machine. This is okay
for us because our machines have similar hardware, but it might be a
problem if your machines have different numbers of CPUs.
I created HADOOP-1245 a long time ago for this problem, but I've since
heard that hadoop uses only the cluster value for maps per
tasktracker, not the hybrid model I describe. In any case, I never
did any work on fixing it because I don't need heterogeneous clusters.
hadoopは,サーバごとにいくつのjobtrackerを立てるかといったことを制御できません...なるほど.