linux/arch
Linus Torvalds e56b3bc794 cpu masks: optimize and clean up cpumask_of_cpu()
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.

Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.

In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.

So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).

And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:

  static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
  {
          const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
          p -= cpu / BITS_PER_LONG;
          return (const cpumask_t *)p;
  }

This brings other advantages and simplifications as well:

 - we are not wasting memory that is just filled with a single bit in
   various different places

 - we don't need all those games to re-create the arrays in some dense
   format, because they're already going to be dense enough.

if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).

[ mingo@elte.hu:

  Converted Linus's mails into a commit. See:

     http://lkml.org/lkml/2008/7/27/156
     http://lkml.org/lkml/2008/7/28/320

  Also applied a family filter - which also has the side-effect of leaving
  out the bits where Linus calls me an idio... Oh, never mind ;-)
]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 22:20:41 +02:00
..
alpha [PATCH] sanitize __user_walk_fd() et.al. 2008-07-26 20:53:34 -04:00
arm Merge master.kernel.org:/home/rmk/linux-2.6-arm 2008-07-27 16:46:08 -07:00
avr32 avr32: some mmc/sd cleanups 2008-07-27 13:57:36 +02:00
blackfin Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 2008-07-26 13:23:17 -07:00
cris cris: use generic show_mem() 2008-07-26 12:00:11 -07:00
frv frv: use generic show_mem() 2008-07-26 12:00:11 -07:00
h8300 h8300: use generic show_mem() 2008-07-26 12:00:11 -07:00
ia64 KVM: ia64: Fix irq disabling leak in error handling code 2008-07-27 11:35:32 +03:00
m32r m32r: use generic show_mem() 2008-07-26 12:00:11 -07:00
m68k m68k: use generic show_mem() 2008-07-26 12:00:11 -07:00
m68knommu m68knommu: use generic show_mem() 2008-07-26 12:00:11 -07:00
mips mips: use generic show_mem() 2008-07-26 12:00:11 -07:00
mn10300 mn10300: use generic show_mem() 2008-07-26 12:00:11 -07:00
parisc [PATCH] sanitize __user_walk_fd() et.al. 2008-07-26 20:53:34 -04:00
powerpc KVM: ppc: fix invalidation of large guest pages 2008-07-27 12:02:05 +03:00
s390 KVM: s390: Fix possible host kernel bug on lctl(g) handling 2008-07-27 11:36:20 +03:00
sh sh: use generic show_mem() 2008-07-26 12:00:10 -07:00
sparc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 2008-07-25 17:33:34 -07:00
sparc64 sparc64: use generic show_mem() 2008-07-26 12:00:10 -07:00
um um: use generic show_mem() 2008-07-26 12:00:10 -07:00
x86 cpu masks: optimize and clean up cpumask_of_cpu() 2008-07-28 22:20:41 +02:00
xtensa xtensa: use generic show_mem() 2008-07-26 12:00:10 -07:00
.gitignore arch: Ignore arch/i386 and arch/x86_64 2008-01-19 21:29:39 -08:00
Kconfig tracehook: CONFIG_HAVE_ARCH_TRACEHOOK 2008-07-26 12:00:09 -07:00