linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-26 22:21:42 +00:00

Author	SHA1	Message	Date
Christoph Lameter	c24f21bda8	[PATCH] zoned vm counters: remove useless struct wbs Remove writeback state We can remove some functions now that were needed to calculate the page state for writeback control since these statistics are now directly available. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:36 -07:00
Christoph Lameter	d2c5e30c9a	[PATCH] zoned vm counters: conversion of nr_bounce to per zone counter Conversion of nr_bounce to a per zone counter nr_bounce is only used for proc output. So it could be left as an event counter. However, the event counters may not be accurate and nr_bounce is categorizing types of pages in a zone. So we really need this to also be a per zone counter. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:36 -07:00
Christoph Lameter	fd39fc8561	[PATCH] zoned vm counters: conversion of nr_unstable to per zone counter Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:36 -07:00
Christoph Lameter	ce866b34ae	[PATCH] zoned vm counters: conversion of nr_writeback to per zone counter Conversion of nr_writeback to per zone counter. This removes the last page_state counter from arch/i386/mm/pgtable.c so we drop the page_state from there. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	b1e7a8fd85	[PATCH] zoned vm counters: conversion of nr_dirty to per zone counter This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	df849a1529	[PATCH] zoned vm counters: conversion of nr_pagetables to per zone counter Conversion of nr_page_table_pages to a per zone counter [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	9a865ffa34	[PATCH] zoned vm counters: conversion of nr_slab to per zone counter - Allows reclaim to access counter without looping over processor counts. - Allows accurate statistics on how many pages are used in a zone by the slab. This may become useful to balance slab allocations over various zones. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	34aa1330f9	[PATCH] zoned vm counters: zone_reclaim: remove /proc/sys/vm/zone_reclaim_interval The zone_reclaim_interval was necessary because we were not able to determine how many unmapped pages exist in a zone. Therefore we had to scan in intervals to figure out if any pages were unmapped. With the zoned counters and NR_ANON_PAGES we now know the number of pagecache pages and the number of mapped pages in a zone. So we can simply skip the reclaim if there is an insufficient number of unmapped pages. We use SWAP_CLUSTER_MAX as the boundary. Drop all support for /proc/sys/vm/zone_reclaim_interval. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	f3dbd34460	[PATCH] zoned vm counters: split NR_ANON_PAGES off from NR_FILE_MAPPED The current NR_FILE_MAPPED is used by zone reclaim and the dirty load calculation as the number of mapped pagecache pages. However, that is not true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch separates those and therefore allows an accurate tracking of the anonymous pages per zone. It then becomes possible to determine the number of unmapped pages per zone and we can avoid scanning for unmapped pages if there are none. Also it may now be possible to determine the mapped/unmapped ratio in get_dirty_limit. Isnt the number of anonymous pages irrelevant in that calculation? Note that this will change the meaning of the number of mapped pages reported in /proc/vmstat /proc/meminfo and in the per node statistics. This may affect user space tools that monitor these counters! NR_FILE_MAPPED works like NR_FILE_DIRTY. It is only valid for pagecache pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	bf02cf4b6c	[PATCH] zoned vm counters: remove NR_FILE_MAPPED from scan control structure We can now access the number of pages in a mapped state in an inexpensive way in shrink_active_list. So drop the nr_mapped field from scan_control. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:35 -07:00
Christoph Lameter	347ce434d5	[PATCH] zoned vm counters: conversion of nr_pagecache to per zone counter Currently a single atomic variable is used to establish the size of the page cache in the whole machine. The zoned VM counters have the same method of implementation as the nr_pagecache code but also allow the determination of the pagecache size per zone. Remove the special implementation for nr_pagecache and make it a zoned counter named NR_FILE_PAGES. Updates of the page cache counters are always performed with interrupts off. We can therefore use the __ variant here. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Christoph Lameter	65ba55f500	[PATCH] zoned vm counters: convert nr_mapped to per zone counter nr_mapped is important because it allows a determination of how many pages of a zone are not mapped, which would allow a more efficient means of determining when we need to reclaim memory in a zone. We take the nr_mapped field out of the page state structure and define a new per zone counter named NR_FILE_MAPPED (the anonymous pages will be split off from NR_MAPPED in the next patch). We replace the use of nr_mapped in various kernel locations. This avoids the looping over all processors in try_to_free_pages(), writeback, reclaim (swap + zone reclaim). [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Christoph Lameter	2244b95a7b	[PATCH] zoned vm counters: basic ZVC (zoned vm counter) implementation Per zone counter infrastructure The counters that we currently have for the VM are split per processor. The processor however has not much to do with the zone these pages belong to. We cannot tell f.e. how many ZONE_DMA pages are dirty. So we are blind to potentially inbalances in the usage of memory in various zones. F.e. in a NUMA system we cannot tell how many pages are dirty on a particular node. If we knew then we could put measures into the VM to balance the use of memory between different zones and different nodes in a NUMA system. For example it would be possible to limit the dirty pages per node so that fast local memory is kept available even if a process is dirtying huge amounts of pages. Another example is zone reclaim. We do not know how many unmapped pages exist per zone. So we just have to try to reclaim. If it is not working then we pause and try again later. It would be better if we knew when it makes sense to reclaim unmapped pages from a zone. This patchset allows the determination of the number of unmapped pages per zone. We can remove the zone reclaim interval with the counters introduced here. Futhermore the ability to have various usage statistics available will allow the development of new NUMA balancing algorithms that may be able to improve the decision making in the scheduler of when to move a process to another node and hopefully will also enable automatic page migration through a user space program that can analyse the memory load distribution and then rebalance memory use in order to increase performance. The counter framework here implements differential counters for each processor in struct zone. The differential counters are consolidated when a threshold is exceeded (like done in the current implementation for nr_pageache), when slab reaping occurs or when a consolidation function is called. Consolidation uses atomic operations and accumulates counters per zone in the zone structure and also globally in the vm_stat array. VM functions can access the counts by simply indexing a global or zone specific array. The arrangement of counters in an array also simplifies processing when output has to be generated for /proc/. Counters can be updated by calling inc/dec_zone_page_state or _inc/dec_zone_page_state analogous to _page_state. The second group of functions can be called if it is known that interrupts are disabled. Special optimized increment and decrement functions are provided. These can avoid certain checks and use increment or decrement instructions that an architecture may provide. We also add a new CONFIG_DMA_IS_NORMAL that signifies that an architecture can do DMA to all memory and therefore ZONE_NORMAL will not be populated. This is only currently set for IA64 SGI SN2 and currently only affects node_page_state(). In the best case node_page_state can be reduced to retrieving a single counter for the one zone on the node. [akpm@osdl.org: cleanups] [akpm@osdl.org: export vm_stat[] for filesystems] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Christoph Lameter	f6ac2354d7	[PATCH] zoned vm counters: create vmstat.c/.h from page_alloc.c/.h NOTE: ZVC are not the lightweight event counters. ZVCs are reliable whereas event counters do not need to be. Zone based VM statistics are necessary to be able to determine what the state of memory in one zone is. In a NUMA system this can be helpful for local reclaim and other memory optimizations that may be able to shift VM load in order to get more balanced memory use. It is also useful to know how the computing load affects the memory allocations on various zones. This patchset allows the retrieval of that data from userspace. The patchset introduces a framework for counters that is a cross between the existing page_stats --which are simply global counters split per cpu-- and the approach of deferred incremental updates implemented for nr_pagecache. Small per cpu 8 bit counters are added to struct zone. If the counter exceeds certain thresholds then the counters are accumulated in an array of atomic_long in the zone and in a global array that sums up all zone values. The small 8 bit counters are next to the per cpu page pointers and so they will be in high in the cpu cache when pages are allocated and freed. Access to VM counter information for a zone and for the whole machine is then possible by simply indexing an array (Thanks to Nick Piggin for pointing out that approach). The access to the total number of pages of various types does no longer require the summing up of all per cpu counters. Benefits of this patchset right now: - Ability for UP and SMP configuration to determine how memory is balanced between the DMA, NORMAL and HIGHMEM zones. - loops over all processors are avoided in writeback and reclaim paths. We can avoid caching the writeback information because the needed information is directly accessible. - Special handling for nr_pagecache removed. - zone_reclaim_interval vanishes since VM stats can now determine when it is worth to do local reclaim. - Fast inline per node page state determination. - Accurate counters in /sys/devices/system/node/node*/meminfo. Current counters are counting simply which processor allocated a page somewhere and guestimate based on that. So the counters were not useful to show the actual distribution of page use on a specific zone. - The swap_prefetch patch requires per node statistics in order to figure out when processors of a node can prefetch. This patch provides some of the needed numbers. - Detailed VM counters available in more /proc and /sys status files. References to earlier discussions: V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2 V2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114980851924230&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115014697910351&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767318740&w=2 Performance tests with AIM7 did not show any regressions. Seems to be a tad faster even. Tested on ia64/NUMA. Builds fine on i386, SMP / UP. Includes fixes for s390/arm/uml arch code. This patch: Move counter code from page_alloc.c/page-flags.h to vmstat.c/h. Create vmstat.c/vmstat.h by separating the counter code and the proc functions. Move the vm_stat_text array before zoneinfo_show. [akpm@osdl.org: s390 build fix] [akpm@osdl.org: HOTPLUG_CPU build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Adrian Bunk	672b2714ae	[PATCH] fix ISTALLION=y drivers/char/istallion.c: In function âstli_initbrdsâ: drivers/char/istallion.c:4150: error: implicit declaration of function âstli_parsebrdâ drivers/char/istallion.c:4150: error: âstli_brdspâ undeclared (first use in this function) drivers/char/istallion.c:4150: error: (Each undeclared identifier is reported only once drivers/char/istallion.c:4150: error: for each function it appears in.) drivers/char/istallion.c:4164: error: implicit declaration of function âstli_argbrdsâ While I was at it, I also removed the #ifdef MODULE around the initialation code to allow it to perhaps work when built into the kernel and made a needlessly global function static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Andrew Morton	e09793bb91	[PATCH] msr.c: use register_hotcpu_notifier() register_cpu_notifier() cannot do anything in a module, in a !CONFIG_HOTPLUG_CPU kernel. Cc: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Ingo Molnar	1017f6afd5	[PATCH] fix platform_device_put/del mishaps This fixes drivers/char/pc8736x_gpio.c and drivers/char/scx200_gpio.c to use the platform_device_del/put ops correctly. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Ingo Molnar	491d525ff1	[PATCH] fix drivers/video/imacfb.c compilation Fix build error on x86_64. There's nothing even remotely close to imacmp_seg in the kernel, so I removed the whole line. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Edgar Hucek <hostmaster@ed-soft.at> Cc: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:34 -07:00
Linus Torvalds	501b7c77de	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: remove redundant NULL checks in ocfs2_direct_IO_get_blocks() ocfs2: clean up some osb fields ocfs2: fix init of uuid_net_key ocfs2: silence a debug print ocfs2: silence ENOENT during lookup of broken links ocfs2: Cleanup message prints ocfs2: silence -EEXIST from ocfs2_extent_map_insert/lookup [PATCH] fs/ocfs2/dlm/dlmrecovery.c: make dlm_lockres_master_requery() static ocfs2: warn the user on a dead timeout mismatch ocfs2: OCFS2_FS must depend on SYSFS ocfs2: Compile-time disabling of ocfs2 debugging output. configfs: Clear up a few extra spaces where there should be TABs. configfs: Release memory in configfs_example.	2006-06-29 17:44:21 -07:00
Linus Torvalds	74e651f0aa	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) [TIPC]: Initial activation message now includes TIPC version number [TIPC]: Improve response to requests for node/link information [TIPC]: Fixed skb_under_panic caused by tipc_link_bundle_buf [IrDA]: Fix the AU1000 FIR dependencies [IrDA]: Fix RCU lock pairing on error path [XFRM]: unexport xfrm_state_mtu [NET]: make skb_release_data() static [NETFILTE] ipv4: Fix typo (Bugzilla #6753) [IrDA]: MCS7780 usb_driver struct should be static [BNX2]: Turn off link during shutdown [BNX2]: Use dev_kfree_skb() instead of the _irq version [ATM]: basic sysfs support for ATM devices [ATM]: [suni] change suni_init to __devinit [ATM]: [iphase] should be __devinit not __init [ATM]: [idt77105] should be __devinit not __init [BNX2]: Add NETIF_F_TSO_ECN [NET]: Add ECN support for TSO [AF_UNIX]: Datagram getpeersec [NET]: Fix logical error in skb_gso_ok [PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000 ...	2006-06-29 17:43:43 -07:00
Allan Stephens	0702056f9f	[TIPC]: Initial activation message now includes TIPC version number Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 17:08:20 -07:00
Allan Stephens	ea13847b24	[TIPC]: Improve response to requests for node/link information Now allocates reply space for "get links" request based on number of actual links, not number of potential links. Also, limits reply to "get links" and "get nodes" requests to 32KB to match capabilities of tipc-config utility that issued request. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com>	2006-06-29 17:08:15 -07:00
Allan Stephens	e49060c7ca	[TIPC]: Fixed skb_under_panic caused by tipc_link_bundle_buf Now determines tailroom of bundle buffer by directly inspection of buffer. Previously, buffer was assumed to have a max capacity equal to the link MTU, but the addition of link MTU negotiation means that the link MTU can increase after the bundle buffer is allocated. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 17:08:10 -07:00
Adrian Bunk	caf430f371	[IrDA]: Fix the AU1000 FIR dependencies AU1000 FIR is broken, it should depend on SOC_AU1000. Spotted by Jean-Luc Leger. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 17:03:19 -07:00
Josh Triplett	1bc1731133	[IrDA]: Fix RCU lock pairing on error path irlan_client_discovery_indication calls rcu_read_lock and rcu_read_unlock, but returns without unlocking in an error case. Fix that by replacing the return with a goto so that the rcu_read_unlock always gets executed. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Samuel Ortiz samuel@sortiz.org <> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 17:02:31 -07:00
Adrian Bunk	244055fdc8	[XFRM]: unexport xfrm_state_mtu This patch removes the unused EXPORT_SYMBOL(xfrm_state_mtu). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:33 -07:00
Adrian Bunk	5bba17127e	[NET]: make skb_release_data() static skb_release_data() no longer has any users in other files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:30 -07:00
Matt LaPlante	c22751b73a	[NETFILTE] ipv4: Fix typo (Bugzilla #6753 ) This patch fixes bugzilla #6753, a typo in the netfilter Kconfig Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:28 -07:00
Adrian Bunk	7263ade1e1	[IrDA]: MCS7780 usb_driver struct should be static This patch makes a needlessly global struct static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:26 -07:00
Michael Chan	6c4f095eae	[BNX2]: Turn off link during shutdown Minor change in shutdown logic to effect a link down. Update version to 1.4.43. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:24 -07:00
Michael Chan	745720e583	[BNX2]: Use dev_kfree_skb() instead of the _irq version Change all dev_kfree_skb_irq() and dev_kfree_skb_any() to dev_kfree_skb(). These calls are never used in irq context. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:21 -07:00
Roman Kagan	656d98b09d	[ATM]: basic sysfs support for ATM devices Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:19 -07:00
Chas Williams	d17f086550	[ATM]: [suni] change suni_init to __devinit Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:17 -07:00
Chas Williams	249c14b55c	[ATM]: [iphase] should be __devinit not __init Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:14 -07:00
Chas Williams	b47eb0eb9b	[ATM]: [idt77105] should be __devinit not __init Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:12 -07:00
Michael Chan	b11d621352	[BNX2]: Add NETIF_F_TSO_ECN Add NETIF_F_TSO_ECN feature for all bnx2 hardware. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:10 -07:00
Michael Chan	b0da853703	[NET]: Add ECN support for TSO In the current TSO implementation, NETIF_F_TSO and ECN cannot be turned on together in a TCP connection. The problem is that most hardware that supports TSO does not handle CWR correctly if it is set in the TSO packet. Correct handling requires CWR to be set in the first packet only if it is set in the TSO header. This patch adds the ability to turn on NETIF_F_TSO and ECN using GSO if necessary to handle TSO packets with CWR set. Hardware that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev-> features flag. All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set. If the output device does not have the NETIF_F_TSO_ECN feature set, GSO will split the packet up correctly with CWR only set in the first segment. With help from Herbert Xu <herbert@gondor.apana.org.au>. Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock flag is completely removed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:08 -07:00
Catherine Zhang	877ce7c1b3	[AF_UNIX]: Datagram getpeersec This patch implements an API whereby an application can determine the label of its peer's Unix datagram sockets via the auxiliary data mechanism of recvmsg. Patch purpose: This patch enables a security-aware application to retrieve the security context of the peer of a Unix datagram socket. The application can then use this security context to determine the security context for processing on behalf of the peer who sent the packet. Patch design and implementation: The design and implementation is very similar to the UDP case for INET sockets. Basically we build upon the existing Unix domain socket API for retrieving user credentials. Linux offers the API for obtaining user credentials via ancillary messages (i.e., out of band/control messages that are bundled together with a normal message). To retrieve the security context, the application first indicates to the kernel such desire by setting the SO_PASSSEC option via getsockopt. Then the application retrieves the security context using the auxiliary data mechanism. An example server application for Unix datagram socket should look like this: toggle = 1; toggle_len = sizeof(toggle); setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len); recvmsg(sockfd, &msg_hdr, 0); if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) { cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr); if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) && cmsg_hdr->cmsg_level == SOL_SOCKET && cmsg_hdr->cmsg_type == SCM_SECURITY) { memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext)); } } sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow a server socket to receive security context of the peer. Testing: We have tested the patch by setting up Unix datagram client and server applications. We verified that the server can retrieve the security context using the auxiliary data mechanism of recvmsg. Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com> Acked-by: Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:06 -07:00
Herbert Xu	d6b4991ad5	[NET]: Fix logical error in skb_gso_ok The test in skb_gso_ok is backwards. Noticed by Michael Chan <mchan@broadcom.com>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:04 -07:00
Shuya MAEDA	4ee303dfea	[PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000 Signed-off-by: Shuya MAEDA <maeda-sxb@necst.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:01 -07:00
Herbert Xu	3d3a853379	[NET]: Make illegal_highdma more anal Rather than having illegal_highdma as a macro when HIGHMEM is off, we can turn it into an inline function that returns zero. This will catch callers that give it bad arguments. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:59 -07:00
Sridhar Samudrala	47da8ee681	[TCP]: Export accept queue len of a TCP listening socket via rx_queue While debugging a TCP server hang issue, we noticed that currently there is no way for a user to get the acceptq backlog value for a TCP listen socket. All the standard networking utilities that display socket info like netstat, ss and /proc/net/tcp have 2 fields called rx_queue and tx_queue. These fields do not mean much for listening sockets. This patch uses one of these unused fields(rx_queue) to export the accept queue len for listening sockets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:57 -07:00
Darrel Goeddel	c7bdb545d2	[NETLINK]: Encapsulate eff_cap usage within security framework. This patch encapsulates the usage of eff_cap (in netlink_skb_params) within the security framework by extending security_netlink_recv to include a required capability parameter and converting all direct usage of eff_caps outside of the lsm modules to use the interface. It also updates the SELinux implementation of the security_netlink_send and security_netlink_recv hooks to take advantage of the sid in the netlink_skb_params struct. This also enables SELinux to perform auditing of netlink capability checks. Please apply, for 2.6.18 if possible. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:55 -07:00
Herbert Xu	576a30eb64	[NET]: Added GSO header verification When GSO packets come from an untrusted source (e.g., a Xen guest domain), we need to verify the header integrity before passing it to the hardware. Since the first step in GSO is to verify the header, we can reuse that code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this bit set can only be fed directly to devices with the corresponding bit NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb is fed to the GSO engine which will allow the packet to be sent to the hardware if it passes the header check. This patch changes the sg flag to a full features flag. The same method can be used to implement TSO ECN support. We simply have to mark packets with CWR set with SKB_GSO_ECN so that only hardware with a corresponding NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment the packet, or segment the first MTU and pass the rest to the hardware for further segmentation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:53 -07:00
Patrick McHardy	68c1692e3e	[NETFILTER]: statistic match: add missing Kconfig help text Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:50 -07:00
Patrick McHardy	ef47c6a7b8	[NETFILTER]: ip_queue/nfnetlink_queue: drop bridge port references when dev disappears When a device that is acting as a bridge port is unregistered, the ip_queue/nfnetlink_queue notifier doesn't check if its one of physindev/physoutdev and doesn't release the references if it is. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:48 -07:00
Jorge Matias	1c7e47726a	[NETFILTER]: xt_sctp: fix --chunk-types matching xt_sctp uses an incorrect header offset when --chunk-types is used. Signed-off-by: Jorge Matias <jorge.matias@motorola.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:46 -07:00
Yuri Gushin	9abdcf6b6c	[NETFILTER]: xt_tcpudp: fix double unregistration in error path "xt_unregister_match(AF_INET, &tcp_matchstruct)" is called twice, leaving "udp_matchstruct" registered, in case of a failure in the registration of the udp6 structure. Signed-off-by: Yuri Gushin <yuri@ecl-labs.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:44 -07:00
Yasuyuki Kozakai	40a839fdbd	[NETFILTER]: nf_conntrack: Fix undefined references to local_bh_* CC net/netfilter/nf_conntrack_proto_sctp.o net/netfilter/nf_conntrack_proto_sctp.c: In function `sctp_print_conntrack': net/netfilter/nf_conntrack_proto_sctp.c:206: warning: implicit declaration of function `local_bh_disable' net/netfilter/nf_conntrack_proto_sctp.c:208: warning: implicit declaration of function `local_bh_enable' CC net/netfilter/nf_conntrack_netlink.o net/netfilter/nf_conntrack_netlink.c: In function `ctnetlink_dump_table': net/netfilter/nf_conntrack_netlink.c:429: warning: implicit declaration of function `local_bh_disable' net/netfilter/nf_conntrack_netlink.c:452: warning: implicit declaration of function `local_bh_enable' Spotted by Toralf Förster Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:42 -07:00
Patrick McHardy	da298d3a4f	[NETFILTER]: x_tables: fix xt_register_table error propagation When xt_register_table fails the error is not properly propagated back. Based on patch by Lepton Wu <ytht.net@gmail.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:40 -07:00

1 2 3 4 5 ...

31383 Commits