Merge branch 'page_pool-stats'
Joe Damato says: ==================== page_pool: Add stats counters Greetings: Welcome to v9. This revisions adds a commit which updates the page_pool documentation to describe the stats API, structures, and fields. Additionally, this revision contains a minor cosmetic change suggested by Saeed in page_pool_recycle_in_ring in commit 2: "page_pool: Add recycle stats", which removes an unnecessary #ifdef. There are no functional changes in this revision. Benchmark output from the v7 cover [1] is pasted below, as it is still relevant since no functional changes have been made in this revision: Benchmarks have been re-run. As always, results between runs are highly variable; you'll find results showing that stats disabled are both faster and slower than stats enabled in back to back benchmark runs. Raw benchmark output with stats off [2] and stats on [3] are available for examination. Test system: - 2x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz - 2 NUMA zones, with 18 cores per zone and 2 threads per core bench_page_pool_simple results, loops=200000000 test name stats enabled stats disabled cycles nanosec cycles nanosec for_loop 0 0.335 0 0.336 atomic_inc 14 6.106 13 6.022 lock 30 13.365 32 13.968 no-softirq-page_pool01 75 32.884 74 32.308 no-softirq-page_pool02 79 34.696 74 32.302 no-softirq-page_pool03 110 48.005 105 46.073 tasklet_page_pool01_fast_path 14 6.156 14 6.211 tasklet_page_pool02_ptr_ring 41 18.028 39 17.391 tasklet_page_pool03_slow 107 46.646 105 46.123 bench_page_pool_cross_cpu results, loops=20000000 returning_cpus=4: test name stats enabled stats disabled cycles nanosec cycles nanosec page_pool_cross_cpu CPU(0) 3973 1731.596 4015 1750.015 page_pool_cross_cpu CPU(1) 3976 1733.217 4022 1752.864 page_pool_cross_cpu CPU(2) 3973 1731.615 4016 1750.433 page_pool_cross_cpu CPU(3) 3976 1733.218 4021 1752.806 page_pool_cross_cpu CPU(4) 994 433.305 1005 438.217 page_pool_cross_cpu average 3378 - 3415 - bench_page_pool_cross_cpu results, loops=20000000 returning_cpus=8: test name stats enabled stats disabled cycles nanosec cycles nanosec page_pool_cross_cpu CPU(0) 6969 3037.488 6909 3011.463 page_pool_cross_cpu CPU(1) 6974 3039.469 6913 3012.961 page_pool_cross_cpu CPU(2) 6969 3037.575 6910 3011.585 page_pool_cross_cpu CPU(3) 6974 3039.415 6913 3012.961 page_pool_cross_cpu CPU(4) 6969 3037.288 6909 3011.368 page_pool_cross_cpu CPU(5) 6972 3038.732 6913 3012.920 page_pool_cross_cpu CPU(6) 6969 3037.350 6909 3011.386 page_pool_cross_cpu CPU(7) 6973 3039.356 6913 3012.921 page_pool_cross_cpu CPU(8) 871 379.934 864 376.620 page_pool_cross_cpu average 6293 - 6239 - Thanks. [1]: https://lore.kernel.org/all/1645810914-35485-1-git-send-email-jdamato@fastly.com/ [2]:327dcd71d1/v7_stats_disabled
[3]:327dcd71d1/v7_stats_enabled
v8 -> v9: - Add documentation about the page_pool_get_stats API, stats structures, and fields to Documentation/networking/page_pool.rst. - Remove unnecessary #ifdef in page_pool_recycle_in_ring. v7 -> v8: - Rename mlx5 ethtool stats so that users have a better idea of their meaning. v6 -> v7: - stats split out into two structs one single per-page pool struct for allocation path stats and one per-cpu pointer for recycle path stats. - page_pool_get_stats updated to use a wrapper struct to gather stats for allocation and recycle stats with a single argument. - placement of structs adjusted - mlx5 driver modified to use page_pool_get_stats API v5 -> v6: - Per cpu page_pool_stats struct pointer is now marked as ____cacheline_aligned_in_smp. Placement of the field in the struct is unchanged; it is the last field. v4 -> v5: - Fixed the description of the kernel option in Kconfig. - Squashed commits 1-10 from v4 into a single commit for easier review. - Changed the comment style of the comment for the this_cpu_inc_alloc_stat macro. - Changed the return type of page_pool_get_stats from struct page_pool_stat * to bool. v3 -> v4: - Restructured stats to be per-cpu per-pool. - Global stats and proc file were removed. - Exposed an API (page_pool_get_stats) for batching the pool stats. v2 -> v3: - patch 8/10 ("Add stat tracking cache refill") fixed placement of counter increment. - patch 10/10 ("net-procfs: Show page pool stats in proc") updated: - fix unused label warning from kernel test robot, - fixed page_pool_seq_show to only display the refill stat once, - added a remove_proc_entry for page_pool_stat to dev_proc_net_exit. v1 -> v2: - A new kernel config option has been added, which defaults to N, preventing this code from being compiled in by default - The stats structure has been converted to a per-cpu structure - The stats are now exported via proc (/proc/net/page_pool_stat) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
commit
a8ff736d31
@ -105,6 +105,47 @@ a page will cause no race conditions is enough.
|
||||
Please note the caller must not use data area after running
|
||||
page_pool_put_page_bulk(), as this function overwrites it.
|
||||
|
||||
* page_pool_get_stats(): Retrieve statistics about the page_pool. This API
|
||||
is only available if the kernel has been configured with
|
||||
``CONFIG_PAGE_POOL_STATS=y``. A pointer to a caller allocated ``struct
|
||||
page_pool_stats`` structure is passed to this API which is filled in. The
|
||||
caller can then report those stats to the user (perhaps via ethtool,
|
||||
debugfs, etc.). See below for an example usage of this API.
|
||||
|
||||
Stats API and structures
|
||||
------------------------
|
||||
If the kernel is configured with ``CONFIG_PAGE_POOL_STATS=y``, the API
|
||||
``page_pool_get_stats()`` and structures described below are available. It
|
||||
takes a pointer to a ``struct page_pool`` and a pointer to a ``struct
|
||||
page_pool_stats`` allocated by the caller.
|
||||
|
||||
The API will fill in the provided ``struct page_pool_stats`` with
|
||||
statistics about the page_pool.
|
||||
|
||||
The stats structure has the following fields::
|
||||
|
||||
struct page_pool_stats {
|
||||
struct page_pool_alloc_stats alloc_stats;
|
||||
struct page_pool_recycle_stats recycle_stats;
|
||||
};
|
||||
|
||||
|
||||
The ``struct page_pool_alloc_stats`` has the following fields:
|
||||
* ``fast``: successful fast path allocations
|
||||
* ``slow``: slow path order-0 allocations
|
||||
* ``slow_high_order``: slow path high order allocations
|
||||
* ``empty``: ptr ring is empty, so a slow path allocation was forced.
|
||||
* ``refill``: an allocation which triggered a refill of the cache
|
||||
* ``waive``: pages obtained from the ptr ring that cannot be added to
|
||||
the cache due to a NUMA mismatch.
|
||||
|
||||
The ``struct page_pool_recycle_stats`` has the following fields:
|
||||
* ``cached``: recycling placed page in the page pool cache
|
||||
* ``cache_full``: page pool cache was full
|
||||
* ``ring``: page placed into the ptr ring
|
||||
* ``ring_full``: page released from page pool because the ptr ring was full
|
||||
* ``released_refcnt``: page released (and not recycled) because refcnt > 1
|
||||
|
||||
Coding examples
|
||||
===============
|
||||
|
||||
@ -157,6 +198,21 @@ NAPI poller
|
||||
}
|
||||
}
|
||||
|
||||
Stats
|
||||
-----
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
/* retrieve stats */
|
||||
struct page_pool_stats stats = { 0 };
|
||||
if (page_pool_get_stats(page_pool, &stats)) {
|
||||
/* perhaps the driver reports statistics with ethool */
|
||||
ethtool_print_allocation_stats(&stats.alloc_stats);
|
||||
ethtool_print_recycle_stats(&stats.recycle_stats);
|
||||
}
|
||||
#endif
|
||||
|
||||
Driver unload
|
||||
-------------
|
||||
|
||||
|
@ -37,6 +37,10 @@
|
||||
#include "en/ptp.h"
|
||||
#include "en/port.h"
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
#include <net/page_pool.h>
|
||||
#endif
|
||||
|
||||
static unsigned int stats_grps_num(struct mlx5e_priv *priv)
|
||||
{
|
||||
return !priv->profile->stats_grps_num ? 0 :
|
||||
@ -183,6 +187,19 @@ static const struct counter_desc sw_stats_desc[] = {
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_arfs_err) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_recover) },
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_fast) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_slow) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_slow_high_order) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_empty) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_refill) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_waive) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_cached) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_cache_full) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_ring) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_ring_full) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_released_ref) },
|
||||
#endif
|
||||
#ifdef CONFIG_MLX5_EN_TLS
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_decrypted_packets) },
|
||||
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_decrypted_bytes) },
|
||||
@ -349,6 +366,19 @@ static void mlx5e_stats_grp_sw_update_stats_rq_stats(struct mlx5e_sw_stats *s,
|
||||
s->rx_congst_umr += rq_stats->congst_umr;
|
||||
s->rx_arfs_err += rq_stats->arfs_err;
|
||||
s->rx_recover += rq_stats->recover;
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
s->rx_pp_alloc_fast += rq_stats->pp_alloc_fast;
|
||||
s->rx_pp_alloc_slow += rq_stats->pp_alloc_slow;
|
||||
s->rx_pp_alloc_empty += rq_stats->pp_alloc_empty;
|
||||
s->rx_pp_alloc_refill += rq_stats->pp_alloc_refill;
|
||||
s->rx_pp_alloc_waive += rq_stats->pp_alloc_waive;
|
||||
s->rx_pp_alloc_slow_high_order += rq_stats->pp_alloc_slow_high_order;
|
||||
s->rx_pp_recycle_cached += rq_stats->pp_recycle_cached;
|
||||
s->rx_pp_recycle_cache_full += rq_stats->pp_recycle_cache_full;
|
||||
s->rx_pp_recycle_ring += rq_stats->pp_recycle_ring;
|
||||
s->rx_pp_recycle_ring_full += rq_stats->pp_recycle_ring_full;
|
||||
s->rx_pp_recycle_released_ref += rq_stats->pp_recycle_released_ref;
|
||||
#endif
|
||||
#ifdef CONFIG_MLX5_EN_TLS
|
||||
s->rx_tls_decrypted_packets += rq_stats->tls_decrypted_packets;
|
||||
s->rx_tls_decrypted_bytes += rq_stats->tls_decrypted_bytes;
|
||||
@ -455,6 +485,35 @@ static void mlx5e_stats_grp_sw_update_stats_qos(struct mlx5e_priv *priv,
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
static void mlx5e_stats_update_stats_rq_page_pool(struct mlx5e_channel *c)
|
||||
{
|
||||
struct mlx5e_rq_stats *rq_stats = c->rq.stats;
|
||||
struct page_pool *pool = c->rq.page_pool;
|
||||
struct page_pool_stats stats = { 0 };
|
||||
|
||||
if (!page_pool_get_stats(pool, &stats))
|
||||
return;
|
||||
|
||||
rq_stats->pp_alloc_fast = stats.alloc_stats.fast;
|
||||
rq_stats->pp_alloc_slow = stats.alloc_stats.slow;
|
||||
rq_stats->pp_alloc_slow_high_order = stats.alloc_stats.slow_high_order;
|
||||
rq_stats->pp_alloc_empty = stats.alloc_stats.empty;
|
||||
rq_stats->pp_alloc_waive = stats.alloc_stats.waive;
|
||||
rq_stats->pp_alloc_refill = stats.alloc_stats.refill;
|
||||
|
||||
rq_stats->pp_recycle_cached = stats.recycle_stats.cached;
|
||||
rq_stats->pp_recycle_cache_full = stats.recycle_stats.cache_full;
|
||||
rq_stats->pp_recycle_ring = stats.recycle_stats.ring;
|
||||
rq_stats->pp_recycle_ring_full = stats.recycle_stats.ring_full;
|
||||
rq_stats->pp_recycle_released_ref = stats.recycle_stats.released_refcnt;
|
||||
}
|
||||
#else
|
||||
static void mlx5e_stats_update_stats_rq_page_pool(struct mlx5e_channel *c)
|
||||
{
|
||||
}
|
||||
#endif
|
||||
|
||||
static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw)
|
||||
{
|
||||
struct mlx5e_sw_stats *s = &priv->stats.sw;
|
||||
@ -465,8 +524,11 @@ static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw)
|
||||
for (i = 0; i < priv->stats_nch; i++) {
|
||||
struct mlx5e_channel_stats *channel_stats =
|
||||
priv->channel_stats[i];
|
||||
|
||||
int j;
|
||||
|
||||
mlx5e_stats_update_stats_rq_page_pool(priv->channels.c[i]);
|
||||
|
||||
mlx5e_stats_grp_sw_update_stats_rq_stats(s, &channel_stats->rq);
|
||||
mlx5e_stats_grp_sw_update_stats_xdpsq(s, &channel_stats->rq_xdpsq);
|
||||
mlx5e_stats_grp_sw_update_stats_ch_stats(s, &channel_stats->ch);
|
||||
@ -1887,6 +1949,19 @@ static const struct counter_desc rq_stats_desc[] = {
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, congst_umr) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, arfs_err) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, recover) },
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_fast) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_slow) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_slow_high_order) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_empty) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_refill) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_waive) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_cached) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_cache_full) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_ring) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_ring_full) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_released_ref) },
|
||||
#endif
|
||||
#ifdef CONFIG_MLX5_EN_TLS
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, tls_decrypted_packets) },
|
||||
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, tls_decrypted_bytes) },
|
||||
|
@ -205,7 +205,19 @@ struct mlx5e_sw_stats {
|
||||
u64 ch_aff_change;
|
||||
u64 ch_force_irq;
|
||||
u64 ch_eq_rearm;
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
u64 rx_pp_alloc_fast;
|
||||
u64 rx_pp_alloc_slow;
|
||||
u64 rx_pp_alloc_slow_high_order;
|
||||
u64 rx_pp_alloc_empty;
|
||||
u64 rx_pp_alloc_refill;
|
||||
u64 rx_pp_alloc_waive;
|
||||
u64 rx_pp_recycle_cached;
|
||||
u64 rx_pp_recycle_cache_full;
|
||||
u64 rx_pp_recycle_ring;
|
||||
u64 rx_pp_recycle_ring_full;
|
||||
u64 rx_pp_recycle_released_ref;
|
||||
#endif
|
||||
#ifdef CONFIG_MLX5_EN_TLS
|
||||
u64 tx_tls_encrypted_packets;
|
||||
u64 tx_tls_encrypted_bytes;
|
||||
@ -352,6 +364,19 @@ struct mlx5e_rq_stats {
|
||||
u64 congst_umr;
|
||||
u64 arfs_err;
|
||||
u64 recover;
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
u64 pp_alloc_fast;
|
||||
u64 pp_alloc_slow;
|
||||
u64 pp_alloc_slow_high_order;
|
||||
u64 pp_alloc_empty;
|
||||
u64 pp_alloc_refill;
|
||||
u64 pp_alloc_waive;
|
||||
u64 pp_recycle_cached;
|
||||
u64 pp_recycle_cache_full;
|
||||
u64 pp_recycle_ring;
|
||||
u64 pp_recycle_ring_full;
|
||||
u64 pp_recycle_released_ref;
|
||||
#endif
|
||||
#ifdef CONFIG_MLX5_EN_TLS
|
||||
u64 tls_decrypted_packets;
|
||||
u64 tls_decrypted_bytes;
|
||||
|
@ -84,6 +84,48 @@ struct page_pool_params {
|
||||
void *init_arg;
|
||||
};
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
struct page_pool_alloc_stats {
|
||||
u64 fast; /* fast path allocations */
|
||||
u64 slow; /* slow-path order 0 allocations */
|
||||
u64 slow_high_order; /* slow-path high order allocations */
|
||||
u64 empty; /* failed refills due to empty ptr ring, forcing
|
||||
* slow path allocation
|
||||
*/
|
||||
u64 refill; /* allocations via successful refill */
|
||||
u64 waive; /* failed refills due to numa zone mismatch */
|
||||
};
|
||||
|
||||
struct page_pool_recycle_stats {
|
||||
u64 cached; /* recycling placed page in the cache. */
|
||||
u64 cache_full; /* cache was full */
|
||||
u64 ring; /* recycling placed page back into ptr ring */
|
||||
u64 ring_full; /* page was released from page-pool because
|
||||
* PTR ring was full.
|
||||
*/
|
||||
u64 released_refcnt; /* page released because of elevated
|
||||
* refcnt
|
||||
*/
|
||||
};
|
||||
|
||||
/* This struct wraps the above stats structs so users of the
|
||||
* page_pool_get_stats API can pass a single argument when requesting the
|
||||
* stats for the page pool.
|
||||
*/
|
||||
struct page_pool_stats {
|
||||
struct page_pool_alloc_stats alloc_stats;
|
||||
struct page_pool_recycle_stats recycle_stats;
|
||||
};
|
||||
|
||||
/*
|
||||
* Drivers that wish to harvest page pool stats and report them to users
|
||||
* (perhaps via ethtool, debugfs, or another mechanism) can allocate a
|
||||
* struct page_pool_stats call page_pool_get_stats to get stats for the specified pool.
|
||||
*/
|
||||
bool page_pool_get_stats(struct page_pool *pool,
|
||||
struct page_pool_stats *stats);
|
||||
#endif
|
||||
|
||||
struct page_pool {
|
||||
struct page_pool_params p;
|
||||
|
||||
@ -96,6 +138,11 @@ struct page_pool {
|
||||
unsigned int frag_offset;
|
||||
struct page *frag_page;
|
||||
long frag_users;
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
/* these stats are incremented while in softirq context */
|
||||
struct page_pool_alloc_stats alloc_stats;
|
||||
#endif
|
||||
u32 xdp_mem_id;
|
||||
|
||||
/*
|
||||
@ -126,6 +173,10 @@ struct page_pool {
|
||||
*/
|
||||
struct ptr_ring ring;
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
/* recycle stats are per-cpu to avoid locking */
|
||||
struct page_pool_recycle_stats __percpu *recycle_stats;
|
||||
#endif
|
||||
atomic_t pages_state_release_cnt;
|
||||
|
||||
/* A page_pool is strictly tied to a single RX-queue being
|
||||
|
13
net/Kconfig
13
net/Kconfig
@ -434,6 +434,19 @@ config NET_DEVLINK
|
||||
config PAGE_POOL
|
||||
bool
|
||||
|
||||
config PAGE_POOL_STATS
|
||||
default n
|
||||
bool "Page pool stats"
|
||||
depends on PAGE_POOL
|
||||
help
|
||||
Enable page pool statistics to track page allocation and recycling
|
||||
in page pools. This option incurs additional CPU cost in allocation
|
||||
and recycle paths and additional memory cost to store the statistics.
|
||||
These statistics are only available if this option is enabled and if
|
||||
the driver using the page pool supports exporting this data.
|
||||
|
||||
If unsure, say N.
|
||||
|
||||
config FAILOVER
|
||||
tristate "Generic failover module"
|
||||
help
|
||||
|
@ -26,6 +26,45 @@
|
||||
|
||||
#define BIAS_MAX LONG_MAX
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
/* alloc_stat_inc is intended to be used in softirq context */
|
||||
#define alloc_stat_inc(pool, __stat) (pool->alloc_stats.__stat++)
|
||||
/* recycle_stat_inc is safe to use when preemption is possible. */
|
||||
#define recycle_stat_inc(pool, __stat) \
|
||||
do { \
|
||||
struct page_pool_recycle_stats __percpu *s = pool->recycle_stats; \
|
||||
this_cpu_inc(s->__stat); \
|
||||
} while (0)
|
||||
|
||||
bool page_pool_get_stats(struct page_pool *pool,
|
||||
struct page_pool_stats *stats)
|
||||
{
|
||||
int cpu = 0;
|
||||
|
||||
if (!stats)
|
||||
return false;
|
||||
|
||||
memcpy(&stats->alloc_stats, &pool->alloc_stats, sizeof(pool->alloc_stats));
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
const struct page_pool_recycle_stats *pcpu =
|
||||
per_cpu_ptr(pool->recycle_stats, cpu);
|
||||
|
||||
stats->recycle_stats.cached += pcpu->cached;
|
||||
stats->recycle_stats.cache_full += pcpu->cache_full;
|
||||
stats->recycle_stats.ring += pcpu->ring;
|
||||
stats->recycle_stats.ring_full += pcpu->ring_full;
|
||||
stats->recycle_stats.released_refcnt += pcpu->released_refcnt;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
EXPORT_SYMBOL(page_pool_get_stats);
|
||||
#else
|
||||
#define alloc_stat_inc(pool, __stat)
|
||||
#define recycle_stat_inc(pool, __stat)
|
||||
#endif
|
||||
|
||||
static int page_pool_init(struct page_pool *pool,
|
||||
const struct page_pool_params *params)
|
||||
{
|
||||
@ -73,6 +112,12 @@ static int page_pool_init(struct page_pool *pool,
|
||||
pool->p.flags & PP_FLAG_PAGE_FRAG)
|
||||
return -EINVAL;
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
pool->recycle_stats = alloc_percpu(struct page_pool_recycle_stats);
|
||||
if (!pool->recycle_stats)
|
||||
return -ENOMEM;
|
||||
#endif
|
||||
|
||||
if (ptr_ring_init(&pool->ring, ring_qsize, GFP_KERNEL) < 0)
|
||||
return -ENOMEM;
|
||||
|
||||
@ -117,8 +162,10 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
|
||||
int pref_nid; /* preferred NUMA node */
|
||||
|
||||
/* Quicker fallback, avoid locks when ring is empty */
|
||||
if (__ptr_ring_empty(r))
|
||||
if (__ptr_ring_empty(r)) {
|
||||
alloc_stat_inc(pool, empty);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/* Softirq guarantee CPU and thus NUMA node is stable. This,
|
||||
* assumes CPU refilling driver RX-ring will also run RX-NAPI.
|
||||
@ -145,14 +192,17 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
|
||||
* This limit stress on page buddy alloactor.
|
||||
*/
|
||||
page_pool_return_page(pool, page);
|
||||
alloc_stat_inc(pool, waive);
|
||||
page = NULL;
|
||||
break;
|
||||
}
|
||||
} while (pool->alloc.count < PP_ALLOC_CACHE_REFILL);
|
||||
|
||||
/* Return last page */
|
||||
if (likely(pool->alloc.count > 0))
|
||||
if (likely(pool->alloc.count > 0)) {
|
||||
page = pool->alloc.cache[--pool->alloc.count];
|
||||
alloc_stat_inc(pool, refill);
|
||||
}
|
||||
|
||||
return page;
|
||||
}
|
||||
@ -166,6 +216,7 @@ static struct page *__page_pool_get_cached(struct page_pool *pool)
|
||||
if (likely(pool->alloc.count)) {
|
||||
/* Fast-path */
|
||||
page = pool->alloc.cache[--pool->alloc.count];
|
||||
alloc_stat_inc(pool, fast);
|
||||
} else {
|
||||
page = page_pool_refill_alloc_cache(pool);
|
||||
}
|
||||
@ -239,6 +290,7 @@ static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
|
||||
return NULL;
|
||||
}
|
||||
|
||||
alloc_stat_inc(pool, slow_high_order);
|
||||
page_pool_set_pp_info(pool, page);
|
||||
|
||||
/* Track how many pages are held 'in-flight' */
|
||||
@ -293,10 +345,12 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
|
||||
}
|
||||
|
||||
/* Return last page */
|
||||
if (likely(pool->alloc.count > 0))
|
||||
if (likely(pool->alloc.count > 0)) {
|
||||
page = pool->alloc.cache[--pool->alloc.count];
|
||||
else
|
||||
alloc_stat_inc(pool, slow);
|
||||
} else {
|
||||
page = NULL;
|
||||
}
|
||||
|
||||
/* When page just alloc'ed is should/must have refcnt 1. */
|
||||
return page;
|
||||
@ -394,7 +448,12 @@ static bool page_pool_recycle_in_ring(struct page_pool *pool, struct page *page)
|
||||
else
|
||||
ret = ptr_ring_produce_bh(&pool->ring, page);
|
||||
|
||||
return (ret == 0) ? true : false;
|
||||
if (!ret) {
|
||||
recycle_stat_inc(pool, ring);
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
/* Only allow direct recycling in special circumstances, into the
|
||||
@ -405,11 +464,14 @@ static bool page_pool_recycle_in_ring(struct page_pool *pool, struct page *page)
|
||||
static bool page_pool_recycle_in_cache(struct page *page,
|
||||
struct page_pool *pool)
|
||||
{
|
||||
if (unlikely(pool->alloc.count == PP_ALLOC_CACHE_SIZE))
|
||||
if (unlikely(pool->alloc.count == PP_ALLOC_CACHE_SIZE)) {
|
||||
recycle_stat_inc(pool, cache_full);
|
||||
return false;
|
||||
}
|
||||
|
||||
/* Caller MUST have verified/know (page_ref_count(page) == 1) */
|
||||
pool->alloc.cache[pool->alloc.count++] = page;
|
||||
recycle_stat_inc(pool, cached);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -459,6 +521,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
|
||||
* doing refcnt based recycle tricks, meaning another process
|
||||
* will be invoking put_page.
|
||||
*/
|
||||
recycle_stat_inc(pool, released_refcnt);
|
||||
/* Do not replace this with page_pool_return_page() */
|
||||
page_pool_release_page(pool, page);
|
||||
put_page(page);
|
||||
@ -472,6 +535,7 @@ void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
|
||||
page = __page_pool_put_page(pool, page, dma_sync_size, allow_direct);
|
||||
if (page && !page_pool_recycle_in_ring(pool, page)) {
|
||||
/* Cache full, fallback to free pages */
|
||||
recycle_stat_inc(pool, ring_full);
|
||||
page_pool_return_page(pool, page);
|
||||
}
|
||||
}
|
||||
@ -620,6 +684,9 @@ static void page_pool_free(struct page_pool *pool)
|
||||
if (pool->p.flags & PP_FLAG_DMA_MAP)
|
||||
put_device(pool->p.dev);
|
||||
|
||||
#ifdef CONFIG_PAGE_POOL_STATS
|
||||
free_percpu(pool->recycle_stats);
|
||||
#endif
|
||||
kfree(pool);
|
||||
}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user