forked from Minki/linux
b399c46ea0
Pull media updates from Mauro Carvalho Chehab: - a new jpeg codec driver for Samsung Exynos (jpeg-hw-exynos4) - a new dvb frontend for ds2103 chipset (m88ds2103) - a new sensor driver for Samsung S5K5BAF UXGA (s5k5baf) - new drivers for R-Car VSP1 - a new radio driver: radio-raremono - a new tuner driver for ts2022 chipset (m88ts2022) - the analog part of em28xx is now a separate module that only load/runs if the device is not a pure digital TV device - added a staging driver for bcm2048 radio devices - the omap 2 video driver (omap24xx) was moved to staging. This driver is for an old hardware and uses a deprecated Kernel internal API. If nobody cares enough to fix it, it would be removed on a couple Kernel releases - the sn9c102 driver was moved to staging. This driver was replaced by gspca, and disabled on some distros, as almost all devices are known to work properly with gspca. It should be removed from kernel on a couple Kernel releases - lots of driver fixes, improvements and cleanups * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (421 commits) [media] media: v4l2-dev: fix video device index assignment [media] rc-core: reuse device numbers [media] em28xx-cards: properly initialize the device bitmap [media] Staging: media: Fix line length exceeding 80 characters in as102_drv.c [media] Staging: media: Fix line length exceeding 80 characters in as102_fe.c [media] Staging: media: Fix quoted string split across line in as102_fe.c [media] media: st-rc: Add reset support [media] m2m-deinterlace: fix allocated struct type [media] radio-usb-si4713: fix sparse non static symbol warnings [media] em28xx-audio: remove needless check before usb_free_coherent() [media] au0828: Fix sparse non static symbol warning Revert "[media] go7007-usb: only use go->dev after allocated" [media] em28xx-audio: provide an error code when URB submit fails [media] em28xx: fix check for audio only usb interfaces when changing the usb alternate setting [media] em28xx: fix usb alternate setting for analog and digital video endpoints > 0 [media] em28xx: make 'em28xx_ctrl_ops' static em28xx-alsa: Fix error patch for init/fini [media] em28xx-audio: flush work at .fini [media] drxk: remove the option to load firmware asynchronously [media] em28xx: adjust period size at runtime ...
206 lines
6.4 KiB
Plaintext
206 lines
6.4 KiB
Plaintext
|
|
The Resource Counter
|
|
|
|
The resource counter, declared at include/linux/res_counter.h,
|
|
is supposed to facilitate the resource management by controllers
|
|
by providing common stuff for accounting.
|
|
|
|
This "stuff" includes the res_counter structure and routines
|
|
to work with it.
|
|
|
|
|
|
|
|
1. Crucial parts of the res_counter structure
|
|
|
|
a. unsigned long long usage
|
|
|
|
The usage value shows the amount of a resource that is consumed
|
|
by a group at a given time. The units of measurement should be
|
|
determined by the controller that uses this counter. E.g. it can
|
|
be bytes, items or any other unit the controller operates on.
|
|
|
|
b. unsigned long long max_usage
|
|
|
|
The maximal value of the usage over time.
|
|
|
|
This value is useful when gathering statistical information about
|
|
the particular group, as it shows the actual resource requirements
|
|
for a particular group, not just some usage snapshot.
|
|
|
|
c. unsigned long long limit
|
|
|
|
The maximal allowed amount of resource to consume by the group. In
|
|
case the group requests for more resources, so that the usage value
|
|
would exceed the limit, the resource allocation is rejected (see
|
|
the next section).
|
|
|
|
d. unsigned long long failcnt
|
|
|
|
The failcnt stands for "failures counter". This is the number of
|
|
resource allocation attempts that failed.
|
|
|
|
c. spinlock_t lock
|
|
|
|
Protects changes of the above values.
|
|
|
|
|
|
|
|
2. Basic accounting routines
|
|
|
|
a. void res_counter_init(struct res_counter *rc,
|
|
struct res_counter *rc_parent)
|
|
|
|
Initializes the resource counter. As usual, should be the first
|
|
routine called for a new counter.
|
|
|
|
The struct res_counter *parent can be used to define a hierarchical
|
|
child -> parent relationship directly in the res_counter structure,
|
|
NULL can be used to define no relationship.
|
|
|
|
c. int res_counter_charge(struct res_counter *rc, unsigned long val,
|
|
struct res_counter **limit_fail_at)
|
|
|
|
When a resource is about to be allocated it has to be accounted
|
|
with the appropriate resource counter (controller should determine
|
|
which one to use on its own). This operation is called "charging".
|
|
|
|
This is not very important which operation - resource allocation
|
|
or charging - is performed first, but
|
|
* if the allocation is performed first, this may create a
|
|
temporary resource over-usage by the time resource counter is
|
|
charged;
|
|
* if the charging is performed first, then it should be uncharged
|
|
on error path (if the one is called).
|
|
|
|
If the charging fails and a hierarchical dependency exists, the
|
|
limit_fail_at parameter is set to the particular res_counter element
|
|
where the charging failed.
|
|
|
|
d. int res_counter_charge_locked
|
|
(struct res_counter *rc, unsigned long val, bool force)
|
|
|
|
The same as res_counter_charge(), but it must not acquire/release the
|
|
res_counter->lock internally (it must be called with res_counter->lock
|
|
held). The force parameter indicates whether we can bypass the limit.
|
|
|
|
e. u64 res_counter_uncharge[_locked]
|
|
(struct res_counter *rc, unsigned long val)
|
|
|
|
When a resource is released (freed) it should be de-accounted
|
|
from the resource counter it was accounted to. This is called
|
|
"uncharging". The return value of this function indicate the amount
|
|
of charges still present in the counter.
|
|
|
|
The _locked routines imply that the res_counter->lock is taken.
|
|
|
|
f. u64 res_counter_uncharge_until
|
|
(struct res_counter *rc, struct res_counter *top,
|
|
unsigned long val)
|
|
|
|
Almost same as res_counter_uncharge() but propagation of uncharge
|
|
stops when rc == top. This is useful when kill a res_counter in
|
|
child cgroup.
|
|
|
|
2.1 Other accounting routines
|
|
|
|
There are more routines that may help you with common needs, like
|
|
checking whether the limit is reached or resetting the max_usage
|
|
value. They are all declared in include/linux/res_counter.h.
|
|
|
|
|
|
|
|
3. Analyzing the resource counter registrations
|
|
|
|
a. If the failcnt value constantly grows, this means that the counter's
|
|
limit is too tight. Either the group is misbehaving and consumes too
|
|
many resources, or the configuration is not suitable for the group
|
|
and the limit should be increased.
|
|
|
|
b. The max_usage value can be used to quickly tune the group. One may
|
|
set the limits to maximal values and either load the container with
|
|
a common pattern or leave one for a while. After this the max_usage
|
|
value shows the amount of memory the container would require during
|
|
its common activity.
|
|
|
|
Setting the limit a bit above this value gives a pretty good
|
|
configuration that works in most of the cases.
|
|
|
|
c. If the max_usage is much less than the limit, but the failcnt value
|
|
is growing, then the group tries to allocate a big chunk of resource
|
|
at once.
|
|
|
|
d. If the max_usage is much less than the limit, but the failcnt value
|
|
is 0, then this group is given too high limit, that it does not
|
|
require. It is better to lower the limit a bit leaving more resource
|
|
for other groups.
|
|
|
|
|
|
|
|
4. Communication with the control groups subsystem (cgroups)
|
|
|
|
All the resource controllers that are using cgroups and resource counters
|
|
should provide files (in the cgroup filesystem) to work with the resource
|
|
counter fields. They are recommended to adhere to the following rules:
|
|
|
|
a. File names
|
|
|
|
Field name File name
|
|
---------------------------------------------------
|
|
usage usage_in_<unit_of_measurement>
|
|
max_usage max_usage_in_<unit_of_measurement>
|
|
limit limit_in_<unit_of_measurement>
|
|
failcnt failcnt
|
|
lock no file :)
|
|
|
|
b. Reading from file should show the corresponding field value in the
|
|
appropriate format.
|
|
|
|
c. Writing to file
|
|
|
|
Field Expected behavior
|
|
----------------------------------
|
|
usage prohibited
|
|
max_usage reset to usage
|
|
limit set the limit
|
|
failcnt reset to zero
|
|
|
|
|
|
|
|
5. Usage example
|
|
|
|
a. Declare a task group (take a look at cgroups subsystem for this) and
|
|
fold a res_counter into it
|
|
|
|
struct my_group {
|
|
struct res_counter res;
|
|
|
|
<other fields>
|
|
}
|
|
|
|
b. Put hooks in resource allocation/release paths
|
|
|
|
int alloc_something(...)
|
|
{
|
|
if (res_counter_charge(res_counter_ptr, amount) < 0)
|
|
return -ENOMEM;
|
|
|
|
<allocate the resource and return to the caller>
|
|
}
|
|
|
|
void release_something(...)
|
|
{
|
|
res_counter_uncharge(res_counter_ptr, amount);
|
|
|
|
<release the resource>
|
|
}
|
|
|
|
In order to keep the usage value self-consistent, both the
|
|
"res_counter_ptr" and the "amount" in release_something() should be
|
|
the same as they were in the alloc_something() when the releasing
|
|
resource was allocated.
|
|
|
|
c. Provide the way to read res_counter values and set them (the cgroups
|
|
still can help with it).
|
|
|
|
c. Compile and run :)
|