Commit Graph

47183 Commits

Author SHA1 Message Date
Maheshwar Ajja
4ad1b0d410 media: v4l2-ctrls: Add encoder constant quality control
When V4L2_CID_MPEG_VIDEO_BITRATE_MODE value is
V4L2_MPEG_VIDEO_BITRATE_MODE_CQ, encoder will produce
constant quality output indicated by
V4L2_CID_MPEG_VIDEO_CONSTANT_QUALITY control value.
Encoder will choose appropriate quantization parameter
and bitrate to produce requested frame quality level.

Signed-off-by: Maheshwar Ajja <majja@codeaurora.org>
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:29 +02:00
Ezequiel Garcia
54889c51b8 media: uapi: h264: Rename and clarify PPS_FLAG_SCALING_MATRIX_PRESENT
Applications are expected to fill V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX
if a non-flat scaling matrix applies to the picture. This is the case if
SPS scaling_matrix_present_flag or PPS pic_scaling_matrix_present_flag
are set, and should be handled by applications.

On one hand, the PPS bitstream syntax element signals the presence of a
Picture scaling matrix modifying the Sequence (SPS) scaling matrix.
On the other hand, our flag should indicate if the scaling matrix
V4L2 control is applicable to this request.

Rename the flag from PPS_FLAG_PIC_SCALING_MATRIX_PRESENT to
PPS_FLAG_SCALING_MATRIX_PRESENT, to avoid mixing this flag with
bitstream syntax element pic_scaling_matrix_present_flag,
and clarify the meaning of our flag.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
d935856317 media: uapi: h264: Clean slice invariants syntax elements
The H.264 specification requires in section 7.4.3 "Slice header semantics",
that the following values shall be the same in all slice headers:

  pic_parameter_set_id
  frame_num
  field_pic_flag
  bottom_field_flag
  idr_pic_id
  pic_order_cnt_lsb
  delta_pic_order_cnt_bottom
  delta_pic_order_cnt[ 0 ]
  delta_pic_order_cnt[ 1 ]
  sp_for_switch_flag
  slice_group_change_cycle

These bitstream fields are part of the slice header, and therefore
passed redundantly on each slice. The purpose of the redundancy
is to make the codec fault-tolerant in network scenarios.

This is of course not needed to be reflected in the V4L2 controls,
given the bitstream has already been parsed by applications.
Therefore, move the redundant fields to the per-frame decode
parameters control (DECODE_PARAMS).

Field 'pic_parameter_set_id' is simply removed in this case,
because the PPS control must currently contain the active PPS.

Syntax elements dec_ref_pic_marking() and those related
to pic order count, remain invariant as well, and therefore,
the fields dec_ref_pic_marking_bit_size and pic_order_cnt_bit_size
are also common to all slices.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
2287c5e65c media: uapi: h264: Clarify SLICE_BASED mode
Currently, the SLICE_BASED and FRAME_BASED modes documentation
is misleading and not matching the intended use-cases.

Drop non-required fields SLICE_PARAMS 'start_byte_offset' and
DECODE_PARAMS 'num_slices' and clarify the decoding modes in the
documentation.

On SLICE_BASED mode, a single slice is expected per OUTPUT buffer,
and therefore 'start_byte_offset' is not needed (since the offset
to the slice is the start of the buffer).

This mode requires the use of CAPTURE buffer holding, and so
the number of slices shall not be required.

On FRAME_BASED mode, the devices are expected to take care of slice
parsing. Neither SLICE_PARAMS are required (and shouldn't be
exposed by frame-based drivers), nor the number of slices.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
f6f0d58edf media: uapi: h264: Drop SLICE_PARAMS 'size' field
The SLICE_PARAMS control is intended for slice-based
devices. In this mode, the OUTPUT buffer contains
a single slice, and so the buffer's plane payload size
can be used to query the slice size.

To reduce the API surface drop the size from the
SLICE_PARAMS control.

A follow-up change will remove other members in SLICE_PARAMS
so we don't need to add padding fields here.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
f9879eb378 media: uapi: h264: Increase size of DPB entry pic_num
DPB entry PicNum maximum value is 2*MaxFrameNum for interlaced
content (field_pic_flag=1).

As specified, MaxFrameNum is 2^(log2_max_frame_num_minus4 + 4)
and log2_max_frame_num_minus4 is in the range of 0 to 12,
which means pic_num should be a 32-bit field.

The v4l2_h264_dpb_entry struct needs to be padded to avoid a hole,
which might be also useful to allow future uAPI extensions.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
c02ff21952 media: uapi: h264: Clean DPB entry interface
As discussed recently, the current interface for the
Decoded Picture Buffer is not enough to properly
support field coding.

This commit introduces enough semantics to support
frame and field coding, and to signal how DPB entries
are "used for reference".

Reserved fields will be added by a follow-up commit.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
4245232fa6 media: uapi: h264: Increase size of 'first_mb_in_slice' field
Slice header syntax element 'first_mb_in_slice' can point
to the last macroblock, currently the field can only reference
65536 macroblocks which is insufficient for 8K videos.

Although unlikely, a 8192x4320 video (where macroblocks are 16x16),
would contain 138240 macroblocks on a frame.

As per the H264 specification, 'first_mb_in_slice' can be up to
PicSizeInMbs - 1, so increase the size of the field to 32-bits.

Note that v4l2_ctrl_h264_slice_params struct will be modified
in a follow-up commit, and so we defer its 64-bit padding.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Philipp Zabel
fb92c56312 media: uapi: h264: Clarify pic_order_cnt_bit_size field
Since pic_order_cnt_bit_size is not a syntax element itself, explicitly
state that it is the total size in bits of the pic_order_cnt_lsb,
delta_pic_order_cnt_bottom, delta_pic_order_cnt[0], and
delta_pic_order_cnt[1] syntax elements contained in the slice.

[Ezequiel: rebase]

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
eb44c6c9c2 media: uapi: h264: Split prediction weight parameters
The prediction weight parameters are only required under
certain conditions, which depend on slice header parameters.

As specified in section 7.3.3 Slice header syntax, the prediction
weight table is present if:

((weighted_pred_flag && (slice_type == P || slice_type == SP)) || \
(weighted_bipred_idc == 1 && slice_type == B))

Given its size, it makes sense to move this table to its control,
so applications can avoid passing it if the slice doesn't specify it.

Before this change struct v4l2_ctrl_h264_slice_params was 960 bytes.
With this change, it's 188 bytes and struct v4l2_ctrl_h264_pred_weight
is 772 bytes.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Ezequiel Garcia
cefdf80584 media: uapi: h264: Further clarify scaling lists order
Commit 0b0393d59e ("media: uapi: h264: clarify
expected scaling_list_4x4/8x8 order") improved the
documentation on H264 scaling lists order.

This commit improves the documentation by clarifying
that the lists themselves are expected in raster scan order.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:28 +02:00
Jernej Skrabec
e000e1fa4b media: uapi: h264: Update reference lists
When dealing with interlaced frames, reference lists must tell if
each particular reference is meant for top or bottom field. This info
is currently not provided at all in the H264 related controls.

Change reference lists to hold a structure, which specifies
an index into the DPB array and the field/frame specification
for the picture.

Currently the only user of these lists is Cedrus which is just compile
fixed here. Actual usage of will come in a following commit.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:27 +02:00
Sakari Ailus
e4cf8c58af media: Documentation: media: Document how to write camera sensor drivers
While we have had some example drivers, there has been up to date no
formal documentation on how camera sensor drivers should be written; what
are the practices, why, and where they apply.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:27 +02:00
Jordan Hand
9eb88a819f media: ipu3.rst: Format media-ctl and yavta commands as code blocks
Fix improper line breaks and format all example yavta and media-ctl
commands as code blocks to improve readability.

Signed-off-by: Jordan Hand <jorhand@linux.microsoft.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:27 +02:00
Jacopo Mondi
09e0046036 media: dt-bindings: media: ov5647: Document clock-noncontinuous
Document the optional clock-noncontinuous endpoint property that
allows enabling MIPI CSI-2 non-continuous clock operations.

Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:27 +02:00
Jacopo Mondi
a541298877 media: dt-bindings: media: ov5647: Document pwdn-gpios
Document in dt-schema bindings for the ov5647 sensor the optional
'pwdn-gpios' property.

Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:27 +02:00
Jacopo Mondi
93d087f8e6 media: dt-bindings: media: ov5647: Convert to json-schema
Convert the ov5647 image sensor bindings to DT schema and add
the file entry to MAINTAINERS.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:26 +02:00
Jonathan Bakker
31163906f1 media: dt-bindings: media: Correct samsung-fimc parallel port numbering
The parallel port nodes should be numbered 1 and 2, not 0 and 1
for A and B respectively.  The driver has always implemented 1
and 2 and the in-tree Goni DTS uses 1 as port A as well.  Update
the documentation to match this behaviour.

Signed-off-by: Jonathan Bakker <xc-racer2@live.ca>
Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01 14:13:26 +02:00
Dafna Hirschfeld
47ad02d12e media: Documentation: v4l: move table of v4l2_pix_format(_mplane) flags to pixfmt-v4l2.rst
The table of the flags of the structs
v4l2_pix_format(_mplane) is currently in pixfmt-reserved.rst
which is wrong, it should be in pixfmt-v4l2.rst

Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-08-29 08:30:13 +02:00
Xia Jiang
3e66e1d8e3 media: dt-bindings: Add jpeg enc device tree node document
Add jpeg enc device tree node document.

Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Xia Jiang <xia.jiang@mediatek.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-08-28 15:35:53 +02:00
Sowjanya Komatineni
b73be49942 media: dt-bindings: tegra: Update VI and CSI bindings with port info
Update VI and CSI bindings to add port and endpoint nodes as per
media video-interfaces DT binding document.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-08-28 15:10:10 +02:00
Hans Verkuil
b305dfe2e9 media: videodev2.h: RGB BT2020 and HSV are always full range
The default RGB quantization range for BT.2020 is full range (just as for
all the other RGB pixel encodings), not limited range.

Update the V4L2_MAP_QUANTIZATION_DEFAULT macro and documentation
accordingly.

Also mention that HSV is always full range and cannot be limited range.

When RGB BT2020 was introduced in V4L2 it was not clear whether it should
be limited or full range, but full range is the right (and consistent)
choice.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-08-26 16:39:35 +02:00
Hans Verkuil
1c5a9be98e media: dev-sliced-vbi.rst: fix wrong type
The documentation reported service_set as a __u32, but according to
videodev2.h it is a __u16. Correct the documentation.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-08-26 16:39:01 +02:00
Linus Torvalds
50f6c7dbd9 Merge tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes and small updates all around the place:

   - Fix mitigation state sysfs output

   - Fix an FPU xstate/sxave code assumption bug triggered by
     Architectural LBR support

   - Fix Lightning Mountain SoC TSC frequency enumeration bug

   - Fix kexec debug output

   - Fix kexec memory range assumption bug

   - Fix a boundary condition in the crash kernel code

   - Optimize porgatory.ro generation a bit

   - Enable ACRN guests to use X2APIC mode

   - Reduce a __text_poke() IRQs-off critical section for the benefit of
     PREEMPT_RT"

* tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Acquire pte lock with interrupts enabled
  x86/bugs/multihit: Fix mitigation reporting when VMX is not in use
  x86/fpu/xstate: Fix an xstate size check warning with architectural LBRs
  x86/purgatory: Don't generate debug info for purgatory.ro
  x86/tsr: Fix tsc frequency enumeration bug on Lightning Mountain SoC
  kexec_file: Correctly output debugging information for the PT_LOAD ELF header
  kexec: Improve & fix crash_exclude_mem_range() to handle overlapping ranges
  x86/crash: Correct the address boundary of function parameters
  x86/acrn: Remove redundant chars from ACRN signature
  x86/acrn: Allow ACRN guest to use X2APIC mode
2020-08-15 10:38:03 -07:00
Linus Torvalds
b07175dc41 Merge tag 'devicetree-fixes-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
 "Another round of 'allOf' removals and whitespace clean-ups of schemas"

* tag 'devicetree-fixes-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: Remove more cases of 'allOf' containing a '$ref'
  dt-bindings: Whitespace clean-ups in schema files
2020-08-15 08:19:58 -07:00
Linus Torvalds
1a5d9dbbaf Merge tag 'pm-5.9-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull one more power management update from Rafael Wysocki:
 "Modify the intel_pstate driver to allow it to work in the passive mode
  with hardware-managed P-states (HWP) enabled"

* tag 'pm-5.9-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Implement passive mode with HWP enabled
2020-08-15 08:17:01 -07:00
Linus Torvalds
884e0d3dd5 Merge tag 'mfd-next-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
 "Core Frameworks
   - Make better attempt at matching device with the correct OF node
   - Allow batch removal of hierarchical sub-devices

  New Drivers
   - Add STM32 Clocksource driver
   - Add support for Khadas System Control Microcontroller

  Driver Removal
   - Remove unused driver for TI's SMSC ECE1099

  New Device Support
   - Add support for Intel Emmitsburg PCH to Intel LPSS PCI
   - Add support for Intel Tiger Lake PCH-H to Intel LPSS PCI
   - Add support for Dialog DA revision to Dialog DA9063

  New Functionality
   - Add support for AXP803 to be probed by I2C

  Fix-ups
   - Numerous W=1 warning fixes
   - Device Tree changes (stm32-lptimer, gateworks-gsc, khadas,mcu, stmfx, cros-ec, j721e-system-controller)
   - Enabled Regmap 'fast I/O' in stm32-lptimer
   - Change BUG_ON to WARN_ON in arizona-core
   - Remove superfluous code/initialisation (madera, max14577)
   - Trivial formatting/spelling issues (madera-core, madera-i2c, da9055, max77693-private)
   - Switch to of_platform_populate() in sprd-sc27xx-spi
   - Expand out set/get brightness/pwm macros in lm3533-ctrlbank
   - Disable IRQs on suspend in motorola-cpcap
   - Clean-up error handling in intel_soc_pmic_mrfld
   - Ensure correct removal order of sub-devices in madera
   - Many s/HTTP/HTTPS/ link changes
   - Ensure name used with Regmap is unique in syscon

  Bug Fixes
   - Properly 'put' clock on unbind and error in arizona-core
   - Fix revision handling in da9063
   - Fix 'assignment of read-only location' error in kempld-core
   - Avoid using the Regmap API when atomic in rn5t618
   - Redefine volatile register description in rn5t618
   - Use locking to protect event handler in dln2"

* tag 'mfd-next-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (76 commits)
  mfd: syscon: Use a unique name with regmap_config
  mfd: Replace HTTP links with HTTPS ones
  mfd: dln2: Run event handler loop under spinlock
  mfd: madera: Improve handling of regulator unbinding
  mfd: mfd-core: Add mechanism for removal of a subset of children
  mfd: intel_soc_pmic_mrfld: Simplify the return expression of intel_scu_ipc_dev_iowrite8()
  mfd: max14577: Remove redundant initialization of variable current_bits
  mfd: rn5t618: Fix caching of battery related registers
  mfd: max77693-private: Drop a duplicated word
  mfd: da9055: pdata.h: Drop a duplicated word
  mfd: rn5t618: Make restart handler atomic safe
  mfd: kempld-core: Fix 'assignment of read-only location' error
  mfd: axp20x: Allow the AXP803 to be probed by I2C
  mfd: da9063: Add support for latest DA silicon revision
  mfd: da9063: Fix revision handling to correctly select reg tables
  dt-bindings: mfd: st,stmfx: Remove I2C unit name
  dt-bindings: mfd: ti,j721e-system-controller.yaml: Add J721e system controller
  mfd: motorola-cpcap: Disable interrupt for suspend
  mfd: smsc-ece1099: Remove driver
  mfd: core: Add OF_MFD_CELL_REG() helper
  ...
2020-08-15 08:09:38 -07:00
Rafael J. Wysocki
f3db6de55e Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: intel_pstate: Implement passive mode with HWP enabled
2020-08-14 19:39:35 +02:00
Rob Herring
5f0b06da5c dt-bindings: Remove more cases of 'allOf' containing a '$ref'
Another wack-a-mole pass of killing off unnecessary 'allOf + $ref'
usage.

json-schema versions draft7 and earlier have a weird behavior in that
any keywords combined with a '$ref' are ignored (silently). The correct
form was to put a '$ref' under an 'allOf'. This behavior is now changed
in the 2019-09 json-schema spec and '$ref' can be mixed with other
keywords. The json-schema library doesn't yet support this, but the
tooling now does a fixup for this and either way works.

This has been a constant source of review comments, so let's change this
treewide so everyone copies the simpler syntax.

Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-14 09:28:52 -06:00
Rob Herring
f516fb704d dt-bindings: Whitespace clean-ups in schema files
Clean-up incorrect indentation, extra spaces, long lines, and missing
EOF newline in schema files. Most of the clean-ups are for list
indentation which should always be 2 spaces more than the preceding
keyword.

Found with yamllint (which I plan to integrate into the checks).

Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-clk@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-spi@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-remoteproc@vger.kernel.org
Cc: linux-hwmon@vger.kernel.org
Cc: linux-i2c@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-iio@vger.kernel.org
Cc: linux-input@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: alsa-devel@alsa-project.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Cc: netdev@vger.kernel.org
Cc: linux-rtc@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-14 08:55:58 -06:00
Linus Torvalds
a1d21081a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Some merge window fallout, some longer term fixes:

   1) Handle headroom properly in lapbether and x25_asy drivers, from
      Xie He.

   2) Fetch MAC address from correct r8152 device node, from Thierry
      Reding.

   3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg,
      from Rouven Czerwinski.

   4) Correct fdputs in socket layer, from Miaohe Lin.

   5) Revert troublesome sockptr_t optimization, from Christoph Hellwig.

   6) Fix TCP TFO key reading on big endian, from Jason Baron.

   7) Missing CAP_NET_RAW check in nfc, from Qingyu Li.

   8) Fix inet fastreuse optimization with tproxy sockets, from Tim
      Froidcoeur.

   9) Fix 64-bit divide in new SFC driver, from Edward Cree.

  10) Add a tracepoint for prandom_u32 so that we can more easily
      perform usage analysis. From Eric Dumazet.

  11) Fix rwlock imbalance in AF_PACKET, from John Ogness"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  net: openvswitch: introduce common code for flushing flows
  af_packet: TPACKET_V3: fix fill status rwlock imbalance
  random32: add a tracepoint for prandom_u32()
  Revert "ipv4: tunnel: fix compilation on ARCH=um"
  net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
  net: ethernet: stmmac: Disable hardware multicast filter
  net: stmmac: dwmac1000: provide multicast filter fallback
  ipv4: tunnel: fix compilation on ARCH=um
  vsock: fix potential null pointer dereference in vsock_poll()
  sfc: fix ef100 design-param checking
  net: initialize fastreuse on inet_inherit_port
  net: refactor bind_bucket fastreuse into helper
  net: phy: marvell10g: fix null pointer dereference
  net: Fix potential memory leak in proto_register()
  net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
  ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
  net/nfc/rawsock.c: add CAP_NET_RAW check.
  hinic: fix strncpy output truncated compile warnings
  drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check
  net/tls: Fix kmap usage
  ...
2020-08-13 20:03:11 -07:00
Linus Torvalds
e764a1e323 Merge branch 'i2c/for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:

 - bus recovery can now be given a pinctrl handle and the I2C core will
   do all the steps to switch to/from GPIO which can save quite some
   boilerplate code from drivers

 - "fallthrough" conversion

 - driver updates, mostly ID additions

* 'i2c/for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (32 commits)
  i2c: iproc: fix race between client unreg and isr
  i2c: eg20t: use generic power management
  i2c: eg20t: Drop PCI wakeup calls from .suspend/.resume
  i2c: mediatek: Fix i2c_spec_values description
  i2c: mediatek: Add i2c compatible for MediaTek MT8192
  dt-bindings: i2c: update bindings for MT8192 SoC
  i2c: mediatek: Add access to more than 8GB dram in i2c driver
  i2c: mediatek: Add apdma sync in i2c driver
  i2c: i801: Add support for Intel Tiger Lake PCH-H
  i2c: i801: Add support for Intel Emmitsburg PCH
  i2c: bcm2835: Replace HTTP links with HTTPS ones
  Documentation: i2c: dev: 'block process call' is supported
  i2c: at91: Move to generic GPIO bus recovery
  i2c: core: treat EPROBE_DEFER when acquiring SCL/SDA GPIOs
  i2c: core: add generic I2C GPIO recovery
  dt-bindings: i2c: add generic properties for GPIO bus recovery
  i2c: rcar: avoid race when unregistering slave
  i2c: tegra: Avoid tegra_i2c_init_dma() for Tegra210 vi i2c
  i2c: tegra: Fix runtime resume to re-init VI I2C
  i2c: tegra: Fix the error path in tegra_i2c_runtime_resume
  ...
2020-08-13 18:41:00 -07:00
Linus Torvalds
dddcbc139e Merge tag 'docs-5.9-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
 "A handful of obvious fixes that wandered in during the merge window"

* tag 'docs-5.9-2' of git://git.lwn.net/linux:
  Documentation/locking/locktypes: fix the typo
  doc/zh_CN: resolve undefined label warning in admin-guide index
  doc/zh_CN: fix title heading markup in admin-guide cpu-load
  docs: remove the 2.6 "Upgrading I2C Drivers" guide
  docs: Correct the release date of 5.2 stable
  mailmap: Update comments for with format and more detalis
  docs: cdrom: Fix a typo and rst markup
  Doc: admin-guide: use correct legends in kernel-parameters.txt
  Documentation/features: refresh RISC-V arch support files
  documentation: coccinelle: Improve command example for make C={1,2}
  Core-api: Documentation: Replace deprecated :c:func: Usage
  Dev-tools: Documentation: Replace deprecated :c:func: Usage
  Filesystems: Documentation: Replace deprecated :c:func: Usage
  docs: trace: fix a typo
2020-08-13 13:57:45 -07:00
Huang Shijie
1edcd4675e Documentation/locking/locktypes: fix the typo
We have three categories locks, not two.

Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200813060220.18199-1-sjhuang@iluvatar.ai
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-08-13 14:47:38 -06:00
Alexander A. Klimov
4f4ed4543e mfd: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-08-13 07:50:59 +01:00
Fabio Estevam
a3f673d009 dt-bindings: mfd: st,stmfx: Remove I2C unit name
Remove the I2C unit name to fix the following build warning with
'make dt_binding_check':

Warning (unit_address_vs_reg): /example-0/i2c@0: node has a unit name, but no reg or ranges property

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-08-13 07:49:46 +01:00
Roger Quadros
e9faaf056d dt-bindings: mfd: ti,j721e-system-controller.yaml: Add J721e system controller
Add DT binding schema for J721e system controller.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-08-13 07:49:44 +01:00
Michael Walle
7d2594cd1f mfd: smsc-ece1099: Remove driver
This MFD driver has no user. The keypad driver of this device never made
it into the kernel. Therefore, this driver is useless. Remove it.

Signed-off-by: Michael Walle <michael@walle.cc>
Cc: Sourav Poddar <sourav.poddar@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-08-13 07:49:40 +01:00
Linus Torvalds
dc06fe51d2 Merge tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
 "Not much this cycle - mostly non urgent driver fixes:

   - ds1374: use watchdog core

   - pcf2127: add alarm and pcf2129 support"

* tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: pcf2127: fix alarm handling
  rtc: pcf2127: add alarm support
  rtc: pcf2127: add pca2129 device id
  rtc: max77686: Fix wake-ups for max77620
  rtc: ds1307: provide an indication that the watchdog has fired
  rtc: ds1374: remove unused define
  rtc: ds1374: fix RTC_DRV_DS1374_WDT dependencies
  rtc: cleanup obsolete comment about struct rtc_class_ops
  rtc: pl031: fix set_alarm by adding back call to alarm_irq_enable
  rtc: ds1374: wdt: Use watchdog core for watchdog part
  rtc: Replace HTTP links with HTTPS ones
  rtc: goldfish: Enable interrupt in set_alarm() when necessary
  rtc: max77686: Do not allow interrupt to fire before system resume
  rtc: imxdi: fix trivial typos
  rtc: cpcap: fix range
2020-08-12 17:17:00 -07:00
Linus Torvalds
8cd84b7096 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
 "PPC:
   - Improvements and bugfixes for secure VM support, giving reduced
     startup time and memory hotplug support.

   - Locking fixes in nested KVM code

   - Increase number of guests supported by HV KVM to 4094

   - Preliminary POWER10 support

  ARM:
   - Split the VHE and nVHE hypervisor code bases, build the EL2 code
     separately, allowing for the VHE code to now be built with
     instrumentation

   - Level-based TLB invalidation support

   - Restructure of the vcpu register storage to accomodate the NV code

   - Pointer Authentication available for guests on nVHE hosts

   - Simplification of the system register table parsing

   - MMU cleanups and fixes

   - A number of post-32bit cleanups and other fixes

  MIPS:
   - compilation fixes

  x86:
   - bugfixes

   - support for the SERIALIZE instruction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
  KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup
  x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled
  MIPS: KVM: Convert a fallthrough comment to fallthrough
  MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64
  x86: Expose SERIALIZE for supported cpuid
  KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled
  KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort()
  KVM: arm64: Don't skip cache maintenance for read-only memslots
  KVM: arm64: Handle data and instruction external aborts the same way
  KVM: arm64: Rename kvm_vcpu_dabt_isextabt()
  KVM: arm: Add trace name for ARM_NISV
  KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text
  KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS
  KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE
  KVM: PPC: Book3S HV: Rework secure mem slot dropping
  KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up
  KVM: PPC: Book3S HV: Migrate hot plugged memory
  KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs
  KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START
  ...
2020-08-12 12:25:06 -07:00
Linus Torvalds
05a5b5d8a2 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull more clk updates from Stephen Boyd:
 "Here's some more updates that missed the last pull request because I
  happened to tag the tree at an earlier point in the history of
  clk-next. I must have fat fingered it and checked out an older version
  of clk-next on this second computer I'm using.

  This time it actually includes more code for Qualcomm SoCs, the AT91
  major updates, and some Rockchip SoC clk driver updates as well. I've
  corrected this flow so this shouldn't happen again"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (83 commits)
  clk: bcm2835: Do not use prediv with bcm2711's PLLs
  clk: drop unused function __clk_get_flags
  clk: hsdk: Fix bad dependency on IOMEM
  dt-bindings: clock: Fix YAML schemas for LPASS clocks on SC7180
  clk: mmp: avoid missing prototype warning
  clk: sparx5: Add Sparx5 SoC DPLL clock driver
  dt-bindings: clock: sparx5: Add bindings include file
  clk: qoriq: add LS1021A core pll mux options
  clk: clk-atlas6: fix return value check in atlas6_clk_init()
  clk: tegra: pll: Improve PLLM enable-state detection
  clk: X1000: Add support for calculat REFCLK of USB PHY.
  clk: JZ4780: Reformat the code to align it.
  clk: JZ4780: Add functions for enable and disable USB PHY.
  clk: Ingenic: Add RTC related clocks for Ingenic SoCs.
  dt-bindings: clock: Add tabs to align code.
  dt-bindings: clock: Add RTC related clocks for Ingenic SoCs.
  clk: davinci: Use fallthrough pseudo-keyword
  clk: imx: Use fallthrough pseudo-keyword
  clk: qcom: gcc-sdm660: Fix up gcc_mss_mnoc_bimc_axi_clk
  clk: qcom: gcc-sdm660: Add missing modem reset
  ...
2020-08-12 12:19:49 -07:00
Linus Torvalds
4586039427 Merge tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:

 - f71808e_wdt imporvements

 - dw_wdt improvements

 - mlx-wdt: support new watchdog type with longer timeout period

 - fallthrough pseudo-keyword replacements

 - overall small fixes and improvements

* tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits)
  watchdog: rti-wdt: balance pm runtime enable calls
  watchdog: rti-wdt: attach to running watchdog during probe
  watchdog: add support for adjusting last known HW keepalive time
  watchdog: use __watchdog_ping in startup
  watchdog: softdog: Add options 'soft_reboot_cmd' and 'soft_active_on_boot'
  watchdog: pcwd_usb: remove needless check before usb_free_coherent()
  watchdog: Replace HTTP links with HTTPS ones
  dt-bindings: watchdog: renesas,wdt: Document r8a774e1 support
  watchdog: initialize device before misc_register
  watchdog: booke_wdt: Add common nowayout parameter driver
  watchdog: scx200_wdt: Use fallthrough pseudo-keyword
  watchdog: Use fallthrough pseudo-keyword
  watchdog: f71808e_wdt: do stricter parameter validation
  watchdog: f71808e_wdt: clear watchdog timeout occurred flag
  watchdog: f71808e_wdt: remove use of wrong watchdog_info option
  watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
  docs: watchdog: codify ident.options as superset of possible status flags
  dt-bindings: watchdog: Add compatible for QCS404, SC7180, SDM845, SM8150
  dt-bindings: watchdog: Convert QCOM watchdog timer bindings to YAML
  watchdog: dw_wdt: Add DebugFS files
  ...
2020-08-12 12:13:44 -07:00
Linus Torvalds
9ad57f6dfc Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

 - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...
2020-08-12 11:24:12 -07:00
Lepton Wu
f38c85f1ba coredump: add %f for executable filename
The document reads "%e" should be "executable filename" while actually it
could be changed by things like pr_ctl PR_SET_NAME.  People who uses "%e"
in core_pattern get surprised when they find out they get thread name
instead of executable filename.

This is either a bug of document or a bug of code.  Since the behavior of
"%e" is there for long time, it could bring another surprise for users if
we "fix" the code.

So we just "fix" the document.  And more, for users who really need the
"executable filename" in core_pattern, we introduce a new "%f" for the
real executable filename.  We already have "%E" for executable path in
kernel, so just reuse most of its code for the new added "%f" format.

Signed-off-by: Lepton Wu <ytht.net@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200701031432.2978761-1-ytht.net@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:01 -07:00
Anshuman Khandual
1a5bae25e3 mm/vmstat: add events for THP migration without split
Add following new vmstat events which will help in validating THP
migration without split.  Statistics reported through these new VM events
will help in performance debugging.

1. THP_MIGRATION_SUCCESS
2. THP_MIGRATION_FAILURE
3. THP_MIGRATION_SPLIT

In addition, these new events also update normal page migration statistics
appropriately via PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE.  While here,
this updates current trace event 'mm_migrate_pages' to accommodate now
available THP statistics.

[akpm@linux-foundation.org: s/hpage_nr_pages/thp_nr_pages/]
[ziy@nvidia.com: v2]
  Link: http://lkml.kernel.org/r/C5E3C65C-8253-4638-9D3C-71A61858BB8B@nvidia.com
[anshuman.khandual@arm.com: s/thp_nr_pages/hpage_nr_pages/]
  Link: http://lkml.kernel.org/r/1594287583-16568-1-git-send-email-anshuman.khandual@arm.com

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Link: http://lkml.kernel.org/r/1594080415-27924-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:57 -07:00
Michal Hocko
b1aa7c9377 doc, mm: clarify /proc/<pid>/oom_score value range
The exported value includes oom_score_adj so the range is no [0, 1000] as
described in the previous section but rather [0, 2000].  Mention that fact
explicitly.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Link: http://lkml.kernel.org/r/20200709062603.18480-2-mhocko@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:56 -07:00
Michal Hocko
de3f32e142 doc, mm: sync up oom_score_adj documentation
There are at least two notes in the oom section.  The 3% discount for root
processes is gone since d46078b288 ("mm, oom: remove 3% bonus for
CAP_SYS_ADMIN processes").

Likewise children of the selected oom victim are not sacrificed since
bbbe480297 ("mm, oom: remove 'prefer children over parent' heuristic")

Drop both of them.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Link: http://lkml.kernel.org/r/20200709062603.18480-1-mhocko@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:56 -07:00
Nitin Gupta
facdaa917c mm: proactive compaction
For some applications, we need to allocate almost all memory as hugepages.
However, on a running system, higher-order allocations can fail if the
memory is fragmented.  Linux kernel currently does on-demand compaction as
we request more hugepages, but this style of compaction incurs very high
latency.  Experiments with one-time full memory compaction (followed by
hugepage allocations) show that kernel is able to restore a highly
fragmented memory state to a fairly compacted memory state within <1 sec
for a 32G system.  Such data suggests that a more proactive compaction can
help us allocate a large fraction of memory as hugepages keeping
allocation latencies low.

For a more proactive compaction, the approach taken here is to define a
new sysctl called 'vm.compaction_proactiveness' which dictates bounds for
external fragmentation which kcompactd tries to maintain.

The tunable takes a value in range [0, 100], with a default of 20.

Note that a previous version of this patch [1] was found to introduce too
many tunables (per-order extfrag{low, high}), but this one reduces them to
just one sysctl.  Also, the new tunable is an opaque value instead of
asking for specific bounds of "external fragmentation", which would have
been difficult to estimate.  The internal interpretation of this opaque
value allows for future fine-tuning.

Currently, we use a simple translation from this tunable to [low, high]
"fragmentation score" thresholds (low=100-proactiveness, high=low+10%).
The score for a node is defined as weighted mean of per-zone external
fragmentation.  A zone's present_pages determines its weight.

To periodically check per-node score, we reuse per-node kcompactd threads,
which are woken up every 500 milliseconds to check the same.  If a node's
score exceeds its high threshold (as derived from user-provided
proactiveness value), proactive compaction is started until its score
reaches its low threshold value.  By default, proactiveness is set to 20,
which implies threshold values of low=80 and high=90.

This patch is largely based on ideas from Michal Hocko [2].  See also the
LWN article [3].

Performance data
================

System: x64_64, 1T RAM, 80 CPU threads.
Kernel: 5.6.0-rc3 + this patch

echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/defrag

Before starting the driver, the system was fragmented from a userspace
program that allocates all memory and then for each 2M aligned section,
frees 3/4 of base pages using munmap.  The workload is mainly anonymous
userspace pages, which are easy to move around.  I intentionally avoided
unmovable pages in this test to see how much latency we incur when
hugepage allocations hit direct compaction.

1. Kernel hugepage allocation latencies

With the system in such a fragmented state, a kernel driver then allocates
as many hugepages as possible and measures allocation latency:

(all latency values are in microseconds)

- With vanilla 5.6.0-rc3

  percentile latency
  –––––––––– –––––––
	   5    7894
	  10    9496
	  25   12561
	  30   15295
	  40   18244
	  50   21229
	  60   27556
	  75   30147
	  80   31047
	  90   32859
	  95   33799

Total 2M hugepages allocated = 383859 (749G worth of hugepages out of 762G
total free => 98% of free memory could be allocated as hugepages)

- With 5.6.0-rc3 + this patch, with proactiveness=20

sysctl -w vm.compaction_proactiveness=20

  percentile latency
  –––––––––– –––––––
	   5       2
	  10       2
	  25       3
	  30       3
	  40       3
	  50       4
	  60       4
	  75       4
	  80       4
	  90       5
	  95     429

Total 2M hugepages allocated = 384105 (750G worth of hugepages out of 762G
total free => 98% of free memory could be allocated as hugepages)

2. JAVA heap allocation

In this test, we first fragment memory using the same method as for (1).

Then, we start a Java process with a heap size set to 700G and request the
heap to be allocated with THP hugepages.  We also set THP to madvise to
allow hugepage backing of this heap.

/usr/bin/time
 java -Xms700G -Xmx700G -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch

The above command allocates 700G of Java heap using hugepages.

- With vanilla 5.6.0-rc3

17.39user 1666.48system 27:37.89elapsed

- With 5.6.0-rc3 + this patch, with proactiveness=20

8.35user 194.58system 3:19.62elapsed

Elapsed time remains around 3:15, as proactiveness is further increased.

Note that proactive compaction happens throughout the runtime of these
workloads.  The situation of one-time compaction, sufficient to supply
hugepages for following allocation stream, can probably happen for more
extreme proactiveness values, like 80 or 90.

In the above Java workload, proactiveness is set to 20.  The test starts
with a node's score of 80 or higher, depending on the delay between the
fragmentation step and starting the benchmark, which gives more-or-less
time for the initial round of compaction.  As t he benchmark consumes
hugepages, node's score quickly rises above the high threshold (90) and
proactive compaction starts again, which brings down the score to the low
threshold level (80).  Repeat.

bpftrace also confirms proactive compaction running 20+ times during the
runtime of this Java benchmark.  kcompactd threads consume 100% of one of
the CPUs while it tries to bring a node's score within thresholds.

Backoff behavior
================

Above workloads produce a memory state which is easy to compact.  However,
if memory is filled with unmovable pages, proactive compaction should
essentially back off.  To test this aspect:

- Created a kernel driver that allocates almost all memory as hugepages
  followed by freeing first 3/4 of each hugepage.
- Set proactiveness=40
- Note that proactive_compact_node() is deferred maximum number of times
  with HPAGE_FRAG_CHECK_INTERVAL_MSEC of wait between each check
  (=> ~30 seconds between retries).

[1] https://patchwork.kernel.org/patch/11098289/
[2] https://lore.kernel.org/linux-mm/20161230131412.GI13301@dhcp22.suse.cz/
[3] https://lwn.net/Articles/817905/

Signed-off-by: Nitin Gupta <nigupta@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Reviewed-by: Oleksandr Natalenko <oleksandr@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Nitin Gupta <ngupta@nitingupta.dev>
Cc: Oleksandr Natalenko <oleksandr@redhat.com>
Link: http://lkml.kernel.org/r/20200616204527.19185-1-nigupta@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:56 -07:00
Roman Gushchin
772616b031 mm: memcg/percpu: per-memcg percpu memory statistics
Percpu memory can represent a noticeable chunk of the total memory
consumption, especially on big machines with many CPUs.  Let's track
percpu memory usage for each memcg and display it in memory.stat.

A percpu allocation is usually scattered over multiple pages (and nodes),
and can be significantly smaller than a page.  So let's add a byte-sized
counter on the memcg level: MEMCG_PERCPU_B.  Byte-sized vmstat infra
created for slabs can be perfectly reused for percpu case.

[guro@fb.com: v3]
  Link: http://lkml.kernel.org/r/20200623184515.4132564-4-guro@fb.com

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Dennis Zhou <dennis@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Bixuan Cui <cuibixuan@huawei.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200608230819.832349-4-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:55 -07:00
Liam Beguin
985b30dbe2 rtc: pcf2127: add pca2129 device id
The PCA2129 is the automotive grade version of the PCF2129.
add it to the list of compatibles.

Signed-off-by: Liam Beguin <lvb@xiphos.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Link: https://lore.kernel.org/r/20200630024211.12782-2-liambeguin@gmail.com
2020-08-12 11:37:37 +02:00