Commit Graph

784947 Commits

Author SHA1 Message Date
David S. Miller
9000a457a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree:

1) Support for matching on ipsec policy already set in the route, from
   Florian Westphal.

2) Split set destruction into deactivate and destroy phase to make it
   fit better into the transaction infrastructure, also from Florian.
   This includes a patch to warn on imbalance when setting the new
   activate and deactivate interfaces.

3) Release transaction list from the workqueue to remove expensive
   synchronize_rcu() from configuration plane path. This speeds up
   configuration plane quite a bit. From Florian Westphal.

4) Add new xfrm/ipsec extension, this new extension allows you to match
   for ipsec tunnel keys such as source and destination address, spi and
   reqid. From Máté Eckl and Florian Westphal.

5) Add secmark support, this includes connsecmark too, patches
   from Christian Gottsche.

6) Allow to specify remaining bytes in xt_quota, from Chenbo Feng.
   One follow up patch to calm a clang warning for this one, from
   Nathan Chancellor.

7) Flush conntrack entries based on layer 3 family, from Kristian Evensen.

8) New revision for cgroups2 to shrink the path field.

9) Get rid of obsolete need_conntrack(), as a result from recent
   demodularization works.

10) Use WARN_ON instead of BUG_ON, from Florian Westphal.

11) Unused exported symbol in nf_nat_ipv4_fn(), from Florian.

12) Remove superfluous check for timeout netlink parser and dump
    functions in layer 4 conntrack helpers.

13) Unnecessary redundant rcu read side locks in NAT redirect,
    from Taehee Yoo.

14) Pass nf_hook_state structure to error handlers, patch from
    Florian Westphal.

15) Remove ->new() interface from layer 4 protocol trackers. Place
    them in the ->packet() interface. From Florian.

16) Place conntrack ->error() handling in the ->packet() interface.
    Patches from Florian Westphal.

17) Remove unused parameter in the pernet initialization path,
    also from Florian.

18) Remove additional parameter to specify layer 3 protocol when
    looking up for protocol tracker. From Florian.

19) Shrink array of layer 4 protocol trackers, from Florian.

20) Check for linear skb only once from the ALG NAT mangling
    codebase, from Taehee Yoo.

21) Use rhashtable_walk_enter() instead of deprecated
    rhashtable_walk_init(), also from Taehee.

22) No need to flush all conntracks when only one single address
    is gone, from Tan Hu.

23) Remove redundant check for NAT flags in flowtable code, from
    Taehee Yoo.

24) Use rhashtable_lookup() instead of rhashtable_lookup_fast()
    from netfilter codebase, since rcu read lock side is already
    assumed in this path.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 21:28:55 -07:00
Nathan Chancellor
ffa0a9a590 netfilter: xt_quota: Don't use aligned attribute in sizeof
Clang warns:

net/netfilter/xt_quota.c:47:44: warning: 'aligned' attribute ignored
when parsing type [-Wignored-attributes]
        BUILD_BUG_ON(sizeof(atomic64_t) != sizeof(__aligned_u64));
                                                  ^~~~~~~~~~~~~

Use 'sizeof(__u64)' instead, as the alignment doesn't affect the size
of the type.

Fixes: e9837e55b0 ("netfilter: xt_quota: fix the behavior of xt_quota module")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-09 00:19:25 +02:00
Ioana Ciocoi Radulescu
68049a5f4d dpaa2-eth: Don't account Tx confirmation frames on NAPI poll
Until now, both Rx and Tx confirmation frames handled during
NAPI poll were counted toward the NAPI budget. However, Tx
confirmations are lighter to process than Rx frames, which can
skew the amount of work actually done inside one NAPI cycle.

Update the code to only count Rx frames toward the NAPI budget
and set a separate threshold on how many Tx conf frames can be
processed in one poll cycle.

The NAPI poll routine stops when either the budget is consumed
by Rx frames or when Tx confirmation frames reach this threshold.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:05:05 -07:00
YueHaibing
9e19dabc05 net: mscc: ocelot: remove set but not used variable 'phy_mode'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/mscc/ocelot_board.c: In function 'mscc_ocelot_probe':
drivers/net/ethernet/mscc/ocelot_board.c:262:17: warning:
 variable 'phy_mode' set but not used [-Wunused-but-set-variable]
   enum phy_mode phy_mode;

It never used since introduction in
commit 71e32a20cf ("net: mscc: ocelot: make use of SerDes PHYs for handling their configuration")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:03:42 -07:00
David S. Miller
ee9615be25 Merge branch 'more-pmtu-selftests'
Sabrina Dubroca says:

====================
selftests: add more PMTU tests

The current selftests for PMTU cover VTI tunnels, but there's nothing
about the generation and handling of PMTU exceptions by intermediate
routers. This series adds and improves existing helpers, then adds
IPv4 and IPv6 selftests with a setup involving an intermediate router.

Joint work with Stefano Brivio.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:24 -07:00
Sabrina Dubroca
e44e428f59 selftests: pmtu: add basic IPv4 and IPv6 PMTU tests
Commit d1f1b9cbf3 ("selftests: net: Introduce first PMTU test") and
follow-ups introduced some PMTU tests, but they all rely on tunneling,
and, particularly, on VTI.

These new tests use simple routing to exercise the generation and
update of PMTU exceptions in IPv4 and IPv6.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:23 -07:00
Sabrina Dubroca
72ebddd7ff selftests: pmtu: extend MTU parsing helper to locked MTU
The mtu_parse helper introduced in commit f2c929feec ("selftests:
pmtu: Factor out MTU parsing helper") can only handle "mtu 1234", but
not "mtu lock 1234". Extend it, so that we can do IPv4 tests with PMTU
smaller than net.ipv4.route.min_pmtu

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:23 -07:00
Stefano Brivio
1e0a720779 selftests: pmtu: Introduce check_pmtu_value()
Introduce and use a function that checks PMTU values against
expected values and logs error messages, to remove some clutter.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:23 -07:00
Gustavo A. R. Silva
062f97a314 isdn/gigaset/isocdata: mark expected switch fall-through
Notice that in this particular case, I replaced the
"--v-- fall through --v--" comment with a proper
"fall through", which is what GCC is expecting to
find.

This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:54:14 -07:00
David S. Miller
cd7f7df6ca Merge branch 'rtnetlink-Add-support-for-rigid-checking-of-data-in-dump-request'
David Ahern says:

====================
rtnetlink: Add support for rigid checking of data in dump request

There are many use cases where a user wants to influence what is
returned in a dump for some rtnetlink command: one is wanting data
for a different namespace than the one the request is received and
another is limiting the amount of data returned in the dump to a
specific set of interest to userspace, reducing the cpu overhead of
both kernel and userspace. Unfortunately, the kernel has historically
not been strict with checking for the proper header or checking the
values passed in the header. This lenient implementation has allowed
iproute2 and other packages to pass any struct or data in the dump
request as long as the family is the first byte. For example, ifinfomsg
struct is used by iproute2 for all generic dump requests - links,
addresses, routes and rules when it is really only valid for link
requests.

There is 1 is example where the kernel deals with the wrong struct: link
dumps after VF support was added. Older iproute2 was sending rtgenmsg as
the header instead of ifinfomsg so a patch was added to try and detect
old userspace vs new:
e5eca6d41f ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0")

The latest example is Christian's patch set wanting to return addresses for
a target namespace. It guesses the header struct is an ifaddrmsg and if it
guesses wrong a netlink warning is generated in the kernel log on every
address dump which is unacceptable.

Another example where the kernel is a bit lenient is route dumps: iproute2
can send either a request with either ifinfomsg or a rtmsg as the header
struct, yet the kernel always treats the header as an rtmsg (see
inet_dump_fib and rtm_flags check). The header inconsistency impacts the
ability to add kernel side filters for route dumps - a necessary feature
for scale setups with 100k+ routes.

How to resolve the problem of not breaking old userspace yet be able to
move forward with new features such as kernel side filtering which are
crucial for efficient operation at high scale?

This patch set addresses the problem by adding a new socket flag,
NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to
request strict checking of headers and attributes on dump requests and
hence unlock the ability to use kernel side filters as they are added.

Kernel side, the dump handlers are updated to verify the message contains
at least the expected header struct:
    RTM_GETLINK:       ifinfomsg
    RTM_GETADDR:       ifaddrmsg
    RTM_GETMULTICAST:  ifaddrmsg
    RTM_GETANYCAST:    ifaddrmsg
    RTM_GETADDRLABEL:  ifaddrlblmsg
    RTM_GETROUTE:      rtmsg
    RTM_GETSTATS:      if_stats_msg
    RTM_GETNEIGH:      ndmsg
    RTM_GETNEIGHTBL:   ndtmsg
    RTM_GETNSID:       rtgenmsg
    RTM_GETRULE:       fib_rule_hdr
    RTM_GETNETCONF:    netconfmsg
    RTM_GETMDB:        br_port_msg

And then every field in the header struct should be 0 with the exception
of the family. There are a few exceptions to this rule where the kernel
already influences the data returned by values in the struct. Next the
message should not contain attributes unless the kernel implements
filtering for it. Any unexpected data causes the dump to fail with EINVAL.
If the new flag is honored by the kernel and the dump contents adjusted
by any data passed in the request, the dump handler can set the
NLM_F_DUMP_FILTERED flag in the netlink message header.

For old userspace on new kernel there is no impact as all checks are
wrapped in a check on the new strict flag. For new userspace on old
kernel, the data in the headers and any appended attributes are
silently ignored though the setsockopt failing is the clue to userspace
the feature is not supported. New userspace on new kernel gets the
requested data dump.

iproute2 patches can be found here:
    https://github.com/dsahern/iproute2 dump-enhancements

Major changes since v1
- inner header is supposed to be 4-bytes aligned. So for dumps that
  should not have attributes appended changed the check to use:
        if (nlmsg_attrlen(nlh, sizeof(hdr)))
  Only impacts patches with headers that are not multiples of 4-bytes
  (rtgenmsg, netconfmsg), but applied the change to all patches not
  calling nlmsg_parse for consistency.

- Added nlmsg_parse_strict and nla_parse_strict for tighter control on
  attribute parsing. There should be no unknown attribute types or extra
  bytes.

- Moved validation to a helper in most cases

Changes since rfc-v2
- dropped the NLM_F_DUMP_FILTERED flag from target nsid dumps per
  Jiri's objections
- changed the opt-in uapi from a netlink message flag to a socket
  flag. setsockopt provides an api for userspace to definitively
  know if the kernel supports strict checking on dumps.
- re-ordered patches to peel off the extack on dumps if needed to
  keep this set size within limits
- misc cleanups in patches based on testing
====================

Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:06 -07:00
David Ahern
8c6e137fbc rtnetlink: Update rtnl_fdb_dump for strict data checking
Update rtnl_fdb_dump for strict data checking. If the flag is set,
the dump request is expected to have an ndmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the NDA_IFINDEX and
NDA_MASTER attributes are supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
8dfbda19a2 rtnetlink: Move input checking for rtnl_fdb_dump to helper
Move the existing input checking for rtnl_fdb_dump into a helper,
valid_fdb_dump_legacy. This function will retain the current
logic that works around the 2 headers that userspace has been
allowed to send up to this point.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
c77b93641e net/bridge: Update br_mdb_dump for strict data checking
Update br_mdb_dump for strict data checking. If the flag is set,
the dump request is expected to have a br_port_msg struct as the
header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
addd383f5a net: Update netconf dump handlers for strict data checking
Update inet_netconf_dump_devconf, inet6_netconf_dump_devconf, and
mpls_netconf_dump_devconf for strict data checking. If the flag is set,
the dump request is expected to have an netconfmsg struct as the header.
The struct only has the family member and no attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
f2ae64bb6b net/ipv6: Update ip6addrlbl_dump for strict data checking
Update ip6addrlbl_dump for strict data checking. If the flag is set,
the dump request is expected to have an ifaddrlblmsg struct as the
header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
4a73e5e56d net/fib_rules: Update fib_nl_dumprule for strict data checking
Update fib_nl_dumprule for strict data checking. If the flag is set,
the dump request is expected to have fib_rule_hdr struct as the header.
All elements of the struct are expected to be 0 and no attributes can
be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
f80f14c364 net/namespace: Update rtnl_net_dumpid for strict data checking
Update rtnl_net_dumpid for strict data checking. If the flag is set,
the dump request is expected to have an rtgenmsg struct as the header
which has the family as the only element. No data may be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
9632d47f6a net/neighbor: Update neightbl_dump_info for strict data checking
Update neightbl_dump_info for strict data checking. If the flag is set,
the dump request is expected to have an ndtmsg struct as the header.
All elements of the struct are expected to be 0 and no attributes can
be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
51183d233b net/neighbor: Update neigh_dump_info for strict data checking
Update neigh_dump_info for strict data checking. If the flag is set,
the dump request is expected to have an ndmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the NDA_IFINDEX and
NDA_MASTER attributes are supported.

Existing code does not fail the dump if nlmsg_parse fails. That behavior
is kept for non-strict checking.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
e8ba330ac0 rtnetlink: Update fib dumps for strict data checking
Add helper to check netlink message for route dumps. If the strict flag
is set the dump request is expected to have an rtmsg struct as the header.
All elements of the struct are expected to be 0 with the exception of
rtm_flags (which is used by both ipv4 and ipv6 dumps) and no attributes
can be appended. rtm_flags can only have RTM_F_CLONED and RTM_F_PREFIX
set.

Update inet_dump_fib, inet6_dump_fib, mpls_dump_routes, ipmr_rtm_dumproute,
and ip6mr_rtm_dumproute to call this helper if strict data checking is
enabled.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
14fc5bb29f rtnetlink: Update ipmr_rtm_dumplink for strict data checking
Update ipmr_rtm_dumplink for strict data checking. If the flag is set,
the dump request is expected to have an ifinfomsg struct as the header.
All elements of the struct are expected to be 0 and no attributes can
be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
786e0007e2 rtnetlink: Update inet6_dump_ifinfo for strict data checking
Update inet6_dump_ifinfo for strict data checking. If the flag is
set, the dump request is expected to have an ifinfomsg struct as
the header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
841891ec0c rtnetlink: Update rtnl_stats_dump for strict data checking
Update rtnl_stats_dump for strict data checking. If the flag is set,
the dump request is expected to have an if_stats_msg struct as the header.
All elements of the struct are expected to be 0 except filter_mask which
must be non-0 (legacy behavior). No attributes are supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
2d011be8c0 rtnetlink: Update rtnl_bridge_getlink for strict data checking
Update rtnl_bridge_getlink for strict data checking. If the flag is set,
the dump request is expected to have an ifinfomsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFLA_EXT_MASK
attribute is supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
905cf0abe8 rtnetlink: Update rtnl_dump_ifinfo for strict data checking
Update rtnl_dump_ifinfo for strict data checking. If the flag is set,
the dump request is expected to have an ifinfomsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID,
IFLA_EXT_MASK, IFLA_MASTER, and IFLA_LINKINFO attributes are supported.

Existing code does not fail the dump if nlmsg_parse fails. That behavior
is kept for non-strict checking.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
ed6eff1179 net/ipv6: Update inet6_dump_addr for strict data checking
Update inet6_dump_addr for strict data checking. If the flag is set, the
dump request is expected to have an ifaddrmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values suppored by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID
attribute is supported. Follow on patches can add support for other fields
(e.g., honor ifa_index and only return data for the given device index).

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
c33078e3df net/ipv4: Update inet_dump_ifaddr for strict data checking
Update inet_dump_ifaddr for strict data checking. If the flag is set,
the dump request is expected to have an ifaddrmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID
attribute is supported. Follow on patches can support for other fields
(e.g., honor ifa_index and only return data for the given device index).

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
89d35528d1 netlink: Add new socket option to enable strict checking on dumps
Add a new socket option, NETLINK_DUMP_STRICT_CHK, that userspace
can use via setsockopt to request strict checking of headers and
attributes on dump requests.

To get dump features such as kernel side filtering based on data in
the header or attributes appended to the dump request, userspace
must call setsockopt() for NETLINK_DUMP_STRICT_CHK and a non-zero
value. Since the netlink sock and its flags are private to the
af_netlink code, the strict checking flag is passed to dump handlers
via a flag in the netlink_callback struct.

For old userspace on new kernel there is no impact as all of the data
checks in later patches are wrapped in a check on the new strict flag.

For new userspace on old kernel, the setsockopt will fail and even if
new userspace sets data in the headers and appended attributes the
kernel will silently ignore it. Moving forward when the setsockopt
succeeds, the new userspace on old kernel means the dump request can
pass an attribute the kernel does not understand. The dump will then
fail as the older kernel does not understand it.

New userspace on new kernel setting the socket option gets the benefit
of the improved data dump.

Kernel side the NETLINK_DUMP_STRICT_CHK uapi is converted to a generic
NETLINK_F_STRICT_CHK flag which can potentially be leveraged for tighter
checking on the NEW, DEL, and SET commands.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
6ba1e6e856 net/ipv6: Refactor address dump to push inet6_fill_args to in6_dump_addrs
Pull the inet6_fill_args arg up to in6_dump_addrs and move netnsid
into it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
a5f6cba291 netlink: Add strict version of nlmsg_parse and nla_parse
nla_parse is currently lenient on message parsing, allowing type to be 0
or greater than max expected and only logging a message

    "netlink: %d bytes leftover after parsing attributes in process `%s'."

if the netlink message has unknown data at the end after parsing. What this
could mean is that the header at the front of the attributes is actually
wrong and the parsing is shifted from what is expected.

Add a new strict version that actually fails with EINVAL if there are any
bytes remaining after the parsing loop completes, if the atttrbitue type
is 0 or greater than max expected.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
dac9c9790e net: Add extack to nlmsg_parse
Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
3d0d4337d7 netlink: Add extack message to nlmsg_parse for invalid header length
Give a user a reason why EINVAL is returned in nlmsg_parse.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:03 -07:00
David Ahern
4a19edb60d netlink: Pass extack to dump handlers
Declare extack in netlink_dump and pass to dump handlers via
netlink_callback. Add any extack message after the dump_done_errno
allowing error messages to be returned. This will be useful when
strict checking is done on dump requests, returning why the dump
fails EINVAL.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:03 -07:00
Gustavo A. R. Silva
272a66173b isdn/gigaset: mark expected switch fall-throughs
Replace "--v-- fall through --v--" with a proper "fall through"
annotation. Also, change "bad cid: fall through" to
"fall through - bad cid".

This fix is part of the ongoing efforts to enabling -Wimplicit-fallthrough

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:35:53 -07:00
David S. Miller
3f2122bbb2 Merge branch 'net-sched-cls_u32-Various-improvements'
Jamal Hadi Salim says:

====================
net: sched: cls_u32 Various improvements

Various improvements from Al.

Changes from version 1: Add missing commit
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:36 -07:00
Al Viro
a030598690 net: sched: cls_u32: simplify the hell out u32_delete() emptiness check
Now that we have the knode count, we can instantly check if
any hnodes are non-empty.  And that kills the check for extra
references to root hnode - those could happen only if there was
a knode to carry such a link.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
b245d32c99 net: sched: cls_u32: keep track of knodes count in tc_u_common
allows to simplify u32_delete() considerably

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
8a8065f683 net: sched: cls_u32: get rid of tp_c
Both hnode ->tp_c and tp_c argument of u32_set_parms()
the latter is redundant, the former - never read...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
db04ff4863 net: sched: cls_u32: the tp_c argument of u32_set_parms() is always tp->data
It must be tc_u_common associated with that tp (i.e. tp->data).
Proof:
	* both ->ht_up and ->tp_c are assign-once
	* ->tp_c of anything inserted into tp_c->hlist is tp_c
	* hnodes never get reinserted into the lists or moved
between those, so anything found by u32_lookup_ht(tp->data, ...)
will have ->tp_c equal to tp->data.
	* tp->root->tp_c == tp->data.
	* ->ht_up of anything inserted into hnode->ht[...] is
equal to hnode.
	* knodes never get reinserted into hash chains or moved
between those, so anything returned by u32_lookup_key(ht, ...)
will have ->ht_up equal to ht.
	* any knode returned by u32_get(tp, ...) will have ->ht_up->tp_c
point to tp->data

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
18512f5c25 net: sched: cls_u32: pass tc_u_common to u32_set_parms() instead of tc_u_hnode
the only thing we used ht for was ht->tp_c and callers can get that
without going through ->tp_c at all; start with lifting that into
the callers, next commits will massage those, eventually removing
->tp_c altogether.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
4895c42f62 net: sched: cls_u32: clean tc_u_common hashtable
* calculate key *once*, not for each hash chain element
* let tc_u_hash() return the pointer to chain head rather than index -
callers are cleaner that way.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
07743ca5c9 net: sched: cls_u32: get rid of tc_u_common ->rcu
unused

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
ec17caf078 net: sched: cls_u32: get rid of tc_u_knode ->tp
not used anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
dc07c57363 net: sched: cls_u32: get rid of unused argument of u32_destroy_key()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
2f0c982df7 net: sched: cls_u32: make sure that divisor is a power of 2
Tested by modifying iproute2 to allow sending a divisor > 255

Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:35 -07:00
Al Viro
27594ec4b6 net: sched: cls_u32: disallow linking to root hnode
Operation makes no sense.  Nothing will actually break if we do so
(depth limit in u32_classify() will prevent infinite loops), but
according to maintainers it's best prohibited outright.

NOTE: doing so guarantees that u32_destroy() will trigger the call
of u32_destroy_hnode(); we might want to make that unconditional.

Test:
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 100 u32 \
link 800: offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
should fail with
Error: cls_u32: Not linking to root node

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:34 -07:00
Al Viro
b44ef84542 net: sched: cls_u32: mark root hnode explicitly
... and produce consistent error on attempt to delete such.
Existing check in u32_delete() is inconsistent - after

tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: u32 \
divisor 1
tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: u32 \
divisor 1

both

tc filter delete dev eth0 parent ffff: protocol ip prio 100 handle 801: u32

and

tc filter delete dev eth0 parent ffff: protocol ip prio 100 handle 800: u32

will fail (at least with refcounting fixes), but the former will complain
about an attempt to remove a busy table, while the latter will recognize
it as root and yield "Not allowed to delete root node" instead.

The problem with the existing check is that several tcf_proto instances
might share the same tp->data and handle-to-hnode lookup will be the same
for all of them. So comparing an hnode to be deleted with tp->root won't
catch the case when one tp is used to try deleting the root of another.
Solution is trivial - mark the root hnodes explicitly upon allocation and
check for that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:33:34 -07:00
David S. Miller
67db8c3ebe Merge branch 'net-phy-mscc-add-support-for-VSC8584-and-VSC8574-Microsemi-quad-port-PHYs'
Quentin Schulz says:

====================
net: phy: mscc: add support for VSC8584 and VSC8574 Microsemi quad-port PHYs

RESEND: rebased on top of latest net-next and on top of latest version of
"net: phy: mscc: various improvements to Microsemi PHY driver" patch
series.

Both PHYs are 4-port PHY that are 10/100/1000BASE-T, 100BASE-FX, 1000BASE-X
and triple-speed copper SFP capable, can communicate with the MAC via
SGMII, QSGMII or 1000BASE-X, supports downshifting and can set the blinking
pattern of each of its 4 LEDs, supports SyncE as well as HP Auto-MDIX
detection.

VSC8574 supports WOL and VSC8584 supports hardware offloading of MACsec.

This patch series add support for 10/100/1000BASE-T, SGMII/QSGMII link with
the MAC, downshifting, HP Auto-MDIX detection and blinking pattern for
their 4 LEDs.

They have also an internal Intel 8051 microcontroller whose firmware needs
to be patched when the PHY is reset. If the 8051's firmware has the
expected CRC, its patching can be skipped. The microcontroller can be
accessed from any port of the PHY, though the CRC function can only be done
through the PHY that is the base PHY of the package (internal address 0)
due to a limitation of the firmware.

The GPIO register bank is a set of registers that are common to all PHYs in
the package. So any modification in any register of this bank affects all
PHYs of the package.

If the PHYs haven't been reset before booting the Linux kernel and were
configured to use interrupts for e.g. link status updates, it is required
to clear the interrupts mask register of all PHYs before being able to use
interrupts with any PHY. The first PHY of the package that will be init
will take care of clearing all PHYs interrupts mask registers. Thus, we
need to keep track of the init sequence in the package, if it's already
been done or if it's to be done.

Most of the init sequence of a PHY of the package is common to all PHYs in
the package, thus we use the SMI broadcast feature which enables us to
propagate a write in one register of one PHY to all PHYs in the same
package.

We also introduce a new development board called PCB120 which exists in
variants for VSC8584 and VSC8574 (and that's the only difference to the
best of my knowledge).

I suggest patches 1 to 3 go through net tree and patches 4 and 5 go
through MIPS tree. Patches going through net tree and those going through
MIPS tree do not depend on one another.

This patch series depends on this patch series:
(https://lore.kernel.org/lkml/20181008100728.24959-1-quentin.schulz@bootlin.com/)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:31:28 -07:00
Quentin Schulz
00d70d8e0e net: phy: mscc: add support for VSC8574 PHY
The VSC8574 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
1000BASE-X and triple-speed copper SFP capable, can communicate with
the MAC via SGMII, QSGMII or 1000BASE-X, supports WOL, downshifting and
can set the blinking pattern of each of its 4 LEDs, supports SyncE as
well as HP Auto-MDIX detection.

This adds support for 10/100/1000BASE-T, SGMII/QSGMII link with the MAC,
WOL, downshifting, HP Auto-MDIX detection and blinking pattern for its 4
LEDs.

The VSC8574 has also an internal Intel 8051 microcontroller whose
firmware needs to be patched when the PHY is reset. If the 8051's
firmware has the expected CRC, its patching can be skipped. The
microcontroller can be accessed from any port of the PHY, though the CRC
function can only be done through the PHY that is the base PHY of the
package (internal address 0) due to a limitation of the firmware.

The GPIO register bank is a set of registers that are common to all PHYs
in the package. So any modification in any register of this bank affects
all PHYs of the package.

If the PHYs haven't been reset before booting the Linux kernel and were
configured to use interrupts for e.g. link status updates, it is
required to clear the interrupts mask register of all PHYs before being
able to use interrupts with any PHY. The first PHY of the package that
will be init will take care of clearing all PHYs interrupts mask
registers. Thus, we need to keep track of the init sequence in the
package, if it's already been done or if it's to be done.

Most of the init sequence of a PHY of the package is common to all PHYs
in the package, thus we use the SMI broadcast feature which enables us
to propagate a write in one register of one PHY to all PHYs in the same
package.

Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:31:28 -07:00
Quentin Schulz
a5afc16780 net: phy: mscc: add support for VSC8584 PHY
The VSC8584 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
1000BASE-X and triple-speed copper SFP capable, can communicate with the
MAC via SGMII, QSGMII or 1000BASE-X, supports downshifting and can set
the blinking pattern of each of its 4 LEDs, supports hardware offloading
of MACsec and supports SyncE as well as HP Auto-MDIX detection.

This adds support for 10/100/1000BASE-T, SGMII/QSGMII link with the MAC,
downshifting, HP Auto-MDIX detection and blinking pattern for its 4
LEDs.

The VSC8584 has also an internal Intel 8051 microcontroller whose
firmware needs to be patched when the PHY is reset. If the 8051's
firmware has the expected CRC, its patching can be skipped. The
microcontroller can be accessed from any port of the PHY, though the CRC
function can only be done through the PHY that is the base PHY of the
package (internal address 0) due to a limitation of the firmware.

The GPIO register bank is a set of registers that are common to all PHYs
in the package. So any modification in any register of this bank affects
all PHYs of the package.

If the PHYs haven't been reset before booting the Linux kernel and were
configured to use interrupts for e.g. link status updates, it is
required to clear the interrupts mask register of all PHYs before being
able to use interrupts with any PHY. The first PHY of the package that
will be init will take care of clearing all PHYs interrupts mask
registers. Thus, we need to keep track of the init sequence in the
package, if it's already been done or if it's to be done.

Most of the init sequence of a PHY of the package is common to all PHYs
in the package, thus we use the SMI broadcast feature which enables us
to propagate a write in one register of one PHY to all PHYs in the same
package.

The revA of the VSC8584 PHY (which is not and will not be publicly
released) should NOT patch the firmware of the microcontroller or it'll
make things worse, the easiest way is just to not support it.

Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:31:28 -07:00