Add support to be able to detect a hung Tx task by adding the netdev
ndo_tx_timeout function callback. Do not set the watchdog_timeo value
so as to use the system default time (currently 5 seconds).
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently a call to configure the Rx mode (promiscuous mode, all
multicast mode, etc.) is made in xgbe_start separate from the xgbe_init
function. This call to set the Rx mode should be part of the xgbe_init
function so that calls to the init function don't have to be preceded
with calls to configure the Rx mode.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the device must be down in order to update the rx-frames
coalescing setting because the interrupt indicator is set in the
descriptor data during initialization. Allow this setting to be changed
while the device is up by moving the interrupt decision into the
descriptor reset function and base the decision off of the supplied
descriptor index value.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the napi_alloc_skb function to allocate an skb when running within
the softirq context to avoid calls to local_irq_save/restore.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Rx coalescing value is internally converted from usecs to a value
that the hardware can use. When reporting the Rx coalescing value, this
internal value is converted back to usecs. During the conversion from
and back to usecs some rounding occurs. So, for example, when setting an
Rx usec of 30, it will be reported as 29. Fix this reporting issue by
keeping the original usec value and using that during reporting.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Tx coalescing support in the driver was a software implementation
for something lacking in the hardware. Using hrtimers, the idea was to
trigger a timer interrupt after having queued a packet for transmit.
Unfortunately, as the timer value was lowered, the timer expired before
the hardware actually did the transmit and so it was racey and resulted
in unnecessary interrupts.
Remove the Tx coalescing support and hrtimer and replace with a Tx timer
that is used as a reclaim timer in case of inactivity.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware supplies a value that indicates the DMA range that it
is capable of using. Use this value rather than hard-coding it in
the driver.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new lighter weight memory barriers when working with the device
descriptors.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible that the hardware may not have been properly shutdown
before this driver gets control, through use by firmware, for example.
Until the driver is loaded, interrupts associated with the hardware
could go pending. When the IRQs are requested napi support has not
been initialized yet, but the ISR will get control and schedule napi
processing resulting in a kernel panic because the poll routine has not
been set.
Adjust the code so that the driver is fully ready to handle and process
interrupts as soon as the IRQs are requested. This involves requesting
and freeing IRQs during start and stop processing and ordering the napi
add and delete calls appropriately.
Also adjust the powerup and powerdown routines to match the start and
stop routines in regards to the ordering of tasks, including napi
related calls.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using per channel DMA interrupts the transmit interrupt (TI) and the
receive interrupt (RI) are masked off so as to not generate an interrupt
to the main ISR. However, should another interrupt fire for the DMA channel
that is handled by the main ISR the TI/RI bits can still be set. This
will cause the wrong and uninitialized napi structure to be used causing a
panic. Add a check to be sure per channel DMA interrupts are not enabled
before acting on those bit flags.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/vxlan.c
drivers/vhost/net.c
include/linux/if_vlan.h
net/core/dev.c
The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.
In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.
In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.
In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.
Signed-off-by: David S. Miller <davem@davemloft.net>
The RSS support requires enablement based on the features reported by
the hardware. The setting of this flag is missing. Add support to
set the RSS enablement flag based on the reported hardware features.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The number of traffic classes reported by the hardware is zero-based
so increment the value returned to get an actual count.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the Tx ring cleanup can run at the same time that data is being
transmitted, a spin lock was used to protect the ring. This patch
eliminates the need for Tx spinlocks by updating the current ring
position only after all ownership bits for data being transmitted have
been set. This will insure that ring operations in the Tx cleanup path
do not interfere with the ring operations in the Tx transmit path.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the Rx descriptor ring processing similar to the Tx descriptor
ring processing. Remove the realloc_index and realloc_threshold
variables and base everything on the current index counter and the
dirty index counter.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When performing a device restart, like during an MTU change, sometimes
the device queues still have data and get hung up trying to flush
resulting in the device becoming unresponsive until brought down and
back up. To prevent this, always perform a device reset during a
restart.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This set of patches resolves some checks reported by the checkpatch
tool.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same macros are used for rx as well. So rename it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/amd/xgbe/xgbe-desc.c
drivers/net/ethernet/renesas/sh_eth.c
Overlapping changes in both conflict cases.
Signed-off-by: David S. Miller <davem@davemloft.net>
The disable_irq_nosync function, not the disable_irq function, must be
used to disable the DMA channel interrupt from within the interrupt
service routine. Change the disable_irq call to disable_irq_nosync.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When performing Tx cleanup, the dirty index counter is compared to the
current index counter as one of the tests used to determine when to stop
cleanup. The "less than" test will fail when the current index counter
rolls over to zero causing cleanup to never occur again. Update the test
to a "not equal" to avoid this situation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When requesting an irq, the name passed in must be (part of) allocated
memory. The irq name was a local variable and resulted in random
characters when listing /proc/interrupts. Add a character field to the
xgbe_channel structure to hold the irq name and use that.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to delay telling the hardware about data that is ready to
be transmitted if the skb->xmit_more flag is set.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call the appropriate BQL functions to track the number of bytes queued
during Tx processing and to track the number of packets and bytes
that have been transmitted during Tx complete processing.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the Tx and Rx related fields within the xgbe_ring_data struct into
their own structs in order to more easily see what fields are used for
each operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a read memory barrier to the Tx and Rx paths where the ownership
bit is checked to be sure that all descriptor fields are read after
having read the ownership bit for the descriptor.
This has not been an issue to date, but it's a good safe-guard to have.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the skb allocation fails during receive processing, the driver would
continue reading descriptors without first determining if there were
any more descriptors for the current packet. Update the code to check
whether more descriptors are associated with the current packet or
whether to move on to the next descriptor as a new packet.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The channel structure is freed before freeing the per channel
interrupts resulting in a kernel oops. Move the call to free
the channel structure to after the freeing of the per channel
interrupts.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the spelling of the word "descriptor" in a couple
of locations.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides support for receive side scaling (RSS). RSS allows
for spreading incoming network packets across the Rx queues. When used
in conjunction with the per DMA channel interrupt support, this allows
the receive processing to be spread across multiple processors.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides support for interrupts that are generated by the
Tx/Rx DMA channel pairs of the device. This allows for Tx and Rx
processing to run across multiple processsors.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide support for splitting IP packets so that the header and
payload can be sent to different DMA addresses. This will allow
the IP header to be put into the linear part of the skb while the
payload can be added as frags.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use page allocations for Rx buffers instead of pre-allocating skbs
of a set size.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pre_xmit function name implies that it performs operations prior
to transmitting the packet when in fact it is responsible for setting
up the descriptors and initiating the transmit. Rename this to
function from pre_xmit to dev_xmit, which is consistent with the name
used during receive processing - dev_read.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the channel and ring tracking structures allocation to device
open. This will allow for future support to vary the number of Tx/Rx
queues without unloading the module.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the amd-xgbe driver increments the packets processed counter
each time a descriptor is processed. Since a packet can be represented
by more than one descriptor incrementing the counter in this way is not
appropriate. Also, since multiple descriptors cause the budget check
to be short circuited, sometimes the returned value from the poll
function would be larger than the budget value resulting in a WARN_ONCE
being triggered.
Update the polling logic to properly account for the number of packets
processed and exit when the budget value is reached.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_set_features callback function was improperly using an unsigned
int to save the current feature value for features such as NETIF_F_RXCSUM.
Since that feature is in the upper 32 bits of a 64 bit variable the
result was always 0 making it not possible to actually turn off the
hardware RX checksum support. Change the unsigned int type to the
netdev_features_t type in order to properly capture the current value
and perform the proper operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains fixes identified by checkpatch when run with the
strict option.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flushing of the Tx hardware queues is only supported at a certain
level of the hardware. Retrieve the current version of the hardware
and use that to determine if flushing is supported.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The header linux/phy.h was included twice, so delete one of them.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A change added to the mdiobus/phy api added a module_get/module_put
during phy connect/disconnect processing. Currently, the driver
performs a phy connect during module probe and a phy disconnect during
module remove. With the addition of the module_get during phy connect
the amd-xgbe module use count is incremented and can no longer be
unloaded.
Move the phy connect/disconnect from the driver probe/remove functions
to the net_device_ops ndo_open/ndo_stop functions. This allows the
module use count to be decremented when the device(s) are brought down
and allows the module to be unloaded.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for traffic classes as well as support
for Data Center Bridging interfaces related to traffic classes
and priority flow control.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for Tx and Rx hardware timestamping.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides some general performance enhancements for the
driver:
- Modify the default coalescing settings (reduce usec, increase frames)
- Change the AXI burst length to 256 bytes (default was 16 bytes which
was smaller than a cache line)
- Change the AXI cache settings to write-back/write-allocate which
allocate cache entries for received packets during the DMA since the
packet will be processed soon afterwards
- Combine ioread/iowrite when disabling both the Tx and Rx interrupts
- Change to processing the Tx/Rx channels in pairs
- Only recycle the Rx descriptors when a threshold of dirty descriptors
is reached
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the napi context is added using netif_napi_add each time
the ndo_open operation is called. However, there is not a
corresponding netif_napi_del call during the ndo_stop operation. If
the device ndo_open operation was called more than once an infinite
loop occurs during module unload. Add a call to netif_napi_del during
the ndo_stop operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the driver makes use of the additional mac address
registers in the hardware to provide perfect filtering. The
hardware can also have a set of hash table registers that can
be used for imperfect filtering. By using imperfect filtering
the additional mac address registers can be used for layer 2
filtering support. Use the hash table registers if the device
has them.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for (imperfect) filtering of
VLAN tag ids using a 16-bit filter hash table. When
VLANs are added, a 4-bit hash is calculated with the
result indicating the bit in the hash table to set.
This table is used by the hardware to drop packets with
a VLAN id that does not hash to a set bit in the table.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to avoid conflicts with other include files, add
a prefix to the defines in xgbe.h.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>