linux

Author	SHA1	Message	Date
Omer Shpigelman	36fafe87ed	habanalabs: don't allow hard reset with open processes When the MMU is heavily used by the engines, unmapping might take a lot of time due to a full MMU cache invalidation done as part of the unmap flow. Hence we might not be able to kill all open processes before going to hard reset the device, as it involves unmapping of all user memory. In case of a failure in killing all open processes, we should stop the hard reset flow as it might lead to a kernel crash - one thread (killing of a process) is updating MMU structures that other thread (hard reset) is freeing. Stopping a hard reset flow leaves the device as nonoperational and the user can then initiate a hard reset via sysfs to reinitialize the device. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Oded Gabbay	66446820df	habanalabs: GAUDI does not support soft-reset GAUDI does not support soft-reset as it leaves the NIC ports in an awkward state, where their QMANs were reset but the NIC itself is still working. In addition, there is not much sense in doing soft-reset when training is done on multiple GAUDIs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2020-05-25 08:15:33 +03:00
Omer Shpigelman	d798507988	habanalabs: add print for soft reset due to event Print the event name that caused the soft reset. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Omer Shpigelman	42d0b0b95f	habanalabs: improve MMU cache invalidation code A new sequence is introduced to invalidate the MMU cache in order to avoid timeouts. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Daniel Vetter	ed65bfd9fd	habanalabs: don't set default fence_ops->wait It's the default. Also so much for "we're not going to tell the graphics people how to review their code", dma_fence is a pretty core piece of gpu driver infrastructure. And it's very much uapi relevant, including piles of corresponding userspace protocols and libraries for how to pass these around. Would be great if habanalabs would not use this (from a quick look it's not needed at all), since open source the userspace and playing by the usual rules isn't on the table. If that's not possible (because it's actually using the uapi part of dma_fence to interact with gpu drivers) then we have exactly what everyone promised we'd want to avoid. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-25 08:15:33 +03:00
Bjorn Helgaas	709b41b56a	misc: rtsx: Remove unnecessary rts5249_set_aspm(), rts5260_set_aspm() rts5249_set_aspm() and rts5260_set_aspm() do nothing more than the default rtsx_comm_set_aspm() does, so remove them and use the default. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200521180545.1159896-7-helgaas@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-22 09:38:14 +02:00
Bjorn Helgaas	05ffe36a09	misc: rtsx: Simplify rtsx_comm_set_aspm() Simplify rtsx_comm_set_aspm() and remove the now-unused rtsx_pci_enable_aspm(). rtsx_pci_disable_aspm() is still used by rtsx_pci_init_hw(). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200521180545.1159896-6-helgaas@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-22 09:38:13 +02:00
Bjorn Helgaas	3d1e7aa80d	misc: rtsx: Use pcie_capability_clear_and_set_word() for PCI_EXP_LNKCTL Instead of using the driver-specific rtsx_pci_update_cfg_byte() to update the PCIe Link Control Register, use pcie_capability_clear_and_set_word() like the rest of the kernel does. This makes it easier to maintain ASPM across the PCI core and drivers. Remove the now-unused rtsx_pci_update_cfg_byte() and ASPM_MASK_NEG definitions. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200521180545.1159896-5-helgaas@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-22 09:38:13 +02:00
Bjorn Helgaas	9ae577047e	misc: rtsx: Use ASPM_MASK_NEG instead of hard-coded value Use ASPM_MASK_NEG instead of hard-coded value, as other callers of rtsx_pci_update_cfg_byte() do. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200521180545.1159896-4-helgaas@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-22 09:38:13 +02:00
Bjorn Helgaas	51876e22bf	misc: rtsx: Removed unused dev_aspm_mode The struct rtsx_cr_option.dev_aspm_mode member is never set to anything other than DEV_ASPM_DYNAMIC (0). Remove it and code that tests it. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200521180545.1159896-3-helgaas@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-22 09:38:12 +02:00
Bjorn Helgaas	8786eda9a9	misc: rtsx: Remove unused pcr_ops Remove the following unused function pointers from struct pcr_ops: int (set_ltr_latency)(struct rtsx_pcr pcr, u32 latency); int (set_l1off_sub)(struct rtsx_pcr pcr, u8 val); void (full_on)(struct rtsx_pcr pcr); void (power_saving)(struct rtsx_pcr pcr); Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20200521180545.1159896-2-helgaas@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-22 09:38:12 +02:00
Lad Prabhakar	b03025c573	misc: pci_endpoint_test: Add Device ID for RZ/G2E PCIe controller Add Renesas R8A774C0 in pci_device_id table so that pci-epf-test can be used for testing PCIe EP on RZ/G2E. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/1589493809-2602-1-git-send-email-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-19 16:46:12 +02:00
John Hubbard	ddae1423bc	genwqe: convert get_user_pages() --> pin_user_pages() This code was using get_user_pages(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Cc: Frank Haverkamp <haver@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Link: https://lore.kernel.org/r/20200518015237.1568940-1-jhubbard@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-19 16:46:12 +02:00
John Hubbard	5459ceedb3	drivers/mic/scif: convert get_user_pages() --> pin_user_pages() This code was using get_user_pages(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. Note that this effectively changes the code's behavior as well: it now ultimately calls set_page_dirty_lock(), instead of SetPageDirty(). This is probably more accurate. As Christoph Hellwig put it, "set_page_dirty() is only safe if we are dealing with a file backed page where we have reference on the inode it hangs off." [3] [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ [3] https://lore.kernel.org/r/20190723153640.GB720@lst.de Signed-off-by: John Hubbard <jhubbard@nvidia.com> Link: https://lore.kernel.org/r/20200518041307.1987328-1-jhubbard@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-19 16:46:12 +02:00
Rachel Stahl	87eaea1cf8	habanalabs: update patched_cb_size for Wreg32 The patch_cb_size is not updated for Wreg32 in its validate function, so updated in goya_validate_cb. Signed-off-by: Rachel Stahl <rstahl@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Ofir Bitton	ebd8d12251	habanalabs: move event handling to common firmware file Instead of writing similar event handling code for each ASIC, move the code to the common firmware file. This code will be used for GAUDI and all future ASICs. In addition, add two new fields to the auto-generated events file: valid and description. This will save the need to manually write the events description in the source code and simplify the code. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	af57cb81a6	habanalabs: enable gaudi code in driver Enable the GAUDI ASIC code in the pci probe callback of the driver so the driver will handle GAUDI ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	79fc7a9fff	habanalabs: add gaudi profiler module Add the GAUDI code to initialize the ASIC's profiler. The profile receives its initialization values from the user, same as in Goya, but the code to initialize is in the driver because the configuration space of the device is not directly exposed to the user. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	3a3a5bf196	habanalabs: add gaudi security module Add the code to initialize the security module of GAUDI. Similar to Goya, we have two dedicated mechanisms for security: Range Registers and Protection bits. Those mechanisms protect sensitive memory and configuration areas inside the device. In addition, in Gaudi we moved to a 3-level security scheme, where the F/W runs with the highest security level (Privileged), the driver runs with a less secured level (Secured) and the user is neither privileged nor secured. The security module in the driver configures the Secured parts so the user won't be able to access them. The Privileged parts are configured by the F/W. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	bcaf415204	habanalabs: add hwmgr module for gaudi The hwmgr module is responsible for messages sent to GAUDI F/W that are not common to all habanalabs ASICs. In GAUDI, we provide the user a simplified mode of controlling the ASIC clock frequency. Instead of three different clocks, we present a single clock property that the user can configure via sysfs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	ac0ae6a96a	habanalabs: add gaudi asic-dependent code Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the GAUDI ASIC. It also contains the code to initialize the F/W of the GAUDI ASIC and to receive events from the F/W. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	2aad2bf81c	habanalabs: add gaudi asic registers header files Add the relevant GAUDI ASIC registers header files. These files are generated automatically from a tool maintained by the VLSI engineers. There are more files which are not upstreamed because only very few defines from those files are used in the driver. For those files, we copied the relevant defines into gaudi_regs.h and gaudi_masks.h, to reduce the size of this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	fca72fbb66	habanalabs: get card type, location from F/W For Gaudi the driver gets two new additional properties from the F/W: 1. The card's type - PCI or PMC 2. The card's location in the Gaudi's box (relevant only for PMC). The card's location is also passed to the user in the HW IP info structure as it needs this property for establishing communication between Gaudis. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	ca62433f53	habanalabs: support clock gating enable/disable In Gaudi there is a feature of clock gating certain engines. Therefore, add this property to the device structure. In addition, due to a limitation of this feature, the driver needs to dynamically enable or disable this feature during run-time. Therefore, add ASIC interface functions to enable/disable this function from the common code. Moreover, this feature must be turned off when the user wishes to debug the ASIC by reading/writing registers and/or memory through the driver's debugfs. Therefore, add an option to enable/disable clock gating via the debugfs interface. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	803917f960	habanalabs: set PM profile to auto only for goya For Gaudi, the driver doesn't change the PM profile automatically due to device-controlled PM capabilities. Therefore, set the PM profile to auto only for Goya so the driver's code to automatically change the profile won't run on Gaudi. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	e09498b078	habanalabs: add dedicated define for hard reset Gaudi requires longer waiting during reset due to closing of network ports. Add this explanation to the relevant comment in the code and add a dedicated define for this reset timeout period, instead of multiplying another define. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	9e5e49cd5b	habanalabs: check if CoreSight is supported Coresight is not supported on simulator, therefore add a boolean for checking that (currently used by un-upstreamed code). Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	b75f22505a	habanalabs: add signal/wait to CS IOCTL operations Add the following two operations to the CS IOCTL: Signal: The signal operation is basically a command submission, that is created by the driver upon user request. It will be implemented using a dedicated PQE that will increment a specific SOB. There will be a new flag: HL_CS_FLAGS_SIGNAL. When the user set this flag in the CS IOCTL structure, the driver will execute a dedicated code path that will prepare this special PQE and submit it. The user only needs to provide a queue index on which to put the signal. Wait: The wait operation is also a command submission that is created by the driver upon user request. It will be implemented using a dedicated PQE that will contain packets of "ARM a monitor" + FENCE packet. There will be a new flag: HL_CS_FLAGS_WAIT. When the user set this flag in the CS structure, the driver will execute a dedicated code path that will prepare this special PQE and submit it. The user needs to provide the following parameters: 1. queue ID 2. an array of signal_seq numbers and the number of signals to wait on (the length of signal_seq_arr). The IOCTL will return the CS sequence number of the wait it put on the queue ID. Currently, the code supports signal_seq_nr==1. But this API definition will allow us to put a single PQE that waits on multiple signals. To correctly configure the monitor and fence, the driver will need to retrieve the specified signal CS object that contains the relevant SOB and its expected value. In case the signal CS has already been completed, there is no point of adding a wait operation. In this case, the driver will return to the user without putting anything on the PQ. The return code should reflect to the user that the signal was completed, as we won't return a CS sequence number for this wait. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	b0b5d92579	habanalabs: handle the h/w sync object Define a structure representing the h/w sync object (SOB). a SOB can contain up to 2^15 values. Each signal CS will increment the SOB by 1, so after some time we will reach the maximum number the SOB can represent. When that happens, the driver needs to move to a different SOB for the signal operation. A SOB can be in 1 of 4 states: 1. Working state with value < 2^15 2. We reached a value of 2^15, but the signal operations weren't completed yet OR there are pending waits on this signal. For the next submission, the driver will move to another SOB. 3. ALL the signal operations on the SOB have finished AND there are no more pending waits on the SOB AND we reached a value of 2^15 (This basically means the refcnt of the SOB is 0 - see explanation below). When that happens, the driver can clear the SOB by simply doing WREG32 0 to it and set the refcnt back to 1. 4. The SOB is cleared and can be used next time by the driver when it needs to reuse an SOB. Per SOB, the driver will maintain a single refcnt, that will be initialized to 1. When a signal or wait operation on this SOB is submitted to the PQ, the refcnt will be incremented. When a signal or wait operation on this SOB completes, the refcnt will be decremented. After the submission of the signal operation that increments the SOB to a value of 2^15, the refcnt is also decremented. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	ec2f8a306a	habanalabs: define ASIC-dependent interface for signal/wait This feature requires handling h/w resources which are a bit different from one ASIC to the other. Therefore, we need to define a set of interfaces the ASIC code provides to the common code to signal, wait, reset sync object and to reset and init a queue. As this feature is not supported in Goya, provide an empty implementation of those functions. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Omer Shpigelman	f9e5f29518	uapi: habanalabs: add signal/wait operations This is a pre-requisite to upstreaming GAUDI support. Signal/wait operations are done by the user to perform sync between two Primary Queues (PQs). The sync is done using the sync manager and it is usually resolved inside the device, but sometimes it can be resolved in the host, i.e. the user should be able to wait in the host until a signal has been completed. The mechanism to define signal and wait operations is done by the driver because it needs atomicity and serialization, which is already done in the driver when submitting work to the different queues. To implement this feature, the driver "takes" a couple of h/w resources, and this is reflected by the defines added to the uapi file. The signal/wait operations are done via the existing CS IOCTL, and they use the same data structure. There is a difference in the meaning of some of the parameters, and for that we added unions to make the code more readable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	824b457839	habanalabs: add missing MODULE_DEVICE_TABLE PCI drivers should use this define to declare their PCI ID table. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Dotan Barak	0a62c3926e	habanalabs: print all CB handles as hex numbers Make all the CB handles printed in the same way and not some as decimal and some as hex numbers. Signed-off-by: Dotan Barak <dbarak@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	010a118cfe	habanalabs: update F/W register map Update the mapping to the latest one used by the Firmware. No impact on the driver in this update. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Adam Aharon	aa9dd58bcc	habanalabs: enable trace data compression (profiler) Set the STMTCSR.COMPEN bit to enable leading-zero trace data compression functionality for the extended stimulus ports. Signed-off-by: Adam Aharon <aaharon@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Ofir Bitton	47f6b41cdd	habanalabs: load CPU device boot loader from host Load CPU device boot loader during driver boot time in order to avoid flash write for every boot loader update. To preserve backward-compatibility, skip the device boot load if the device doesn't request it. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	39b425170d	habanalabs: leave space for 2xMSG_PROT in CB The user must leave space for 2xMSG_PROT in the external CB, so adjust the define of max size accordingly. The driver, however, can still create a CB with the maximum size of 2MB. Therefore, we need to add a check specifically for the user requested size. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Christine Gharzuzi	8e708af284	habanalabs: support hwmon_reset_history attribute Support hwmon_temp_reset_histroy, hwmon_in_reset_history and hwmon_curr_reset attribute which resets the historical highest value. Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Tomer Tayar	79c823c57e	habanalabs: Align protection bits configuration of all TPCs Align the protection bits configuration of all TPC cores to be as of TPC core 0. Fixes: `a513f9a7ec` ("habanalabs: make tpc registers secured") Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Tomer Tayar	eef544f746	habanalabs: Allow access to TPC LFSR register Allow user access to TPC LFSR register, as it might be accessed by TPC kernels. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Tomer Tayar	25e7aeba60	habanalabs: Add INFO IOCTL opcode for time sync information Add a new opcode to the INFO IOCTL that retrieves the device time alongside the host time, to allow a user application that want to measure device time together with host time (such as a profiler) to synchronize these times. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
kbuild test robot	ba7193c952	habanalabs: hl_pci_set_dma_mask() can be static set function to be static as it is not called from outside its file. Signed-off-by: kbuild test robot <lkp@intel.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-19 14:48:41 +03:00
Oded Gabbay	926ba4cce1	habanalabs: handle barriers in DMA QMAN streams When we have DMA QMAN with multiple streams, we need to know whether the command buffer contains at least one DMA packet in order to configure the barriers correctly when adding the 2xMSG_PROT at the end of the JOB. If there is no DMA packet, then there is no need to put engine barrier. This is relevant only for GAUDI as GOYA doesn't have streams so the engine can't be busy by another stream. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	cb056b9fd5	habanalabs: retrieve DMA mask indication from firmware Retrieve from the firmware the DMA mask value we need to set according to the device's PCI controller configuration. This is needed when working on POWER9 machines, as the device's PCI controller is configured in a different way in those machines. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	c8aee597bb	habanalabs: update firmware definitions Add comments for the various errors and states of the firmware during boot. Add a mapping of a new register that will tell the driver whether the firmware executed the request from the driver or if it has encountered an error. Add a new enum for the possible values of this register. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	7a65ee046b	habanalabs: increase timeout during reset When doing training, the DL framework (e.g. tensorflow) performs hundreds of thousands of memory allocations and mappings. In case the driver needs to perform hard-reset during training, the driver kills the application and unmaps all those memory allocations. Unfortunately, because of that large amount of mappings, the driver isn't able to do that in the current timeout (5 seconds). Therefore, increase the timeout significantly to 30 seconds to avoid situation where the driver resets the device with active mappings, which sometime can cause a kernel bug. BTW, it doesn't mean we will spend all the 30 seconds because the reset thread checks every one second if the unmap operation is done. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	49aba0bbab	habanalabs: print warning when reset is requested When the system administrator asks the driver to soft or hard reset the device through sysfs, the driver should display a warning in the kernel log to explain why it suddenly resets the device. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Oded Gabbay	7e1c07dd35	habanalabs: unify and improve device cpu init Move the code of device CPU initialization from being ASIC-Dependent to common code. In addition, add support for the new error reporting feature of the firmware boot code. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Omer Shpigelman	1fa185c656	habanalabs: re-factor H/W queues initialization We want to remove the following restrictions/assumptions in our driver: 1. The H/W queue index is also the completion queue index. 2. The H/W queue index is also the IRQ number of the completion queue. 3. All queues of the same type have consecutive indexes. Therefore we add the support for H/W queues of the same type with nonconsecutive indexes and completion queue index and IRQ number different than the H/W queue index. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00
Omer Shpigelman	76cedc739d	habanalabs: remove stop-on-error flag from DMA Stop-on-error mode in DMA is useful as it stops the transaction immediately upon error e.g. page fault. But it may cause the next command submission to fail as is leaves the DMA in unstable state. Therefore we remove the stop-on-error configuration from the DMA. Stop-on-err is still available for debug. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +03:00

1 2 3 4 5 ...

4153 Commits