linux

mirror of https://github.com/Dasharo/linux.git synced 2026-03-06 15:25:10 -08:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	c29930ef83	Merge tag 'fsi-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi into char-misc-next Joel writes: FSI changes for v5.18 * Improvements in SCOM and OCC drivers for error handling and retries * Addition of tracepoints for initialisation path * API for setting long running SBE FIFO operations * tag 'fsi-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi: fsi: Add trace events in initialization path fsi: sbefifo: Implement FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl fsi: sbefifo: Use specified value of start of response timeout fsi: occ: Improve response status checking fsi: scom: Remove retries in indirect scoms fsi: scom: Fix error handling	2022-02-21 17:47:42 +01:00
Eddie James	f2af60bb7c	fsi: Add trace events in initialization path Add definitions for trace events to show the scanning flow. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20220207161640.35605-1-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2022-02-21 19:38:54 +10:30
Amitay Isaacs	a1dc630886	fsi: sbefifo: Implement FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl sets the read timeout (in seconds) for the response received by sbefifo device from sbe. The timeout affects only the read operation on current sbefifo device fd. Certain SBE operations can take long time to complete and the default timeout of 10 seconds might not be sufficient to start receiving response from SBE. In such cases, allow the timeout to be set to the maximum of 120 seconds. The kernel does not contain the definition of the various SBE operations, so we must expose an interface to userspace to set the timeout for the given operation. Signed-off-by: Amitay Isaacs <amitay@ozlabs.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20220121053816.82253-3-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2022-02-21 19:38:17 +10:30
Amitay Isaacs	b8d536d277	fsi: sbefifo: Use specified value of start of response timeout For some of the chip-ops where sbe needs to collect trace information, sbe can take a long time (>30s) to respond. Currently these chip-ops will timeout as the start of response timeout defaults to 10s. Instead of default value, use specified value. The require timeout value will be set using ioctl. Signed-off-by: Amitay Isaacs <amitay@ozlabs.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20220121053816.82253-2-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2022-02-21 19:38:17 +10:30
Eddie James	3dcf3c84f5	fsi: occ: Improve response status checking If the driver sequence number coincidentally equals the previous command response sequence number, the driver may proceed with fetching the entire buffer before the OCC has processed the current command. To be sure the correct response is obtained, check the command type and also retry if any of the response parameters have changed when the rest of the buffer is fetched. Also initialize the driver with a random sequence number in order to reduce the chances of this happening. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20220208152235.19686-1-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2022-02-21 19:37:03 +10:30
Christophe JAILLET	83ba7e895d	fsi: Aspeed: Fix a potential double free A struct device can never be devm_alloc()'ed. Here, it is embedded in "struct fsi_master", and "struct fsi_master" is embedded in "struct fsi_master_aspeed". Since "struct device" is embedded, the data structure embedding it must be released with the release function, as is already done here. So use kzalloc() instead of devm_kzalloc() when allocating "aspeed" and update all error handling branches accordingly. This prevent a potential double free(). This also fix another issue if opb_readl() fails. Instead of a direct return, it now jumps in the error handling path. Fixes: `606397d67f` ("fsi: Add ast2600 master driver") Suggested-by: Greg KH <gregkh@linuxfoundation.org> Suggested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/2c123f8b0a40dc1a061fae982169fe030b4f47e6.1641765339.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-04 16:45:39 +01:00
Joel Stanley	ab1b79159a	fsi: scom: Remove retries in indirect scoms In commit `f72ddbe1d7` ("fsi: scom: Remove retries") the retries were removed from get and put scoms. That patch missed the retires in get and put indirect scom. For the same reason, remove them from the scom driver to allow the caller to decide to retry. This removes the following special case which would have caused the retry code to return early: - if ((ind_data & XSCOM_DATA_IND_COMPLETE) \|\| (err != SCOM_PIB_BLOCKED)) - return 0; I believe this case is handled. Fixes: `f72ddbe1d7` ("fsi: scom: Remove retries") Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20211207033811.518981-3-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2022-01-31 15:56:55 +10:30
Joel Stanley	d46fddd52d	fsi: scom: Fix error handling SCOM error handling is made complex by trying to pass around two bits of information: the function return code, and a status parameter that represents the CFAM error status register. The commit `f72ddbe1d7` ("fsi: scom: Remove retries") removed the "hidden" retries in the SCOM driver, in preference of allowing the calling code (userspace or driver) to decide how to handle a failed SCOM. However it introduced a bug by attempting to be smart about the return codes that were "errors" and which were ok to fall through to the status register parsing. We get the following errors: - EINVAL or ENXIO, for indirect scoms where the value is invalid - EINVAL, where the size or address is incorrect - EIO or ETIMEOUT, where FSI write failed (aspeed master) - EAGAIN, where the master detected a crc error (GPIO master only) - EBUSY, where the bus is disabled (GPIO master in external mode) In all of these cases we should fail the SCOM read/write and return the error. Thanks to Dan Carpenter for the detailed bug report. Fixes: `f72ddbe1d7` ("fsi: scom: Remove retries") Link: https://lists.ozlabs.org/pipermail/linux-fsi/2021-November/000235.html Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20211207033811.518981-2-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2022-01-31 15:56:55 +10:30
Eddie James	7cc2f34e1f	fsi: sbefifo: Use interruptible mutex locking Some SBE operations have extremely large responses and can require several minutes to process the response. During this time, the device lock must be held. If another process attempts an operation, it will wait for the mutex for longer than the kernel hung task watchdog allows. Therefore, use the interruptible function to lock the mutex. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20210803213016.44739-1-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-10-22 09:54:33 +10:30
Eddie James	826280348e	fsi: sbefifo: Add sysfs file indicating a timeout error The SBEFIFO timeout error requires special handling in userspace to do recovery operations. Add a sysfs file to indicate a timeout error, and notify pollers when a timeout occurs. This will be used by the openpower-occ-control application. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20211019211749.38059-3-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-10-22 09:54:33 +10:30
Eddie James	8ec3cc9fb5	fsi: occ: Store the SBEFIFO FFDC in the user response buffer If the SBEFIFO response indicates an error, store the response in the user buffer and return an error. Previously, the user had no way of obtaining the SBEFIFO FFDC. The user's buffer now contains data in the event of a failure. No change in the event of a successful transfer. Signed-off-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20211019205307.36946-3-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-10-22 09:54:32 +10:30
Eddie James	008d3825a8	fsi: occ: Use a large buffer for responses Allocate a large buffer for each OCC to handle response data. This removes memory allocation during an operation, and also allows for the maximum amount of SBE FFDC. Previously for the putsram and attn commands, only 32 words would have been available, and for getsram, only up to the size of the transfer. SBE FFDC might be up to 8Kb. The SBE interface expects data to be specified in units of words (4 bytes), defined as OCC_MAX_RESP_WORDS. This change allows the full FFDC capture to be implemented, where before it was not available. Signed-off-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20211019205307.36946-2-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-10-22 09:50:55 +10:30
Eddie James	62f79f3d0e	fsi: occ: Force sequence numbering per OCC Set and increment the sequence number during the submit operation. This prevents sequence number conflicts between different users of the interface. A sequence number conflict may result in a user getting an OCC response meant for a different command. Since the sequence number is now modified, the checksum must be calculated and set before submitting the command. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20210721190231.117185-2-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-10-15 15:09:25 +10:30
Joachim Fenkes	9ab1428dfe	fsi/sbefifo: Fix reset timeout On BMCs with lower timer resolution than 1ms, msleep(1) will take way longer than 1ms, so looping 10k times won't wait for 10s but significantly longer. Fix this by using jiffies like the rest of the code. Fixes: `9f4a8a2d7f` ("fsi/sbefifo: Add driver for the SBE FIFO") Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Link: https://lore.kernel.org/r/20200724071518.430515-3-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 16:06:57 +09:30
Joachim Fenkes	95152433e4	fsi/sbefifo: Clean up correct FIFO when receiving reset request from SBE When the SBE requests a reset via the down FIFO, that is also the FIFO we should go and reset ;) Fixes: `9f4a8a2d7f` ("fsi/sbefifo: Add driver for the SBE FIFO") Signed-off-by: Joachim Fenkes <FENKES@de.ibm.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20200724071518.430515-2-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 15:23:40 +09:30
Zhen Lei	56e05c60f2	fsi: master-ast-cf: Remove redundant error printing in fsi_master_acf_probe() When devm_ioremap_resource() fails, a clear enough error message will be printed by its subfunction __devm_ioremap_resource(). The error information contains the device name, failure cause, and possibly resource information. Therefore, remove the error printing here to simplify code and reduce the binary size. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://patchwork.ozlabs.org/project/linux-fsi/patch/20210511085745.4340-1-thunder.leizhen@huawei.com/ Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 15:02:55 +09:30
Eddie James	1e2233d4f3	fsi: Aspeed: Reduce poll timeout The lengthy timeout previously used sometimes resulted in scheduling problems, detailed below. Therefore reduce the timeout to 500us. This timeout selection is supported by the benchmarks collected below with various clock dividers. This is purely the time spent polling (reported by ktime_get()). div 1: max:150us avg: 2us div 2: max:155us avg: 3us div 4: max:149us avg: 7us div 8: max:153us avg: 13us div 16: max:197us avg: 21us div 32: max:181us avg: 50us div 64: max:262us avg:100us Jan 22 01:27:21 rain27bmc kernel: rcu: INFO: rcu_sched self-detected stall on CPU Jan 22 01:27:21 rain27bmc kernel: rcu: 0-....: (2099 ticks this GP) idle=0ca/1/0x40000002 softirq=349573/349573 fqs=1048 Jan 22 01:27:21 rain27bmc kernel: (t=2100 jiffies g=841149 q=7163) Jan 22 01:27:21 rain27bmc kernel: NMI backtrace for cpu 0 Jan 22 01:27:21 rain27bmc kernel: CPU: 0 PID: 5959 Comm: ibm-read-vpd Not tainted 5.8.17-a9b4ea8 #1 Jan 22 01:27:21 rain27bmc kernel: Hardware name: Generic DT based system Jan 22 01:27:21 rain27bmc kernel: Backtrace: Jan 22 01:27:25 rain27bmc kernel: [<8010d92c>] (dump_backtrace) from [<8010db80>] (show_stack+0x20/0x24) ... Jan 22 01:27:25 rain27bmc kernel: [<8010130c>] (gic_handle_irq) from [<80100b0c>] (__irq_svc+0x6c/0x90) Jan 22 01:27:25 rain27bmc kernel: Exception stack(0xb79159b0 to 0xb79159f8) Jan 22 01:27:25 rain27bmc kernel: 59a0: 9e88e5d5 00000559 00000559 00000018 Jan 22 01:27:25 rain27bmc kernel: 59c0: 00000000 9f217c55 00000003 00000559 a0201c00 bfa4d048 bfa4d000 b7915a44 Jan 22 01:27:25 rain27bmc kernel: 59e0: 40e88f8a b7915a00 3254e553 80734924 80030113 ffffffff Jan 22 01:27:25 rain27bmc kernel: r9:b7914000 r8:a0201c00 r7:b79159e4 r6:ffffffff r5:80030113 r4:80734924 Jan 22 01:27:25 rain27bmc kernel: [<807348b4>] (__opb_read) from [<80734d98>] (aspeed_master_read+0xbc/0xcc) Jan 22 01:27:25 rain27bmc kernel: r10:00000004 r9:00000002 r8:80734cdc r7:bd33fa40 r6:00000004 r5:bd33f840 Jan 22 01:27:25 rain27bmc kernel: r4:00201c00 Jan 22 01:27:25 rain27bmc kernel: [<80734cdc>] (aspeed_master_read) from [<807320f0>] (fsi_master_read+0x6c/0x1bc) ... Signed-off-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20210211194846.35475-1-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 15:00:01 +09:30
Yangtao Li	a3469912f4	fsi: aspeed: convert to devm_platform_ioremap_resource Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lore.kernel.org/r/20191228190631.26777-1-tiny.windzz@gmail.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:34:09 +09:30
Eddie James	614f0a50c9	fsi: occ: Log error for checksum failure Log an error if the response checksum doesn't match the calculated checksum. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20210209171235.20624-3-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:29:57 +09:30
Eddie James	8a4659be08	fsi: occ: Don't accept response from un-initialized OCC If the OCC is not initialized and responds as such, the driver should continue waiting for a valid response until the timeout expires. Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Fixes: `7ed98dddb7` ("fsi: Add On-Chip Controller (OCC) driver") Link: https://lore.kernel.org/r/20210209171235.20624-2-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:29:54 +09:30
Joel Stanley	f72ddbe1d7	fsi: scom: Remove retries On a functioning FSI link there is not need to retry a write when doing a scom in the driver. Allow the higher layers (eg. userspace) to attempt a retry if they want, or to accept that the address they are talking to is not accessible. By removing the retries we can separate the error handling from retry logic. In particular -EBUSY was used to force the get/put scom logic to retry. Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20210527070109.225198-1-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:24:02 +09:30
Eddie James	a5c317dac5	fsi: scom: Reset the FSI2PIB engine for any error The error bits in the FSI2PIB status are only cleared by a reset. So the driver needs to perform a reset after seeing any of the FSI2PIB errors, otherwise subsequent operations will also look like failures. Fixes: `6b293258cd` ("fsi: scom: Major overhaul") Signed-off-by: Eddie James <eajames@linux.ibm.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20210329151344.14246-1-eajames@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:19:48 +09:30
Joel Stanley	4134cb9165	fsi: aspeed: Emit fewer barriers in opb operations When setting up a read or write to the OPB memory space, we must perform five or six AHB writes. The ordering of these up until the trigger write does not matter, so use writel_relaxed. The generated code goes from (Debian GCC 10.2.1-6): mov r8, r3 mcr 15, 0, sl, cr7, cr10, {4} str sl, [r6, #20] mcr 15, 0, sl, cr7, cr10, {4} str r3, [r6, #24] mcr 15, 0, sl, cr7, cr10, {4} str r1, [r6, #28] mcr 15, 0, sl, cr7, cr10, {4} str r2, [r6, #32] mcr 15, 0, sl, cr7, cr10, {4} mov r1, #1 str r1, [r6, #64] ; 0x40 mcr 15, 0, sl, cr7, cr10, {4} str r1, [r6, #4] to this: str r3, [r7, #20] str r2, [r7, #24] str r1, [r7, #28] str r3, [r7, #64] mov r8, #0 mcr 15, 0, r8, cr7, cr10, {4} str r3, [r7, #4] Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Eddie James <eajames@linux.ibm.com> Tested-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20210223041737.171274-1-joel@jms.id.au Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:18:17 +09:30
Colin Ian King	9108109457	fsi: core: Fix return of error values on failures Currently the cfam_read and cfam_write functions return the provided number of bytes given in the count parameter and not the error return code in variable rc, hence all failures of read/writes are being silently ignored. Fix this by returning the error code in rc. Addresses-Coverity: ("Unused value") Fixes: `d1dcd67825` ("fsi: Add cfam char devices") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jeremy Kerr <jk@ozlabs.org> Link: https://lore.kernel.org/r/20210603122812.83587-1-colin.king@canonical.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:11:48 +09:30
Zou Wei	19a5217812	fsi: Add missing MODULE_DEVICE_TABLE This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Link: https://lore.kernel.org/r/1620896249-52769-1-git-send-email-zou_wei@huawei.com Signed-off-by: Joel Stanley <joel@jms.id.au>	2021-06-04 14:10:45 +09:30

1 2 3 4 5 ...

155 Commits