Some SBE operations have extremely large responses and can require
several minutes to process the response. During this time, the device
lock must be held. If another process attempts an operation, it will
wait for the mutex for longer than the kernel hung task watchdog
allows. Therefore, use the interruptible function to lock the mutex.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210803213016.44739-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
If the SBEFIFO response indicates an error, store the response in the
user buffer and return an error. Previously, the user had no way of
obtaining the SBEFIFO FFDC.
The user's buffer now contains data in the event of a failure. No change
in the event of a successful transfer.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019205307.36946-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Allocate a large buffer for each OCC to handle response data. This
removes memory allocation during an operation, and also allows for
the maximum amount of SBE FFDC.
Previously for the putsram and attn commands, only 32 words would have
been available, and for getsram, only up to the size of the transfer.
SBE FFDC might be up to 8Kb.
The SBE interface expects data to be specified in units of words (4
bytes), defined as OCC_MAX_RESP_WORDS.
This change allows the full FFDC capture to be implemented, where before
it was not available.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019205307.36946-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Set and increment the sequence number during the submit operation.
This prevents sequence number conflicts between different users of
the interface. A sequence number conflict may result in a user
getting an OCC response meant for a different command. Since the
sequence number is now modified, the checksum must be calculated and
set before submitting the command.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210721190231.117185-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
On BMCs with lower timer resolution than 1ms, msleep(1) will take
way longer than 1ms, so looping 10k times won't wait for 10s but
significantly longer.
Fix this by using jiffies like the rest of the code.
Fixes: 9f4a8a2d7f ("fsi/sbefifo: Add driver for the SBE FIFO")
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Link: https://lore.kernel.org/r/20200724071518.430515-3-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
The lengthy timeout previously used sometimes resulted in
scheduling problems, detailed below. Therefore reduce the timeout
to 500us. This timeout selection is supported by the benchmarks
collected below with various clock dividers. This is purely the time
spent polling (reported by ktime_get()).
div 1: max:150us avg: 2us
div 2: max:155us avg: 3us
div 4: max:149us avg: 7us
div 8: max:153us avg: 13us
div 16: max:197us avg: 21us
div 32: max:181us avg: 50us
div 64: max:262us avg:100us
Jan 22 01:27:21 rain27bmc kernel: rcu: INFO: rcu_sched self-detected stall on CPU
Jan 22 01:27:21 rain27bmc kernel: rcu: 0-....: (2099 ticks this GP) idle=0ca/1/0x40000002 softirq=349573/349573 fqs=1048
Jan 22 01:27:21 rain27bmc kernel: (t=2100 jiffies g=841149 q=7163)
Jan 22 01:27:21 rain27bmc kernel: NMI backtrace for cpu 0
Jan 22 01:27:21 rain27bmc kernel: CPU: 0 PID: 5959 Comm: ibm-read-vpd Not tainted 5.8.17-a9b4ea8 #1
Jan 22 01:27:21 rain27bmc kernel: Hardware name: Generic DT based system
Jan 22 01:27:21 rain27bmc kernel: Backtrace:
Jan 22 01:27:25 rain27bmc kernel: [<8010d92c>] (dump_backtrace) from [<8010db80>] (show_stack+0x20/0x24)
...
Jan 22 01:27:25 rain27bmc kernel: [<8010130c>] (gic_handle_irq) from [<80100b0c>] (__irq_svc+0x6c/0x90)
Jan 22 01:27:25 rain27bmc kernel: Exception stack(0xb79159b0 to 0xb79159f8)
Jan 22 01:27:25 rain27bmc kernel: 59a0: 9e88e5d5 00000559 00000559 00000018
Jan 22 01:27:25 rain27bmc kernel: 59c0: 00000000 9f217c55 00000003 00000559 a0201c00 bfa4d048 bfa4d000 b7915a44
Jan 22 01:27:25 rain27bmc kernel: 59e0: 40e88f8a b7915a00 3254e553 80734924 80030113 ffffffff
Jan 22 01:27:25 rain27bmc kernel: r9:b7914000 r8:a0201c00 r7:b79159e4 r6:ffffffff r5:80030113 r4:80734924
Jan 22 01:27:25 rain27bmc kernel: [<807348b4>] (__opb_read) from [<80734d98>] (aspeed_master_read+0xbc/0xcc)
Jan 22 01:27:25 rain27bmc kernel: r10:00000004 r9:00000002 r8:80734cdc r7:bd33fa40 r6:00000004 r5:bd33f840
Jan 22 01:27:25 rain27bmc kernel: r4:00201c00
Jan 22 01:27:25 rain27bmc kernel: [<80734cdc>] (aspeed_master_read) from [<807320f0>] (fsi_master_read+0x6c/0x1bc)
...
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20210211194846.35475-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
On a functioning FSI link there is not need to retry a write when doing
a scom in the driver.
Allow the higher layers (eg. userspace) to attempt a retry if they want,
or to accept that the address they are talking to is not accessible.
By removing the retries we can separate the error handling from retry
logic. In particular -EBUSY was used to force the get/put scom logic to
retry.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210527070109.225198-1-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
Currently the cfam_read and cfam_write functions return the provided
number of bytes given in the count parameter and not the error return
code in variable rc, hence all failures of read/writes are being
silently ignored. Fix this by returning the error code in rc.
Addresses-Coverity: ("Unused value")
Fixes: d1dcd67825 ("fsi: Add cfam char devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jeremy Kerr <jk@ozlabs.org>
Link: https://lore.kernel.org/r/20210603122812.83587-1-colin.king@canonical.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Pull hwmon updates from Guenter Roeck:
"New drivers:
- SB-TSI sensors
- Lineat Technology LTC2992
- Delta power supplies Q54SJ108A2
- Maxim MAX127
- Corsair PSU
- STMicroelectronics PM6764 Voltage Regulator
New chip support:
- P10 added to fsi/occ driver
- NCT6687D added to nct6883 driver
- Intel-based Xserves added to applesmc driver
- AMD family 19h model 01h added to amd_energy driver
And various minor bug fixes and improvements"
* tag 'hwmon-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (41 commits)
dt-bindings: (hwmon/sbtsi_temp) Add SB-TSI hwmon driver bindings
hwmon: (sbtsi) Add documentation
hwmon: (sbtsi) Add basic support for SB-TSI sensors
hwmon: (iio_hwmon) Drop bogus __refdata annotation
hwmon: (xgene) Drop bogus __refdata annotation
dt-bindings: hwmon: convert AD ADM1275 bindings to dt-schema
hwmon: (occ) Add new temperature sensor type
fsi: occ: Add support for P10
dt-bindings: fsi: Add P10 OCC device documentation
dt-bindings: hwmon: convert TI ADS7828 bindings to dt-schema
dt-bindings: hwmon: convert AD AD741x bindings to dt-schema
dt-bindings: hwmon: convert TI INA2xx bindings to dt-schema
hwmon: (ltc2992) Fix less than zero comparisons with an unsigned integer
hwmon: (pmbus/q54sj108a2) Correct title underline length
dt-bindings: hwmon: Add documentation for ltc2992
hwmon: (ltc2992) Add support for GPIOs.
hwmon: (ltc2992) Add support
hwmon: (pmbus) Driver for Delta power supplies Q54SJ108A2
hwmon: Add driver for STMicroelectronics PM6764 Voltage Regulator
hwmon: (nct6683) Support NCT6687D.
...
Systems have a line for restting the remote CFAM. This is not part of
the FSI master, but is associated with it, so it makes sense to include
it in the master driver.
This exposes a sysfs interface to reset the cfam, abstracting away the
direction and polarity of the GPIO, as well as the timing of the reset
pulse. Userspace will be blocked until the reset pulse is finished.
The reset is hard coded to be in the range of (900, 1000) us. It was
observed with a scope to regularly be just over 1ms.
If the device tree property is not preset the driver will silently
continue.
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Link: https://lore.kernel.org/r/20200728025527.174503-6-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
For testing and hardware debugging a user may wish to override the
divisor at runtime. By setting fsi_master_aspeed.bus_div=N, the divisor
will be set to N, if 0 < N <= 0x3ff.
This is a module parameter and not a device tree option as it will only
need to be set when testing or debugging.
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20200728025527.174503-5-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
Some FSI capable systems have internal FSI signals, and some have
external cabled FSI. Software can detect which machine this is by
reading a jumper GPIO, and also control which pins the signals are
routed to through a mux GPIO.
This attempts to find the GPIOs at probe time. If they are not present
in the device tree the driver will not error and continue as before.
The mux GPIO is owned by the FSI driver to ensure it is not modified at
runtime. The routing jumper obtained as non-exclusive to allow other
software to inspect it's state.
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Link: https://lore.kernel.org/r/20200728025527.174503-3-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
The only usage of scom_ids is to assign its address to the id_table
field in the fsi_driver struct, which is a const pointer, so make it
const to allow the compiler to put it in read-only memory
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>