The sector probe buffer in firehose_try_configure() is a small,
bounded allocation (max 4096 bytes) used only within this function
scope. Replace malloc() with alloca() to:
- avoid the need for a free() path and NULL-check error handling
- simplify control flow, as the buffer is automatically released
on function return
The upper bound is compile-time constant (largest entry in
sector_sizes[]), well within safe stack allocation limits.
Signed-off-by: Igor Opaniuk <igor.opaniuk@oss.qualcomm.com>
This got broken during some refactoring, pass the correct values through.
Fixes: 841a0a8e7f ("firehose: Improve startup time significantly")
Signed-off-by: Casey Connolly <casey.connolly@linaro.org>
Commit 'a10cf7f5da7a ("firehose: Increase image write timeout to 60
seconds")' bumped the timeout for writes in the program operation, per
observed behavior. This was confirmed to solve the problems seen when
programming SPINOR in Nord devices
New reports (and measurements) shows the write taking a negligible
amount of time and the read of the following "program ack" taking the
whole reported ~35 second, resulting in a timeout.
The proprietary tools has a write timeout of 60 seconds in some
operating systems and no timeout for others, and a read timeout of 120
seconds.
Update this timeout for the "program ack" to 120 seconds, to match the
proprietary tools.
Fixes: a10cf7f5da ("firehose: Increase image write timeout to 60 seconds")
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
On RB3 and IQ9, devices may intermittently hang if reboot is issued too
soon after flashing, sometimes requiring a full power cycle to recover.
During testing, shorter post-flash delays were more likely to reproduce
the issue. Add a conservative 10s delay between flash completion and
reboot to provide a stabilization window.
This mirrors long-standing behavior in fh_loader and PCAT.
Results (30 runs each, Rubik Pi/IQ8/IQ9):
- QDL 2.4 baseline: 18/30 flash attempts failed
- QDL 2.4 + this change: 30/30 passes
Signed-off-by: Yvonne Kaire <ykaire@qti.qualcomm.com>
Fix the capitalization of the 'ZlpAwareHost' and 'Verbose'
options, since the mis-capitalized versions seem to have no
effect on programmer behavior.
Signed-off-by: Steve Moskovchenko <stevemo@skydio.com>
[bjorn: Updated VIP digest]
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
The somewhat arbitrary timeout of 30 seconds was chosen by observing a
worst case time of ~15 seconds when flashing Hamoa and Nord. But, as
reported, flashing SAIL on Nord sometimes needs this timeout to be ~35
seconds.
While the relationship between the different factors and the needed
timeout isn't fully understood, bump the value to 60 seconds to make it
work for this case as well.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
The assumption that every other Firehose read transfer is null is
incorrect. A Zero-Length Packet (ZLP) is only sent when the received
transfer size is a multiple of the USB IN endpoint's Max Packet Size.
This is a way for the device to indicated end-of-transfer.
Unconditionally expecting an empty packet was not an issue with USB 2.0,
where the typical 512-byte sector size matches the bulk IN endpoint's
Max Packet Size. However, with USB SuperSpeed (Max Packet Size = 1024),
attempting to read an empty packet after a 512/1024-byte transfer is
unnecessary and fails (timeout), resulting in errors during storage
device sector size discovery:
```
waiting for programmer...
qdl: failed to read: Resource temporarily unavailable
```
Fix and move ZLP handling to bus-specific code (usb.c).
Reported-by: Danilo Leo <d.leo@arduino.cc>
Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Address more signedness/unsignedness issues, like:
../sahara.c: In function ‘sahara_debug64_one’:
../qdl.h:20:12: warning: comparison of integer expressions of different
signedness: ‘long unsigned int’ and ‘int’ [-Wsign-compare]
20 | _x < _y ? _x : _y; \
| ^
../sahara.c:286:26: note: in expansion of macro ‘MIN’
286 | remain = MIN((uint64_t)(region.length - chunk), DEBUG_BLOCK_SIZE);
| ^~~
../qdl.h:20:24: warning: operand of ‘?:’ changes signedness from ‘int’ to
‘long unsigned int’ due to unsignedness of other operand [-Wsign-compare]
20 | _x < _y ? _x : _y; \
../gpt.c: In function ‘gpt_find_by_name’:
../gpt.c:255:65: warning: comparison of integer expressions of different
signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
255 | if (*phys_partition >= 0 && gpt_part->partition != *phys_partition)
Now tools are built without any warnings when -Wsign-compare is enabled.
Signed-off-by: Igor Opaniuk <igor.opaniuk@oss.qualcomm.com>
For SoCs that support multiple UFS devices, firehose allows the slot property. This adds a flag to set the slot parameter.
Signed-off-by: Jacob Creedon <jcreedon@gmail.com>
Address warnings for comparisons of integer expressions of different
signedness, for example:
../firehose.c:384:31: warning: comparison of integer expressions of
different signedness: ‘int’ and ‘long unsigned int’ [-Wsign-compare]
384 | for (i = 0; i < ARRAY_SIZE(sector_sizes); i++) {
In all places, where signed value is casted to unsigned (size_t for
instance), there is always explicitly handling of possible negative
value beforehand
Signed-off-by: Igor Opaniuk <igor.opaniuk@oss.qualcomm.com>
Usage of __unused macro before the variable name leads
to triggering checkpatch sanity check:
CHECK: spaces preferred around that '*' (ctx:WxV)
Fix this by moving the macro to the end of a function param
declaration, like it's done in the Linux kernel code.
Fixes: a79a572f18 ("Address unused parameter warnings")
Signed-off-by: Igor Opaniuk <igor.opaniuk@oss.qualcomm.com>
Use __unused attribute to suppress compile warnings for callback
implementations, where some parameters aren't used. For example:
warning: unused parameter ‘qdl’ [-Wunused-parameter]
./firehose.c:257:80: warning: unused parameter ‘rawmode’ [-Wunused-parameter]
257 | static int firehose_configure_response_parser(xmlNode *node...
Signed-off-by: Igor Opaniuk <igor.opaniuk@oss.qualcomm.com>
The ZLP for SPINOR writes has been measured to take up to 15 seconds to
complete, resulting in timeouts even with the new 10 second value.
Provide a longer timeout (double the currently measured value)
specifically for SPINOR.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Rather than accepting an arbitrary string for "storage type", pass this
through a variety of strcmp() and then passing it verbatim to
<configure>, define the supported storage types, decode the passed
value, and clean up the various checks.
While doing so, extend the help text to document that we do support both
"nand" and "spinor" (the latter with some trouble).
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
During QRB2210 provisioning, a USB write failure has been
observed early in the Firehose phase (first block write).
This issue has been traced to a timeout during the USB
bulk transfer of the Zero-Length Packet (ZLP).
In some conditions, the ZLP transfer may take longer than
the current timeout, up to approximately 1.7 seconds.
The issue specifically occurs after a prior large eMMC write
operation (e.g., during a previous QDL session). It could then
be related to internal eMMC I/O operations or timing delays
affecting the USB ack.
To resolve this issue, we introduce a timeout parameter to the
qdl_write function, consistent with the existing qdl_read, and
we increase the timeout to 10 seconds for Firehose raw binary
write operations to avoid 'false-positive' timeout.
Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
The firehose_read() at the start of firehose_run() serves the purpose of
waiting for the programmer to show up and to consume any <log/> messages
spat out, so that we're ready to invoke the <configure/>.
On MSM8916 there are no <log/> entries, so this is a 5 second loop
hitting -ETIMEDOUTs in firehose_read(). On other tested targets, it does
drain the <log/> entries and then hits -ETIMEDOUTs for the remainder of
5 seconds.
On the newer targets, we could perhaps determine when the programmer is
up by looking for <log/> entries, but this doesn't help the MSM8916
model and it doesn't fit with the current implementation.
If we instead tweak the timeouts and error handling of
firehose_configure(), we can speculatively attempt to configure the
programmer until success.
This meets both the observed models, and shows savings between 4 and 4.5
seconds in startup time across the tested devices.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
firehose_run() is clearly sequential, except if UFS provisioning is
taken place, then it's a completely different sequence and an early
return - almost like it's two separate functions crammed into one.
Split firehose_provision() from firehose_run() and make the decision in
qdl.c instead.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Somewhere along the road, we broke support for the programmer for
msm8916 (and presumably other similar socs). Attempts to operate
firehose against these boards would exclusively result in USB timeouts
on write.
The reason for this is that, in contrast to all other commonly used
targets, the programmer for MSM8916 generates <log/> messages after the
<response/> message and writes are not services until all the <log/>
entries are consumed.
Rework the read loop, to continue reading messages until a timeout or an
error occurs, and then return the response status, if one was previously
found. This cost us 100ms per attempt to read a response, but I've not
found any other way to determine that we've consumed all responses.
Naively doing this does however break <read/>, as immediately following
the response with "rawmode=true" the device starts sending the data.
Adding support for parsing out "rawmode" and immediately terminating the
loop solves this, and reduces the impact of the 100ms penalty (as it no
longer applies to either <program/> or <read/>.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Rather than letting the user wait in silence for read operations add a
progress bar, in the same way as it's done for writing.
Make it conditional on the "quiet" variable, to avoid spamming the log
when the GPT parser etc kicks in.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Rather than letting the flash process get halfway through before
aborting for missing files, check that files are available while loading
the program xml files, before we start talking to the device.
In addition to a better user experience, this helps cleaning up the
upcoming change where sparse files are parsed at load time as well.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Rather than waiting until program and read time to resolve the include
path, adjust the filename paths while we're parsing the XML tags.
This simplifies the execution step slightly, but more importantly it
allow us to also apply the missing-files check at load time in an
upcoming commit.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
While already powerful, it's quite often one wants to read and write
some specific GPT partition, and manually resolving the sectors and
plugging these into either a XML file or the command line is tedious and
error prone.
Allow partition names in the address specifier of the "read" and "write"
command line actions, and when these are used read the GPTs across all
physical partitions to resolve the physical partition, start sector and
sector count for the operation.
This allow us to do things like:
qdl prog_firehose.elf write abl_a abl2esp.elf write abl_b abl2esp.elf
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
When programatically issuing read and write operations they still need
to have a sector size that matches the selected storage device. But
rather than forcing the user to tell us what the sector size is,
implement a mechanism to probe for one that works.
The detected sector size is stored in the qdl context and is used if the
read or program operations don't specify one themself.
This could be used to improve user friendliness, e.g. by providing
better error messages when the wrong value is used, but this is left as
an open item.
Since the behavior of NAND-based programmers is unknown to me, the
mechanism is left disabled when this storage type is selected.
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
When the sparse image parser finds a RAW chunk, it queries the current
file offset in the sparse image and store this for the programming
phase. But the offset is stored in a 32-bit unsigned int, so when the
sparse image passes 4GB the program entries start to refer to the wrong
data.
Split the fill_value and offset into dedicated fields and give them both
their specific type, to avoid any confusion related to the size of these
data types.
Fixes: 02c008adfd ("add support sparse attribute")
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>