There are issues related to the boot_ec:
1. If acpi_ec_remove() is invoked, boot_ec will also be freed, this is not
expected as the boot_ec could be enumerated via ECDT.
2. Address space handler installation/unstallation lead to unexpected _REG
evaluations.
This patch adds acpi_is_boot_ec() check to be used to fix the above issues.
However, since acpi_ec_remove() actually won't be invoked, this patch
doesn't handle the reference counting of "struct acpi_ec", it only ensures
the correctness of the boot_ec destruction during the boot.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=153511
Reported-and-tested-by: Jonh Henderson <jw.hendy@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In order to support full ECDT (driving the ECDT EC after probing the
namespace EC), we need to change our EC device alloc/free algorithm, ensure
not to free old boot EC before qualifying new boot EC.
This patch achieves this by cleaning up first_ec/boot_ec logic:
1. first_ec: used to perform transactions, so it is assigned in new
acpi_ec_setup() function.
2. boot_ec: used to track early EC device, so it is assigned in new
acpi_config_boot_ec() function which explictly tells the driver to save
the EC device as early EC device.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=115021
Reported-and-tested-by: Luya Tshimbalanga <luya@fedoraproject.org>
Tested-by: Jonh Henderson <jw.hendy@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch enables the event freeze mode, flushing the EC event handling in
.suspend() callback. This feature is experimental, if it is bisected out to
be the cause of the real issues, please report the issues to the kernel
bugzilla for further root causing and improvement.
This mode eliminates useless _Qxx handling during the power saving
operations, thus can help to tune the power saving operations faster. Tests
show that this mode can efficiently block flooding _Qxx during the suspend
process and tune the speed of the suspend faster.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch makes 2 changes:
1. Restore old behavior
Originally, EC driver stops handling both events and transactions in
acpi_ec_block_transactions(), and restarts to handle transactions in
acpi_ec_unblock_transactions_early(), restarts to handle both events and
transactions in acpi_ec_unblock_transactions().
While currently, EC driver still stops handling both events and
transactions in acpi_ec_block_transactions(), but restarts to handle both
events and transactions in acpi_ec_unblock_transactions_early().
This patch tries to restore the old behavior by dropping
__acpi_ec_enable_event() from acpi_unblock_transactions_early().
2. Improve old behavior
However this still cannot fix the real issue as both of the
acpi_ec_unblock_xxx() functions are invoked in the noirq stage. Since the
EC driver actually doesn't implement the event handling in the polling
mode, re-enabling the event handling too early in the noirq stage could
result in the problem that if there is no triggering source causing
advance_transaction() to be invoked, pending SCI_EVT cannot be detected by
the EC driver and _Qxx cannot be triggered.
It actually makes sense to restart the event handling in any point during
resuming after the noirq stage. Just like the boot stage where the event
handling is enabled in .add(), this patch further moves
acpi_ec_enable_event() to .resume(). After doing that, the following 2
functions can be combined:
acpi_ec_unblock_transactions_early()/acpi_ec_unblock_transactions().
The differences of the event handling availability between the old behavior
(this patch isn't applied) and the new behavior (this patch is applied) are
as follows:
!Applied Applied
before suspend Y Y
suspend before EC Y Y
suspend after EC Y Y
suspend_late Y Y
suspend_noirq Y (actually N) Y (actually N)
resume_noirq Y (actually N) Y (actually N)
resume_late Y (actually N) Y (actually N)
resume before EC Y (actually N) Y (actually N)
resume after EC Y (actually N) Y
after resume Y (actually N) Y
Where "actually N" means if there is no triggering source, the EC driver
is actually not able to notice the pending SCI_EVT occurred in the noirq
stage. So we can clearly see that this patch has improved the situation.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After enabling the EC event handling, Linux is still in the noirq stage, if
there is no triggering source (EC transaction, GPE STS status),
advance_transaction() will not be invoked and SCI_EVT cannot be detected.
This patch adds one more triggering source after enabling the EC event
handling to poll the pending SCI_EVT.
Known issues:
1. Still no SCI_EVT triggering source
There could still be no SCI_EVT triggering source after handling the
first SCI_EVT (polled by this patch if any). Because after handling the
first SCI_EVT, Linux could still be in noirq stage and there could still
be no further triggering source in this stage. Then the second SCI_EVT
indicated during this stage still cannot be detected by the EC driver.
With this improvement applied, it is then possible to move
acpi_ec_enable_event() out of the noirq stage to fix this issue (if the
first SCI_EVT is handled out of the noirq stage, the follow-up SCI_EVTs
should be able to trigger IRQs).
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is a hidden logic in the EC driver:
1. During boot, EC_FLAGS_QUERY_PENDING is responsible for blocking event
handling;
2. During suspend, EC_FLAGS_STARTED is responsible for blocking event
handling.
This patch uses a new EC_FLAGS_QUERY_ENABLED flag to make this hidden
logic explicit and have code cleaned up. No functional change.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is reported that on some platforms, resume speed is not fast. The cause
is: in noirq stage, EC driver is working in polling mode, and each state
machine advancement requires a context switch.
The context switch is not necessary to the EC driver's polling mode. This
patch implements PM hooks to automatically switch the driver to/from the
busy polling mode to eliminate the overhead caused by the context switch.
This finally contributes to the tuning result: acpi_pm_finish() execution
time is improved from 192ms to 6ms.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reported-and-tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A regression is caused by the following commit:
Commit: 02b771b64b
Subject: ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations
In this commit, using system workqueue causes that the maximum parallel
executions of _Qxx can exceed 255. This violates the method reentrancy
limit in ACPICA and generates the following error log:
ACPI Error: Method reached maximum reentrancy limit (255) (20150818/dsmethod-341)
This patch creates a seperate workqueue and limits the number of parallel
_Qxx evaluations down to a configurable value (can be tuned against number
of online CPUs).
Since EC events are handled after driver probe, we can create the workqueue
in acpi_ec_init().
Fixes: 02b771b64b (ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=135691
Cc: 4.3+ <stable@vger.kernel.org> # 4.3+
Reported-and-tested-by: Helen Buus <ubuntu@hbuus.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is an order issue in ec_remove_handlers() that acpi_ec_stop()
is called before removing the operation region handler. That is
incorrect, because the operation region handler removal triggers
_REG(DISCONNECT) which may result in new EC transactions to carry
out.
That existing issue has been triggered by the following commit:
Commit: dcf15cbded
Subject: ACPI / EC: Fix a boot EC regresion by restoring boot EC
which changed the driver to call ec_remove_handlers() after invoking
_REG(CONNECT), so the issue has become visible.
Fixes: dcf15cbded (ACPI / EC: Fix a boot EC regresion by restoring boot EC)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=102421
Reported-and-tested-by: Wolfram Sang <wsa@the-dreams.de>
Reported-by: Nicholas <nkudriavtsev@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Our Windows probe result shows that EC._REG is evaluated after evaluating
all _INI/_STA control methods.
With boot EC always switched in acpi_ec_dsdt_probe(), we can see that as
long as there is no EC opregion accesses in the MLC (module level code, AML
code out of any control methods) and in _INI/_STA, there is no need to make
sure that ECDT must be correct.
Bugs of 9399/12461 were reported against an order issue that BAT0/1._STA
evaluations contain EC accesses while the ECDT setting is wrong.
>From the acpidump output posted on bug 9399, we can see that it is actually
a different issue. In this table, if EC._REG is not executed, EC accesses
will be done in a platform specific manner. As we've already ensured not to
execute EC._REG during the eary stage, we can remove the quirks for bug
9399.
From the acpidump output posted on bug 12461, we can see that it still
needs the quirk. In this table, EC._REG flags a named object whose default
value is One, thus BAT1._STA surely should invoke EC accesses whatever we
invoke EC._REG or not. We have to keep the quirk for it before we can root
cause the issue.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Failure handling of the boot EC code is not tidy. This patch cleans
them up with acpi_ec_alloc().
This patch also changes acpi_ec_dsdt_probe(), always switches the
boot EC from the ECDT one to the DSDT one in this function.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to the Windows probing result, during the table loading, the EC
device described in the ECDT should be used. And the ECDT EC is also
effective during the period the namespace objects are initialized (we can
see a separate process executing _STA/_INI on Windows before executing
other device specific control methods, for example, EC._REG). During the
device enumration, the EC device described in the DSDT should be used. But
there are differences between Linux and Windows around the device probing
order. Thus in Linux, we should enable the DSDT EC as early as possible
before enumerating devices in order not to trigger issues related to the
device enumeration order differences.
This patch thus converts acpi_boot_ec_enable() into acpi_ec_dsdt_probe() to
fix the gap. This also fixes a user reported regression triggered after we
switched the "table loading"/"ECDT support" to be ACPI spec 2.0 compliant.
Fixes: 59f0aa9480 (ACPI 2.0 / ECDT: Remove early namespace reference from EC)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=119261
Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.
Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
"The presence of this table allows OSPM to provide Embedded Controller
operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
Device (EC)
{
Name (REGC, Ones)
Method (_REG, 2)
{
If (LEqual (Arg0, 3))
{
Store (Arg1, REGC)
}
}
Method (ECAV)
{
If (LEqual (REGC, Ones))
{
If (LGreaterEqual (_REV, 2))
{
Return (One)
}
Else
{
Return (Zero)
}
}
Else
{
Return (REGC)
}
}
}
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.
But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
first control method evaluated by OSPM during the enumeration, this
allows EC accesses to happen for the entire enumeration process before
the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
happen during the table loading.
Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.
The wrong bug fixing story is:
1. Link 1:
At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
before the namespace is ready" are not ensured by ACPICA core and Linux.
This is currently ensured by deferred enabling of GPE and defered
registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
Reporters reported buggy ECDTs, expecting quirks for the platform.
Originally, the quirk is simple, only doing things with ECDT.
Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
reported to have wrong ECDT IO port addresses, the port addresses are
reversed.
Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
addresses, we can see that all EC accesses are protected by ECAV on
this platform, so actually no early EC accesses is required by this
platform.
3. Link 3:
But when the bug fixing developer was requested to provide a handy and
non-quirk bug fix, he tried to use correct EC settings from namespace
and broke the spec purpose. We can even see that the developer was
suffered from many regrssions. One interesting one is 14086, where the
actual root cause obviously should be: _REG is evaluated too early. But
unfortunately, the bug is fixed in a totally wrong way.
So everything goes wrong from these commits:
Commit: c6cb0e8784
Subject: ACPI: EC: Don't trust ECDT tables from ASUS
Commit: a5032bfdd9
Subject: ACPI: EC: Always parse EC device
This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.
Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916http://bugzilla.kernel.org/show_bug.cgi?id=10100https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399https://bugzilla.kernel.org/show_bug.cgi?id=12461https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884https://bugzilla.kernel.org/show_bug.cgi?id=14081https://bugzilla.kernel.org/show_bug.cgi?id=14086https://bugzilla.kernel.org/show_bug.cgi?id=14446
Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch splits EC_FLAGS_HANDLERS_INSTALLED so that address space handler
can be installed when it is not possible to install GPE handler during
early stage.
This patch also tunes address space handler installation, making it
happening earlier than GPE handler installation for the same purpose.
Since acpi_ec_start()/acpi_ec_stop() will be entered multiple times after
applying this change, it is also required to protect acpi_enable_gpe()/
acpi_disable_gpe() invocations.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The acpi_ec_delete_query() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In acpi_ec_guard_event(), EC transaction state machine variables should be
checked with the EC spinlock locked.
The bug doesn't trigger any real issue now because this bug can only occur
when the ec_event_clearing=event mode is applied while there is no user
currently using this mode.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
1. acpi_ec_remove_query_handlers()
This patch refines the query handler removal logic implemented in
acpi_ec_remove_query_handler(), making it to invoke new
acpi_ec_remove_query_handlers() API, and ensuring all other removal code
paths to invoke the new API to honor the reference count of the query
handlers.
2. acpi_ec_get_query_handler_by_value()
This patch also refines the query handler search logic originally
implemented in acpi_ec_query(), collecting it into
acpi_ec_get_query_handler_by_value(). And since schedule_work() can ensure
the serilization of acpi_ec_event_handler(), we needn't put the
mutex_lock() around schedule_work().
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When query handler is not found, "result" is actually stil 0, and
"struct acpi_ec_query" is not NULL, so the deletion code of
"struct acpi_ec_query" at the end of the function cannot be invoked.
As a consequence, memory leak can be observed.
The issue is introduced by this commit:
Commit: 02b771b64b
Subject: ACPI / EC: Fix an issue caused by the serialized _Qxx
This patch fixes such memory leakage.
Fixes: 02b771b64b (ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations)
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>