kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Lyude Paul	c358ebf596	drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX While I had thought I had fixed this issue in: commit `342406e4fb` ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") It turns out that while I did fix the error messages I was seeing on my P50 when trying to access i2c busses with the GPU in runtime suspend, I accidentally had missed one important detail that was mentioned on the bug report this commit was supposed to fix: that the CPU would only lock up when trying to access i2c busses _on connected devices_ _while the GPU is not in runtime suspend_. Whoops. That definitely explains why I was not able to get my machine to hang with i2c bus interactions until now, as plugging my P50 into it's dock with an HDMI monitor connected allowed me to finally reproduce this locally. Now that I have managed to reproduce this issue properly, it looks like the problem is much simpler then it looks. It turns out that some connected devices, such as MST laptop docks, will actually ACK i2c reads even if no data was actually read: [ 275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1 [ 275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000 [ 275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001 [ 275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000 [ 275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000 [ 275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000 Because we don't handle the situation of i2c ack without any data, we end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the value of cnt always remains at 0. This finally properly explains how this could result in a CPU hang like the ones observed in the aforementioned commit. So, fix this by retrying transactions if no data is written or received, and give up and fail the transaction if we continue to not write or receive any data after 32 retries. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-08-23 12:42:43 +10:00
Ben Skeggs	4d352dbd58	drm/nouveau/secboot/gp102-: remove WAR for SEC2 RTOS start bug Appears to be fixed by "flcn/gp102-: improve implementation of bind_context() on SEC2/GSP". Tested on GP10[24678] and GV100. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:51 +10:00
Ben Skeggs	5210e967d3	drm/nouveau/flcn/gp102-: improve implementation of bind_context() on SEC2/GSP Fixes various issues encountered while attempting to initialise ACR. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:51 +10:00
Yongxin Liu	09b90e2fe3	drm/nouveau: fix memory leak in nouveau_conn_reset() In nouveau_conn_reset(), if connector->state is true, __drm_atomic_helper_connector_destroy_state() will be called, but the memory pointed by asyc isn't freed. Memory leak happens in the following function __drm_atomic_helper_connector_reset(), where newly allocated asyc->state will be assigned to connector->state. So using nouveau_conn_atomic_destroy_state() instead of __drm_atomic_helper_connector_destroy_state to free the "old" asyc. Here the is the log showing memory leak. unreferenced object 0xffff8c5480483c80 (size 192): comm "kworker/0:2", pid 188, jiffies 4294695279 (age 53.179s) hex dump (first 32 bytes): 00 f0 ba 7b 54 8c ff ff 00 00 00 00 00 00 00 00 ...{T........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000005005c0d0>] kmem_cache_alloc_trace+0x195/0x2c0 [<00000000a122baed>] nouveau_conn_reset+0x25/0xc0 [nouveau] [<000000004fd189a2>] nouveau_connector_create+0x3a7/0x610 [nouveau] [<00000000c73343a8>] nv50_display_create+0x343/0x980 [nouveau] [<000000002e2b03c3>] nouveau_display_create+0x51f/0x660 [nouveau] [<00000000c924699b>] nouveau_drm_device_init+0x182/0x7f0 [nouveau] [<00000000cc029436>] nouveau_drm_probe+0x20c/0x2c0 [nouveau] [<000000007e961c3e>] local_pci_probe+0x47/0xa0 [<00000000da14d569>] work_for_cpu_fn+0x1a/0x30 [<0000000028da4805>] process_one_work+0x27c/0x660 [<000000001d415b04>] worker_thread+0x22b/0x3f0 [<0000000003b69f1f>] kthread+0x12f/0x150 [<00000000c94c29b7>] ret_from_fork+0x3a/0x50 Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:51 +10:00
Ralph Campbell	d304654bd7	drm/nouveau/dmem: missing mutex_lock in error path In nouveau_dmem_pages_alloc(), the drm->dmem->mutex is unlocked before calling nouveau_dmem_chunk_alloc() as shown when CONFIG_PROVE_LOCKING is enabled: [ 1294.871933] ===================================== [ 1294.876656] WARNING: bad unlock balance detected! [ 1294.881375] 5.2.0-rc3+ #5 Not tainted [ 1294.885048] ------------------------------------- [ 1294.889773] test-malloc-vra/6299 is trying to release lock (&drm->dmem->mutex) at: [ 1294.897482] [<ffffffffa01a220f>] nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.905782] but there are no more locks to release! [ 1294.910690] [ 1294.910690] other info that might help us debug this: [ 1294.917249] 1 lock held by test-malloc-vra/6299: [ 1294.921881] #0: 0000000016e10454 (&mm->mmap_sem#2){++++}, at: nouveau_svmm_bind+0x142/0x210 [nouveau] [ 1294.931313] [ 1294.931313] stack backtrace: [ 1294.935702] CPU: 4 PID: 6299 Comm: test-malloc-vra Not tainted 5.2.0-rc3+ #5 [ 1294.942786] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1401 05/21/2018 [ 1294.949590] Call Trace: [ 1294.952059] dump_stack+0x7c/0xc0 [ 1294.955469] ? nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.962213] print_unlock_imbalance_bug.cold.52+0xca/0xcf [ 1294.967641] lock_release+0x306/0x380 [ 1294.971383] ? nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.978089] ? lock_downgrade+0x2d0/0x2d0 [ 1294.982121] ? find_held_lock+0xac/0xd0 [ 1294.985979] __mutex_unlock_slowpath+0x8f/0x3f0 [ 1294.990540] ? wait_for_completion+0x230/0x230 [ 1294.995002] ? rwlock_bug.part.2+0x60/0x60 [ 1294.999197] nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1295.005751] ? page_mapping+0x98/0x110 [ 1295.009511] migrate_vma+0xa74/0x1090 [ 1295.013186] ? move_to_new_page+0x480/0x480 [ 1295.017400] ? __kmalloc+0x153/0x300 [ 1295.021052] ? nouveau_dmem_migrate_vma+0xd8/0x1e0 [nouveau] [ 1295.026796] nouveau_dmem_migrate_vma+0x157/0x1e0 [nouveau] [ 1295.032466] ? nouveau_dmem_init+0x490/0x490 [nouveau] [ 1295.037612] ? vmacache_find+0xc2/0x110 [ 1295.041537] nouveau_svmm_bind+0x1b4/0x210 [nouveau] [ 1295.046583] ? nouveau_svm_fault+0x13e0/0x13e0 [nouveau] [ 1295.051912] drm_ioctl_kernel+0x14d/0x1a0 [ 1295.055930] ? drm_setversion+0x330/0x330 [ 1295.059971] drm_ioctl+0x308/0x530 [ 1295.063384] ? drm_version+0x150/0x150 [ 1295.067153] ? find_held_lock+0xac/0xd0 [ 1295.070996] ? __pm_runtime_resume+0x3f/0xa0 [ 1295.075285] ? mark_held_locks+0x29/0xa0 [ 1295.079230] ? _raw_spin_unlock_irqrestore+0x3c/0x50 [ 1295.084232] ? lockdep_hardirqs_on+0x17d/0x250 [ 1295.088768] nouveau_drm_ioctl+0x9a/0x100 [nouveau] [ 1295.093661] do_vfs_ioctl+0x137/0x9a0 [ 1295.097341] ? ioctl_preallocate+0x140/0x140 [ 1295.101623] ? match_held_lock+0x1b/0x230 [ 1295.105646] ? match_held_lock+0x1b/0x230 [ 1295.109660] ? find_held_lock+0xac/0xd0 [ 1295.113512] ? __do_page_fault+0x324/0x630 [ 1295.117617] ? lock_downgrade+0x2d0/0x2d0 [ 1295.121648] ? mark_held_locks+0x79/0xa0 [ 1295.125583] ? handle_mm_fault+0x352/0x430 [ 1295.129687] ksys_ioctl+0x60/0x90 [ 1295.133020] ? mark_held_locks+0x29/0xa0 [ 1295.136964] __x64_sys_ioctl+0x3d/0x50 [ 1295.140726] do_syscall_64+0x68/0x250 [ 1295.144400] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1295.149465] RIP: 0033:0x7f1a3495809b [ 1295.153053] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48 [ 1295.171850] RSP: 002b:00007ffef7ed1358 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1295.179451] RAX: ffffffffffffffda RBX: 00007ffef7ed1628 RCX: 00007f1a3495809b [ 1295.186601] RDX: 00007ffef7ed13b0 RSI: 0000000040406449 RDI: 0000000000000004 [ 1295.193759] RBP: 00007ffef7ed13b0 R08: 0000000000000000 R09: 000000000157e770 [ 1295.200917] R10: 000000000151c010 R11: 0000000000000246 R12: 0000000040406449 [ 1295.208083] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000 Reacquire the lock before continuing to the next page. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:51 +10:00
Karol Herbst	68bf8b5779	drm/nouveau/hwmon: return EINVAL if the GPU is powered down for sensors reads fixes bogus values userspace gets from hwmon while the GPU is powered down Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:51 +10:00
Ben Skeggs	b0f84a84ff	drm/nouveau: fix bogus GPL-2 license header The bulk SPDX addition made all these files into GPL-2.0 licensed files. However the remainder of the project is MIT-licensed, these files were simply missing the boiler plate and got caught up in the global update. Fixes: `96ac6d4351` (treewide: Add SPDX license identifier - Kbuild) Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:51 +10:00
Ilia Mirkin	b7019ac550	drm/nouveau: fix bogus GPL-2 license header The bulk SPDX addition made all these files into GPL-2.0 licensed files. However the remainder of the project is MIT-licensed, these files (primarily header files) were simply missing the boiler plate and got caught up in the global update. Fixes: `b24413180f` (License cleanup: add SPDX GPL-2.0 license identifier to files with no license) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Lyude Paul	7cb95eeea6	drm/nouveau/i2c: Enable i2c pads & busses during preinit It turns out that while disabling i2c bus access from software when the GPU is suspended was a step in the right direction with: commit `342406e4fb` ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") We also ended up accidentally breaking the vbios init scripts on some older Tesla GPUs, as apparently said scripts can actually use the i2c bus. Since these scripts are executed before initializing any subdevices, we end up failing to acquire access to the i2c bus which has left a number of cards with their fan controllers uninitialized. Luckily this doesn't break hardware - it just means the fan gets stuck at 100%. This also means that we've always been using our i2c busses before initializing them during the init scripts for older GPUs, we just didn't notice it until we started preventing them from being used until init. It's pretty impressive this never caused us any issues before! So, fix this by initializing our i2c pad and busses during subdev pre-init. We skip initializing aux busses during pre-init, as those are guaranteed to only ever be used by nouveau for DP aux transactions. Signed-off-by: Lyude Paul <lyude@redhat.com> Tested-by: Marc Meledandri <m.meledandri@gmail.com> Fixes: `342406e4fb` ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Ben Skeggs	3485b7b50b	drm/nouveau/disp/tu102-: wire up scdc parameter setter Regs seem valid here still, and tested on TU116. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Ben Skeggs	75dec321cd	drm/nouveau/core: recognise TU116 chipset Modesetting only, still waiting on ACR/GR firmware from NVIDIA for Turing graphics/compute bring-up. Each subsystem was compared with traces, along with various tests to check that things generally work as they should, and appears compatible enough with the current TU117 code to enable support. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Ben Skeggs	d108418478	drm/nouveau/kms: disallow dual-link harder if hdmi connection detected The fallthrough cases (pre-Fermi) would accidentally allow dual-link pixel clocks even where they shouldn't be. This leads to a high resolution HDMI displays, connected via a DVI->HDMI adapter, to fail on the original NV50. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Ilia Mirkin	533f475240	drm/nouveau/disp/nv50-: fix center/aspect-corrected scaling Previously center scaling would get scaling applied to it (when it was only supposed to center the image), and aspect-corrected scaling did not always correctly pick whether to reduce width or height for a particular combination of inputs/outputs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110660 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Ilia Mirkin	f8d6211ac7	drm/nouveau/disp/nv50-: force scaler for any non-default LVDS/eDP modes Higher layers tend to add a lot of modes not actually in the EDID, such as the standard DMT modes. Changing this would be extremely intrusive to everyone, so just force the scaler more often. There are no practical cases we're aware of where a LVDS/eDP panel has multiple resolutions exposed, and i915 already does it this way. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110660 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Timo Wiren	bb2b4074f8	drm/nouveau/mcp89/mmu: Use mcp77_mmu_new instead of g84_mmu_new on MCP89. Fix a crash or broken depth testing in all OpenGL applications that use the depth buffer on MCP89 (GeForce 320M) seen on a MacBook Pro Late 2010. The bug is tracked in https://bugs.freedesktop.org/show_bug.cgi?id=108500 Signed-off-by: Timo Wiren <timo.wiren@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-07-19 16:26:50 +10:00
Dave Airlie	3729fe2bc2	Revert "Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux into drm-next" This reverts commit `031e610a6a`, reversing changes made to `52d2d44eee`. The mm changes in there we premature and not fully ack or reviewed by core mm folks, I dropped the ball by merging them via this tree, so lets take em all back out. Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-07-16 04:07:13 +10:00
Dave Airlie	7e4b4dfc98	Revert "mm: adjust apply_to_pfn_range interface for dropped token." This reverts commit `6dfc43d3a1`. Going to revert the whole vmwwgfx pull. Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-07-16 04:06:29 +10:00
Dave Airlie	6dfc43d3a1	mm: adjust apply_to_pfn_range interface for dropped token. mm/pgtable: drop pgtable_t variable from pte_fn_t functions drops the token came in via the hmm tree, this caused lots of conflicts, but applying this cleanup patch should reduce it to something easier to handle. Just accept the token is unused at this point. Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-07-15 15:16:20 +10:00
Dave Airlie	f27b99a1ce	Merge tag 'imx-drm-next-2019-07-05' of git://git.pengutronix.de/git/pza/linux into drm-next drm/imx: IPUv3 image converter improvements, enable scanout FIFO watermark - Fix a saturation bit position in the colorspace converter configuration memory. - Fully describe colorspace conversions in the API to the imx-media driver. - Add support for limited range and Rec.709 YUV encoding. - Enable colorimetry configuration via the media-controller API. - Enable the double write reduction feature for memory bandwidth savings when the image converter writes YUV 4:2:0 output. - Enable a scanout FIFO watermark feature that can increase priority of scanout read transfers at the memory controller whenever the FIFO runs low. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/1562326831.4291.8.camel@pengutronix.de	2019-07-12 07:30:20 +10:00
Dave Airlie	b784d6bff9	Merge tag 'drm-next-5.3-2019-07-09' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.3-2019-07-09: amdgpu: - GPU reset for navi10 - Powerplay fixes for navi10 - GFX fixes for navi10 - Prepare for hmm_range_register API change - XGMI fixes - clang warning fixes - Fixes for various kconfig scenarios - Misc fixes and cleanups amdkfd: - Add workaround for soft hangs with oversubscribed runlists - Remove duplicated pcie atomics request Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190710035017.3407-1-alexander.deucher@amd.com	2019-07-12 06:57:16 +10:00
Alex Deucher	7f963d9f69	drm/amdgpu/navi10: add uclk activity sensor Query the metrics table for the current uclk activity. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-09 17:43:36 -05:00
Alex Deucher	f54eeab4e7	drm/amdgpu: properly guard the generic discovery code It's only available on navi and newer. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-09 17:43:31 -05:00
Alex Deucher	4056278714	drm/amdgpu: add missing documentation on new module parameters New parameters added for navi lack documentation. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-09 17:43:26 -05:00
Marek Olšák	83145f110e	drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback This RELEASE_MEM use has the Release semantic, which means we should write back but not invalidate. Invalidations only make sense with the Acquire semantic (ACQUIRE_MEM), or when RELEASE_MEM is used to do the combined Acquire-Release semantic, which is a barrier, not a fence. The undesirable side effect of doing invalidations for the Release semantic is that it invalidates caches while shaders are running, because the Release can execute in the middle of the next IB. UMDs should use ACQUIRE_MEM at the beginning of IBs. Doing cache invalidations for a fence (like in this case) doesn't do anything for correctness. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-09 17:43:09 -05:00
Arnd Bergmann	5f65ae344f	drm/amd/display: avoid 64-bit division On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: `bedbbe6af4` ("drm/amd/display: Move link functions from dc to dc_link") Fixes: `f18bc4e53a` ("drm/amd/display: update calculated bounding box logic for NV") Acked-by: Slava Abramov <slava.abramov@amd.com> Tested-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-08 14:27:23 -05:00

1 2 3 4 5 ...

843041 Commits