commit fa897d17313ce94d8d368d29900455e2ce941106 Author: Greg Kroah-Hartman Date: Tue Aug 6 18:29:42 2019 +0200 Linux 4.9.188 commit d73af79742e708a171c1a224759fb0661ed6fcac Author: Vlastimil Babka Date: Fri Aug 2 18:06:14 2019 +0200 x86, mm, gup: prevent get_page() race with munmap in paravirt guest The x86 version of get_user_pages_fast() relies on disabled interrupts to synchronize gup_pte_range() between gup_get_pte(ptep); and get_page() against a parallel munmap. The munmap side nulls the pte, then flushes TLBs, then releases the page. As TLB flush is done synchronously via IPI disabling interrupts blocks the page release, and get_page(), which assumes existing reference on page, is thus safe. However when TLB flush is done by a hypercall, e.g. in a Xen PV guest, there is no blocking thanks to disabled interrupts, and get_page() can succeed on a page that was already freed or even reused. We have recently seen this happen with our 4.4 and 4.12 based kernels, with userspace (java) that exits a thread, where mm_release() performs a futex_wake() on tsk->clear_child_tid, and another thread in parallel unmaps the page where tsk->clear_child_tid points to. The spurious get_page() succeeds, but futex code immediately releases the page again, while it's already on a freelist. Symptoms include a bad page state warning, general protection faults acessing a poisoned list prev/next pointer in the freelist, or free page pcplists of two cpus joined together in a single list. Oscar has also reproduced this scenario, with a patch inserting delays before the get_page() to make the race window larger. Fix this by removing the dependency on TLB flush interrupts the same way as the generic get_user_pages_fast() code by using page_cache_add_speculative() and revalidating the PTE contents after pinning the page. Mainline is safe since 4.13 where the x86 gup code was removed in favor of the common code. Accessing the page table itself safely also relies on disabled interrupts and TLB flush IPIs that don't happen with hypercalls, which was acknowledged in commit 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)"). That commit with follups should also be backported for full safety, although our reproducer didn't hit a problem without that backport. Reproduced-by: Oscar Salvador Signed-off-by: Vlastimil Babka Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juergen Gross Cc: Kirill A. Shutemov Cc: Vitaly Kuznetsov Cc: Linus Torvalds Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Signed-off-by: Greg Kroah-Hartman commit 410b20dd6f219d4fc3ac2c48705fc0f4025e2078 Author: Josh Poimboeuf Date: Wed Oct 31 21:57:30 2018 -0500 objtool: Support GCC 9 cold subfunction naming scheme commit bcb6fb5da77c2a228adf07cc9cb1a0c2aa2001c6 upstream. Starting with GCC 8, a lot of unlikely code was moved out of line to "cold" subfunctions in .text.unlikely. For example, the unlikely bits of: irq_do_set_affinity() are moved out to the following subfunction: irq_do_set_affinity.cold.49() Starting with GCC 9, the numbered suffix has been removed. So in the above example, the cold subfunction is instead: irq_do_set_affinity.cold() Tweak the objtool subfunction detection logic so that it detects both GCC 8 and GCC 9 naming schemes. Reported-by: Peter Zijlstra (Intel) Signed-off-by: Josh Poimboeuf Signed-off-by: Thomas Gleixner Tested-by: Peter Zijlstra (Intel) Acked-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/015e9544b1f188d36a7f02fa31e9e95629aa5f50.1541040800.git.jpoimboe@redhat.com Signed-off-by: Greg Kroah-Hartman commit 2c34c215c1028f971af7cf8189c4f6237afdb90b Author: Miguel Ojeda Date: Fri Aug 2 12:37:57 2019 +0200 include/linux/module.h: copy __init/__exit attrs to init/cleanup_module commit a6e60d84989fa0e91db7f236eda40453b0e44afa upstream. The upcoming GCC 9 release extends the -Wmissing-attributes warnings (enabled by -Wall) to C and aliases: it warns when particular function attributes are missing in the aliases but not in their target. In particular, it triggers for all the init/cleanup_module aliases in the kernel (defined by the module_init/exit macros), ending up being very noisy. These aliases point to the __init/__exit functions of a module, which are defined as __cold (among other attributes). However, the aliases themselves do not have the __cold attribute. Since the compiler behaves differently when compiling a __cold function as well as when compiling paths leading to calls to __cold functions, the warning is trying to point out the possibly-forgotten attribute in the alias. In order to keep the warning enabled, we decided to silence this case. Ideally, we would mark the aliases directly as __init/__exit. However, there are currently around 132 modules in the kernel which are missing __init/__exit in their init/cleanup functions (either because they are missing, or for other reasons, e.g. the functions being called from somewhere else); and a section mismatch is a hard error. A conservative alternative was to mark the aliases as __cold only. However, since we would like to eventually enforce __init/__exit to be always marked, we chose to use the new __copy function attribute (introduced by GCC 9 as well to deal with this). With it, we copy the attributes used by the target functions into the aliases. This way, functions that were not marked as __init/__exit won't have their aliases marked either, and therefore there won't be a section mismatch. Note that the warning would go away marking either the extern declaration, the definition, or both. However, we only mark the definition of the alias, since we do not want callers (which only see the declaration) to be compiled as if the function was __cold (and therefore the paths leading to those calls would be assumed to be unlikely). Link: https://lore.kernel.org/lkml/259986242.BvXPX32bHu@devpool35/ Cc: Link: https://lore.kernel.org/lkml/20190123173707.GA16603@gmail.com/ Link: https://lore.kernel.org/lkml/20190206175627.GA20399@gmail.com/ Suggested-by: Martin Sebor Acked-by: Jessica Yu Signed-off-by: Miguel Ojeda Signed-off-by: Greg Kroah-Hartman commit fe5844365ec6c4d61838f289926f4d55da94d2fb Author: Miguel Ojeda Date: Fri Aug 2 12:37:56 2019 +0200 Backport minimal compiler_attributes.h to support GCC 9 This adds support for __copy to v4.9.y so that we can use it in init/exit_module to avoid -Werror=missing-attributes errors on GCC 9. Link: https://lore.kernel.org/lkml/259986242.BvXPX32bHu@devpool35/ Cc: Suggested-by: Rolf Eike Beer Signed-off-by: Miguel Ojeda Signed-off-by: Greg Kroah-Hartman commit 0b326994892a2f8ed76cd4867c3aebcec9e6b54d Author: Jean Delvare Date: Sun Jul 28 18:41:38 2019 +0200 eeprom: at24: make spd world-readable again commit 25e5ef302c24a6fead369c0cfe88c073d7b97ca8 upstream. The integration of the at24 driver into the nvmem framework broke the world-readability of spd EEPROMs. Fix it. Signed-off-by: Jean Delvare Cc: stable@vger.kernel.org Fixes: 57d155506dd5 ("eeprom: at24: extend driver to plug into the NVMEM framework") Cc: Andrew Lunn Cc: Srinivas Kandagatla Cc: Greg Kroah-Hartman Cc: Bartosz Golaszewski Cc: Arnd Bergmann Signed-off-by: Bartosz Golaszewski [Bartosz: backported the patch to older branches] Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman commit d4fc64c92713f6962672439739d0f0dc4117cb5b Author: Andrea Arcangeli Date: Thu Jun 13 15:56:11 2019 -0700 coredump: fix race condition between collapse_huge_page() and core dumping commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream. When fixing the race conditions between the coredump and the mmap_sem holders outside the context of the process, we focused on mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping"), but those aren't the only cases where the mmap_sem can be taken outside of the context of the process as Michal Hocko noticed while backporting that commit to older -stable kernels. If mmgrab() is called in the context of the process, but then the mm_count reference is transferred outside the context of the process, that can also be a problem if the mmap_sem has to be taken for writing through that mm_count reference. khugepaged registration calls mmgrab() in the context of the process, but the mmap_sem for writing is taken later in the context of the khugepaged kernel thread. collapse_huge_page() after taking the mmap_sem for writing doesn't modify any vma, so it's not obvious that it could cause a problem to the coredump, but it happens to modify the pmd in a way that breaks an invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page() needs the mmap_sem for writing just to block concurrent page faults that call pmd_trans_huge_lock(). Specifically the invariant that "!pmd_trans_huge()" cannot become a "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs. The coredump will call __get_user_pages() without mmap_sem for reading, which eventually can invoke a lockless page fault which will need a functional pmd_trans_huge_lock(). So collapse_huge_page() needs to use mmget_still_valid() to check it's not running concurrently with the coredump... as long as the coredump can invoke page faults without holding the mmap_sem for reading. This has "Fixes: khugepaged" to facilitate backporting, but in my view it's more a bug in the coredump code that will eventually have to be rewritten to stop invoking page faults without the mmap_sem for reading. So the long term plan is still to drop all mmget_still_valid(). Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Andrea Arcangeli Reported-by: Michal Hocko Acked-by: Michal Hocko Acked-by: Kirill A. Shutemov Cc: Oleg Nesterov Cc: Jann Horn Cc: Hugh Dickins Cc: Mike Rapoport Cc: Mike Kravetz Cc: Peter Xu Cc: Jason Gunthorpe Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [Ajay: Just adjusted to apply on v4.9] Signed-off-by: Ajay Kaher Signed-off-by: Greg Kroah-Hartman commit 91c3a6ce3a0a652921cc3f9feaaf2ab7cecbea96 Author: Ajay Kaher Date: Sun Aug 4 09:29:26 2019 +0530 infiniband: fix race condition between infiniband mlx4, mlx5 driver and core dumping This patch is the extension of following upstream commit to fix the race condition between get_task_mm() and core dumping for IB->mlx4 and IB->mlx5 drivers: commit 04f5866e41fb ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping")' Thanks to Jason for pointing this. Signed-off-by: Ajay Kaher Reviewed-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 16903f1a5ba7707c051edfdfa457620bba45e2c9 Author: Andrea Arcangeli Date: Thu Apr 18 17:50:52 2019 -0700 coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream. The core dumping code has always run without holding the mmap_sem for writing, despite that is the only way to ensure that the entire vma layout will not change from under it. Only using some signal serialization on the processes belonging to the mm is not nearly enough. This was pointed out earlier. For example in Hugh's post from Jul 2017: https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils "Not strictly relevant here, but a related note: I was very surprised to discover, only quite recently, how handle_mm_fault() may be called without down_read(mmap_sem) - when core dumping. That seems a misguided optimization to me, which would also be nice to correct" In particular because the growsdown and growsup can move the vm_start/vm_end the various loops the core dump does around the vma will not be consistent if page faults can happen concurrently. Pretty much all users calling mmget_not_zero()/get_task_mm() and then taking the mmap_sem had the potential to introduce unexpected side effects in the core dumping code. Adding mmap_sem for writing around the ->core_dump invocation is a viable long term fix, but it requires removing all copy user and page faults and to replace them with get_dump_page() for all binary formats which is not suitable as a short term fix. For the time being this solution manually covers the places that can confuse the core dump either by altering the vma layout or the vma flags while it runs. Once ->core_dump runs under mmap_sem for writing the function mmget_still_valid() can be dropped. Allowing mmap_sem protected sections to run in parallel with the coredump provides some minor parallelism advantage to the swapoff code (which seems to be safe enough by never mangling any vma field and can keep doing swapins in parallel to the core dumping) and to some other corner case. In order to facilitate the backporting I added "Fixes: 86039bd3b4e6" however the side effect of this same race condition in /proc/pid/mem should be reproducible since before 2.6.12-rc2 so I couldn't add any other "Fixes:" because there's no hash beyond the git genesis commit. Because find_extend_vma() is the only location outside of the process context that could modify the "mm" structures under mmap_sem for reading, by adding the mmget_still_valid() check to it, all other cases that take the mmap_sem for reading don't need the new check after mmget_not_zero()/get_task_mm(). The expand_stack() in page fault context also doesn't need the new check, because all tasks under core dumping are frozen. Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Andrea Arcangeli Reported-by: Jann Horn Suggested-by: Oleg Nesterov Acked-by: Peter Xu Reviewed-by: Mike Rapoport Reviewed-by: Oleg Nesterov Reviewed-by: Jann Horn Acked-by: Jason Gunthorpe Acked-by: Michal Hocko Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [akaher@vmware.com: stable 4.9 backport - handle binder_update_page_range - mhocko@suse.com] Signed-off-by: Ajay Kaher Signed-off-by: Greg Kroah-Hartman commit ed56c2497a3ecdf7ff0e0ff33a2bb80f74ebbc56 Author: Yishai Hadas Date: Tue Jul 23 09:57:29 2019 +0300 IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification commit b7165bd0d6cbb93732559be6ea8774653b204480 upstream. The specification for the Toeplitz function doesn't require to set the key explicitly to be symmetric. In case a symmetric functionality is required a symmetric key can be simply used. Wrongly forcing the algorithm to symmetric causes the wrong packet distribution and a performance degradation. Link: https://lore.kernel.org/r/20190723065733.4899-7-leon@kernel.org Cc: # 4.7 Fixes: 28d6137008b2 ("IB/mlx5: Add RSS QP support") Signed-off-by: Yishai Hadas Reviewed-by: Alex Vainman Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 436cd8a992c3c66b3d80d7842f9f65d8f0d2a7f4 Author: Juergen Gross Date: Fri Jun 14 07:46:02 2019 +0200 xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() commit 50f6393f9654c561df4cdcf8e6cfba7260143601 upstream. The condition in xen_swiotlb_free_coherent() for deciding whether to call xen_destroy_contiguous_region() is wrong: in case the region to be freed is not contiguous calling xen_destroy_contiguous_region() is the wrong thing to do: it would result in inconsistent mappings of multiple PFNs to the same MFN. This will lead to various strange crashes or data corruption. Instead of calling xen_destroy_contiguous_region() in that case a warning should be issued as that situation should never occur. Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Reviewed-by: Jan Beulich Acked-by: Konrad Rzeszutek Wilk Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman commit 9224305cc02b5012496f530e6468a3e06a7f6618 Author: Will Deacon Date: Mon Jul 29 11:43:48 2019 +0100 drivers/perf: arm_pmu: Fix failure path in PM notifier commit 0d7fd70f26039bd4b33444ca47f0e69ce3ae0354 upstream. Handling of the CPU_PM_ENTER_FAILED transition in the Arm PMU PM notifier code incorrectly skips restoration of the counters. Fix the logic so that CPU_PM_ENTER_FAILED follows the same path as CPU_PM_EXIT. Cc: Fixes: da4e4f18afe0f372 ("drivers/perf: arm_pmu: implement CPU_PM notifier") Reported-by: Anders Roxell Acked-by: Lorenzo Pieralisi Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 0a87819292e88db7717a5d52f268a47563c53f19 Author: Stefan Haberland Date: Thu Aug 1 13:06:30 2019 +0200 s390/dasd: fix endless loop after read unit address configuration commit 41995342b40c418a47603e1321256d2c4a2ed0fb upstream. After getting a storage server event that causes the DASD device driver to update its unit address configuration during a device shutdown there is the possibility of an endless loop in the device driver. In the system log there will be ongoing DASD error messages with RC: -19. The reason is that the loop starting the ruac request only terminates when the retry counter is decreased to 0. But in the sleep_on function there are early exit paths that do not decrease the retry counter. Prevent an endless loop by handling those cases separately. Remove the unnecessary do..while loop since the sleep_on function takes care of retries by itself. Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1") Cc: stable@vger.kernel.org # 2.6.25+ Signed-off-by: Stefan Haberland Reviewed-by: Jan Hoeppner Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit ae190f04359d04333d7c4ff24673f1b76e974260 Author: Ondrej Mosnacek Date: Thu Jul 25 12:52:43 2019 +0200 selinux: fix memory leak in policydb_init() commit 45385237f65aeee73641f1ef737d7273905a233f upstream. Since roles_init() adds some entries to the role hash table, we need to destroy also its keys/values on error, otherwise we get a memory leak in the error path. Cc: Reported-by: syzbot+fee3a14d4cdf92646287@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 7b18585947c305e254429fc85db4d37890c58c97 Author: Michael Wu Date: Mon Jul 8 13:23:08 2019 +0800 gpiolib: fix incorrect IRQ requesting of an active-low lineevent commit 223ecaf140b1dd1c1d2a1a1d96281efc5c906984 upstream. When a pin is active-low, logical trigger edge should be inverted to match the same interrupt opportunity. For example, a button pushed triggers falling edge in ACTIVE_HIGH case; in ACTIVE_LOW case, the button pushed triggers rising edge. For user space the IRQ requesting doesn't need to do any modification except to configuring GPIOHANDLE_REQUEST_ACTIVE_LOW. For example, we want to catch the event when the button is pushed. The button on the original board drives level to be low when it is pushed, and drives level to be high when it is released. In user space we can do: req.handleflags = GPIOHANDLE_REQUEST_INPUT; req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE; while (1) { read(fd, &dat, sizeof(dat)); if (dat.id == GPIOEVENT_EVENT_FALLING_EDGE) printf("button pushed\n"); } Run the same logic on another board which the polarity of the button is inverted; it drives level to be high when pushed, and level to be low when released. For this inversion we add flag GPIOHANDLE_REQUEST_ACTIVE_LOW: req.handleflags = GPIOHANDLE_REQUEST_INPUT | GPIOHANDLE_REQUEST_ACTIVE_LOW; req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE; At the result, there are no any events caught when the button is pushed. By the way, button releasing will emit a "falling" event. The timing of "falling" catching is not expected. Cc: stable@vger.kernel.org Signed-off-by: Michael Wu Tested-by: Bartosz Golaszewski Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman commit 9d06bfcb68c09ba87d15c439a673b4c26ea0f0db Author: Douglas Anderson Date: Mon Jul 8 12:56:13 2019 -0700 mmc: dw_mmc: Fix occasional hang after tuning on eMMC commit ba2d139b02ba684c6c101de42fed782d6cd2b997 upstream. In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.") we fixed a tuning-induced hang that I saw when stress testing tuning on certain SD cards. I won't re-hash that whole commit, but the summary is that as a normal part of tuning you need to deal with transfer errors and there were cases where these transfer errors was putting my system into a bad state causing all future transfers to fail. That commit fixed handling of the transfer errors for me. In downstream Chrome OS my fix landed and had the same behavior for all SD/MMC commands. However, it looks like when the commit landed upstream we limited it to only SD tuning commands. Presumably this was to try to get around problems that Alim Akhtar reported on exynos [1]. Unfortunately while stress testing reboots (and suspend/resume) on some rk3288-based Chromebooks I found the same problem on the eMMC on some of my Chromebooks (the ones with Hynix eMMC). Since the eMMC tuning command is different (MMC_SEND_TUNING_BLOCK_HS200 vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the same situation. I'm hoping that whatever problems exynos was having in the past are somehow magically fixed now and we can make the behavior the same for all commands. [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.") Signed-off-by: Douglas Anderson Cc: Marek Szyprowski Cc: Alim Akhtar Cc: Enric Balletbo i Serra Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 09c63dcc30402c03f10c128ecd6ccf776ca174b1 Author: Filipe Manana Date: Wed Jul 17 13:23:39 2019 +0100 Btrfs: fix incremental send failure after deduplication commit b4f9a1a87a48c255bb90d8a6c3d555a1abb88130 upstream. When doing an incremental send operation we can fail if we previously did deduplication operations against a file that exists in both snapshots. In that case we will fail the send operation with -EIO and print a message to dmesg/syslog like the following: BTRFS error (device sdc): Send: inconsistent snapshot, found updated \ extent for inode 257 without updated inode item, send root is 258, \ parent root is 257 This requires that we deduplicate to the same file in both snapshots for the same amount of times on each snapshot. The issue happens because a deduplication only updates the iversion of an inode and does not update any other field of the inode, therefore if we deduplicate the file on each snapshot for the same amount of time, the inode will have the same iversion value (stored as the "sequence" field on the inode item) on both snapshots, therefore it will be seen as unchanged between in the send snapshot while there are new/updated/deleted extent items when comparing to the parent snapshot. This makes the send operation return -EIO and print an error message. Example reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt # Create our first file. The first half of the file has several 64Kb # extents while the second half as a single 512Kb extent. $ xfs_io -f -s -c "pwrite -S 0xb8 -b 64K 0 512K" /mnt/foo $ xfs_io -c "pwrite -S 0xb8 512K 512K" /mnt/foo # Create the base snapshot and the parent send stream from it. $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1 $ btrfs send -f /tmp/1.snap /mnt/mysnap1 # Create our second file, that has exactly the same data as the first # file. $ xfs_io -f -c "pwrite -S 0xb8 0 1M" /mnt/bar # Create the second snapshot, used for the incremental send, before # doing the file deduplication. $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2 # Now before creating the incremental send stream: # # 1) Deduplicate into a subrange of file foo in snapshot mysnap1. This # will drop several extent items and add a new one, also updating # the inode's iversion (sequence field in inode item) by 1, but not # any other field of the inode; # # 2) Deduplicate into a different subrange of file foo in snapshot # mysnap2. This will replace an extent item with a new one, also # updating the inode's iversion by 1 but not any other field of the # inode. # # After these two deduplication operations, the inode items, for file # foo, are identical in both snapshots, but we have different extent # items for this inode in both snapshots. We want to check this doesn't # cause send to fail with an error or produce an incorrect stream. $ xfs_io -r -c "dedupe /mnt/bar 0 0 512K" /mnt/mysnap1/foo $ xfs_io -r -c "dedupe /mnt/bar 512K 512K 512K" /mnt/mysnap2/foo # Create the incremental send stream. $ btrfs send -p /mnt/mysnap1 -f /tmp/2.snap /mnt/mysnap2 ERROR: send ioctl failed with -5: Input/output error This issue started happening back in 2015 when deduplication was updated to not update the inode's ctime and mtime and update only the iversion. Back then we would hit a BUG_ON() in send, but later in 2016 send was updated to return -EIO and print the error message instead of doing the BUG_ON(). A test case for fstests follows soon. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203933 Fixes: 1c919a5e13702c ("btrfs: don't update mtime/ctime on deduped inodes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 757ee02703cda68a0f1d56746e5c0292e975c319 Author: Masahiro Yamada Date: Mon Jul 29 18:15:17 2019 +0900 kbuild: initialize CLANG_FLAGS correctly in the top Makefile commit 5241ab4cf42d3a93b933b55d3d53f43049081fa1 upstream. CLANG_FLAGS is initialized by the following line: CLANG_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%)) ..., which is run only when CROSS_COMPILE is set. Some build targets (bindeb-pkg etc.) recurse to the top Makefile. When you build the kernel with Clang but without CROSS_COMPILE, the same compiler flags such as -no-integrated-as are accumulated into CLANG_FLAGS. If you run 'make CC=clang' and then 'make CC=clang bindeb-pkg', Kbuild will recompile everything needlessly due to the build command change. Fix this by correctly initializing CLANG_FLAGS. Fixes: 238bcbc4e07f ("kbuild: consolidate Clang compiler flags") Cc: # v5.0+ Signed-off-by: Masahiro Yamada Reviewed-by: Nathan Chancellor Acked-by: Nick Desaulniers Signed-off-by: Greg Kroah-Hartman commit b2ca435ce65e5f1016d5ae698cb71c9c436ea3fb Author: Zhenzhong Duan Date: Tue Jul 16 21:18:12 2019 +0800 x86, boot: Remove multiple copy of static function sanitize_boot_params() [ Upstream commit 8c5477e8046ca139bac250386c08453da37ec1ae ] Kernel build warns: 'sanitize_boot_params' defined but not used [-Wunused-function] at below files: arch/x86/boot/compressed/cmdline.c arch/x86/boot/compressed/error.c arch/x86/boot/compressed/early_serial_console.c arch/x86/boot/compressed/acpi.c That's becausethey each include misc.h which includes a definition of sanitize_boot_params() via bootparam_utils.h. Remove the inclusion from misc.h and have the c file including bootparam_utils.h directly. Signed-off-by: Zhenzhong Duan Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/1563283092-1189-1-git-send-email-zhenzhong.duan@oracle.com Signed-off-by: Sasha Levin commit 4d2bf5798ebf009fa1c82aeb87986da7fc10efe0 Author: Josh Poimboeuf Date: Wed Jul 17 20:36:39 2019 -0500 x86/kvm: Don't call kvm_spurious_fault() from .fixup [ Upstream commit 3901336ed9887b075531bffaeef7742ba614058b ] After making a change to improve objtool's sibling call detection, it started showing the following warning: arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame The problem is the ____kvm_handle_fault_on_reboot() macro. It does a fake call by pushing a fake RIP and doing a jump. That tricks the unwinder into printing the function which triggered the exception, rather than the .fixup code. Instead of the hack to make it look like the original function made the call, just change the macro so that the original function actually does make the call. This allows removal of the hack, and also makes objtool happy. I triggered a vmx instruction exception and verified that the stack trace is still sane: kernel BUG at arch/x86/kvm/x86.c:358! invalid opcode: 0000 [#1] SMP PTI CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16 Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017 RIP: 0010:kvm_spurious_fault+0x5/0x10 Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246 RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000 RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000 R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0 R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000 FS: 00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: loaded_vmcs_init+0x4f/0xe0 alloc_loaded_vmcs+0x38/0xd0 vmx_create_vcpu+0xf7/0x600 kvm_vm_ioctl+0x5e9/0x980 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? free_one_page+0x13f/0x4e0 do_vfs_ioctl+0xa4/0x630 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x1c0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fa349b1ee5b Signed-off-by: Josh Poimboeuf Signed-off-by: Thomas Gleixner Acked-by: Paolo Bonzini Acked-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.1563413318.git.jpoimboe@redhat.com Signed-off-by: Sasha Levin commit 62369d5c014f611490bad4bad6b74ce2181858d6 Author: Kees Cook Date: Tue Jul 16 16:30:21 2019 -0700 ipc/mqueue.c: only perform resource calculation if user valid [ Upstream commit a318f12ed8843cfac53198390c74a565c632f417 ] Andreas Christoforou reported: UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow: 9 * 2305843009213693951 cannot be represented in type 'long int' ... Call Trace: mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414 evict+0x472/0x8c0 fs/inode.c:558 iput_final fs/inode.c:1547 [inline] iput+0x51d/0x8c0 fs/inode.c:1573 mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320 mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459 vfs_mkobj+0x39e/0x580 fs/namei.c:2892 prepare_open ipc/mqueue.c:731 [inline] do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771 Which could be triggered by: struct mq_attr attr = { .mq_flags = 0, .mq_maxmsg = 9, .mq_msgsize = 0x1fffffffffffffff, .mq_curmsgs = 0, }; if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1) perror("mq_open"); mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and preparing to return -EINVAL. During the cleanup, it calls mqueue_evict_inode() which performed resource usage tracking math for updating "user", before checking if there was a valid "user" at all (which would indicate that the calculations would be sane). Instead, delay this check to after seeing a valid "user". The overflow was real, but the results went unused, so while the flaw is harmless, it's noisy for kernel fuzzers, so just fix it by moving the calculation under the non-NULL "user" where it actually gets used. Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook Signed-off-by: Kees Cook Reported-by: Andreas Christoforou Acked-by: "Eric W. Biederman" Cc: Al Viro Cc: Arnd Bergmann Cc: Davidlohr Bueso Cc: Manfred Spraul Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit bcebb4419ec723f3ffc29f8560f0b4c0d0910ff3 Author: Dan Carpenter Date: Tue Jul 16 16:30:03 2019 -0700 drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings [ Upstream commit 156e0b1a8112b76e351684ac948c59757037ac36 ] The dev_info.name[] array has space for RIO_MAX_DEVNAME_SZ + 1 characters. But the problem here is that we don't ensure that the user put a NUL terminator on the end of the string. It could lead to an out of bounds read. Link: http://lkml.kernel.org/r/20190529110601.GB19119@mwanda Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Dan Carpenter Acked-by: Alexandre Bounine Cc: Ira Weiny Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 317fc4dd5212208734d99ae87bdbb84aa0a7bcec Author: Mikko Rapeli Date: Tue Jul 16 16:28:10 2019 -0700 uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers [ Upstream commit f90fb3c7e2c13ae829db2274b88b845a75038b8a ] Only users of upc_req in kernel side fs/coda/psdev.c and fs/coda/upcall.c already include linux/coda_psdev.h. Suggested by Jan Harkes in https://lore.kernel.org/lkml/20150531111913.GA23377@cs.cmu.edu/ Fixes these include/uapi/linux/coda_psdev.h compilation errors in userspace: linux/coda_psdev.h:12:19: error: field `uc_chain' has incomplete type struct list_head uc_chain; ^ linux/coda_psdev.h:13:2: error: unknown type name `caddr_t' caddr_t uc_data; ^ linux/coda_psdev.h:14:2: error: unknown type name `u_short' u_short uc_flags; ^ linux/coda_psdev.h:15:2: error: unknown type name `u_short' u_short uc_inSize; /* Size is at most 5000 bytes */ ^ linux/coda_psdev.h:16:2: error: unknown type name `u_short' u_short uc_outSize; ^ linux/coda_psdev.h:17:2: error: unknown type name `u_short' u_short uc_opcode; /* copied from data to save lookup */ ^ linux/coda_psdev.h:19:2: error: unknown type name `wait_queue_head_t' wait_queue_head_t uc_sleep; /* process' wait queue */ ^ Link: http://lkml.kernel.org/r/9f99f5ce6a0563d5266e6cf7aa9585aac2cae971.1558117389.git.jaharkes@cs.cmu.edu Signed-off-by: Mikko Rapeli Signed-off-by: Jan Harkes Cc: Arnd Bergmann Cc: Colin Ian King Cc: Dan Carpenter Cc: David Howells Cc: Fabian Frederick Cc: Sam Protsenko Cc: Yann Droneaud Cc: Zhouyang Jia Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 90320a506cbe57a339c59fce30eea003f28f0354 Author: Sam Protsenko Date: Tue Jul 16 16:28:20 2019 -0700 coda: fix build using bare-metal toolchain [ Upstream commit b2a57e334086602be56b74958d9f29b955cd157f ] The kernel is self-contained project and can be built with bare-metal toolchain. But bare-metal toolchain doesn't define __linux__. Because of this u_quad_t type is not defined when using bare-metal toolchain and codafs build fails. This patch fixes it by defining u_quad_t type unconditionally. Link: http://lkml.kernel.org/r/3cbb40b0a57b6f9923a9d67b53473c0b691a3eaa.1558117389.git.jaharkes@cs.cmu.edu Signed-off-by: Sam Protsenko Signed-off-by: Jan Harkes Cc: Arnd Bergmann Cc: Colin Ian King Cc: Dan Carpenter Cc: David Howells Cc: Fabian Frederick Cc: Mikko Rapeli Cc: Yann Droneaud Cc: Zhouyang Jia Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit ca62806b39552cfcf62db15bdc7247d464fe0023 Author: Zhouyang Jia Date: Tue Jul 16 16:28:13 2019 -0700 coda: add error handling for fget [ Upstream commit 02551c23bcd85f0c68a8259c7b953d49d44f86af ] When fget fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling fget. Link: http://lkml.kernel.org/r/2514ec03df9c33b86e56748513267a80dd8004d9.1558117389.git.jaharkes@cs.cmu.edu Signed-off-by: Zhouyang Jia Signed-off-by: Jan Harkes Cc: Arnd Bergmann Cc: Colin Ian King Cc: Dan Carpenter Cc: David Howells Cc: Fabian Frederick Cc: Mikko Rapeli Cc: Sam Protsenko Cc: Yann Droneaud Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit ad2041109001a00175ad19fa0b1f8f9b4aebaeb4 Author: Doug Berger Date: Tue Jul 16 16:26:24 2019 -0700 mm/cma.c: fail if fixed declaration can't be honored [ Upstream commit c633324e311243586675e732249339685e5d6faa ] The description of cma_declare_contiguous() indicates that if the 'fixed' argument is true the reserved contiguous area must be exactly at the address of the 'base' argument. However, the function currently allows the 'base', 'size', and 'limit' arguments to be silently adjusted to meet alignment constraints. This commit enforces the documented behavior through explicit checks that return an error if the region does not fit within a specified region. Link: http://lkml.kernel.org/r/1561422051-16142-1-git-send-email-opendmb@gmail.com Fixes: 5ea3b1b2f8ad ("cma: add placement specifier for "cma=" kernel parameter") Signed-off-by: Doug Berger Acked-by: Michal Nazarewicz Cc: Yue Hu Cc: Mike Rapoport Cc: Laura Abbott Cc: Peng Fan Cc: Thomas Gleixner Cc: Marek Szyprowski Cc: Andrey Konovalov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit bf4e8f2a81d128077e4cfa6253d2189778f11923 Author: Arnd Bergmann Date: Fri Jul 12 11:08:05 2019 +0200 x86: math-emu: Hide clang warnings for 16-bit overflow [ Upstream commit 29e7e9664aec17b94a9c8c5a75f8d216a206aa3a ] clang warns about a few parts of the math-emu implementation where a 16-bit integer becomes negative during assignment: arch/x86/math-emu/poly_tan.c:88:35: error: implicit conversion from 'int' to 'short' changes value from 49216 to -16320 [-Werror,-Wconstant-conversion] (0x41 + EXTENDED_Ebias) | SIGN_Negative); ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~ arch/x86/math-emu/fpu_emu.h:180:58: note: expanded from macro 'setexponent16' #define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } ~ ^ arch/x86/math-emu/reg_constant.c:37:32: error: implicit conversion from 'int' to 'short' changes value from 49085 to -16451 [-Werror,-Wconstant-conversion] FPU_REG const CONST_PI2extra = MAKE_REG(NEG, -66, ^~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:48:28: error: implicit conversion from 'int' to 'short' changes value from 65535 to -1 [-Werror,-Wconstant-conversion] FPU_REG const CONST_QNaN = MAKE_REG(NEG, EXP_OVER, 0x00000000, 0xC0000000); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ The code is correct as is, so add a typecast to shut up the warnings. Signed-off-by: Arnd Bergmann Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/20190712090816.350668-1-arnd@arndb.de Signed-off-by: Sasha Levin commit d4ce30c91b9ddc446439187a24ddd47dcee04377 Author: Qian Cai Date: Mon Jul 8 17:36:45 2019 -0400 x86/apic: Silence -Wtype-limits compiler warnings [ Upstream commit ec6335586953b0df32f83ef696002063090c7aef ] There are many compiler warnings like this, In file included from ./arch/x86/include/asm/smp.h:13, from ./arch/x86/include/asm/mmzone_64.h:11, from ./arch/x86/include/asm/mmzone.h:5, from ./include/linux/mmzone.h:969, from ./include/linux/gfp.h:6, from ./include/linux/mm.h:10, from arch/x86/kernel/apic/io_apic.c:34: arch/x86/kernel/apic/io_apic.c: In function 'check_timer': ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X " ^~~~~~~~~~~ ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: " ^~~~~~~~~~~ APIC_QUIET is 0, so silence them by making apic_verbosity type int. Signed-off-by: Qian Cai Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pw Signed-off-by: Sasha Levin commit 0b4f4a4b18dd104b9ddb13e02e90679c04825fb9 Author: Benjamin Poirier Date: Tue Jul 16 17:16:55 2019 +0900 be2net: Signal that the device cannot transmit during reconfiguration [ Upstream commit 7429c6c0d9cb086d8e79f0d2a48ae14851d2115e ] While changing the number of interrupt channels, be2net stops adapter operation (including netif_tx_disable()) but it doesn't signal that it cannot transmit. This may lead dev_watchdog() to falsely trigger during that time. Add the missing call to netif_carrier_off(), following the pattern used in many other drivers. netif_carrier_on() is already taken care of in be_open(). Signed-off-by: Benjamin Poirier Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0153dbcbd4c842c38955b7d80a4708bb012052a7 Author: Arnd Bergmann Date: Fri Jul 12 11:01:21 2019 +0200 ACPI: fix false-positive -Wuninitialized warning [ Upstream commit dfd6f9ad36368b8dbd5f5a2b2f0a4705ae69a323 ] clang gets confused by an uninitialized variable in what looks to it like a never executed code path: arch/x86/kernel/acpi/boot.c:618:13: error: variable 'polarity' is uninitialized when used here [-Werror,-Wuninitialized] polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH; ^~~~~~~~ arch/x86/kernel/acpi/boot.c:606:32: note: initialize the variable 'polarity' to silence this warning int rc, irq, trigger, polarity; ^ = 0 arch/x86/kernel/acpi/boot.c:617:12: error: variable 'trigger' is uninitialized when used here [-Werror,-Wuninitialized] trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE; ^~~~~~~ arch/x86/kernel/acpi/boot.c:606:22: note: initialize the variable 'trigger' to silence this warning int rc, irq, trigger, polarity; ^ = 0 This is unfortunately a design decision in clang and won't be fixed. Changing the acpi_get_override_irq() macro to an inline function reliably avoids the issue. Signed-off-by: Arnd Bergmann Reviewed-by: Andy Shevchenko Reviewed-by: Nathan Chancellor Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit a0e456f0427cfb47760f6098dc8d748aa16aa10a Author: Benjamin Block Date: Tue Jul 2 23:02:02 2019 +0200 scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized [ Upstream commit 484647088826f2f651acbda6bcf9536b8a466703 ] GCC v9 emits this warning: CC drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue': drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized] 217 | struct zfcp_erp_action *erp_action; | ^~~~~~~~~~ This is a possible false positive case, as also documented in the GCC documentations: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uninitialized The actual code-sequence is like this: Various callers can invoke the function below with the argument "want" being one of: ZFCP_ERP_ACTION_REOPEN_ADAPTER, ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, ZFCP_ERP_ACTION_REOPEN_PORT, or ZFCP_ERP_ACTION_REOPEN_LUN. zfcp_erp_action_enqueue(want, ...) ... need = zfcp_erp_required_act(want, ...) need = want ... maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER ... return need ... zfcp_erp_setup_act(need, ...) struct zfcp_erp_action *erp_action; // <== line 217 ... switch(need) { case ZFCP_ERP_ACTION_REOPEN_LUN: ... erp_action = &zfcp_sdev->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_PORT: case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: ... erp_action = &port->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_ADAPTER: ... erp_action = &adapter->erp_action; WARN_ON_ONCE(erp_action->port != NULL); // <== access ... break; } ... WARN_ON_ONCE(erp_action->adapter != adapter); // <== access When zfcp_erp_setup_act() is called, 'need' will never be anything else than one of the 4 possible enumeration-names that are used in the switch-case, and 'erp_action' is initialized for every one of them, before it is used. Thus the warning is a false positive, as documented. We introduce the extra if{} in the beginning to create an extra code-flow, so the compiler can be convinced that the switch-case will never see any other value. BUG_ON()/BUG() is intentionally not used to not crash anything, should this ever happen anyway - right now it's impossible, as argued above; and it doesn't introduce a 'default:' switch-case to retain warnings should 'enum zfcp_erp_act_type' ever be extended and no explicit case be introduced. See also v5.0 commit 399b6c8bc9f7 ("scsi: zfcp: drop old default switch case which might paper over missing case"). Signed-off-by: Benjamin Block Reviewed-by: Jens Remus Reviewed-by: Steffen Maier Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 679ff6a3e4ccb02efabf6f8564d11a3681a8c150 Author: Jeff Layton Date: Thu Jun 13 15:17:00 2019 -0400 ceph: return -ERANGE if virtual xattr value didn't fit in buffer [ Upstream commit 3b421018f48c482bdc9650f894aa1747cf90e51d ] The getxattr manpage states that we should return ERANGE if the destination buffer size is too small to hold the value. ceph_vxattrcb_layout does this internally, but we should be doing this for all vxattrs. Fix the only caller of getxattr_cb to check the returned size against the buffer length and return -ERANGE if it doesn't fit. Drop the same check in ceph_vxattrcb_layout and just rely on the caller to handle it. Signed-off-by: Jeff Layton Reviewed-by: "Yan, Zheng" Acked-by: Ilya Dryomov Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin commit 4e064062d391df79d62eb8a0309c94636e303ccd Author: Andrea Parri Date: Mon May 20 19:23:58 2019 +0200 ceph: fix improper use of smp_mb__before_atomic() [ Upstream commit 749607731e26dfb2558118038c40e9c0c80d23b5 ] This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic64_set() primitive. Replace the barrier with an smp_mb(). Fixes: fdd4e15838e59 ("ceph: rework dcache readdir") Reported-by: "Paul E. McKenney" Reported-by: Peter Zijlstra Signed-off-by: Andrea Parri Reviewed-by: "Yan, Zheng" Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin commit 44f7521a23f58e5d970112c3040fffc4a4dd964f Author: David Sterba Date: Fri May 17 11:43:13 2019 +0200 btrfs: fix minimum number of chunk errors for DUP [ Upstream commit 0ee5f8ae082e1f675a2fb6db601c31ac9958a134 ] The list of profiles in btrfs_chunk_max_errors lists DUP as a profile DUP able to tolerate 1 device missing. Though this profile is special with 2 copies, it still needs the device, unlike the others. Looking at the history of changes, thre's no clear reason why DUP is there, functions were refactored and blocks of code merged to one helper. d20983b40e828 Btrfs: fix writing data into the seed filesystem - factor code to a helper de11cc12df173 Btrfs: don't pre-allocate btrfs bio - unrelated change, DUP still in the list with max errors 1 a236aed14ccb0 Btrfs: Deal with failed writes in mirrored configurations - introduced the max errors, leaves DUP and RAID1 in the same group Reviewed-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 820402d2fc3f6cf888809783e8e735786b130bb1 Author: Russell King Date: Tue Jun 4 14:50:14 2019 +0100 fs/adfs: super: fix use-after-free bug [ Upstream commit 5808b14a1f52554de612fee85ef517199855e310 ] Fix a use-after-free bug during filesystem initialisation, where we access the disc record (which is stored in a buffer) after we have released the buffer. Signed-off-by: Russell King Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 1235f5e013d6d3267204cff4a350184c13039c94 Author: Geert Uytterhoeven Date: Mon Jun 24 14:38:18 2019 +0200 dmaengine: rcar-dmac: Reject zero-length slave DMA requests [ Upstream commit 78efb76ab4dfb8f74f290ae743f34162cd627f19 ] While the .device_prep_slave_sg() callback rejects empty scatterlists, it still accepts single-entry scatterlists with a zero-length segment. These may happen if a driver calls dmaengine_prep_slave_single() with a zero len parameter. The corresponding DMA request will never complete, leading to messages like: rcar-dmac e7300000.dma-controller: Channel Address Error happen and DMA timeouts. Although requesting a zero-length DMA request is a driver bug, rejecting it early eases debugging. Note that the .device_prep_dma_memcpy() callback already rejects requests to copy zero bytes. Reported-by: Eugeniu Rosca Analyzed-by: Yoshihiro Shimoda Signed-off-by: Geert Uytterhoeven Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit f1741424feee85d5583948ab100e045bf5155d67 Author: Petr Cvek Date: Thu Jun 20 23:39:37 2019 +0200 MIPS: lantiq: Fix bitfield masking [ Upstream commit ba1bc0fcdeaf3bf583c1517bd2e3e29cf223c969 ] The modification of EXIN register doesn't clean the bitfield before the writing of a new value. After a few modifications the bitfield would accumulate only '1's. Signed-off-by: Petr Cvek Signed-off-by: Paul Burton Cc: hauke@hauke-m.de Cc: john@phrozen.org Cc: linux-mips@vger.kernel.org Cc: openwrt-devel@lists.openwrt.org Cc: pakahmar@hotmail.com Signed-off-by: Sasha Levin commit 104307692801868ce85751f0393a0072745406d6 Author: Prarit Bhargava Date: Wed May 29 07:26:25 2019 -0400 kernel/module.c: Only return -EEXIST for modules that have finished loading [ Upstream commit 6e6de3dee51a439f76eb73c22ae2ffd2c9384712 ] Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and linux guests boot with repeated errors: amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) The warnings occur because the module code erroneously returns -EEXIST for modules that have failed to load and are in the process of being removed from the module list. module amd64_edac_mod has a dependency on module edac_mce_amd. Using modules.dep, systemd will load edac_mce_amd for every request of amd64_edac_mod. When the edac_mce_amd module loads, the module has state MODULE_STATE_UNFORMED and once the module load fails and the state becomes MODULE_STATE_GOING. Another request for edac_mce_amd module executes and add_unformed_module() will erroneously return -EEXIST even though the previous instance of edac_mce_amd has MODULE_STATE_GOING. Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which fails because of unknown symbols from edac_mce_amd. add_unformed_module() must wait to return for any case other than MODULE_STATE_LIVE to prevent a race between multiple loads of dependent modules. Signed-off-by: Prarit Bhargava Signed-off-by: Barret Rhoden Cc: David Arcari Cc: Jessica Yu Cc: Heiko Carstens Signed-off-by: Jessica Yu Signed-off-by: Sasha Levin commit 81c09bab09d3b3dd1940280b618c83f9db25c1ff Author: Cheng Jian Date: Sat May 4 19:39:39 2019 +0800 ftrace: Enable trampoline when rec count returns back to one [ Upstream commit a124692b698b00026a58d89831ceda2331b2e1d0 ] Custom trampolines can only be enabled if there is only a single ops attached to it. If there's only a single callback registered to a function, and the ops has a trampoline registered for it, then we can call the trampoline directly. This is very useful for improving the performance of ftrace and livepatch. If more than one callback is registered to a function, the general trampoline is used, and the custom trampoline is not restored back to the direct call even if all the other callbacks were unregistered and we are back to one callback for the function. To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented to one, and the ops that left has a trampoline. Testing After this patch : insmod livepatch_unshare_files.ko cat /sys/kernel/debug/tracing/enabled_functions unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0 echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter echo function > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150 echo nop > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0 Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@huawei.com Signed-off-by: Cheng Jian Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit 614e14d68edf596fddbd935c3848dc1ebc46448b Author: Douglas Anderson Date: Tue May 21 16:49:33 2019 -0700 ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend [ Upstream commit 8ef1ba39a9fa53d2205e633bc9b21840a275908e ] This is similar to commit e6186820a745 ("arm64: dts: rockchip: Arch counter doesn't tick in system suspend"). Specifically on the rk3288 it can be seen that the timer stops ticking in suspend if we end up running through the "osc_disable" path in rk3288_slp_mode_set(). In that path the 24 MHz clock will turn off and the timer stops. To test this, I ran this on a Chrome OS filesystem: before=$(date); \ suspend_stress_test -c1 --suspend_min=30 --suspend_max=31; \ echo ${before}; date ...and I found that unless I plug in a device that requests USB wakeup to be active that the two calls to "date" would show that fewer than 30 seconds passed. NOTE: deep suspend (where the 24 MHz clock gets disabled) isn't supported yet on upstream Linux so this was tested on a downstream kernel. Signed-off-by: Douglas Anderson Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin commit 2b0a7453ea0e59e45791a975df768d3928ad1649 Author: Douglas Anderson Date: Fri May 3 16:45:37 2019 -0700 ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again [ Upstream commit 99fa066710f75f18f4d9a5bc5f6a711968a581d5 ] When I try to boot rk3288-veyron-mickey I totally fail to make the eMMC work. Specifically my logs (on Chrome OS 4.19): mmc_host mmc1: card is non-removable. mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) mmc1: switch to bus width 8 failed mmc1: switch to bus width 4 failed mmc1: new high speed MMC card at address 0001 mmcblk1: mmc1:0001 HAG2e 14.7 GiB mmcblk1boot0: mmc1:0001 HAG2e partition 1 4.00 MiB mmcblk1boot1: mmc1:0001 HAG2e partition 2 4.00 MiB mmcblk1rpmb: mmc1:0001 HAG2e partition 3 4.00 MiB, chardev (243:0) mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) mmc1: switch to bus width 8 failed mmc1: switch to bus width 4 failed mmc1: tried to HW reset card, got error -110 mmcblk1: error -110 requesting status mmcblk1: recovery failed! print_req_error: I/O error, dev mmcblk1, sector 0 ... When I remove the '/delete-property/mmc-hs200-1_8v' then everything is hunky dory. That line comes from the original submission of the mickey dts upstream, so presumably at the time the HS200 was failing and just enumerating things as a high speed device was fine. ...or maybe it's just that some mickey devices work when enumerating at "high speed", just not mine? In any case, hs200 seems good now. Let's turn it on. Signed-off-by: Douglas Anderson Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin commit 1078e302b947cdabcf2027c7e8fc30e6c77cb9a2 Author: Douglas Anderson Date: Fri May 3 16:41:42 2019 -0700 ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200 [ Upstream commit 1c0479023412ab7834f2e98b796eb0d8c627cd62 ] As some point hs200 was failing on rk3288-veyron-minnie. See commit 984926781122 ("ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie"). Although I didn't track down exactly when it started working, it seems to work OK now, so let's turn it back on. To test this, I booted from SD card and then used this script to stress the enumeration process after fixing a memory leak [1]: cd /sys/bus/platform/drivers/dwmmc_rockchip for i in $(seq 1 3000); do echo "========================" $i echo ff0f0000.dwmmc > unbind sleep .5 echo ff0f0000.dwmmc > bind while true; do if [ -e /dev/mmcblk2 ]; then break; fi sleep .1 done done It worked fine. [1] https://lkml.kernel.org/r/20190503233526.226272-1-dianders@chromium.org Signed-off-by: Douglas Anderson Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin commit 790fc2d3f1258c99b9c2bbfee9eab21074d23d30 Author: Russell King Date: Thu May 2 17:19:18 2019 +0100 ARM: riscpc: fix DMA [ Upstream commit ffd9a1ba9fdb7f2bd1d1ad9b9243d34e96756ba2 ] DMA got broken a while back in two different ways: 1) a change in the behaviour of disable_irq() to wait for the interrupt to finish executing causes us to deadlock at the end of DMA. 2) a change to avoid modifying the scatterlist left the first transfer uninitialised. DMA is only used with expansion cards, so has gone unnoticed. Fixes: fa4e99899932 ("[ARM] dma: RiscPC: don't modify DMA SG entries") Signed-off-by: Russell King Signed-off-by: Sasha Levin