You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge tag 'libnvdimm-for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This cycle was was not something I ever want to repeat as there were
several late changes that have only now just settled.
Half of the branch up to commit d2c997c0f1 ("fs, dax: use
page->mapping to warn...") have been in -next for several releases.
The of_pmem driver and the address range scrub rework were late
arrivals, and the dax work was scaled back at the last moment.
The of_pmem driver missed a previous merge window due to an oversight.
A sense of obligation to rectify that miss is why it is included for
4.17. It has acks from PowerPC folks. Stephen reported a build failure
that only occurs when merging it with your latest tree, for now I have
fixed that up by disabling modular builds of of_pmem. A test merge
with your tree has received a build success report from the 0day robot
over 156 configs.
An initial version of the ARS rework was submitted before the merge
window. It is self contained to libnvdimm, a net code reduction, and
passing all unit tests.
The filesystem-dax changes are based on the wait_var_event()
functionality from tip/sched/core. However, late review feedback
showed that those changes regressed truncate performance to a large
degree. The branch was rewound to drop the truncate behavior change
and now only includes preparation patches and cleanups (with full acks
and reviews). The finalization of this dax-dma-vs-trnucate work will
need to wait for 4.18.
Summary:
- A rework of the filesytem-dax implementation provides for detection
of unmap operations (truncate / hole punch) colliding with
in-progress device-DMA. A fix for these collisions remains a
work-in-progress pending resolution of truncate latency and
starvation regressions.
- The of_pmem driver expands the users of libnvdimm outside of x86
and ACPI to describe an implementation of persistent memory on
PowerPC with Open Firmware / Device tree.
- Address Range Scrub (ARS) handling is completely rewritten to
account for the fact that ARS may run for 100s of seconds and there
is no platform defined way to cancel it. ARS will now no longer
block namespace initialization.
- The NVDIMM Namespace Label implementation is updated to handle
label areas as small as 1K, down from 128K.
- Miscellaneous cleanups and updates to unit test infrastructure"
* tag 'libnvdimm-for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (39 commits)
libnvdimm, of_pmem: workaround OF_NUMA=n build error
nfit, address-range-scrub: add module option to skip initial ars
nfit, address-range-scrub: rework and simplify ARS state machine
nfit, address-range-scrub: determine one platform max_ars value
powerpc/powernv: Create platform devs for nvdimm buses
doc/devicetree: Persistent memory region bindings
libnvdimm: Add device-tree based driver
libnvdimm: Add of_node to region and bus descriptors
libnvdimm, region: quiet region probe
libnvdimm, namespace: use a safe lookup for dimm device name
libnvdimm, dimm: fix dpa reservation vs uninitialized label area
libnvdimm, testing: update the default smart ctrl_temperature
libnvdimm, testing: Add emulation for smart injection commands
nfit, address-range-scrub: introduce nfit_spa->ars_state
libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device'
nfit, address-range-scrub: fix scrub in-progress reporting
dax, dm: allow device-mapper to operate without dax support
dax: introduce CONFIG_DAX_DRIVER
fs, dax: use page->mapping to warn if truncate collides with a busy page
ext2, dax: introduce ext2_dax_aops
...
This commit is contained in:
@@ -0,0 +1,65 @@
|
||||
Device-tree bindings for persistent memory regions
|
||||
-----------------------------------------------------
|
||||
|
||||
Persistent memory refers to a class of memory devices that are:
|
||||
|
||||
a) Usable as main system memory (i.e. cacheable), and
|
||||
b) Retain their contents across power failure.
|
||||
|
||||
Given b) it is best to think of persistent memory as a kind of memory mapped
|
||||
storage device. To ensure data integrity the operating system needs to manage
|
||||
persistent regions separately to the normal memory pool. To aid with that this
|
||||
binding provides a standardised interface for discovering where persistent
|
||||
memory regions exist inside the physical address space.
|
||||
|
||||
Bindings for the region nodes:
|
||||
-----------------------------
|
||||
|
||||
Required properties:
|
||||
- compatible = "pmem-region"
|
||||
|
||||
- reg = <base, size>;
|
||||
The reg property should specificy an address range that is
|
||||
translatable to a system physical address range. This address
|
||||
range should be mappable as normal system memory would be
|
||||
(i.e cacheable).
|
||||
|
||||
If the reg property contains multiple address ranges
|
||||
each address range will be treated as though it was specified
|
||||
in a separate device node. Having multiple address ranges in a
|
||||
node implies no special relationship between the two ranges.
|
||||
|
||||
Optional properties:
|
||||
- Any relevant NUMA assocativity properties for the target platform.
|
||||
|
||||
- volatile; This property indicates that this region is actually
|
||||
backed by non-persistent memory. This lets the OS know that it
|
||||
may skip the cache flushes required to ensure data is made
|
||||
persistent after a write.
|
||||
|
||||
If this property is absent then the OS must assume that the region
|
||||
is backed by non-volatile memory.
|
||||
|
||||
Examples:
|
||||
--------------------
|
||||
|
||||
/*
|
||||
* This node specifies one 4KB region spanning from
|
||||
* 0x5000 to 0x5fff that is backed by non-volatile memory.
|
||||
*/
|
||||
pmem@5000 {
|
||||
compatible = "pmem-region";
|
||||
reg = <0x00005000 0x00001000>;
|
||||
};
|
||||
|
||||
/*
|
||||
* This node specifies two 4KB regions that are backed by
|
||||
* volatile (normal) memory.
|
||||
*/
|
||||
pmem@6000 {
|
||||
compatible = "pmem-region";
|
||||
reg = < 0x00006000 0x00001000
|
||||
0x00008000 0x00001000 >;
|
||||
volatile;
|
||||
};
|
||||
|
||||
@@ -8048,6 +8048,14 @@ Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
|
||||
S: Supported
|
||||
F: drivers/nvdimm/pmem*
|
||||
|
||||
LIBNVDIMM: DEVICETREE BINDINGS
|
||||
M: Oliver O'Halloran <oohall@gmail.com>
|
||||
L: linux-nvdimm@lists.01.org
|
||||
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
|
||||
S: Supported
|
||||
F: drivers/nvdimm/of_pmem.c
|
||||
F: Documentation/devicetree/bindings/pmem/pmem-region.txt
|
||||
|
||||
LIBNVDIMM: NON-VOLATILE MEMORY DEVICE SUBSYSTEM
|
||||
M: Dan Williams <dan.j.williams@intel.com>
|
||||
L: linux-nvdimm@lists.01.org
|
||||
|
||||
@@ -824,6 +824,9 @@ static int __init opal_init(void)
|
||||
/* Create i2c platform devices */
|
||||
opal_pdev_init("ibm,opal-i2c");
|
||||
|
||||
/* Handle non-volatile memory devices */
|
||||
opal_pdev_init("pmem-region");
|
||||
|
||||
/* Setup a heatbeat thread if requested by OPAL */
|
||||
opal_init_heartbeat();
|
||||
|
||||
|
||||
+333
-366
File diff suppressed because it is too large
Load Diff
@@ -51,9 +51,8 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
|
||||
if ((spa->address + spa->length - 1) < mce->addr)
|
||||
continue;
|
||||
found_match = 1;
|
||||
dev_dbg(dev, "%s: addr in SPA %d (0x%llx, 0x%llx)\n",
|
||||
__func__, spa->range_index, spa->address,
|
||||
spa->length);
|
||||
dev_dbg(dev, "addr in SPA %d (0x%llx, 0x%llx)\n",
|
||||
spa->range_index, spa->address, spa->length);
|
||||
/*
|
||||
* We can break at the first match because we're going
|
||||
* to rescan all the SPA ranges. There shouldn't be any
|
||||
|
||||
@@ -117,10 +117,17 @@ enum nfit_dimm_notifiers {
|
||||
NFIT_NOTIFY_DIMM_HEALTH = 0x81,
|
||||
};
|
||||
|
||||
enum nfit_ars_state {
|
||||
ARS_REQ,
|
||||
ARS_DONE,
|
||||
ARS_SHORT,
|
||||
ARS_FAILED,
|
||||
};
|
||||
|
||||
struct nfit_spa {
|
||||
struct list_head list;
|
||||
struct nd_region *nd_region;
|
||||
unsigned int ars_required:1;
|
||||
unsigned long ars_state;
|
||||
u32 clear_err_unit;
|
||||
u32 max_ars;
|
||||
struct acpi_nfit_system_address spa[0];
|
||||
@@ -171,9 +178,8 @@ struct nfit_mem {
|
||||
struct resource *flush_wpq;
|
||||
unsigned long dsm_mask;
|
||||
int family;
|
||||
u32 has_lsi:1;
|
||||
u32 has_lsr:1;
|
||||
u32 has_lsw:1;
|
||||
bool has_lsr;
|
||||
bool has_lsw;
|
||||
};
|
||||
|
||||
struct acpi_nfit_desc {
|
||||
@@ -191,18 +197,18 @@ struct acpi_nfit_desc {
|
||||
struct device *dev;
|
||||
u8 ars_start_flags;
|
||||
struct nd_cmd_ars_status *ars_status;
|
||||
size_t ars_status_size;
|
||||
struct work_struct work;
|
||||
struct delayed_work dwork;
|
||||
struct list_head list;
|
||||
struct kernfs_node *scrub_count_state;
|
||||
unsigned int max_ars;
|
||||
unsigned int scrub_count;
|
||||
unsigned int scrub_mode;
|
||||
unsigned int cancel:1;
|
||||
unsigned int init_complete:1;
|
||||
unsigned long dimm_cmd_force_en;
|
||||
unsigned long bus_cmd_force_en;
|
||||
unsigned long bus_nfit_cmd_force_en;
|
||||
unsigned int platform_cap;
|
||||
unsigned int scrub_tmo;
|
||||
int (*blk_do_io)(struct nd_blk_region *ndbr, resource_size_t dpa,
|
||||
void *iobuf, u64 len, int rw);
|
||||
};
|
||||
@@ -244,7 +250,7 @@ struct nfit_blk {
|
||||
|
||||
extern struct list_head acpi_descs;
|
||||
extern struct mutex acpi_desc_lock;
|
||||
int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, u8 flags);
|
||||
int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, unsigned long flags);
|
||||
|
||||
#ifdef CONFIG_X86_MCE
|
||||
void nfit_mce_register(void);
|
||||
|
||||
+4
-1
@@ -1,3 +1,7 @@
|
||||
config DAX_DRIVER
|
||||
select DAX
|
||||
bool
|
||||
|
||||
menuconfig DAX
|
||||
tristate "DAX: direct access to differentiated memory"
|
||||
select SRCU
|
||||
@@ -16,7 +20,6 @@ config DEV_DAX
|
||||
baseline memory pool. Mappings of a /dev/daxX.Y device impose
|
||||
restrictions that make the mapping behavior deterministic.
|
||||
|
||||
|
||||
config DEV_DAX_PMEM
|
||||
tristate "PMEM DAX: direct access to persistent memory"
|
||||
depends on LIBNVDIMM && NVDIMM_DAX && DEV_DAX
|
||||
|
||||
+17
-21
@@ -257,8 +257,8 @@ static int __dev_dax_pte_fault(struct dev_dax *dev_dax, struct vm_fault *vmf)
|
||||
|
||||
dax_region = dev_dax->region;
|
||||
if (dax_region->align > PAGE_SIZE) {
|
||||
dev_dbg(dev, "%s: alignment (%#x) > fault size (%#x)\n",
|
||||
__func__, dax_region->align, fault_size);
|
||||
dev_dbg(dev, "alignment (%#x) > fault size (%#x)\n",
|
||||
dax_region->align, fault_size);
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
@@ -267,8 +267,7 @@ static int __dev_dax_pte_fault(struct dev_dax *dev_dax, struct vm_fault *vmf)
|
||||
|
||||
phys = dax_pgoff_to_phys(dev_dax, vmf->pgoff, PAGE_SIZE);
|
||||
if (phys == -1) {
|
||||
dev_dbg(dev, "%s: pgoff_to_phys(%#lx) failed\n", __func__,
|
||||
vmf->pgoff);
|
||||
dev_dbg(dev, "pgoff_to_phys(%#lx) failed\n", vmf->pgoff);
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
@@ -299,14 +298,14 @@ static int __dev_dax_pmd_fault(struct dev_dax *dev_dax, struct vm_fault *vmf)
|
||||
|
||||
dax_region = dev_dax->region;
|
||||
if (dax_region->align > PMD_SIZE) {
|
||||
dev_dbg(dev, "%s: alignment (%#x) > fault size (%#x)\n",
|
||||
__func__, dax_region->align, fault_size);
|
||||
dev_dbg(dev, "alignment (%#x) > fault size (%#x)\n",
|
||||
dax_region->align, fault_size);
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
/* dax pmd mappings require pfn_t_devmap() */
|
||||
if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) != (PFN_DEV|PFN_MAP)) {
|
||||
dev_dbg(dev, "%s: region lacks devmap flags\n", __func__);
|
||||
dev_dbg(dev, "region lacks devmap flags\n");
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
@@ -323,8 +322,7 @@ static int __dev_dax_pmd_fault(struct dev_dax *dev_dax, struct vm_fault *vmf)
|
||||
pgoff = linear_page_index(vmf->vma, pmd_addr);
|
||||
phys = dax_pgoff_to_phys(dev_dax, pgoff, PMD_SIZE);
|
||||
if (phys == -1) {
|
||||
dev_dbg(dev, "%s: pgoff_to_phys(%#lx) failed\n", __func__,
|
||||
pgoff);
|
||||
dev_dbg(dev, "pgoff_to_phys(%#lx) failed\n", pgoff);
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
@@ -351,14 +349,14 @@ static int __dev_dax_pud_fault(struct dev_dax *dev_dax, struct vm_fault *vmf)
|
||||
|
||||
dax_region = dev_dax->region;
|
||||
if (dax_region->align > PUD_SIZE) {
|
||||
dev_dbg(dev, "%s: alignment (%#x) > fault size (%#x)\n",
|
||||
__func__, dax_region->align, fault_size);
|
||||
dev_dbg(dev, "alignment (%#x) > fault size (%#x)\n",
|
||||
dax_region->align, fault_size);
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
/* dax pud mappings require pfn_t_devmap() */
|
||||
if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) != (PFN_DEV|PFN_MAP)) {
|
||||
dev_dbg(dev, "%s: region lacks devmap flags\n", __func__);
|
||||
dev_dbg(dev, "region lacks devmap flags\n");
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
@@ -375,8 +373,7 @@ static int __dev_dax_pud_fault(struct dev_dax *dev_dax, struct vm_fault *vmf)
|
||||
pgoff = linear_page_index(vmf->vma, pud_addr);
|
||||
phys = dax_pgoff_to_phys(dev_dax, pgoff, PUD_SIZE);
|
||||
if (phys == -1) {
|
||||
dev_dbg(dev, "%s: pgoff_to_phys(%#lx) failed\n", __func__,
|
||||
pgoff);
|
||||
dev_dbg(dev, "pgoff_to_phys(%#lx) failed\n", pgoff);
|
||||
return VM_FAULT_SIGBUS;
|
||||
}
|
||||
|
||||
@@ -399,9 +396,8 @@ static int dev_dax_huge_fault(struct vm_fault *vmf,
|
||||
struct file *filp = vmf->vma->vm_file;
|
||||
struct dev_dax *dev_dax = filp->private_data;
|
||||
|
||||
dev_dbg(&dev_dax->dev, "%s: %s: %s (%#lx - %#lx) size = %d\n", __func__,
|
||||
current->comm, (vmf->flags & FAULT_FLAG_WRITE)
|
||||
? "write" : "read",
|
||||
dev_dbg(&dev_dax->dev, "%s: %s (%#lx - %#lx) size = %d\n", current->comm,
|
||||
(vmf->flags & FAULT_FLAG_WRITE) ? "write" : "read",
|
||||
vmf->vma->vm_start, vmf->vma->vm_end, pe_size);
|
||||
|
||||
id = dax_read_lock();
|
||||
@@ -460,7 +456,7 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
|
||||
struct dev_dax *dev_dax = filp->private_data;
|
||||
int rc, id;
|
||||
|
||||
dev_dbg(&dev_dax->dev, "%s\n", __func__);
|
||||
dev_dbg(&dev_dax->dev, "trace\n");
|
||||
|
||||
/*
|
||||
* We lock to check dax_dev liveness and will re-check at
|
||||
@@ -518,7 +514,7 @@ static int dax_open(struct inode *inode, struct file *filp)
|
||||
struct inode *__dax_inode = dax_inode(dax_dev);
|
||||
struct dev_dax *dev_dax = dax_get_private(dax_dev);
|
||||
|
||||
dev_dbg(&dev_dax->dev, "%s\n", __func__);
|
||||
dev_dbg(&dev_dax->dev, "trace\n");
|
||||
inode->i_mapping = __dax_inode->i_mapping;
|
||||
inode->i_mapping->host = __dax_inode;
|
||||
filp->f_mapping = inode->i_mapping;
|
||||
@@ -533,7 +529,7 @@ static int dax_release(struct inode *inode, struct file *filp)
|
||||
{
|
||||
struct dev_dax *dev_dax = filp->private_data;
|
||||
|
||||
dev_dbg(&dev_dax->dev, "%s\n", __func__);
|
||||
dev_dbg(&dev_dax->dev, "trace\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
@@ -575,7 +571,7 @@ static void unregister_dev_dax(void *dev)
|
||||
struct inode *inode = dax_inode(dax_dev);
|
||||
struct cdev *cdev = inode->i_cdev;
|
||||
|
||||
dev_dbg(dev, "%s\n", __func__);
|
||||
dev_dbg(dev, "trace\n");
|
||||
|
||||
kill_dev_dax(dev_dax);
|
||||
cdev_device_del(cdev, dev);
|
||||
|
||||
+4
-14
@@ -34,7 +34,7 @@ static void dax_pmem_percpu_release(struct percpu_ref *ref)
|
||||
{
|
||||
struct dax_pmem *dax_pmem = to_dax_pmem(ref);
|
||||
|
||||
dev_dbg(dax_pmem->dev, "%s\n", __func__);
|
||||
dev_dbg(dax_pmem->dev, "trace\n");
|
||||
complete(&dax_pmem->cmp);
|
||||
}
|
||||
|
||||
@@ -43,7 +43,7 @@ static void dax_pmem_percpu_exit(void *data)
|
||||
struct percpu_ref *ref = data;
|
||||
struct dax_pmem *dax_pmem = to_dax_pmem(ref);
|
||||
|
||||
dev_dbg(dax_pmem->dev, "%s\n", __func__);
|
||||
dev_dbg(dax_pmem->dev, "trace\n");
|
||||
wait_for_completion(&dax_pmem->cmp);
|
||||
percpu_ref_exit(ref);
|
||||
}
|
||||
@@ -53,7 +53,7 @@ static void dax_pmem_percpu_kill(void *data)
|
||||
struct percpu_ref *ref = data;
|
||||
struct dax_pmem *dax_pmem = to_dax_pmem(ref);
|
||||
|
||||
dev_dbg(dax_pmem->dev, "%s\n", __func__);
|
||||
dev_dbg(dax_pmem->dev, "trace\n");
|
||||
percpu_ref_kill(ref);
|
||||
}
|
||||
|
||||
@@ -150,17 +150,7 @@ static struct nd_device_driver dax_pmem_driver = {
|
||||
.type = ND_DRIVER_DAX_PMEM,
|
||||
};
|
||||
|
||||
static int __init dax_pmem_init(void)
|
||||
{
|
||||
return nd_driver_register(&dax_pmem_driver);
|
||||
}
|
||||
module_init(dax_pmem_init);
|
||||
|
||||
static void __exit dax_pmem_exit(void)
|
||||
{
|
||||
driver_unregister(&dax_pmem_driver.drv);
|
||||
}
|
||||
module_exit(dax_pmem_exit);
|
||||
module_nd_driver(dax_pmem_driver);
|
||||
|
||||
MODULE_LICENSE("GPL v2");
|
||||
MODULE_AUTHOR("Intel Corporation");
|
||||
|
||||
+12
-3
@@ -124,10 +124,19 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
|
||||
return len < 0 ? len : -EIO;
|
||||
}
|
||||
|
||||
if ((IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn))
|
||||
|| pfn_t_devmap(pfn))
|
||||
if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
|
||||
/*
|
||||
* An arch that has enabled the pmem api should also
|
||||
* have its drivers support pfn_t_devmap()
|
||||
*
|
||||
* This is a developer warning and should not trigger in
|
||||
* production. dax_flush() will crash since it depends
|
||||
* on being able to do (page_address(pfn_to_page())).
|
||||
*/
|
||||
WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
|
||||
} else if (pfn_t_devmap(pfn)) {
|
||||
/* pass */;
|
||||
else {
|
||||
} else {
|
||||
pr_debug("VFS (%s): error: dax support not enabled\n",
|
||||
sb->s_id);
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
+1
-1
@@ -201,7 +201,7 @@ config BLK_DEV_DM_BUILTIN
|
||||
config BLK_DEV_DM
|
||||
tristate "Device mapper support"
|
||||
select BLK_DEV_DM_BUILTIN
|
||||
select DAX
|
||||
depends on DAX || DAX=n
|
||||
---help---
|
||||
Device-mapper is a low level volume manager. It works by allowing
|
||||
people to specify mappings for ranges of logical sectors. Various
|
||||
|
||||
@@ -154,6 +154,7 @@ static int linear_iterate_devices(struct dm_target *ti,
|
||||
return fn(ti, lc->dev, lc->start, ti->len, data);
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DAX_DRIVER)
|
||||
static long linear_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
|
||||
long nr_pages, void **kaddr, pfn_t *pfn)
|
||||
{
|
||||
@@ -184,6 +185,11 @@ static size_t linear_dax_copy_from_iter(struct dm_target *ti, pgoff_t pgoff,
|
||||
return dax_copy_from_iter(dax_dev, pgoff, addr, bytes, i);
|
||||
}
|
||||
|
||||
#else
|
||||
#define linear_dax_direct_access NULL
|
||||
#define linear_dax_copy_from_iter NULL
|
||||
#endif
|
||||
|
||||
static struct target_type linear_target = {
|
||||
.name = "linear",
|
||||
.version = {1, 4, 0},
|
||||
|
||||
+50
-45
@@ -611,51 +611,6 @@ static int log_mark(struct log_writes_c *lc, char *data)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int log_dax(struct log_writes_c *lc, sector_t sector, size_t bytes,
|
||||
struct iov_iter *i)
|
||||
{
|
||||
struct pending_block *block;
|
||||
|
||||
if (!bytes)
|
||||
return 0;
|
||||
|
||||
block = kzalloc(sizeof(struct pending_block), GFP_KERNEL);
|
||||
if (!block) {
|
||||
DMERR("Error allocating dax pending block");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
block->data = kzalloc(bytes, GFP_KERNEL);
|
||||
if (!block->data) {
|
||||
DMERR("Error allocating dax data space");
|
||||
kfree(block);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
/* write data provided via the iterator */
|
||||
if (!copy_from_iter(block->data, bytes, i)) {
|
||||
DMERR("Error copying dax data");
|
||||
kfree(block->data);
|
||||
kfree(block);
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
/* rewind the iterator so that the block driver can use it */
|
||||
iov_iter_revert(i, bytes);
|
||||
|
||||
block->datalen = bytes;
|
||||
block->sector = bio_to_dev_sectors(lc, sector);
|
||||
block->nr_sectors = ALIGN(bytes, lc->sectorsize) >> lc->sectorshift;
|
||||
|
||||
atomic_inc(&lc->pending_blocks);
|
||||
spin_lock_irq(&lc->blocks_lock);
|
||||
list_add_tail(&block->list, &lc->unflushed_blocks);
|
||||
spin_unlock_irq(&lc->blocks_lock);
|
||||
wake_up_process(lc->log_kthread);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void log_writes_dtr(struct dm_target *ti)
|
||||
{
|
||||
struct log_writes_c *lc = ti->private;
|
||||
@@ -925,6 +880,52 @@ static void log_writes_io_hints(struct dm_target *ti, struct queue_limits *limit
|
||||
limits->io_min = limits->physical_block_size;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DAX_DRIVER)
|
||||
static int log_dax(struct log_writes_c *lc, sector_t sector, size_t bytes,
|
||||
struct iov_iter *i)
|
||||
{
|
||||
struct pending_block *block;
|
||||
|
||||
if (!bytes)
|
||||
return 0;
|
||||
|
||||
block = kzalloc(sizeof(struct pending_block), GFP_KERNEL);
|
||||
if (!block) {
|
||||
DMERR("Error allocating dax pending block");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
block->data = kzalloc(bytes, GFP_KERNEL);
|
||||
if (!block->data) {
|
||||
DMERR("Error allocating dax data space");
|
||||
kfree(block);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
/* write data provided via the iterator */
|
||||
if (!copy_from_iter(block->data, bytes, i)) {
|
||||
DMERR("Error copying dax data");
|
||||
kfree(block->data);
|
||||
kfree(block);
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
/* rewind the iterator so that the block driver can use it */
|
||||
iov_iter_revert(i, bytes);
|
||||
|
||||
block->datalen = bytes;
|
||||
block->sector = bio_to_dev_sectors(lc, sector);
|
||||
block->nr_sectors = ALIGN(bytes, lc->sectorsize) >> lc->sectorshift;
|
||||
|
||||
atomic_inc(&lc->pending_blocks);
|
||||
spin_lock_irq(&lc->blocks_lock);
|
||||
list_add_tail(&block->list, &lc->unflushed_blocks);
|
||||
spin_unlock_irq(&lc->blocks_lock);
|
||||
wake_up_process(lc->log_kthread);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
|
||||
long nr_pages, void **kaddr, pfn_t *pfn)
|
||||
{
|
||||
@@ -961,6 +962,10 @@ static size_t log_writes_dax_copy_from_iter(struct dm_target *ti,
|
||||
dax_copy:
|
||||
return dax_copy_from_iter(lc->dev->dax_dev, pgoff, addr, bytes, i);
|
||||
}
|
||||
#else
|
||||
#define log_writes_dax_direct_access NULL
|
||||
#define log_writes_dax_copy_from_iter NULL
|
||||
#endif
|
||||
|
||||
static struct target_type log_writes_target = {
|
||||
.name = "log-writes",
|
||||
|
||||
@@ -313,6 +313,7 @@ static int stripe_map(struct dm_target *ti, struct bio *bio)
|
||||
return DM_MAPIO_REMAPPED;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DAX_DRIVER)
|
||||
static long stripe_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
|
||||
long nr_pages, void **kaddr, pfn_t *pfn)
|
||||
{
|
||||
@@ -353,6 +354,11 @@ static size_t stripe_dax_copy_from_iter(struct dm_target *ti, pgoff_t pgoff,
|
||||
return dax_copy_from_iter(dax_dev, pgoff, addr, bytes, i);
|
||||
}
|
||||
|
||||
#else
|
||||
#define stripe_dax_direct_access NULL
|
||||
#define stripe_dax_copy_from_iter NULL
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Stripe status:
|
||||
*
|
||||
|
||||
+6
-4
@@ -1826,7 +1826,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
|
||||
static struct mapped_device *alloc_dev(int minor)
|
||||
{
|
||||
int r, numa_node_id = dm_get_numa_node();
|
||||
struct dax_device *dax_dev;
|
||||
struct dax_device *dax_dev = NULL;
|
||||
struct mapped_device *md;
|
||||
void *old_md;
|
||||
|
||||
@@ -1892,9 +1892,11 @@ static struct mapped_device *alloc_dev(int minor)
|
||||
md->disk->private_data = md;
|
||||
sprintf(md->disk->disk_name, "dm-%d", minor);
|
||||
|
||||
dax_dev = alloc_dax(md, md->disk->disk_name, &dm_dax_ops);
|
||||
if (!dax_dev)
|
||||
goto bad;
|
||||
if (IS_ENABLED(CONFIG_DAX_DRIVER)) {
|
||||
dax_dev = alloc_dax(md, md->disk->disk_name, &dm_dax_ops);
|
||||
if (!dax_dev)
|
||||
goto bad;
|
||||
}
|
||||
md->dax_dev = dax_dev;
|
||||
|
||||
add_disk_no_queue_reg(md->disk);
|
||||
|
||||
+12
-1
@@ -20,7 +20,7 @@ if LIBNVDIMM
|
||||
config BLK_DEV_PMEM
|
||||
tristate "PMEM: Persistent memory block device support"
|
||||
default LIBNVDIMM
|
||||
select DAX
|
||||
select DAX_DRIVER
|
||||
select ND_BTT if BTT
|
||||
select ND_PFN if NVDIMM_PFN
|
||||
help
|
||||
@@ -102,4 +102,15 @@ config NVDIMM_DAX
|
||||
|
||||
Select Y if unsure
|
||||
|
||||
config OF_PMEM
|
||||
# FIXME: make tristate once OF_NUMA dependency removed
|
||||
bool "Device-tree support for persistent memory regions"
|
||||
depends on OF
|
||||
default LIBNVDIMM
|
||||
help
|
||||
Allows regions of persistent memory to be described in the
|
||||
device-tree.
|
||||
|
||||
Select Y if unsure.
|
||||
|
||||
endif
|
||||
|
||||
@@ -4,6 +4,7 @@ obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o
|
||||
obj-$(CONFIG_ND_BTT) += nd_btt.o
|
||||
obj-$(CONFIG_ND_BLK) += nd_blk.o
|
||||
obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o
|
||||
obj-$(CONFIG_OF_PMEM) += of_pmem.o
|
||||
|
||||
nd_pmem-y := pmem.o
|
||||
|
||||
|
||||
+10
-11
@@ -26,7 +26,7 @@ static void nd_btt_release(struct device *dev)
|
||||
struct nd_region *nd_region = to_nd_region(dev->parent);
|
||||
struct nd_btt *nd_btt = to_nd_btt(dev);
|
||||
|
||||
dev_dbg(dev, "%s\n", __func__);
|
||||
dev_dbg(dev, "trace\n");
|
||||
nd_detach_ndns(&nd_btt->dev, &nd_btt->ndns);
|
||||
ida_simple_remove(&nd_region->btt_ida, nd_btt->id);
|
||||
kfree(nd_btt->uuid);
|
||||
@@ -74,8 +74,8 @@ static ssize_t sector_size_store(struct device *dev,
|
||||
nvdimm_bus_lock(dev);
|
||||
rc = nd_size_select_store(dev, buf, &nd_btt->lbasize,
|
||||
btt_lbasize_supported);
|
||||
dev_dbg(dev, "%s: result: %zd wrote: %s%s", __func__,
|
||||
rc, buf, buf[len - 1] == '\n' ? "" : "\n");
|
||||
dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
|
||||
buf[len - 1] == '\n' ? "" : "\n");
|
||||
nvdimm_bus_unlock(dev);
|
||||
device_unlock(dev);
|
||||
|
||||
@@ -101,8 +101,8 @@ static ssize_t uuid_store(struct device *dev,
|
||||
|
||||
device_lock(dev);
|
||||
rc = nd_uuid_store(dev, &nd_btt->uuid, buf, len);
|
||||
dev_dbg(dev, "%s: result: %zd wrote: %s%s", __func__,
|
||||
rc, buf, buf[len - 1] == '\n' ? "" : "\n");
|
||||
dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
|
||||
buf[len - 1] == '\n' ? "" : "\n");
|
||||
device_unlock(dev);
|
||||
|
||||
return rc ? rc : len;
|
||||
@@ -131,8 +131,8 @@ static ssize_t namespace_store(struct device *dev,
|
||||
device_lock(dev);
|
||||
nvdimm_bus_lock(dev);
|
||||
rc = nd_namespace_store(dev, &nd_btt->ndns, buf, len);
|
||||
dev_dbg(dev, "%s: result: %zd wrote: %s%s", __func__,
|
||||
rc, buf, buf[len - 1] == '\n' ? "" : "\n");
|
||||
dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
|
||||
buf[len - 1] == '\n' ? "" : "\n");
|
||||
nvdimm_bus_unlock(dev);
|
||||
device_unlock(dev);
|
||||
|
||||
@@ -206,8 +206,8 @@ static struct device *__nd_btt_create(struct nd_region *nd_region,
|
||||
dev->groups = nd_btt_attribute_groups;
|
||||
device_initialize(&nd_btt->dev);
|
||||
if (ndns && !__nd_attach_ndns(&nd_btt->dev, ndns, &nd_btt->ndns)) {
|
||||
dev_dbg(&ndns->dev, "%s failed, already claimed by %s\n",
|
||||
__func__, dev_name(ndns->claim));
|
||||
dev_dbg(&ndns->dev, "failed, already claimed by %s\n",
|
||||
dev_name(ndns->claim));
|
||||
put_device(dev);
|
||||
return NULL;
|
||||
}
|
||||
@@ -346,8 +346,7 @@ int nd_btt_probe(struct device *dev, struct nd_namespace_common *ndns)
|
||||
return -ENOMEM;
|
||||
btt_sb = devm_kzalloc(dev, sizeof(*btt_sb), GFP_KERNEL);
|
||||
rc = __nd_btt_probe(to_nd_btt(btt_dev), ndns, btt_sb);
|
||||
dev_dbg(dev, "%s: btt: %s\n", __func__,
|
||||
rc == 0 ? dev_name(btt_dev) : "<none>");
|
||||
dev_dbg(dev, "btt: %s\n", rc == 0 ? dev_name(btt_dev) : "<none>");
|
||||
if (rc < 0) {
|
||||
struct nd_btt *nd_btt = to_nd_btt(btt_dev);
|
||||
|
||||
|
||||
@@ -358,6 +358,7 @@ struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
|
||||
nvdimm_bus->dev.release = nvdimm_bus_release;
|
||||
nvdimm_bus->dev.groups = nd_desc->attr_groups;
|
||||
nvdimm_bus->dev.bus = &nvdimm_bus_type;
|
||||
nvdimm_bus->dev.of_node = nd_desc->of_node;
|
||||
dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
|
||||
rc = device_register(&nvdimm_bus->dev);
|
||||
if (rc) {
|
||||
@@ -984,8 +985,8 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
|
||||
|
||||
if (cmd == ND_CMD_CALL) {
|
||||
func = pkg.nd_command;
|
||||
dev_dbg(dev, "%s:%s, idx: %llu, in: %u, out: %u, len %llu\n",
|
||||
__func__, dimm_name, pkg.nd_command,
|
||||
dev_dbg(dev, "%s, idx: %llu, in: %u, out: %u, len %llu\n",
|
||||
dimm_name, pkg.nd_command,
|
||||
in_len, out_len, buf_len);
|
||||
}
|
||||
|
||||
@@ -996,8 +997,8 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
|
||||
u32 copy;
|
||||
|
||||
if (out_size == UINT_MAX) {
|
||||
dev_dbg(dev, "%s:%s unknown output size cmd: %s field: %d\n",
|
||||
__func__, dimm_name, cmd_name, i);
|
||||
dev_dbg(dev, "%s unknown output size cmd: %s field: %d\n",
|
||||
dimm_name, cmd_name, i);
|
||||
return -EFAULT;
|
||||
}
|
||||
if (out_len < sizeof(out_env))
|
||||
@@ -1012,9 +1013,8 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
|
||||
|
||||
buf_len = (u64) out_len + (u64) in_len;
|
||||
if (buf_len > ND_IOCTL_MAX_BUFLEN) {
|
||||
dev_dbg(dev, "%s:%s cmd: %s buf_len: %llu > %d\n", __func__,
|
||||
dimm_name, cmd_name, buf_len,
|
||||
ND_IOCTL_MAX_BUFLEN);
|
||||
dev_dbg(dev, "%s cmd: %s buf_len: %llu > %d\n", dimm_name,
|
||||
cmd_name, buf_len, ND_IOCTL_MAX_BUFLEN);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
|
||||
@@ -148,7 +148,7 @@ ssize_t nd_namespace_store(struct device *dev,
|
||||
char *name;
|
||||
|
||||
if (dev->driver) {
|
||||
dev_dbg(dev, "%s: -EBUSY\n", __func__);
|
||||
dev_dbg(dev, "namespace already active\n");
|
||||
return -EBUSY;
|
||||
}
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user