target: Fix task count > 1 handling breakage and use max_sector page alignment

This patch addresses recent breakage with multiple se_task (task_count > 1)
operation following backend dev->se_sub_dev->se_dev_attrib.max_sectors in new
transport_allocate_data_tasks() code.  The initial bug here was a bogus
task->task_sg_nents assignment in transport_allocate_data_tasks() based on
the passed parameter, which now uses DIV_ROUND_UP(task_size, PAGE_SIZE) to
determine the proper number of per task SGL entries for the (task_count > 1)
case.

This also means we now need to enforce a PAGE_SIZE aligned max_sector count
value for this to work as expected without bringing back the pre v3.1
transport_map_mem_to_sg() logic to handle SGL offsets across multiple tasks.
So this patch adds se_dev_align_max_sectors() to round down max_sectors as
necessary to ensure this alignment via se_dev_set_default_attribs() and
se_dev_align_max_sectors() and keeps it simple for (task_count > 1)
operation.

So far this bugfix has been tested with (task_count > 1) operation
using iscsi-target and iblock backends.

Reported-by: Chris Boot <bootc@bootc.net>
Cc: Kiran Patil <kiran.patil@intel.com>
Cc: Andy Grover <agrover@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index ec3fbcd..4b5237f 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -839,6 +839,24 @@
 	return ret;
 }
 
+u32 se_dev_align_max_sectors(u32 max_sectors, u32 block_size)
+{
+	u32 tmp, aligned_max_sectors;
+	/*
+	 * Limit max_sectors to a PAGE_SIZE aligned value for modern
+	 * transport_allocate_data_tasks() operation.
+	 */
+	tmp = rounddown((max_sectors * block_size), PAGE_SIZE);
+	aligned_max_sectors = (tmp / block_size);
+	if (max_sectors != aligned_max_sectors) {
+		printk(KERN_INFO "Rounding down aligned max_sectors from %u"
+				" to %u\n", max_sectors, aligned_max_sectors);
+		return aligned_max_sectors;
+	}
+
+	return max_sectors;
+}
+
 void se_dev_set_default_attribs(
 	struct se_device *dev,
 	struct se_dev_limits *dev_limits)
@@ -878,6 +896,11 @@
 	 * max_sectors is based on subsystem plugin dependent requirements.
 	 */
 	dev->se_sub_dev->se_dev_attrib.hw_max_sectors = limits->max_hw_sectors;
+	/*
+	 * Align max_sectors down to PAGE_SIZE to follow transport_allocate_data_tasks()
+	 */
+	limits->max_sectors = se_dev_align_max_sectors(limits->max_sectors,
+						limits->logical_block_size);
 	dev->se_sub_dev->se_dev_attrib.max_sectors = limits->max_sectors;
 	/*
 	 * Set optimal_sectors from max_sectors, which can be lowered via
@@ -1242,6 +1265,11 @@
 			return -EINVAL;
 		}
 	}
+	/*
+	 * Align max_sectors down to PAGE_SIZE to follow transport_allocate_data_tasks()
+	 */
+	max_sectors = se_dev_align_max_sectors(max_sectors,
+				dev->se_sub_dev->se_dev_attrib.block_size);
 
 	dev->se_sub_dev->se_dev_attrib.max_sectors = max_sectors;
 	pr_debug("dev[%p]: SE Device max_sectors changed to %u\n",