HADOOP-19604_branch-3.4. ABFS: BlockId generation based on blockCount along with full blob md5 computation change #7819

anmolanmol1234 · 2025-07-23T07:50:55Z

Jira :- https://issues.apache.org/jira/browse/HADOOP-19604

BlockId computation to be consistent across clients for PutBlock and PutBlockList so made use of blockCount instead of offset.
Block IDs were previously derived from the data offset, which could lead to inconsistency across different clients. The change now uses blockCount (i.e., the index of the block) to compute the Block ID, ensuring deterministic and consistent ID generation for both PutBlock and PutBlockList operations across clients.

Restrict URL encoding of certain JSON metadata during setXAttr calls.
When setting extended attributes (xAttrs), the JSON metadata (hdi_permission) was previously URL-encoded, which could cause unnecessary escaping or compatibility issues. This change ensures that only required metadata are encoded.

Maintain the MD5 hash of the whole block to validate data integrity during flush.
During flush operations, the MD5 hash of the entire block's data is computed and stored. This hash is later used to validate that the block correctly persisted, ensuring data integrity and helping detect corruption or transmission errors.

… full blob md5 computation change (apache#7777) Contributed by Anmol Asrani

hadoop-yetus · 2025-07-23T10:09:43Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	18m 30s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 1s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 12 new or modified test files.
			_ branch-3.4 Compile Tests _
+1 💚	mvninstall	35m 56s		branch-3.4 passed
+1 💚	compile	0m 40s		branch-3.4 passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	0m 38s		branch-3.4 passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 34s		branch-3.4 passed
+1 💚	mvnsite	0m 43s		branch-3.4 passed
+1 💚	javadoc	0m 42s		branch-3.4 passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	0m 36s		branch-3.4 passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	1m 8s		branch-3.4 passed
+1 💚	shadedclient	32m 4s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 31s		the patch passed
+1 💚	compile	0m 33s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	0m 33s		the patch passed
+1 💚	compile	0m 29s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	0m 29s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 21s		hadoop-tools/hadoop-azure: The patch generated 0 new + 14 unchanged - 1 fixed = 14 total (was 15)
+1 💚	mvnsite	0m 33s		the patch passed
+1 💚	javadoc	0m 27s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	0m 26s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	1m 7s		the patch passed
+1 💚	shadedclient	36m 37s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 32s		hadoop-azure in the patch passed.
+1 💚	asflicense	0m 38s		The patch does not generate ASF License warnings.
		137m 31s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7819/1/artifact/out/Dockerfile
GITHUB PR	#7819
Optional Tests	dupname asflicense codespell detsecrets xmllint compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle
uname	Linux 6a669777139a 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	branch-3.4 / `d086393`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7819/1/testReport/
Max. process+thread count	553 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7819/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

HADOOP-19604. ABFS: BlockId generation based on blockCount along with…

d086393

… full blob md5 computation change (apache#7777) Contributed by Anmol Asrani

anmolanmol1234 changed the title ~~HADOOP-19604. ABFS: BlockId generation based on blockCount along with full blob md5 computation change~~ HADOOP-19604_branch-3.4. ABFS: BlockId generation based on blockCount along with full blob md5 computation change Jul 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HADOOP-19604_branch-3.4. ABFS: BlockId generation based on blockCount along with full blob md5 computation change #7819

HADOOP-19604_branch-3.4. ABFS: BlockId generation based on blockCount along with full blob md5 computation change #7819

anmolanmol1234 commented Jul 23, 2025

Uh oh!

hadoop-yetus commented Jul 23, 2025

Uh oh!

Uh oh!

HADOOP-19604_branch-3.4. ABFS: BlockId generation based on blockCount along with full blob md5 computation change #7819

Are you sure you want to change the base?

HADOOP-19604_branch-3.4. ABFS: BlockId generation based on blockCount along with full blob md5 computation change #7819

Conversation

anmolanmol1234 commented Jul 23, 2025

Uh oh!

hadoop-yetus commented Jul 23, 2025

Uh oh!

Uh oh!