Skip to content

arm64/dma-mapping: Fix arch_sync_dma_for_device to respect dir parameter #6995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 13, 2025

Conversation

jc-kynesim
Copy link
Contributor

All other architectures do different cache operations depending on the dir parameter. Fix arm64 to do the same.

This fixes udmabuf operations when syncing for read e.g. when the CPU reads back a V4L2 decoded frame buffer.

I suspect it also fixes less common dma_heap/system issues too. It can be argued that these problems should be fixed in udmabuf & dma_heap but fixing here also works around any other hidden bugs in other drivers. This fix does not preclude fixing the other drivers as well.

All other architectures do different cache operations depending on the
dir parameter. Fix arm64 to do the same.

This fixes udmabuf operations when syncing for read e.g. when the CPU
reads back a V4L2 decoded frame buffer.

Signed-off-by: John Cox <jc@kynesim.co.uk>
Copy link
Contributor

@pelwell pelwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a comfortable patch, because it make me wonder how we got this far without it, but I think it is correct.

@6by9
Copy link
Contributor

6by9 commented Aug 12, 2025

Have you pinged Maxime with context as I'd suggested? Otherwise it definitely deserves a query to mainline.

@jc-kynesim
Copy link
Contributor Author

No, so far I've come back from holiday & discussed it with you. Where/how do you suggest that I ping Maxime?

@pelwell
Copy link
Contributor

pelwell commented Aug 12, 2025

We could try just giving Maxime (@mripard) a mention here...

@6by9
Copy link
Contributor

6by9 commented Aug 12, 2025

Pinging here is fine, but need to provide the context of the use case that goes wrong.

(Otherwise Libreelec slack if no reply).

@jc-kynesim
Copy link
Contributor Author

jc-kynesim commented Aug 12, 2025

Longer form context of what I was doing to find this issue:

[Userland]
Alloc memory with memfd_create + ftruncate
Derive dmabuf from memfd with udmabuf
Close memfd
Queue dmabuf into V4L2 CAPTURE with QBUF
<decode a video frame>
Extract dmabuf from V4L2 CAPTURE with DQBUF
Map dmabuf for read with mmap
Sync for read with DMA_BUF_IOCTL_SYNC with (DMA_BUF_SYNC_START |
DMA_BUF_SYNC_READ)
Read buffer
Sync end
Unmap

I get old (zero) data out of the "Read buffer" stage in some cache lines.
It doesn't matter which way round the mmap & sync are

What happens in the Sync is:

udmabuf finds that none of the pages are currently mapped and maps them with dma_map_sgtable passing the direction flags to the mapping process. The mapping process picks some pages and after a few more calls in the stack get to dma_direct_map_page which flushes them with arch_sync_dma_for_device with the direction flags. This ignores the direction and does a clear, which in arm-speak appears to be writeback without invalidate - other drivers (including arm32) respect the dir flags and do one of invalidate, writeback, writeback + invalidate depending on them.

udmabuf then assumes that this is enough and returns. It appears that the pages picked can still have cached contents which are not invalidated.

In subsequent calls when the memory is mapped udmabuf does a dma_sync_sg_for_cpu which does work. (Doing the sync twice from userspace goes down this path the second time and works.)

Now it is arguable that udmabuf shouldn't rely on dma_map_sgtable to give it clear pages or that dma_direct_map_page is picking the wrong flush. However it is almost certainly a good idea for the arm64 arch_sync_dma_for_device to do the same as all the other drivers as there may be other similar fails elsewhere.

@jc-kynesim
Copy link
Contributor Author

Looks like this fixes my, noticeably rarer, dma_heap allocated dmabuf issues too so this isn't just an issue with udmabuf. If I can find 2 things that this fixes I'm not going to bet that there aren't others.

@pelwell
Copy link
Contributor

pelwell commented Aug 13, 2025

I'm going to merge this on the basis that it seems reasonable to all of us, and because it should get some in-the-field testing.

@pelwell pelwell merged commit dc3cfb9 into raspberrypi:rpi-6.12.y Aug 13, 2025
12 checks passed
@jc-kynesim
Copy link
Contributor Author

Did we want to leave this open for further comment, possibly by Maxime? or is merged, but still open not an available option? Not sure what the correct thing to do is?

@pelwell
Copy link
Contributor

pelwell commented Aug 13, 2025

Until we lock the conversation everyone is free to comment here.

popcornmix added a commit to raspberrypi/firmware that referenced this pull request Aug 13, 2025
See: raspberrypi/linux#6974

kernel: pisp_be: Stride alignment
See: raspberrypi/linux#6985

kernel: drm/panel: ilitek-ili9881c: Restore missing lanes configuration for nwe080 panel
See: raspberrypi/linux#6987

kernel: drivers: media: pisp_be: Fix alignment for V4L2_PIX_FMT_SRGGB8
See: raspberrypi/linux#6988

kernel: overlays: Fix sc16is752-spi1 emulation
See: raspberrypi/linux#6996

kernel: arm64/dma-mapping: Fix arch_sync_dma_for_device to respect dir parameter
See: raspberrypi/linux#6995
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this pull request Aug 13, 2025
See: raspberrypi/linux#6974

kernel: pisp_be: Stride alignment
See: raspberrypi/linux#6985

kernel: drm/panel: ilitek-ili9881c: Restore missing lanes configuration for nwe080 panel
See: raspberrypi/linux#6987

kernel: drivers: media: pisp_be: Fix alignment for V4L2_PIX_FMT_SRGGB8
See: raspberrypi/linux#6988

kernel: overlays: Fix sc16is752-spi1 emulation
See: raspberrypi/linux#6996

kernel: arm64/dma-mapping: Fix arch_sync_dma_for_device to respect dir parameter
See: raspberrypi/linux#6995
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants