Skip to content

Use alignment builtins for ZEND_MM_ALIGNED macros#21913

Open
tobil4sk wants to merge 3 commits intophp:masterfrom
tobil4sk:fix/zend-mm-aligned-builtins
Open

Use alignment builtins for ZEND_MM_ALIGNED macros#21913
tobil4sk wants to merge 3 commits intophp:masterfrom
tobil4sk:fix/zend-mm-aligned-builtins

Conversation

@tobil4sk
Copy link
Copy Markdown
Contributor

@tobil4sk tobil4sk commented Apr 30, 2026

Where available, these provide a more descriptive way of aligning pointers than direct bitwise manipulation. The additional abstraction they provide also makes them more compatible with platforms like CHERI.

Also, the non-builtin ZEND_MM_ALIGNED_BASE implementation now casts to uintptr_t instead of size_t, since the returned value tends to be used as a pointer, not a size. This makes intent clearer and also improves compatibility.

These patches are from a port of PHP to pure-capability CHERI (where pointers are wider than size_t and narrowing invalidates the pointer). Using these builtins improves compatibility generally and they seem preferable due to the abstraction. I thought I would send this patch upstream in case it is of interest.

See: https://clang.llvm.org/docs/LanguageExtensions.html#alignment-builtins

On some architectures, e.g. CHERI, pointers are wider than size_t so
converting to size_t causes part of the pointer to be lost, which means
it cannot be cast back into a pointer. It can be preserved by casting to
uintptr_t instead.
Where available, these provide a more descriptive way of aligning
pointers than direct bitwise manipulation. They are also more compatible
with platforms like CHERI.
Copy link
Copy Markdown
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'm in favor of using compiler intrinsics where possible, so this looks like a reasonable addition.

Comment thread Zend/zend_alloc.c Outdated
Comment thread Zend/zend_alloc.c Outdated
#define ZEND_MM_ALIGNED_OFFSET(size, alignment) \
#ifdef PHP_HAVE_BUILTIN_ALIGN_DOWN
# define ZEND_MM_ALIGNED_OFFSET(size, alignment) \
(((size_t)(size)) - (size_t)__builtin_align_down((size), (alignment)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you can use ZEND_MM_ALIGNED_BASE for the latter half of this expression.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I have adjusted the argument name and factored the offset macro for the builtin implementation. I suppose the non-builtin one could be implemented like this too, but the macro would generate more code in that case. What would you prefer?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to keep as-is.

Since it is clear that it is a pointer now, we may use
ZEND_MM_ALIGNED_BASE for the builtin implementation.
Copy link
Copy Markdown
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, how much does PHP profit from CHERI with its own allocator? E.g. can CHERI detect overreads of individual PHP allocations or is the safety mostly limited to allocations coming from the standard malloc implementation?

Comment thread Zend/zend_alloc.c
#define ZEND_MM_SIZE_TO_NUM(size, alignment) \
#ifdef PHP_HAVE_BUILTIN_ALIGN_DOWN
# define ZEND_MM_ALIGNED_BASE(ptr, alignment) \
__builtin_align_down((ptr), (alignment))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more, I'm not sure we want the result of ZEND_MM_ALIGNED_BASE to be of the type of ptr, as the base has a different type in every case. I guess we could cast to void* before passing to __builtin_align_down? Then the result is void*, which would force coercion to an appropriate type. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there would be benefit in ZEND_MM_ALIGNED_BASE mapping directly to the builtin.

That being said, currently the two implementations return different types (uintptr_t vs T*). I guess it might be more consistent to ensure the result for both is void*.

Comment thread Zend/zend_alloc.c

#ifdef PHP_HAVE_BUILTIN_ALIGN_UP
# define ZEND_MM_SIZE_TO_NUM(size, alignment) \
(__builtin_align_up((size), (alignment)) / (alignment))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this usage of __builtin_align_up() actually correct when the input is size_t rather than some pointer type?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one actually only works with size_t (or another integer type) because of the / operator which does not accept a pointer operand. I think this is OK because the macro name explicitly states that it operates on a size, not a pointer.

Comment thread Zend/zend_alloc.c Outdated
#define ZEND_MM_ALIGNED_OFFSET(size, alignment) \
#ifdef PHP_HAVE_BUILTIN_ALIGN_DOWN
# define ZEND_MM_ALIGNED_OFFSET(size, alignment) \
(((size_t)(size)) - (size_t)__builtin_align_down((size), (alignment)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to keep as-is.

@tobil4sk
Copy link
Copy Markdown
Contributor Author

tobil4sk commented May 2, 2026

Out of interest, how much does PHP profit from CHERI with its own allocator? E.g. can CHERI detect overreads of individual PHP allocations or is the safety mostly limited to allocations coming from the standard malloc implementation?

For the PHP allocator to benefit from spatial safety (i.e. disallow overreads), it requires CHERI-specific code to be added to the allocator to restrict object bounds: tobil4sk@239c0cf. As part of my work on the CheriBSD port, I tested it against the sample in GHSA-h96m-rvf9-jgm2 and some artificially reintroduced past vulnerabilities.

Further changes would be required for temporal safety (i.e. use-after-free protection). It is currently achievable with USE_ZEND_ALLOC=0 to use malloc instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants