Move all cuda.core.system enums into cuda.core.system.typing by mdboom · Pull Request #2022 · NVIDIA/cuda-python

mdboom · 2026-05-05T15:13:47Z

This is a follow-on to #2014, and based on a comment in #2016 that all of these new enums should go in a separate typing module dedicated to this and type annotations.

For cuda.core.system, we decided to put the enums in cuda.core.system.typing rather than cuda.core.typing because cuda.core.system is deliberately designed to be a little bit independent of CUDA. (It could become its own package someday, or be under a different namespace etc.)

This also addresses a few small bugs in the test_enum_coverage.py tests that were discovered while working on #2016. Otherwise, this PR is exclusively moving content and updating imports and doc references accordingly.

rwgk · 2026-05-05T18:18:02Z

I have a naming concern: When I see typing, I expect Python type hints, protocols, or type aliases. This module is mostly runtime enums (plus related public types like FieldId / DeviceArch), so typing feels a little misleading. If the goal is "public enums for the system API", enum_types.py seems clearer. If the goal is the broader "public named types used by cuda.core.system", then types.py feels like a better umbrella than typing.py.

mdboom · 2026-05-05T18:26:37Z

I have a naming concern: When I see typing, I expect Python type hints, protocols, or type aliases. This module is mostly runtime enums (plus related public types like FieldId / DeviceArch), so typing feels a little misleading. If the goal is "public enums for the system API", enum_types.py seems clearer. If the goal is the broader "public named types used by cuda.core.system", then types.py feels like a better umbrella than typing.py.

@leofang: What do you think? I know you sort of see enums as a type-checking feature (though they are a bit more than that). I'm on the fence. If we change here we should also change cuda.core.typing.

rwgk · 2026-05-05T18:54:16Z

This is a pure agent review. I'm posting it here before drilling down myself, for visibility b/o our approaching deadlines.

Cusor GPT-5.4 Extra High Fast

Findings

High cuda_core/cuda/core/system/__init__.py:21 now eagerly imports typing before the existing compatibility gate in cuda_core/cuda/core/system/_system.pyx:13. But cuda_core/cuda/core/system/typing.py:329 only suppresses ImportError, while it still dereferences version-specific NVML symbols like cuda_core/cuda/core/system/typing.py:352. So an older or otherwise incompatible cuda.bindings install can still fail at import cuda.core.system with AttributeError instead of cleanly falling back to CUDA_BINDINGS_NVML_IS_COMPATIBLE = False. The fresh commits look like they addressed a CI/import cleanup issue, but I don’t think they closed this compatibility hole.
Medium cuda_core/cuda/core/system/_device.pyx:1121 still removes Utilization from _device.__all__, which means cuda.core.system.Utilization disappears from the public package surface. cuda_core/tests/system/test_system_device.py:783 was updated to assert against _device.Utilization instead, so the tests now normalize the API drop rather than catch it. Unless the intent is to demote that type to private, this looks like unrelated public API churn.
Medium cuda_core/tests/system/test_system_device.py:503 now asserts isinstance(near_device, typing.Device), but cuda_core/cuda/core/system/typing.py does not define Device. That is a real test bug on the current tip, even if CI may not hit it today because the test is hardware/topology dependent.
Low cuda_core/tests/test_enum_coverage.py:25 and cuda_core/tests/test_enum_coverage.py:256 narrow test_all_str_enums_in_cases() from package-wide discovery to a hand-maintained _MODULES list. That fixes the immediate missing-test problem, but it weakens the original guardrail: future StrEnum additions elsewhere in cuda.core will no longer be discovered automatically.

Open Questions

Was the new from cuda.core.system import typing in cuda_core/cuda/core/system/__init__.py:21 actually required? If the goal is just to support from cuda.core.system import typing, Python already resolves submodules there, so dropping the eager import may avoid the compatibility regression entirely.
Is demoting Utilization from cuda.core.system.Utilization to _device.Utilization intentional? The PR description still reads like a pure enum move plus test fixes, so that API change stands out.

Change Summary

I re-reviewed the current PR head 82c5f98. Since the earlier pass, two follow-up commits landed: Fix missing StrEnum test and Fix imports.
The CI picture is better now: static checks are green, several build legs have completed successfully, and the rest of the matrix is still in progress. So the branch is in a healthier state than before.
My updated read is that the earlier low-level CI issue likely was addressed, but the main compatibility concern still looks live, and there is now also a concrete test typo in test_get_nearest_gpus().

mdboom · 2026-05-05T19:01:01Z

High cuda_core/cuda/core/system/init.py:21 now eagerly imports typing before the existing compatibility gate in cuda_core/cuda/core/system/_system.pyx:13

This is totally fine. The enum in question (DeviceArch) has been there from the beginning on cuda.bindings.nvml.

Medium cuda_core/cuda/core/system/_device.pyx:1121 still removes Utilization from _device.all, which means cuda.core.system.Utilization disappears from the public package surface.

This was on purpose. It was unintentionally public before. It is now private like all other "helper" classes in cuda.core.system.

Medium cuda_core/tests/system/test_system_device.py:503 now asserts isinstance(near_device, typing.Device), but cuda_core/cuda/core/system/typing.py does not define Device

Good catch. Fixed.

Low cuda_core/tests/test_enum_coverage.py:25 and cuda_core/tests/test_enum_coverage.py:256 narrow test_all_str_enums_in_cases() from package-wide discovery to a hand-maintained _MODULES list. That fixes the immediate missing-test problem, but it weakens the original guardrail: future StrEnum additions elsewhere in cuda.core will no longer be discovered automatically.

This is fine. The sort of import hooks this once used are pretty broken by our "megapackage" approach. I think it's good enough to just declare the places where we might find public enums.

Was the new from cuda.core.system import typing in cuda_core/cuda/core/system/init.py:21 actually required? If the goal is just to support from cuda.core.system import typing, Python already resolves submodules there, so dropping the eager import may avoid the compatibility regression entirely.

Yes, this is required to get around a cyclical import issue.

Is demoting Utilization from cuda.core.system.Utilization to _device.Utilization intentional? The PR description still reads like a pure enum move plus test fixes, so that API change stands out.

Yes.

rwgk · 2026-05-05T19:09:17Z

Logging a somewhat unusual observation:

https://github.com/NVIDIA/cuda-python/actions/runs/25394895050/job/74482106980?pr=2022

failed without logging any error message:

2026-05-05T18:55:12.3160422Z Downloading single artifact
2026-05-05T18:55:12.5147791Z Post job cleanup.

Agent take:

...
  So the best read is:

  • this was a failure inside actions/download-artifact,
  • very early in the download flow,
  • likely a transient GitHub artifact service / runner networking / action hiccup,
  • not a deterministic failure caused by your code changes.

  Why it looks like it “just stops”:
  • In the failed job, the log reaches Downloading single artifact and then immediately jumps to Post job cleanup.
  • In a successful sibling job, the next lines are:
    • Preparing to download the following artifacts:
    • artifact ID / expected digest
    • redirect to blob URL
    • Artifact download completed successfully.
  • Your failed job never gets that far, so it likely died before or during artifact metadata resolution, and GitHub didn’t preserve a
    useful Error: line or annotation for it.
...
  So my conclusion is: most likely transient infrastructure/action failure, not related to the enum PR. Rerunning just that failed job
  would be the normal next move.

…-enums3

github-actions · 2026-05-06T00:33:04Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

Move all cuda.core.system enums into cuda.core.system.typing

f94e505

mdboom self-assigned this May 5, 2026

github-actions Bot added the cuda.core Everything related to the cuda.core module label May 5, 2026

mdboom added P0 High priority - Must do! breaking Breaking changes are introduced labels May 5, 2026

mdboom added this to the cuda.core v1.0.0 milestone May 5, 2026

mdboom added 2 commits May 5, 2026 11:21

Don't re-expose things in cuda.core.system.typing

42974fc

Fix tests

86cf4d0

This comment has been minimized.

Sign in to view

Fix imports

b9797a8

mdboom force-pushed the re-expose-enums3 branch from f1a36d6 to b9797a8 Compare May 5, 2026 17:32

mdboom and others added 2 commits May 5, 2026 13:34

Merge branch 'main' into re-expose-enums3

a4b84aa

Fix missing StrEnum test

ec4c8ff

mdboom and others added 2 commits May 5, 2026 14:27

Fix imports

54625b7

Merge branch 'main' into re-expose-enums3

82c5f98

Fix del and fix a test

c08bc32

mdboom requested a review from rwgk May 5, 2026 19:02

mdboom and others added 4 commits May 5, 2026 15:28

Merge branch 'main' into re-expose-enums3

91e8240

Merge branch 'main' into re-expose-enums3

780ec9c

Merge remote-tracking branch 'upstream/main' into re-expose-enums3

720fb6c

Merge remote-tracking branch 'origin/re-expose-enums3' into re-expose…

5196cd5

…-enums3

rwgk approved these changes May 5, 2026

View reviewed changes

mdboom enabled auto-merge (squash) May 5, 2026 20:49

mdboom merged commit 38c032f into NVIDIA:main May 5, 2026
180 of 182 checks passed

This comment has been minimized.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move all cuda.core.system enums into cuda.core.system.typing#2022

Move all cuda.core.system enums into cuda.core.system.typing#2022
mdboom merged 13 commits intoNVIDIA:mainfrom
mdboom:re-expose-enums3

mdboom commented May 5, 2026

Uh oh!

This comment has been minimized.

rwgk commented May 5, 2026

Uh oh!

mdboom commented May 5, 2026

Uh oh!

rwgk commented May 5, 2026 •

edited

Loading

Uh oh!

mdboom commented May 5, 2026

Uh oh!

rwgk commented May 5, 2026

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mdboom commented May 5, 2026

Uh oh!

This comment has been minimized.

rwgk commented May 5, 2026

Uh oh!

mdboom commented May 5, 2026

Uh oh!

rwgk commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdboom commented May 5, 2026

Uh oh!

rwgk commented May 5, 2026

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rwgk commented May 5, 2026 •

edited

Loading