Skip to content

fix(version_schemes): support arbitrary semver pre-release labels#1548

Open
bearomorphism wants to merge 1 commit intocommitizen-tools:masterfrom
bearomorphism:bear-fix-semver-not-fully-covered
Open

fix(version_schemes): support arbitrary semver pre-release labels#1548
bearomorphism wants to merge 1 commit intocommitizen-tools:masterfrom
bearomorphism:bear-fix-semver-not-fully-covered

Conversation

@bearomorphism
Copy link
Copy Markdown
Collaborator

@bearomorphism bearomorphism commented Jun 12, 2025

Description

Fixes #950

This PR fixes a bug where commitizen 3.x raises InvalidVersion (or InvalidVersionPart) when encountering git tags with arbitrary semver pre-release identifiers (e.g., v0.7.1-release, v0.0.1-SNAPSHOT). These are valid semver identifiers per SemVer §9, but the previous regex pattern didn't accept them.

Changes

1. Extended _VERSION_PATTERN regex (version_schemes.py)

The original regex only allowed alpha|beta|preview|rc|a|b|c as pre-release labels. The new pattern adds an additional branch [a-zA-Z-]+ that matches any alphabetical or hyphenated identifier, conforming to semver spec.

Before: (alpha|beta|preview|rc|a|b|c) — only PEP 440 labels accepted
After: (alpha|beta|preview|rc|a|b|c|[a-zA-Z-]+) — also accepts arbitrary labels

2. Widened prerelease parameter type from Prerelease | None to str | None

The Prerelease type alias is Literal["alpha", "beta", "rc"], which is too narrow when the system needs to handle arbitrary labels parsed from existing tags. The CLI still restricts user input to the three known labels; this type widening only affects the internal API.

3. Fixed generate_prerelease() comparison logic

The old code used startswith() to match the incoming prerelease label against the current pre-release phase. This was subtly buggy: "alphabeta".startswith("a") is True, so a label like "alphabeta" would incorrectly be treated as continuing an "alpha" series.

The new logic:

  • Normalizes the incoming label using the same mapping packaging uses internally ("alpha"→"a", "beta"→"b", "rc"→"rc", others→lowercase)
  • For known PEP 440 labels (a, b, rc): uses max() ordering to prevent down-bumping phases (e.g., won't go from b1 back to a2)
  • For arbitrary labels: uses strict equality comparison — no ordering assumptions since arbitrary labels have no defined precedence

This also fixes a potential case-sensitivity issue: "SNAPSHOT" in a tag is normalized to "snapshot" by packaging, so the comparison now lowercases the incoming label for arbitrary labels.

Root Cause Analysis

When commitizen discovers existing tags (e.g., during cz bump), it calls:

Version(tag_string)  →  packaging.version.Version.__init__()
                     →  self._regex.fullmatch(version)

The _regex on BaseVersion was inherited from packaging.version.Version, which only accepts PEP 440 pre-release labels. A tag like v0.7.1-release would fail the regex match and raise InvalidVersion.

By overriding _regex with a pattern that also accepts arbitrary identifiers, the version can be parsed successfully. The rest of the version logic (epoch, release, post, dev, local) remains unchanged.

Test Cases Added

Input Version Increment Prerelease Expected Output
v1.0.0-reallyweird PATCH reallyweird 1.0.0-reallyweird1
v0.7.1-release PATCH release 0.7.1-release1
v0.0.1-SNAPSHOT PATCH SNAPSHOT 0.0.1-snapshot1

AI Disclosure

This PR was revived and updated with AI assistance (GitHub Copilot). The original fix concept came from the community contributor.

Comment thread tests/test_version_scheme_semver.py Outdated
@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch from 877fcd6 to da46193 Compare June 13, 2025 13:09
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 13, 2025

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

@bearomorphism bearomorphism changed the title WIP fix semver not fully covered fix semver not fully covered Aug 13, 2025
@bearomorphism bearomorphism marked this pull request as ready for review August 13, 2025 01:30
@Lee-W Lee-W added this to the 4.9.0 milestone Aug 17, 2025
Comment thread tests/test_version_scheme_semver.py Outdated
Comment thread commitizen/version_schemes.py Outdated
@bearomorphism
Copy link
Copy Markdown
Collaborator Author

I will update this PR this week when I have bandwidth

@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch from da46193 to 3081fff Compare September 8, 2025 08:34
@Lee-W Lee-W modified the milestones: 4.9.0, 4.9.1 Sep 9, 2025
@Lee-W Lee-W deleted the branch commitizen-tools:master September 9, 2025 06:09
@Lee-W Lee-W closed this Sep 9, 2025
@bearomorphism
Copy link
Copy Markdown
Collaborator Author

Maybe we can adjust the workflow. Several PRs got closed just because the target branch is deleted

@Lee-W Lee-W reopened this Sep 9, 2025
@Lee-W Lee-W changed the base branch from v4-9-0-test to master September 9, 2025 14:22
@Lee-W
Copy link
Copy Markdown
Member

Lee-W commented Sep 9, 2025

Yep, this was a temporary workflow. we usually don't receive this many PR and don't review this fast. Maybe worth rethink it to make a rc to avoid what we encountered yesterday as well

@Lee-W Lee-W removed this from the 4.9.1 milestone Sep 9, 2025
@Lee-W Lee-W added this to the 4.9.2 milestone Sep 9, 2025
@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch 2 times, most recently from 8ee1dbe to bb48144 Compare September 13, 2025 09:45
Comment thread commitizen/version_schemes.py Outdated
@bearomorphism bearomorphism marked this pull request as draft September 27, 2025 02:13
@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch 2 times, most recently from 565f319 to 72fbe95 Compare May 3, 2026 06:48
@bearomorphism bearomorphism changed the title fix semver not fully covered fix(version_schemes): support arbitrary semver pre-release labels May 3, 2026
@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch 2 times, most recently from 9198eb4 to 9235842 Compare May 3, 2026 11:25
@bearomorphism bearomorphism marked this pull request as ready for review May 3, 2026 11:26
@bearomorphism
Copy link
Copy Markdown
Collaborator Author

Updated PR with new description

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends Commitizen’s version parsing to avoid InvalidVersion failures when encountering git tags that use non-PEP440, SemVer-style prerelease labels (e.g., v0.7.1-release, v0.0.1-SNAPSHOT), addressing issue #950.

Changes:

  • Widen VersionProtocol.bump() / BaseVersion.bump() prerelease typing from a restricted Literal[...] to str | None.
  • Override packaging.version.Version’s parsing regex via BaseVersion._regex to accept additional prerelease labels.
  • Add SemVer/SemVer2 test cases to validate bumping with non-standard prerelease labels.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
commitizen/version_schemes.py Broadens prerelease typing and overrides the underlying version regex used for parsing.
tests/utils.py Updates test argument typing to match widened prerelease type (`str
tests/test_version_scheme_semver.py Adds regression tests for SemVer bumping with arbitrary prerelease labels and v-prefixed tags.
tests/test_version_scheme_semver2.py Adds a SemVer2 regression test for bumping with an arbitrary prerelease label.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +243 to +247
# arbitrary semver pre-release labels (issue #950)
(
VersionSchemeTestArgs(
current_version="1.0.0-reallyweird",
increment="PATCH",
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. Added equivalent SemVer2 test cases for �0.7.1-release and �0.0.1-SNAPSHOT alongside the existing 1.0.0-reallyweird case.

Comment thread commitizen/version_schemes.py Outdated
Comment on lines +166 to +170
(post|rev|r|dev)
[-_\.]?
([0-9]+)?
$)
[a-z]+? # match any letters (semver support)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Widened the pattern from [a-z]+? to [a-z]+(?:-[a-z]+)* which now accepts hyphenated labels like pre-release. For labels containing digits (e.g., foo1bar), the regex structure separates the numeric suffix via the pre_n group (foo + 1), which is consistent with how packaging handles it. Purely numeric identifiers like 1.0.0-1 are already handled by the post_n1 group (numeric-only post release). I've updated the PR description to clarify this scoping.

Comment thread commitizen/version_schemes.py Outdated
(post|rev|r|dev)
[-_\.]?
([0-9]+)?
$)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The negative lookahead now uses (\+|$) instead of just $, so reserved labels like post and dev are correctly excluded even when followed by a +local segment. Added a test case 1.0.0-release+local123 to verify this.

Comment thread commitizen/version_schemes.py Outdated
Comment on lines +197 to +201
"""
A base class implementing the `VersionProtocol` for PEP440-like versions.
"""

_regex: re.Pattern = re.compile(_VERSION_PATTERN, re.VERBOSE | re.IGNORECASE)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. Moved the _regex override from BaseVersion to SemVer (which SemVer2 inherits). Pep440 now retains the strict PEP 440 regex from packaging.version.Version. Added an explicit test test_pep440_rejects_arbitrary_prerelease_labels() to lock this in.

@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch from 9fff07f to 4a72d21 Compare May 3, 2026 11:38
Extend BaseVersion with a custom _VERSION_PATTERN regex that accepts
arbitrary pre-release identifiers (e.g., -release, -SNAPSHOT,
-reallyweird) instead of only PEP 440's alpha/beta/rc.

This fixes InvalidVersion errors when using tags like v0.7.1-release
or v0.0.1-SNAPSHOT with commitizen's changelog and bump commands.

Closes commitizen-tools#950

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bearomorphism bearomorphism force-pushed the bear-fix-semver-not-fully-covered branch from 4a72d21 to ac1bc5f Compare May 3, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Help wanted: commitizen 3.x not supoort generate CHANGELOG with tag format like v1.0.0-release

3 participants