feat(ci): add contributor reputation check workflow#1520
feat(ci): add contributor reputation check workflow#1520imran-siddique wants to merge 2 commits intogithub:mainfrom
Conversation
Add automated screening for coordinated inauthentic contributions on PR/issue open events using AGT contributor reputation tools. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
main, but PRs should target staged.
The main branch is auto-published from staged and should not receive direct PRs.
Please close this PR and re-open it against the staged branch.
You can change the base branch using the Edit button at the top of this PR,
or run: gh pr edit 1520 --base staged
There was a problem hiding this comment.
Pull request overview
Adds a new GitHub Actions workflow to automatically screen new PR/issue authors using the Agent Governance Toolkit (AGT) and to surface elevated-risk contributors via comments and labels.
Changes:
- Introduces
.github/workflows/contributor-check.ymltriggered onpull_request_targetandissuesopened events. - Runs two AGT Python checks (profile + credential audit) and computes an overall risk level.
- Posts a PR/issue comment and applies a
needs-review:<RISK>label for MEDIUM/HIGH outcomes.
| - name: Checkout AGT scripts | ||
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| with: | ||
| repository: microsoft/agent-governance-toolkit |
There was a problem hiding this comment.
This workflow checks out and executes Python from an external repo (microsoft/agent-governance-toolkit) in a pull_request_target context with write permissions. To reduce supply-chain risk, pin the checkout to a specific immutable ref (commit SHA) and consider setting persist-credentials: false so the token isn’t written into the checked-out repo’s git config.
| repository: microsoft/agent-governance-toolkit | |
| repository: microsoft/agent-governance-toolkit | |
| ref: 0123456789abcdef0123456789abcdef01234567 # replace with a trusted AGT commit SHA | |
| persist-credentials: false |
| case "$1" in HIGH) echo 3 ;; MEDIUM) echo 2 ;; LOW) echo 1 ;; *) echo 0 ;; esac | ||
| } | ||
| p=$(risk_to_num "${{ steps.profile.outputs.risk }}") | ||
| c=$(risk_to_num "${{ steps.credential.outputs.risk }}") | ||
| max=$p; [ "$c" -gt "$max" ] && max=$c | ||
| case "$max" in 3) r="HIGH" ;; 2) r="MEDIUM" ;; *) r="LOW" ;; esac |
There was a problem hiding this comment.
The profile / credential steps convert any script failure or malformed JSON into risk=UNKNOWN, but risk_to_num maps unknown values to 0, which then makes the overall risk LOW. This means the workflow silently fails open (no label/comment) when the checks break; consider treating UNKNOWN as at least MEDIUM (or failing the job) so the workflow doesn’t miss suspicious contributions due to transient errors.
| case "$1" in HIGH) echo 3 ;; MEDIUM) echo 2 ;; LOW) echo 1 ;; *) echo 0 ;; esac | |
| } | |
| p=$(risk_to_num "${{ steps.profile.outputs.risk }}") | |
| c=$(risk_to_num "${{ steps.credential.outputs.risk }}") | |
| max=$p; [ "$c" -gt "$max" ] && max=$c | |
| case "$max" in 3) r="HIGH" ;; 2) r="MEDIUM" ;; *) r="LOW" ;; esac | |
| case "$1" in | |
| HIGH) echo 3 ;; | |
| MEDIUM|UNKNOWN) echo 2 ;; | |
| LOW) echo 1 ;; | |
| *) echo 2 ;; | |
| esac | |
| } | |
| p=$(risk_to_num "${{ steps.profile.outputs.risk }}") | |
| c=$(risk_to_num "${{ steps.credential.outputs.risk }}") | |
| max=$p; [ "$c" -gt "$max" ] && max=$c | |
| case "$max" in 3) r="HIGH" ;; 2) r="MEDIUM" ;; 1) r="LOW" ;; *) r="MEDIUM" ;; esac |
| body="<!-- agt-contributor-check --> | ||
| $icon **Contributor Reputation Check: $risk risk** | ||
|
|
||
| | Check | Risk | | ||
| |-------|------| | ||
| | Profile | $profile | | ||
| | Credential audit | $cred | | ||
|
|
||
| Maintainers: please review this contributor before merging. | ||
| *Automated check powered by [AGT](https://github.com/microsoft/agent-governance-toolkit).*" |
There was a problem hiding this comment.
The multi-line body string is indented inside the bash script, so the generated Markdown comment will include leading spaces on most lines. In GitHub-flavored Markdown, 4+ leading spaces turns content into a code block, which will likely prevent the table/bold formatting from rendering. Consider building the body with a heredoc (no indentation) or otherwise stripping leading whitespace.
| body="<!-- agt-contributor-check --> | |
| $icon **Contributor Reputation Check: $risk risk** | |
| | Check | Risk | | |
| |-------|------| | |
| | Profile | $profile | | |
| | Credential audit | $cred | | |
| Maintainers: please review this contributor before merging. | |
| *Automated check powered by [AGT](https://github.com/microsoft/agent-governance-toolkit).*" | |
| body=$(cat <<EOF | |
| <!-- agt-contributor-check --> | |
| $icon **Contributor Reputation Check: $risk risk** | |
| | Check | Risk | | |
| |-------|------| | |
| | Profile | $profile | | |
| | Credential audit | $cred | | |
| Maintainers: please review this contributor before merging. | |
| *Automated check powered by [AGT](https://github.com/microsoft/agent-governance-toolkit).* | |
| EOF | |
| ) |
| env: | ||
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| run: | | ||
| username="${{ steps.author.outputs.username }}" |
There was a problem hiding this comment.
username is assigned but never used in this step; removing it would reduce noise and avoid implying it influences the comment/label behavior.
| username="${{ steps.author.outputs.username }}" |
|
Please don't contribute new workflows and policies without discussing with the maintainers first. |
| pull_request_target: | ||
| types: [opened] | ||
| issues: | ||
| types: [opened] |
There was a problem hiding this comment.
Have you experimented with running this on discussions too? Would that be useful to do?
| - name: Checkout AGT scripts | ||
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| with: | ||
| repository: microsoft/agent-governance-toolkit | ||
| sparse-checkout: scripts | ||
| path: agt |
There was a problem hiding this comment.
Do we have to clone the repo? Is it not possible to install the CLI with pip?
If we have to clone the repo, it might be a good idea to pin a tag or release, so that if there is drift in the script it doesn't result in this action failing.
| profile="${{ steps.profile.outputs.risk }}" | ||
| cred="${{ steps.credential.outputs.risk }}" | ||
|
|
||
| if [ "$risk" = "HIGH" ]; then icon="🔴"; else icon="🟡"; fi |
There was a problem hiding this comment.
Should low risk not have a separate icon?
| body="<!-- agt-contributor-check --> | ||
| $icon **Contributor Reputation Check: $risk risk** | ||
|
|
||
| | Check | Risk | |
There was a problem hiding this comment.
Ah, my comment on the discussion is probably irrelevant - I wrote that before reviewing the PR and I was assuming the PR was going to be using the action and Python script the action runs from the repo, not a custom script here.
| --repo "${{ github.repository }}" \ | ||
| --json > /tmp/cred.json 2>/tmp/cred.log | ||
| set -e | ||
| risk=$(python -c "import json; print(json.load(open('/tmp/cred.json'))['risk'])" 2>/dev/null || echo "UNKNOWN") |
There was a problem hiding this comment.
It only extracts the risk property from the JSON, is there anything else from the JSON that might be useful to extract and display?
| --repo "${{ github.repository }}" \ | ||
| --json > /tmp/cred.json 2>/tmp/cred.log | ||
| set -e | ||
| risk=$(python -c "import json; print(json.load(open('/tmp/cred.json'))['risk'])" 2>/dev/null || echo "UNKNOWN") |
There was a problem hiding this comment.
| risk=$(python -c "import json; print(json.load(open('/tmp/cred.json'))['risk'])" 2>/dev/null || echo "UNKNOWN") | |
| risk=$(cat '/tmp/cred.json' | jq '.risk // "UNKNOWN"') |
Can simplify with jq rather than loading a Python environment for that.
| --username "${{ steps.author.outputs.username }}" \ | ||
| --json > /tmp/profile.json 2>/tmp/profile.log | ||
| set -e | ||
| risk=$(python -c "import json; print(json.load(open('/tmp/profile.json'))['risk'])" 2>/dev/null || echo "UNKNOWN") |
There was a problem hiding this comment.
| risk=$(python -c "import json; print(json.load(open('/tmp/profile.json'))['risk'])" 2>/dev/null || echo "UNKNOWN") | |
| risk=$(cat '/tmp/profile.json' | jq '.risk // "UNKNOWN"') |
Can simplify with jq rather than loading a Python environment for that.
| --username "${{ steps.author.outputs.username }}" \ | ||
| --json > /tmp/profile.json 2>/tmp/profile.log | ||
| set -e | ||
| risk=$(python -c "import json; print(json.load(open('/tmp/profile.json'))['risk'])" 2>/dev/null || echo "UNKNOWN") |
There was a problem hiding this comment.
It only extracts the risk property from the JSON, is there anything else from the JSON that might be useful to extract and display?
Summary
Add automated contributor reputation screening on PR/issue open events to detect coordinated inauthentic contribution patterns (e.g., credential-laundering campaigns, spray-and-pray governance issues).
How it works
Why this matters
Multiple AI agent framework repos have been targeted by coordinated campaigns that:
This workflow helps maintainers catch these patterns early.
Dependencies
pull_request_targetandissuesevents onlyGITHUB_TOKENCo-authored-by: Copilot 223556219+Copilot@users.noreply.github.com