Skip to content

Add optional severity filter for SARIF alerts#17

Merged
adrienpessu merged 8 commits intomainfrom
copilot/add-severity-filter-option
Mar 11, 2026
Merged

Add optional severity filter for SARIF alerts#17
adrienpessu merged 8 commits intomainfrom
copilot/add-severity-filter-option

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 2, 2026

Adds a severity input that filters SARIF alerts by severity level before pattern-based filtering is applied. Accepts a comma-separated list of values.

Supported severity values

  • SARIF result level: error, warning, note, none — matched against result.level
  • Security severity category: critical, high, medium, low — derived from rule.properties.security-severity using the standard GitHub mapping (critical ≥9.0, high ≥7.0, medium ≥4.0, low >0.0)

Both kinds can be combined: severity: error,high,critical

Changes

  • filter_sarif.py: Three new functions (compute_security_severity_category, collect_rule_severities, result_matches_severity) and a --severity CLI arg. Severity check runs before location-level pattern matching.
  • action.yml: New optional severity input, passed via env var to avoid shell injection.
  • README.md: New Severity section documenting supported values and usage.

Usage

- name: filter-sarif
  uses: advanced-security/filter-sarif@v1
  with:
    patterns: |
      +**/*.java
      -**/*Test*.java
    severity: high,critical
    input: sarif-results/java.sarif
    output: sarif-results/java.sarif

Omitting severity preserves all alerts (backward compatible).

Copilot AI and others added 3 commits March 2, 2026 13:30
Support filtering by SARIF result level (error, warning, note, none)
and security-severity categories (critical, high, medium, low) derived
from rule metadata. The filter is applied before pattern-based filtering.

Co-authored-by: adrienpessu <7055334+adrienpessu@users.noreply.github.com>
Co-authored-by: adrienpessu <7055334+adrienpessu@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional severity-based pre-filter for SARIF results (by SARIF result.level and/or rule security-severity category), wire it into the GitHub Action input, and document usage.

Changes:

  • Add --severity CLI arg and severity-matching helpers in filter_sarif.py, applied before pattern/location filtering.
  • Add severity input to action.yml and pass it safely via an env var.
  • Document supported severity values and examples in README.md.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File Description
filter_sarif.py Implements severity parsing and filtering logic for SARIF results.
action.yml Adds a new optional severity input and plumbs it to the script.
README.md Documents the new severity filter and provides examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread filter_sarif.py Outdated
Comment on lines +96 to +97
# Check standard SARIF result.level (error, warning, note, none)
res_level = result.get('level', '')
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.level is optional in SARIF, with a default of "warning" when omitted. Using result.get('level', '') will cause results without an explicit level to never match warning and get filtered out unexpectedly when a severity filter is applied. Consider defaulting missing/empty levels to "warning" before comparing.

Suggested change
# Check standard SARIF result.level (error, warning, note, none)
res_level = result.get('level', '')
# Check standard SARIF result.level (error, warning, note, none).
# Per SARIF spec, missing/empty level defaults to "warning".
res_level = result.get('level')
if not res_level:
res_level = 'warning'

Copilot uses AI. Check for mistakes.
Comment thread filter_sarif.py Outdated
Comment on lines +120 to +127
severity_filter = None
if args.severity:
severity_filter = set()
for tok in args.severity.split(','):
stripped = tok.strip().lower()
if stripped:
severity_filter.add(stripped)
print('Severity filter: keeping results with severity in {}'.format(severity_filter))
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unknown severity tokens are currently accepted silently; if a user misspells a value (e.g. hgh), the filter will exclude all results with no clear error. Consider validating tokens against the supported set (SARIF levels + supported security categories) and failing fast (or at least warning) when invalid values are provided.

Copilot uses AI. Check for mistakes.
Comment thread filter_sarif.py Outdated
Comment on lines +75 to +77
if numeric > 0.0:
return 'low'
return 'none'
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code introduces a security-severity category value 'none' (for scores <= 0.0), but the PR description/README only list critical/high/medium/low for security categories. This mismatch can confuse users, especially since 'none' is also a SARIF result.level value. Either document 'none' as a supported security category (and define its meaning), or return None here so that non-positive scores don’t participate in severity matching.

Copilot uses AI. Check for mistakes.
Comment thread filter_sarif.py Outdated
stripped = tok.strip().lower()
if stripped:
severity_filter.add(stripped)
print('Severity filter: keeping results with severity in {}'.format(severity_filter))
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing a raw set makes the log output order nondeterministic, which can be noisy and harder to read/compare. Consider printing a stable ordering (e.g., sorted list or a comma-joined string) for consistent logs.

Suggested change
print('Severity filter: keeping results with severity in {}'.format(severity_filter))
print('Severity filter: keeping results with severity in {}'.format(', '.join(sorted(severity_filter))))

Copilot uses AI. Check for mistakes.
@adrienpessu
Copy link
Copy Markdown
Contributor

@copilot the filtering doesn't work. When I set the severity paramterer, all alerts are filtered. Can you fix it ?

Copilot AI and others added 2 commits March 11, 2026 12:44
Co-authored-by: adrienpessu <7055334+adrienpessu@users.noreply.github.com>
…l is missing

Co-authored-by: adrienpessu <7055334+adrienpessu@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 11, 2026

@copilot the filtering doesn't work. When I set the severity paramterer, all alerts are filtered. Can you fix it ?

Fixed in da97499. The root cause was that result.get('level', '') returned an empty string for results without an explicit level field (very common in SARIF), which never matched any severity value — causing all results to be filtered out.

The fix:

  • Missing result.level now falls back to rule.defaultConfiguration.level, then defaults to 'warning' per the SARIF spec
  • Invalid severity tokens (e.g. hgh) now fail immediately with a clear error listing valid values
  • compute_security_severity_category no longer returns 'none' for score ≤ 0 (returns None instead, avoiding ambiguity with SARIF's none level)

@adrienpessu adrienpessu marked this pull request as ready for review March 11, 2026 15:47
Copilot AI review requested due to automatic review settings March 11, 2026 15:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (2)

filter_sarif.py:134

  • This function tries to fall back to ruleIndex (r_idx) to find a default level, but collect_rule_severities() currently only stores entries keyed by rule id. As a result, results without ruleId (or where only ruleIndex is set) will incorrectly default to "warning" and may be filtered wrong. Populate rule_default_level_lookup (and rule_sev_lookup) with ruleIndex keys as well (careful to scope indices per tool component), or remove the ruleIndex fallback.
    if res_level is None:
        rid = result.get('ruleId', '')
        r_idx = result.get('ruleIndex')
        res_level = rule_default_level_lookup.get(rid)
        if res_level is None:
            res_level = rule_default_level_lookup.get(r_idx)
        if res_level is None:
            res_level = 'warning'

filter_sarif.py:200

  • Per-result debug logging here will produce very large logs for SARIF files with many results and may expose rule identifiers/severity data. This should be removed or behind an explicit debug/verbose option.
                    sev_cat = rule_sev_lookup.get(rid)
                    def_level = rule_default_level_lookup.get(rid)
                    matches = result_matches_severity(r, severity_filter, rule_sev_lookup, rule_default_level_lookup)
                    print(f"[DEBUG] Result ruleId={rid}, level={r_level}, sev_cat={sev_cat}, def_level={def_level}, matches={matches}")
                    if not matches:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread filter_sarif.py
Comment thread filter_sarif.py
Comment on lines +90 to +92
def _process_rules(rules, source_name):
for idx, rule_def in enumerate(rules):
rid = rule_def.get('id', '')
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source_name is never used in _process_rules, which makes the helper harder to understand and suggests missing logic. Remove the unused parameter or use it (e.g., to namespace ruleIndex lookups so indices from different tool components don’t collide).

Copilot uses AI. Check for mistakes.
Comment thread filter_sarif.py
Comment on lines +158 to +162
if args.severity:
severity_filter = set()
for tok in args.severity.split(','):
stripped = tok.strip().lower()
if not stripped:
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If --severity is provided but contains only commas/whitespace, severity_filter becomes an empty set and the code silently behaves as if no filter was requested (but still prints a "Severity filter" line). Consider failing fast when the parsed set is empty, or treat it as unset by resetting severity_filter to None after parsing.

Copilot uses AI. Check for mistakes.
Comment thread filter_sarif.py
Comment on lines +101 to +118
# Process driver rules
driver_rules = run.get('tool', {}).get('driver', {}).get('rules', [])
print(f"[DEBUG] Driver rules count: {len(driver_rules)}")
_process_rules(driver_rules, 'driver')

# Process extension rules (e.g. CodeQL query packs)
extensions = run.get('tool', {}).get('extensions', [])
print(f"[DEBUG] Extensions count: {len(extensions)}")
for ext in extensions:
ext_name = ext.get('name', '?')
ext_rules = ext.get('rules', [])
if ext_rules:
print(f"[DEBUG] Extension {ext_name} has {len(ext_rules)} rules")
_process_rules(ext_rules, ext_name)

print(f"[DEBUG] rule_sev_lookup has {len(sec_sev_lookup)} entries")
print(f"[DEBUG] rule_default_level_lookup has {len(default_level_lookup)} entries")
return sec_sev_lookup, default_level_lookup
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new [DEBUG] prints in production code will spam action logs and can leak SARIF metadata (rule ids, levels, derived categories). Please remove these prints or gate them behind an explicit --debug/--verbose flag (or an environment variable) so normal runs stay quiet.

This issue also appears on line 196 of the same file.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed all [DEBUG] print statements in b82026b. The per-result debug logging block in filter_sarif() was also cleaned up, restoring the simpler single-line severity check.

Comment thread action.yml Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: adrienpessu <7055334+adrienpessu@users.noreply.github.com>
@adrienpessu adrienpessu merged commit 2da736f into main Mar 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants