https://kazuhira-r.hatenablog.com/entry/2025/11/03/192004

これは、なにをしたくて書いたもの？

前のエントリーでSemgrep Community Editionについて扱いました。

SASTツール、Semgrep Community Editionを試す - CLOVER🍀

ここで書こうとした内容なのですが、ちょっと長くなったので分割することにしました。

Semgrepの実行結果のフォーマットを変更する話です。

環境

今回の環境はこちら。

$ semgrep --version
1.142.0

Semgrepでの出力フォーマットを変更する

動作確認用にこのようなスクリプトを用意。

app.py

import subprocess
import sys

# Vulnerable
user_input = "foo && cat /etc/passwd" # value supplied by user
subprocess.call("grep -R {} .".format(user_input), shell=True)

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
subprocess.run(["bash", "-c", user_input], shell=True)

# Not vulnerable
user_input = "cat /etc/passwd" # value supplied by user
subprocess.Popen(['ls', '-l', user_input])

# Not vulnerable
subprocess.check_output('ls -l dir/')

こちらから引っ張ってきたものですね。

Command Injection in Python | Semgrep

実行。

$ semgrep scan --config p/default

┌──── ○○○ ────┐
│ Semgrep CLI │
└─────────────┘


Scanning 1 file (only git-tracked) with 1062 Code rules:

  CODE RULES

  Language      Rules   Files          Origin      Rules
 ─────────────────────────────        ───────────────────
  python          243       1          Community    1062
  <multilang>      48       1


  SUPPLY CHAIN RULES

  💎 Sign in with `semgrep login` and run
     `semgrep ci` to find dependency vulnerabilities and
     advanced cross-file findings.


  PROGRESS

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00


┌─────────────────┐
│ 2 Code Findings │
└─────────────────┘

    app.py
   ❯❯❱ python.lang.security.audit.subprocess-shell-true.subprocess-shell-true
          Found 'subprocess' function 'call' with 'shell=True'. This is dangerous because this call will spawn
          the command using a shell process. Doing so propagates current shell settings and variables, which
          makes it much easier for a malicious actor to execute commands. Use 'shell=False' instead.
          Details: https://sg.run/J92w

           ▶▶┆ Autofix ▶ False
            6┆ subprocess.call("grep -R {} .".format(user_input), shell=True)
            ⋮┆----------------------------------------
   ❯❯❱ python.lang.security.audit.subprocess-shell-true.subprocess-shell-true
          Found 'subprocess' function 'run' with 'shell=True'. This is dangerous because this call will spawn
          the command using a shell process. Doing so propagates current shell settings and variables, which
          makes it much easier for a malicious actor to execute commands. Use 'shell=False' instead.
          Details: https://sg.run/J92w

           ▶▶┆ Autofix ▶ False
           10┆ subprocess.run(["bash", "-c", user_input], shell=True)



┌──────────────┐
│ Scan Summary │
└──────────────┘
✅ Scan completed successfully.
 • Findings: 2 (2 blocking)
 • Rules run: 291
 • Targets scanned: 1
 • Parsed lines: ~100.0%
 • No ignore information available
Ran 291 rules on 1 file: 2 findings.
💎 Missed out on 1390 pro rules since you aren't logged in!
⚡ Supercharge Semgrep OSS when you create a free account at https://sg.run/rules.

この結果をオプションで調整できます。

Customize scans | Semgrep

たとえば--sarifオプションをつけると、結果がSARIF形式で標準出力に出力されます。

$ semgrep scan --config p/default --sarif

実行結果はとんでもなく長くなるので貼りませんが、SARIF形式のJSONが出力されます。

--sarif-outputとしてファイル名を指定すると、結果がファイルに書き出されます。

$ semgrep scan --config p/default --sarif-output=semgrep.sarif

## もしくは
$ semgrep scan --config p/default --sarif --sarif-output=semgrep.sarif

こちらは結果が表示できます。
※--sarifオプションを明示的に付けた場合は、SARIF形式の出力が標準出力にも書き出されます

┌──── ○○○ ────┐
│ Semgrep CLI │
└─────────────┘


Scanning 1 file (only git-tracked) with 1062 Code rules:

  CODE RULES

  Language      Rules   Files          Origin      Rules
 ─────────────────────────────        ───────────────────
  python          243       1          Community    1062
  <multilang>      48       1


  SUPPLY CHAIN RULES

  💎 Sign in with `semgrep login` and run
     `semgrep ci` to find dependency vulnerabilities and
     advanced cross-file findings.


  PROGRESS

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00


┌─────────────────┐
│ 2 Code Findings │
└─────────────────┘

    app.py
   ❯❯❱ python.lang.security.audit.subprocess-shell-true.subprocess-shell-true
          Found 'subprocess' function 'call' with 'shell=True'. This is dangerous because this call will spawn
          the command using a shell process. Doing so propagates current shell settings and variables, which
          makes it much easier for a malicious actor to execute commands. Use 'shell=False' instead.
          Details: https://sg.run/J92w

           ▶▶┆ Autofix ▶ False
            6┆ subprocess.call("grep -R {} .".format(user_input), shell=True)
            ⋮┆----------------------------------------
   ❯❯❱ python.lang.security.audit.subprocess-shell-true.subprocess-shell-true
          Found 'subprocess' function 'run' with 'shell=True'. This is dangerous because this call will spawn
          the command using a shell process. Doing so propagates current shell settings and variables, which
          makes it much easier for a malicious actor to execute commands. Use 'shell=False' instead.
          Details: https://sg.run/J92w

           ▶▶┆ Autofix ▶ False
           10┆ subprocess.run(["bash", "-c", user_input], shell=True)



┌──────────────┐
│ Scan Summary │
└──────────────┘
✅ Scan completed successfully.
 • Findings: 2 (2 blocking)
 • Rules run: 291
 • Targets scanned: 1
 • Parsed lines: ~100.0%
 • No ignore information available
Ran 291 rules on 1 file: 2 findings.
💎 Missed out on 1390 pro rules since you aren't logged in!
⚡ Supercharge Semgrep OSS when you create a free account at https://sg.run/rules.

結果。

$ cat semgrep.sarif | jq | head -n 20
{
  "version": "2.1.0",
  "runs": [
    {
      "invocations": [
        {
          "executionSuccessful": true,
          "toolExecutionNotifications": []
        }
      ],
      "results": [
        {
          "fingerprints": {
            "matchBasedId/v1": "requires login"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "app.py",

ヘルプで見ると、Emacs、GitLab SAST、GitLab Secrets、JSON、JUnit XML、SARIF、TEXT、VIM形式で出力可能みたいですね。

$ semgrep scan --help | grep output
           affects text and SARIF output).
       --emacs-output=VAL
           Write a copy of the emacs output to a file or post to URL.
           Always  include  ANSI color in the output, even if not writing to a
       --gitlab-sast-output=VAL
           Write a copy of the GitLab SAST output to a file or post to URL.
       --gitlab-secrets-output=VAL
           Write a copy of the GitLab Secrets output to a file or post to URL.
       --incremental-output
       --json-output=VAL
           Write a copy of the json output to a file or post to URL.
       --junit-xml-output=VAL
           Write a copy of the JUnit XML output to a file or post to URL.
           Add debugging information in the JSON output to trace how different
       -o VAL, --output=VAL
           Only output findings.
       --sarif-output=VAL
           Write a copy of the SARIF output to a file or post to URL.
       --text-output=VAL
           Write a copy of the text output to a file or post to URL.
           Include  a  timing  summary  with  the results. If output format is
       --vim-output=VAL
           Write a copy of the vim output to a file or post to URL.
           or  language-specific  filtering.  Then  exit.  The  default output
           why  they were skipped, using an unspecified output format. Implies

おわりに

Semgrepでの出力フォーマットを変更してみました。

けっこういろいろなフォーマットをサポートしているので、Semgrep AppScan Platformを使わなくてもCIジョブに組み込んだりして
確認などができそうですね。