Detecting malware in repositories with Shady Project Scanner

2026-03-30

post-thumb

Contents

You find a repository on GitHub that claims to be a PoC exploit, a useful tool or a lib that solves exactly your problem. You clone it, run npm install and… congratulations, you just installed a stealer that’s exfiltrating your credentials to a Discord webhook.

This scenario is not fiction. Malicious repositories are published daily disguised as legitimate tools, security exploits and open source utilities. The problem is that, most of the time, the malicious code is hidden behind obfuscation, install hooks and base64-encoded payloads.

To address this, I created Shady Project Scanner: a pattern-based scanner suite that audits entire repositories looking for malware, backdoors and suspicious behavior. No external dependencies, built 100% in bash.


What is Shady Project Scanner

It’s a set of 6 detection scripts that scan repositories looking for known malicious patterns. Each scanner specializes in a specific ecosystem:

ScannerTarget
scan-repo.shGeneral checks (cross-language)
scan-node.shNode.js / npm
scan-php.shPHP (webshells, backdoors)
scan-python.shPython / PyPI
scan-go.shGo modules
scan-polinrider.shPolinRider malware (specific signature)

Features:

  • Zero external dependencies (bash + grep + find + awk)
  • Works on Linux and macOS
  • Color-coded severity: [!] red (high risk), [~] yellow (suspicious), [i] cyan (informational)
  • Exit codes: 0 = clean, 1 = findings, 2 = error

Installation

git clone https://github.com/lauralesteves/shady-project-scanner.git
cd shady-project-scanner

That’s it. No installation required.


How to use

The simplest way is via Makefile. To run all 6 scanners at once:

make scan TARGET=/path/to/suspicious/repository

To run a specific scanner:

make scan-node TARGET=/path/to/repository
make scan-python TARGET=/path/to/repository
make scan-repo TARGET=/path/to/repository

Or directly:

./scanners/scan-repo.sh /path/to/repository
./scanners/scan-node.sh /path/to/repository

Real example: scanning a suspicious repository

Let’s use the repository 0xgrama/StockX_PoC_1.07 as our target, which presents itself as a PoC exploit for StockX. Repositories like this are the classic scenario: they promise an offensive tool and the real target is whoever downloads it.

Warning: Do not clone this repository on your personal machine. If you want to test it, use an isolated environment like a VM or a disposable EC2 instance.

After cloning the suspicious repository in a safe environment, just point the scanner at it:

make scan TARGET=/path/to/StockX_PoC_1.07

Shady Project Scanner will run all 6 detection scripts sequentially against the repository. Let’s look at what each one does.


The 6 scanners in detail

1. scan-repo.sh: general checks (cross-language)

This is the most comprehensive scanner. It runs 14 checks applicable to any language:

Malicious Git Hooks – Checks whether hooks in .git/hooks/ contain commands like curl, wget, bash, eval or exfiltration URLs (Discord webhooks, Slack, ngrok). A post-checkout hook that downloads and executes a remote script is one of the most common attack vectors.

Crypto miners – Looks for references to xmrig, coinhive, mining pools (supportxmr, nanopool, f2pool) and Monero/Bitcoin wallet addresses.

Secrets and credentials – Detects AWS keys (AKIA/ABIA/ACCA prefix), private keys, GitHub/Slack/JWT tokens, database connection strings containing passwords, and sensitive files like .env, .npmrc, .pypirc, .pem, id_rsa.

CI/CD tampering – Checks for GitHub Actions pinned to branches instead of SHA, pull_request_target with secrets access, curl | bash in workflows and write permissions.

Docker abuse – Detects privileged containers, mounting of sensitive paths (/etc/shadow, ~/.ssh, ~/.aws, /var/run/docker.sock) and remote script downloads in Dockerfiles.

Unicode/Homoglyph attacks – Looks for Trojan Source characters (CVE-2021-42574), such as bidirectional overrides (U+202A-E) and zero-width chars (U+200B-D), which can visually hide malicious code.

Suspicious binaries – Flags .exe, .dll, .so, .dylib, .bat, .ps1 and other executables. PowerShell and batch scripts with dangerous patterns receive special attention.

Build scripts – Analyzes Makefiles and shell scripts looking for curl | bash, base64 --decode | exec, reverse shells (/dev/tcp, mkfifo, nc) and system file modifications.

Large encoded blobs – Detects lines longer than 2000 characters in source files, indicating embedded minified or obfuscated code (excludes *.min.js and legitimate bundles).

Cron/Scheduled tasks – Checks cron files for network or script commands (curl, wget, python, nc).

IDE configuration – Detects VS Code tasks with shell execution and extension recommendations that could be malicious.

Git config abuse – Looks for custom filter drivers in .gitattributes (can silently execute code), local .gitconfig overrides and FUNDING.yml with suspicious wallets.

Reverse shells – Detects reverse shell patterns in bash (/dev/tcp), Python (socket), PHP (fsockopen) and Perl (IO::Socket).

Hardcoded endpoints – Finds non-private IPs with port in build and config scripts, excluding private ranges (127.0.0.1, 192.168.x.x, 10.x.x.x).


2. scan-node.sh: Node.js / npm

Focused on JavaScript ecosystem-specific threats. 8 checks:

Install hooks in package.json – Checks if preinstall, postinstall or preuninstall contain curl, wget, eval, base64 or node -e. Also detects dependencies with known typosquatting names (lodash vs 1odash, event-stream, flatmap-stream, ua-parser-js) and suspicious bundledDependencies.

eval() / Function() abuse – Detects eval() with dynamic arguments, new Function(), eval(atob(...)), eval(Buffer.from(...)) and the most dangerous pattern: eval(response.data) or eval(res.body), which is fetch-then-eval (remote code execution).

child_process / shell execution – Finds execSync, exec, spawnSync with suspicious commands, obfuscated require('child_process') and dynamic require with string concatenation.

Data exfiltration – Detects reading process.env + HTTP requests, access to sensitive files (.ssh, .aws, .npmrc), exfiltration URLs (Discord, Slack, ngrok, webhook.site, pipedream, requestbin) and CI/CD token theft (GITHUB_TOKEN, NPM_TOKEN, AWS_SECRET_ACCESS_KEY).

Obfuscated payloads – Long Base64 strings (>100 chars), hex-encoded patterns (\x in bulk), Unicode escapes, Buffer.from + base64 + eval/exec, javascript-obfuscator patterns (_0x variables) and JSFuck-style obfuscation.

npm/yarn config – Custom registries pointing outside official npm and authentication tokens in .npmrc/.yarnrc.

Lockfile integrity – Checks if URLs in package-lock.json point outside official registries (npmjs.org, yarnpkg.com) or if tarballs come from non-standard locations.

Native addons – Detects binding.gyp files and .node binaries outside node_modules.


3. scan-php.sh: PHP webshells and backdoors

Scanner specialized in webshells, backdoors and obfuscated PHP code. 9 checks:

Webshell signatures – Detects known shells (c99shell, r57shell, b374k, WSO, FilesMan, Ani-Shell, ALFA Shell, Weevely, p0wny, phpspy). Also detects the most common webshell pattern: $_GET/$_POST/$_REQUEST/$_COOKIE passed directly to dangerous functions (eval, system, exec, passthru, shell_exec, popen, proc_open). Identifies file_get_contents + eval (fetch-then-eval), php://input + code execution and hardcoded MD5 hashes (backdoor passwords).

Obfuscation chainseval(base64_decode(...)), eval(gzinflate(...)), eval(str_rot13(...)), nested decode chains, assert() with encoded input, preg_replace with /e modifier and create_function().

Dangerous functionssystem/passthru/shell_exec with variables, backtick operator with interpolation, proc_open() and dl() (dynamic extension loading).

File write abusefile_put_contents with user-controlled path, fwrite with decoded content and move_uploaded_file without MIME validation.

PHP hidden in non-PHP files – Detects <?php code inside .jpg, .png, .gif, .ico, .css, .txt, .html files. Also flags double extensions (.php.jpg) and suspicious extensions (.phtml, .php5, .phar).

Encoded payloads – Very long Base64 strings (>200 chars), long hex sequences (\x in bulk), chr() obfuscation (10+ chained characters), variable variables and string concatenation building function names.

.htaccess abuse – Injection via auto_prepend_file/auto_append_file, AddHandler/AddType making images executable as PHP and SetHandler directives.

PHP configphp.ini/.user.ini with auto_prepend_file and disable_functions directives.

Composer – Suspicious scripts in install hooks (post-install-cmd, post-update-cmd) containing curl, wget, bash, eval or base64.


4. scan-python.sh: Python / PyPI

Focused on Python ecosystem threats. 8 checks:

Malicious setup.py – Detects imports of suspicious modules during setup (subprocess, os.system, socket, urllib, requests), calls to exec/eval/compile, base64 decoding, custom install command classes with code execution and URL fetching during pip install.

exec() / eval() abuseexec/eval with encoded payloads (base64, codecs, zlib, marshal, bytes.fromhex), exec(compile(...)) and the most critical pattern: requests.get + exec or urllib response + eval (fetch-then-exec). Also detects multi-layer chains: base64 -> zlib -> marshal -> exec.

Obfuscated payloads – Long Base64 strings (>200 chars), marshal.loads() (bytecode deserialization), dynamic __import__, importlib.import_module with dynamic strings, bytes.fromhex + exec, bulk chr(), lambda wrapping exec(), getattr(__builtins__) and browser credential theft patterns (Chrome, Firefox + sqlite3).

Data exfiltration – Reading sensitive files (.ssh, .aws, .env, /etc/passwd, /etc/shadow) + network calls, exfiltration webhooks, os.environ collection + HTTP POST, DNS resolution + system info gathering and token theft.

subprocess / os command executionsubprocess with shell=True, os.system() with variable argument, os.popen() and reverse shell patterns (socket.connect + exec, pty.spawn).

Deserialization riskspickle.load(s) with untrusted data (HTTP, file open, stdin), yaml.load() without SafeLoader and custom __reduce__ with command execution.

Requirements files – Dependencies from git URLs (git+https://), non-PyPI URLs and custom indexes (--index-url, --extra-index-url) pointing outside known registries.

Suspicious .pth files.pth files with import statements (auto-execute on Python startup) and .egg-info with entry_points.


5. scan-go.sh: Go modules

Scanner for supply chain attacks in Go projects. 7 checks:

replace directives in go.mod – Detects replace pointing to local paths (../), non-standard repositories (outside github.com, golang.org, google, gopkg.in), retract directives and dependencies from unusual domains.

Suspicious init() functionsinit() functions in files that also import net/http, net.Dial, os/exec, access environment variables + make network calls, or access sensitive paths (.ssh, .aws, /etc/passwd).

os/exec abuseexec.Command with shell interpreters (bash, sh, powershell) + suspicious commands (curl, wget, nc, base64, whoami), fmt.Sprintf as exec argument (obfuscation) and exec.Command with os.Getenv (environment variable as command).

Data exfiltration – Webhook/exfiltration URLs, system info collection (os.Hostname, runtime.GOOS) + HTTP POST, reading sensitive files, raw TCP/UDP connections + exec (possible C2) and hardcoded IPs.

CGo abuseimport "C" with dangerous C functions (system(), popen(), execl/execv/execvp(), fork(), dlopen()).

go:generate directives – Suspicious generate commands (curl, wget, bash, powershell, rm, nc, python) and remote URL fetching.

Build constraints and embed tricksgo:embed with executable/hidden files (.sh, .bat, .exe, .dll, .so), //go:build ignore with exec/network code, go:linkname and syscall.Exec/ForkExec/StartProcess calls.


6. scan-polinrider.sh: PolinRider malware

Scanner with a specific signature for the PolinRider malware, which injects obfuscated JS payloads into project configuration files and force-pushes via batch scripts. 4 checks:

Primary signature – Looks for the obfuscated payload ("rmcej%otb%",2857687) in files like postcss.config.mjs, tailwind.config.js, eslint.config.mjs, next.config.mjs and babel.config.js.

Secondary signature – Detects the pattern global['!']='8-270-2';var _$_1e42=. With the --js-all flag, scans all .js files beyond known configs.

Propagation scripts – Looks for temp_auto_push.bat (force-push script) and config.bat (orchestrator).

Git artifacts – Detects config.bat entries in .gitignore and amended commits in git reflog, behavior consistent with PolinRider propagation.

Accepts extra flags: --verbose for detailed output and --js-all to scan all .js files.


Best practices when analyzing suspicious repositories

  1. Never clone directly on your machine. Use an isolated environment: VM, container or disposable EC2 instance.

  2. Never run npm install, pip install or go build before scanning. Install hooks are attack vector #1. Clone and scan before anything else.

  3. Combine tools. Shady Project Scanner is pattern-based and complements (not replaces) tools like npm audit, pip-audit, Snyk or Semgrep.

  4. Be suspicious of repos that ask you to disable security. “Disable your antivirus before running” is the oldest red flag there is.

  5. Check the author’s account. New account, no history, no contributions to other projects? The repo is probably not what it claims to be.


Summary

Shady Project Scanner is a lightweight, dependency-free tool that runs over 50 checks across 6 different ecosystems, covering everything from PHP webshells to supply chain attacks in Go modules. It’s not a silver bullet, but it’s a fast and efficient first line of defense before running any third-party code.

If you work in security, do CTFs or simply want to stop blindly trusting npm install, check out the project and contribute.

git clone https://github.com/lauralesteves/shady-project-scanner.git
cd shady-project-scanner
make scan TARGET=/path/to/suspicious/repo