Goal Reached Thanks to every supporter — we hit 100%!

Goal: 1000 CNY · Raised: 1000 CNY

100.0%

CVE-2021-42574 PoC — Unicode 代码注入漏洞

Source
Associated Vulnerability
Title:Unicode 代码注入漏洞 (CVE-2021-42574)
Description:Unicode(通用字符集)是美国统一码联盟(Unicode Consortium)组织的一种通用字符编码标准。用于为世界上每种语言的每个字符和符号分配一个代码。 Unicode Specification 14.0版本及之前版本存在代码注入漏洞,该漏洞源于在现实某些字符时可能存在双向文本欺骗问题。
Description
Checks your files for existence of Unicode BIDI characters which can be misused for supply chain attacks. See CVE-2021-42574 
Readme
# BIDI Character Detector
This tool checks your files for existence of Unicode BIDI characters which can be misused for supply chain attacks to mitigate [CVE-2021-42574](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-42574).
This tool was written in Rust and is distributed as a small (< 3MB) docker compatible container to allow fast and easy usage.

For an explanation of the attack, have a look at [GitHub's blog entry](https://github.blog/changelog/2021-10-31-warning-about-bidirectional-unicode-text/) or the [original paper where the attack was published](https://trojansource.codes/).

## Installation
This package is mostly intended to be used via it's docker container. But local installation is possible of course.
Compilation requires at least Rust 1.56.0 because it uses the Rust 2021 Edition.
Clone this repository and run `cargo install --path .` to install the binary for your current user. After that, you can invoke the `bidi_detector` command.


## Usage
Running the tool via it's official docker container is probably the easiest way to get started.
To run it via docker, the following command should work to scan all files inside your current working directory:
```bash
docker run --rm -it -v $(pwd):/data ghcr.io/maweil/bidi_char_detector:latest
```

Depending on your system you may have to adapt this command slightly. If you use e.g. podman and have SELinux enabled, try the following command instead:

```bash
podman run --rm -it -v $(pwd):/data:Z ghcr.io/maweil/bidi_char_detector:latest
```

### Configuration
By default, all files will be checked. If you have binary files inside the current directory, the command will fail because it can't decode a non-UTF8 encoded file.
To adapt the command to your needs, place a file called `bidi_config.toml` inside the root of your project.
You can find an example for it in this repository, see an example below. The options will be described in more detail below the example: 

```toml
[general]
includes = [ 
    "src/**/*",
    "**/*.patch",
    "**/*.json",
    "**/Dockerfile",
    "test/*.js"
]
excludes = [
    ".git/*",
    "target/*"
]

[display]
show_details = true
```

#### General Settings
This section includes two arrays (`includes` and `excludes`) where you can specify patterns of files to be scanned (or to be excluded from the scan).
Please make sure your patterns actually match the files inside the directory, not the directory name itself, otherwise your files will not be scanned.
If you want to scan all files and only exclude e.g. your `.git` directory, the following configuration would do the trick:

```toml
[general]
includes = [ 
    "**/*"
]
excludes = [
    ".git/*",
]

[display]
show_details = true
```

If you want to intead explicitly define which folder contains your source files, the following configuration example would scan all files in the src directory (without ignoring anything):
```toml
[general]
includes = [ 
    "src/**/*"
]
excludes = [
]

[display]
show_details = true
```

#### Display Settings
The following settings are available in this section:

| Setting               | Description                                                                                                 | Optional | Default | Introduced in |
| --------------------- | ----------------------------------------------------------------------------------------------------------- | -------- | ------- | ------------- |
| `show_details`        | Decides whether to print out line/pos and type of the detected BIDI character if found (see examples below) | No       | -       | v0.1.1        |
| `ignore_invalid_data` | Whether to print an error message when binary/non-UTF8 files are detected                                   | Yes      | true    | v0.1.2        |
| `verbose`             | Whether to print the filename even if no suspicious characters were found                                   | Yes      | true    | v0.1.2        |

**Example:** `show_details = true`, `verbose = true`

```txt
src/lib.rs - 0 BIDI characters
src/main.rs - 0 BIDI characters
test/example-commenting-out.js - 6 BIDI characters
Found character RLO (Right-to-Left Override), test/example-commenting-out.js:4:3
Found character LRI (Left-to-Right Isolate), test/example-commenting-out.js:4:7
Found character PDI (Pop Directional Isolate), test/example-commenting-out.js:4:20
Found character LRI (Left-to-Right Isolate), test/example-commenting-out.js:4:22
Found character RLO (Right-to-Left Override), test/example-commenting-out.js:6:20
Found character LRI (Left-to-Right Isolate), test/example-commenting-out.js:6:24
Found 6 potentially dangerous Unicode BIDI characters!
```

**Example:** `show_details = false`, `verbose = true`

```txt
src/lib.rs - 0 BIDI characters
src/main.rs - 0 BIDI characters
test/example-commenting-out.js - 6 BIDI characters
Found 6 potentially dangerous Unicode BIDI characters!
```

**Example:** `show_details = false`, `verbose = false`
```txt
test/example-commenting-out.js - 6 BIDI characters
Found 6 potentially dangerous Unicode BIDI characters!
```

## Credits
All credits for detecting the attack including the list of relevant BIDI characters go to the original authors of the corresponding paper. 
Please cite their original paper when building on their work.

The file `test/example-commenting-out.js` in this repository is a copy of [commenting-out.js](https://github.com/nickboucher/trojan-source/blob/main/JavaScript/commenting-out.js) in their original repository. It's licensing follows the [original repository](https://github.com/nickboucher/trojan-source) (MIT License)
It is used for test purposes only here.

```bibtex
@article{boucher_trojansource_2021,
    title = {Trojan {Source}: {Invisible} {Vulnerabilities}},
    author = {Nicholas Boucher and Ross Anderson},
    year = {2021},
    journal = {Preprint},
    eprint = {2111.00169},
    archivePrefix = {arXiv},
    primaryClass = {cs.CR},
    url = {https://arxiv.org/abs/2111.00169}
}
```
File Snapshot

[4.0K] /data/pocs/a7e4f3380a5ab07dfe400db42b76b69c45d57ba8 ├── [ 231] bidi_config.toml ├── [1.9K] Cargo.lock ├── [ 310] Cargo.toml ├── [ 557] Dockerfile ├── [1.0K] LICENSE ├── [5.9K] README.md ├── [4.0K] src │   ├── [4.9K] lib.rs │   └── [6.4K] main.rs └── [4.0K] test └── [ 163] example-commenting-out.js 2 directories, 9 files
Shenlong Bot has cached this for you
Remarks
    1. It is advised to access via the original source first.
    2. If the original source is unavailable, please email f.jinxu#gmail.com for a local snapshot (replace # with @).
    3. Shenlong has snapshotted the POC code for you. To support long-term maintenance, please consider donating. Thank you for your support.