### Key Information #### Vulnerability Type - **SSRF (Server-Side Request Forgery) Vulnerability** #### Vulnerability Description - **Issue**: The `_split_url` function in `mindsdb/utilities/security.py` uses `urlparse().netloc` to compare URL origins. However, `netloc` includes the `userinfo` component (e.g., `user:pass@host`). An attacker can craft a URL like: ``` http://attacker@127.0.0.1:4444/ ``` This results in `netloc = "attacker@127.0.0.1:4444"`, which fails to match the blacklisted origin `127.0.0.1:4444`, thereby bypassing SSRF protection. #### Remediation Measures 1. **Changes Made**: - Replace `parsed_url.netloc` with `parsed_url.hostname + parsed_url.port`, including: - Remove `userinfo` — hostname never includes `user@` or `user:pass@`. - Preserve port-aware matching — still correctly compares origins with explicit ports. - Maintain existing validation checks — retain `netloc` null check to reject malformed URLs. 2. **Fixed Code Example**: ```python hostname = parsed_url.hostname or "" port = parsed_url.port host = f"{hostname}:{port}" if port else hostname return parsed_url.scheme.lower(), host.lower() ``` #### Reproduction Steps 1. Import `urlparse` from `urllib.parse`. 2. Construct a malicious URL and retrieve the `netloc` value. ```python from urllib.parse import urlparse url = "http://attacker@127.0.0.1:4444/" parsed = urlparse(url) print(parsed.netloc) # "attacker@127.0.0.1:4444" ``` #### Testing Plan - Verify that `_split_url` correctly handles various URL scenarios after the fix: - URLs with different schemes and host combinations. - URLs with and without explicit ports. - Invalid inputs should raise `ValueError`.