所以我有这个网站
在检查了下载按钮的网络流量后,我得到了下面的curl post请求
curl "https://flood-map-for-planning.service.gov.uk/pdf" -X POST -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8" -H "Accept-Language: en-US,en;q=0.5" -H "Accept-Encoding: gzip, deflate, br" -H "Content-Type: application/x-www-form-urlencoded" -H "Origin: https://flood-map-for-planning.service.gov.uk" -H "Connection: keep-alive" -H "Referer: https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR" -H "Upgrade-Insecure-Requests: 1" -H "Sec-Fetch-Dest: document" -H "Sec-Fetch-Mode: navigate" -H "Sec-Fetch-Site: same-origin" -H "Sec-Fetch-User: ?1" -H "TE: trailers" --data-raw "id=1660136366038&polygon=¢er="%"5B429240"%"2C431613"%"5D&reference=&scale=2500"
我访问了这个网站,以便将curl转换为c#
这就是我得到的
using (var httpClient = new HttpClient())
{
using (var request = new HttpRequestMessage(new HttpMethod("POST"), "https://flood-map-for-planning.service.gov.uk/pdf"))
{
request.Headers.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0");
request.Headers.TryAddWithoutValidation("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8");
request.Headers.TryAddWithoutValidation("Accept-Language", "en-US,en;q=0.5");
request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate, br");
request.Headers.TryAddWithoutValidation("Origin", "https://flood-map-for-planning.service.gov.uk");
request.Headers.TryAddWithoutValidation("Connection", "keep-alive");
request.Headers.TryAddWithoutValidation("Referer", "https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR");
request.Headers.TryAddWithoutValidation("Upgrade-Insecure-Requests", "1");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Dest", "document");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Mode", "navigate");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Site", "same-origin");
request.Headers.TryAddWithoutValidation("Sec-Fetch-User", "?1");
request.Headers.TryAddWithoutValidation("TE", "trailers");
request.Content = new StringContent("id=1660136366038&polygon=¢er=");
request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/x-www-form-urlencoded");
var response = await httpClient.SendAsync(request);
}
}
我把它改成:
var httpClient = new HttpClient();
var request =
new HttpRequestMessage(new HttpMethod("POST"), "https://flood-map-for-planning.service.gov.uk/pdf");
request.Headers.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0");
request.Headers.TryAddWithoutValidation("Accept", "application/pdf");
request.Headers.TryAddWithoutValidation("Accept-Language", "en-US,en;q=0.5");
request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate, br");
request.Headers.TryAddWithoutValidation("Origin", "https://flood-map-for-planning.service.gov.uk");
request.Headers.TryAddWithoutValidation("Connection", "keep-alive");
request.Headers.TryAddWithoutValidation("Referer", "https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR");
request.Headers.TryAddWithoutValidation("Upgrade-Insecure-Requests", "1");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Dest", "document");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Mode", "navigate");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Site", "same-origin");
request.Headers.TryAddWithoutValidation("Sec-Fetch-User", "?1");
request.Headers.TryAddWithoutValidation("TE", "trailers");
request.Content = new StringContent("center=&scale=2500");
var response = httpClient.Send(request);
response.Content.Headers.Add("Content-Disposition", "inline;filename="Testpdf.pdf"");
response.Content.Headers.Add("Content-Name", "Testpdf.PDF");
response.Content.Headers.Add("Content-Type", "application/pdf;charset=UTF-8");
if (response.IsSuccessStatusCode)
{
using (FileStream fs = new FileStream("somepdf.pdf", FileMode.CreateNew))
{
using (StreamWriter writer = new StreamWriter(fs))
{
var contentStream = response.Content.ReadAsStream(); // get the actual content stream
writer.Write(contentStream);
}
}
}
这就是问题所在。
我的目标是在本地下载pdf。
我通常会得到一个1KB或6KB的文件。
带有输出参数的curl命令可以正常工作。我只是不确定上面的c#http帖子请求缺少了什么。
正如您所看到的,我已经添加了文件流和流写入程序的用法。
我还尝试过处理响应,以便将其转换为application/pdf响应。
你知道我为什么做错了吗?
=================================================
编辑
感谢@thehennyy,
这是一个可行的解决方案:
var unixTimestamp = (long)DateTime.UtcNow.Subtract(DateTime.UnixEpoch).TotalSeconds;
HttpClientHandler handler = new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};
using (var httpClient = new HttpClient(handler))
{
using (var request =
new HttpRequestMessage(new HttpMethod("POST"), "https://flood-map-for-planning.service.gov.uk/pdf"))
{
request.Headers.TryAddWithoutValidation("Referer",
"https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR");
request.Content =
new StringContent($"id={unixTimestamp}&polygon=¢er=[429240,431613]&reference=&scale=2500");
request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/x-www-form-urlencoded");
var response = await httpClient.SendAsync(request);
if (response.IsSuccessStatusCode)
{
using (FileStream fs = new FileStream("somepdf.pdf", FileMode.Create))
{
var contentStream = await response.Content.ReadAsStreamAsync();
await contentStream.CopyToAsync(fs);
}
}
}
}
这里需要考虑以下几点:
curl到httpclient的转换器在转换帖子内容时似乎出现了问题。以下对我有效:
request.Content = new StringContent("id=1&polygon=¢er=[429240,431613]&reference=&scale=2500");
request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/x-www-form-urlencoded");
必须提供参数id
,否则请求将失败。网站使用当前的unix时间戳作为id参数的值。
向响应response.Content.Headers.Add([...])
添加标题没有意义,只需删除这些行即可。
将内容写入磁盘可以更简单:
using (FileStream fs = new FileStream("somepdf.pdf", FileMode.Create))
{
var contentStream = await response.Content.ReadAsStreamAsync();
await contentStream.CopyToAsync(fs);
}
在测试时;错误的";文件,这些通常只是html响应,有时包含错误消息。以html形式查看它们。也许它们看起来像胡言乱语,然后你必须打开自动解压缩:
HttpClientHandler handler = new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};
var httpClient = new HttpClient(handler);
自动解压缩值应与以下标头值匹配:
request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate");
当前版本的dotnet也支持"br"
-DecompressionMethods.Brotli
。使用自动解压缩几乎在任何情况下都有帮助。