Curl vs UrlConnection differences for HTTP requests

I am curious what the difference is between curl and URLConnection socket. I can connect to an RSS feed with the URLConnection just fine however when using curl either through the terminal or with curlMBS I get a timeout results.

Example, this works perfectly fine with the URLConnection. I get the RSS feed content in the ContentReceived event.

URLConnection1.Send("GET", "https://www.fs.usda.gov/wps/PA_WIDConsumption/rssgetfile?xFSENavChannel00=110512&desc=alerts")

However, the curlMBS code below produces an error

// URL of the RSS feed
Dim rssFeedUrl As String = "https://www.fs.usda.gov/wps/PA_WIDConsumption/rssgetfile?xFSENavChannel00=110510&desc=events"

// Initialize the CURL object
Dim curl As New CURLSMBS

// Set the URL
curl.OptionURL = rssFeedUrl

// Set the options to mimic a web browser
curl.OptionUserAgent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
'curl.OptionHTTPHeader = Array("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", "Accept-Language: en-US,en;q=0.9", "Connection: keep-alive")
curl.OptionFollowLocation = True
curl.OptionTimeout = 60

// Prepare a variable to store the response
Dim response As String = ""

// Set up the write function to collect the response
curl.CollectOutputData = True

// Perform the request
Dim result As Integer = curl.Perform

// Check for errors
If result = 0 Then
  // Save the response to a file
  Dim file As FolderItem = SpecialFolder.Desktop.Child("events.rss")
  If file <> Nil Then
    Dim outputStream As BinaryStream = BinaryStream.Create(file, True)
    outputStream.Write(curl.OutputData)
    outputStream.Close
    MsgBox("RSS feed saved to events.rss successfully.")
  Else
    MsgBox("Failed to save the RSS feed to a file.")
  End If
Else
  MsgBox("cURL Error: " + curl.DebugData)
End If

// Clean up
curl = Nil

Error from curlMBS. It seems to connect but cannot get the actual content. The same issue happens with the curl command in the terminal.

Host www.fs.usda.gov:443 was resolved.

IPv6: (none)

IPv4: 184.24.175.87

Trying 184.24.175.87:443...

Connected to www.fs.usda.gov (184.24.175.87) port 443

ALPN: curl offers http/1.1

TLSv1.3 (OUT), TLS handshake, Client hello (1):

TLSv1.3 (IN), TLS handshake, Server hello (2):

TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):

TLSv1.3 (IN), TLS handshake, Certificate (11):

TLSv1.3 (IN), TLS handshake, CERT verify (15):

TLSv1.3 (IN), TLS handshake, Finished (20):

TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):

TLSv1.3 (OUT), TLS handshake, Finished (20):

SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / X25519 / id-ecPublicKey

ALPN: server accepted http/1.1

Server certificate:

subject: C=US; ST=District of Columbia; L=Washington; O=U.S. Department of Agriculture; CN=www.fs.usda.gov

start date: Dec 18 00:00:00 2023 GMT

expire date: Dec 18 23:59:59 2024 GMT

issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1

SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.

Certificate level 0: Public key type EC/prime256v1 (256/128 Bits/secBits), signed using sha256WithRSAEncryption

Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption

using HTTP/1.x

GET /wps/PA_WIDConsumption/rssgetfile?xFSENavChannel00=110510&desc=events HTTP/1.1

Host: www.fs.usda.gov

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36

Accept: */*

TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):

TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):

old SSL session ID is stale, removing

Operation timed out after 60001 milliseconds with 0 bytes received

Closing connection

TLSv1.3 (OUT), TLS alert, close notify (256):

Do you realize that you are using different parameters in the URL?

?xFSENavChannel00=110512&desc=alerts

?xFSENavChannel00=110510&desc=events

Yeah, I was testing all the different RSS Feeds. Results are the same no matter which parameters are used.

Thanks!

I can confirm curl chokes on this site. Not sure what the issue is, even if you drop to regular http it still times out. Even if you just try to load the base url “https://www.fs.usda.gov/” it still times out. This server does not play well with curl. I have tried adding all the pertinent request headers I can think of and it still is not working.

Have you configured the cacerts file?

The following curl command gets me a correctly terminated response.

curl 'https://www.fs.usda.gov/' --compressed
 -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:128.0) Gecko/20100101 Firefox/128.0'
 -H 'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/png,image/svg+xml,*/*;q=0.8'
 -H 'Accept-Language: en-GB,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br, zstd'
 -H 'Connection: keep-alive'
 -H 'Upgrade-Insecure-Requests: 1'
 -H 'Sec-Fetch-Dest: document'
 -H 'Sec-Fetch-Mode: navigate'
 -H 'Sec-Fetch-Site: cross-site'
 -H 'Sec-Fetch-User: ?1'
 -H 'Priority: u=0, i'

You may be able to trim that down but it looks like the server is sensitive to the user-agent, which is the first thing I tried.

3 Likes

You are right. There are many sites that adjust the response to the user-agent and they don’t answer for no user-agent or not accepted user-agents.

This appears to work. Thank you Matthew!