Skip to content

Conversation

@julianz-
Copy link
Contributor

@julianz- julianz- commented Nov 23, 2025

Added an early probe to identify when a client attempts to speak unencrypted HTTP on an HTTPS port.

When a plaintext connection is detected, the server now attempts to use the raw TCP socket (if wrapped) to send a "400 Bad Request" error response before any TLS failure. This unifies the behavior of the 'builtin' and 'pyopenssl' adapters for this specific error condition.

@codecov
Copy link

codecov bot commented Nov 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.01%. Comparing base (c6da545) to head (720ce41).
⚠️ Report is 4 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #800      +/-   ##
==========================================
+ Coverage   77.70%   78.01%   +0.31%     
==========================================
  Files          41       41              
  Lines        4691     4749      +58     
  Branches      542      542              
==========================================
+ Hits         3645     3705      +60     
- Misses        904      905       +1     
+ Partials      142      139       -3     

@julianz-

This comment was marked as outdated.

@julianz- julianz- force-pushed the http_https_handling branch from 33454e6 to 8f8ad17 Compare November 23, 2025 18:37
@julianz- julianz- force-pushed the http_https_handling branch 3 times, most recently from 12f7ee9 to 8fa9124 Compare November 23, 2025 19:03
@julianz-

This comment was marked as resolved.

@julianz- julianz- force-pushed the http_https_handling branch 2 times, most recently from e02b4c9 to 38e751b Compare November 24, 2025 00:19
@julianz- julianz- force-pushed the http_https_handling branch from 38e751b to 00bfcf8 Compare November 24, 2025 01:04
raise
return None
except errors.NoSSLError:
return self._send_bad_request_plain_http_error(s, addr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to pass the raw socket smuggled through NoSSLError here. Although, I think that maybe we could just preserve the underlying socket before line 300 and pass that. Both these options will make the s._socket shennanigans redundant.

Copy link
Contributor Author

@julianz- julianz- Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pyopenSSL, the socket has already been wrapped by the time it arrives at _from_server_socket() so you can't smuggle the raw socket through here unless you unwrap it. For the builtin adapter this is not the case so everything works fine. If you are against unwrapping it for sending the 400 bad request message then I suggest we drop the http check in pyopenSSL's wrap() function, and just apply this change for the builtin adapter. pyopenSSL's still successfully detects the http attempt later during the handshake phase and is able to send a 400 message. So this PR would not unify things but would enable the builtin adapter to handle this error condition better.

Copy link
Member

@webknjaz webknjaz Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pyopenSSL, the socket has already been wrapped by the time it arrives at _from_server_socket()

I don't see it — could you explain more?

My understanding is that s, addr = server_socket.accept() (line 288) works with a server socket, initialized somewhere here

cheroot/cheroot/server.py

Lines 1798 to 1871 in a475500

def prepare(self): # noqa: C901 # FIXME
"""Prepare server to serving requests.
It binds a socket's port, setups the socket to ``listen()`` and does
other preparing things.
"""
self._interrupt = None
if self.software is None:
self.software = '%s Server' % self.version
# Select the appropriate socket
self.socket = None
msg = 'No socket could be created'
if os.getenv('LISTEN_PID', None):
# systemd socket activation
self.socket = socket.fromfd(3, socket.AF_INET, socket.SOCK_STREAM)
elif isinstance(self.bind_addr, (str, bytes)):
# AF_UNIX socket
try:
self.bind_unix_socket(self.bind_addr)
except socket.error as serr:
msg = '%s -- (%s: %s)' % (msg, self.bind_addr, serr)
raise socket.error(msg) from serr
else:
# AF_INET or AF_INET6 socket
# Get the correct address family for our host (allows IPv6
# addresses)
host, port = self.bind_addr
try:
info = socket.getaddrinfo(
host,
port,
socket.AF_UNSPEC,
socket.SOCK_STREAM,
0,
socket.AI_PASSIVE,
)
except socket.gaierror:
sock_type = socket.AF_INET
bind_addr = self.bind_addr
if ':' in host:
sock_type = socket.AF_INET6
bind_addr = bind_addr + (0, 0)
info = [(sock_type, socket.SOCK_STREAM, 0, '', bind_addr)]
for res in info:
af, socktype, proto, _canonname, sa = res
try:
self.bind(af, socktype, proto)
break
except socket.error as serr:
msg = '%s -- (%s: %s)' % (msg, sa, serr)
if self.socket:
self.socket.close()
self.socket = None
if not self.socket:
raise socket.error(msg)
# Timeout so KeyboardInterrupt can be caught on Win32
self.socket.settimeout(1)
self.socket.listen(self.request_queue_size)
# must not be accessed once stop() has been called
self._connections = connections.ConnectionManager(self)
# Create worker threads
self.requests.start()
self.ready = True
self._start_time = time.time()
.

It's a raw socket and s would be a raw socket too. Naming is unfortunate, though. Later on, s, ssl_env = self.server.ssl_adapter.wrap(s) (line 300) re-assigns a wrapped socket to the same variable so we can't access it anymore.

But what if we did raw_client_socket = s right after line 288, preserving that socket pointer? (or something more elegant along the lines)

Then, we'd be able to use it right here, without any smuggling.

Am I wrong?

Copy link
Contributor Author

@julianz- julianz- Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare() creates the socket, calls HTTPServer.bind(), which in turn calls prepare_socket(), which calls sock = ssl_adapter.bind(sock). Because pyopenSSL currently wraps the socket with its bind method, the result is a wrapped socket of type pyopenSSL.SSLConnection at the end of prepare(). So when we arrive at _from_server_socket() in connections.py, the server_socket parameter is already a wrapped socket, not a raw socket. Do you know why the wrapping was put in the bind method in pyopenSSL? Seems that was a mistake.

I have tried moving the wrapping from bind() to wrap() and after applying a few other fixes elsewhere in pyopenSSL got everything to work. I am inclined to do a new PR with that change first before we complete this PR. That way, we can fix this architectural flaw first. What do you think?

Copy link
Contributor Author

@julianz- julianz- Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, pyOpenSSL currently relies on this code in server.py to send the 400 Bad Request message for flagging the http over https error. It relies on unwrapping the socket.

def _handle_no_ssl(self, req):
        if not req or req.sent_headers:
            return
        # Unwrap wfile
        try:
            resp_sock = self.socket._sock
        except AttributeError:
            # self.socket is of OpenSSL.SSL.Connection type
            resp_sock = self.socket._socket
        self.wfile = StreamWriter(resp_sock, 'wb', self.wbufsize)
        msg = (
            'The client sent a plain HTTP request, but '
            'this server only speaks HTTPS on this port.'
        )
        req.simple_response('400 Bad Request', msg)
        self.linger = True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I think #802 might be worth doing first?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure — some changes over there seem misplaced. Plus I'd feel more confident reviewing it after this refactoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sure.

Copy link
Contributor Author

@julianz- julianz- Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured out why the old code didn't need the line conn.set_accept_state() that we now have in wrap(). In the ConnectionsManager in connections.py we have:

def _from_server_socket(self, server_socket):  # noqa: C901  # FIXME
   try:
       s, addr = server_socket.accept() 

For the old code, server_socket was an SSL.Connection. Calling accept() on an SSL.Connection listening socket returns a new SSL.Connection that is set to server mode. In the new code, server_socket is a raw socket that's listening for clients. Calling accept() spawns a socket that has accepted a connection so it becomes a connected TCP socket. However, the raw socket doesn't know it's going to be a server socket. TCP sockets are bidirectional - a connected TCP socket looks the same whether it initiated the connection or accepted it. SSL sockets are not though so when the raw socket becomes wrapped (further down this function) the SSL.Connection doesn't know what mode it's supposed to be in. That's why we need to call set_accept_state() when the socket is wrapped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for documenting this!

@julianz- julianz- force-pushed the http_https_handling branch from cb21cb4 to d3529f1 Compare December 1, 2025 23:17
@webknjaz
Copy link
Member

webknjaz commented Dec 2, 2025

So there's two things left to address:

@julianz-
Copy link
Contributor Author

julianz- commented Dec 2, 2025

I fixed the doc fragment and played around with your spy code but wasn't sure it achieved the purpose you had in mind. See my note above about this.

@julianz- julianz- force-pushed the http_https_handling branch 5 times, most recently from c94c2e5 to 18ec74c Compare December 3, 2025 04:58
@julianz- julianz- force-pushed the http_https_handling branch 2 times, most recently from ee5ade4 to 43147ac Compare December 4, 2025 19:29
Added logic including a new method (_ensure_peer_speaks_https)
in the Adapter base class to identify when a client
attempts to speak unencrypted HTTP on an HTTPS port.

When a plaintext connection is detected, the server
now attempts to access the raw underlying TCP socket (if wrapped)
to send a "400 Bad Request" error response before any TLS failure.
This unifies the behavior of the 'builtin' and 'pyopenssl'
adapters for this specific error condition.
@julianz- julianz- force-pushed the http_https_handling branch from 43147ac to 720ce41 Compare December 5, 2025 04:38
Copy link
Member

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

@webknjaz webknjaz enabled auto-merge December 5, 2025 17:33
@webknjaz webknjaz merged commit 4b4bb7f into cherrypy:main Dec 5, 2025
76 checks passed
@github-project-automation github-project-automation bot moved this from 🧐 @webknjaz's review queue 📋 to 🌈 Done 🦄 in 📅 Procrastinating in public Dec 5, 2025
@julianz- julianz- deleted the http_https_handling branch December 6, 2025 03:30
@julianz-
Copy link
Contributor Author

julianz- commented Dec 6, 2025

Appreciate your help and guidance as always!

webknjaz added a commit to webknjaz/cpython that referenced this pull request Dec 10, 2025
It was discovered by @julianz- [[1]] that this works differently under
*NIX and on Windows. So this patch adds a test case to demonstrate the
inconsistency.

[1]: cherrypy/cheroot#800 (comment)
webknjaz added a commit to webknjaz/cpython that referenced this pull request Dec 10, 2025
It was discovered by @julianz- [[1]] that this works differently under
*NIX and on Windows. So this patch adds a test case to demonstrate the
inconsistency.

[1]: cherrypy/cheroot#800 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants