Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests: chunked request body #1307

Merged
merged 1 commit into from
Jun 24, 2024

Conversation

andrey-zelenkov
Copy link
Contributor

@andrey-zelenkov andrey-zelenkov commented Jun 7, 2024

Tests for the #1262.

Copy link
Contributor

@hongzhidao hongzhidao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Andrei,
I set the default value of the chunked_transform false.
And feel free to remove these lines.
@pytest.mark.skip('not yet')

@andrey-zelenkov andrey-zelenkov marked this pull request as ready for review June 13, 2024 15:32
@andrey-zelenkov
Copy link
Contributor Author

Hi Zhidao!

I set the default value of the chunked_transform false.

Why do we need chunked_transform option at all?

ASGI spec explicitly states how to handle chunked requests https://asgi.readthedocs.io/en/latest/specs/www.html:

Servers are responsible for handling inbound and outbound chunked transfer encodings. A request with a chunked encoded body should be automatically de-chunked by the server and presented to the application as plain body bytes; a response that is given to the server with no Content-Length may be chunked as the server sees fit.

and

Note that if the request is being sent using Transfer-Encoding: chunked, the server is responsible for handling this encoding. The http.request messages should contain just the decoded contents of each chunk.

Same for WSGI https://peps.python.org/pep-3333/#other-http-features:

WSGI servers must handle any supported inbound “hop-by-hop” headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.

@ac000
Copy link
Member

ac000 commented Jun 13, 2024

Hi Zhidao!

I set the default value of the chunked_transform false.

Why do we need chunked_transform option at all?

It's an experimental feature and we wanted to gate it.

ASGI spec explicitly states how to handle chunked requests https://asgi.readthedocs.io/en/latest/specs/www.html:

Servers are responsible for handling inbound and outbound chunked transfer encodings. A request with a chunked encoded body should be automatically de-chunked by the server and presented to the application as plain body bytes; a response that is given to the server with no Content-Length may be chunked as the server sees fit.

and

Note that if the request is being sent using Transfer-Encoding: chunked, the server is responsible for handling this encoding. The http.request messages should contain just the decoded contents of each chunk.

Same for WSGI https://peps.python.org/pep-3333/#other-http-features:

WSGI servers must handle any supported inbound “hop-by-hop” headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.

As it stands language modules won't see chunked data, Unit will buffer them up and convert the request to Content-Length

@andrey-zelenkov
Copy link
Contributor Author

As it stands language modules won't see chunked data, Unit will buffer them up and convert the request to Content-Length

Even when "chunked_transform": False?

@ac000
Copy link
Member

ac000 commented Jun 13, 2024

As it stands language modules won't see chunked data, Unit will buffer them up and convert the request to Content-Length

Even when "chunked_transform": False?

AIUI

If chunked_transform is false then Unit behaves as it does now and rejects chunked requests.

If it's true, Unit will buffer all the chunked data up and convert it to a Content-Length request.

In either case language modules will only see Content-Length requests as they do now.

@andrey-zelenkov
Copy link
Contributor Author

andrey-zelenkov commented Jun 13, 2024

If chunked_transform is false then Unit behaves as it does now and rejects chunked requests.

It doesn't work that way. In the current implementation of #1262, chunked requests are always accepted and served.

In either case language modules will only see Content-Length requests as they do now.

No, in case of "chunked_transform": False app will receive a request and will not see any Content-Length. At least in current version of patch.

After asking @hongzhidao about it, he explained that the main reason for having this option is the problem with the current implementation: the decision about de-chunking is made before routing and knowing where the request will be sent (proxy or application). Disabling de-chunking makes the proxy work correctly, while enabling de-chunking makes applications work correctly. So, by setting chunked_transform option, you are choosing what will be broken and what will work correctly. This explanation makes some sense to me now.

My current thought is to set chunked_transform to True by default. The only drawback I can see is de-chunking proxied requests and I would expect that most of the servers should be able to live with it (or even it will be possible to chunk it back later by implementing body modification using njs) and Unit does extra work which might be useless, whereas setting it to False leads to broken applications on the application server.

@ac000
Copy link
Member

ac000 commented Jun 14, 2024

If chunked_transform is false then Unit behaves as it does now and rejects chunked requests.

It doesn't work that way. In the current implementation of #1262, chunked requests are always accepted and served.

In either case language modules will only see Content-Length requests as they do now.

No, in case of "chunked_transform": False app will receive a request and will not see any Content-Length. At least in current version of patch.

@callahad @hongzhidao

Well I'm confused because we just had a meeting where it was agreed that language modules (for now anyway) will not see chunked requests.

Sending chunked requests to language modules will be completely pointless as they won't know what to do with it, for all I know they may even just crash...

@ac000
Copy link
Member

ac000 commented Jun 14, 2024

commit 93d8b3d1424387f2a5ca264b9b9653c9da21a9a7
Author: Zhidao HONG <[email protected]>
Date:   Tue Jun 11 17:59:00 2024 +0800

    http: Support chunked request bodies
    
    This is a temporary support for chunked request bodies by converting
    to Content-Length. This allows for processing of such requests until
    a more permanent solution is developed.
    
    A new configuration option "chunked_transform" has been added to enable
    this feature. The option can be set as follows:
      {
        "settings": {
          "chunked_transform": true
        }
      }
    By default, this option is set to false.
    
    Please note that this is an experimental implementation.

To me that means setting chunked_transform to true will allow chunked requests to be accepted, albeit by being converted to Content-Length requests.

The only sensible behaviour for defaulting it to false can be to retain the current behaviour, which I thought was whole point of having this setting...

@hongzhidao
Copy link
Contributor

The only sensible behaviour for defaulting it to false can be to retain the current behaviour, which I thought was whole point of having this setting...

Hmm, the false setting is also not clear to me.

If chunked_transform is false then Unit behaves as it does now and rejects chunked requests.
If it's true, Unit will buffer all the chunked data up and convert it to a Content-Length request.
In either case language modules will only see Content-Length requests as they do now.]

It doesn't work that way. In the current implementation of #1262, chunked requests are always accepted and served.

I changed the logic and will wait for a final decision from the team.
If it's false, it will be rejected as before, which returns 411 (NXT_HTTP_LENGTH_REQUIRED = 411).

@callahad
Copy link
Collaborator

For now, let's merge to master with chunked_transform defaulting to false.

That keeps the new behavior opt-in until we work out all of the kinks with proxy vs application.

@ac000
Copy link
Member

ac000 commented Jun 14, 2024

@callahad @hongzhidao

OK, so I think things are as I expected. But just to be clear.

When chunked_transform = false, Unit will behave as it does now and will reject chunked requests with a 411 Length Required status code?

@hongzhidao

I think that could be more clearly stated in the commit message for ("http: Support chunked request bodies")

Perhaps by extending the second last line to something like

By default, this option is set to false, which retains the current
behaviour of rejecting chunked requests with a '411 Length Required'
status code.

@andrey-zelenkov
Copy link
Contributor Author

Rebased, added tests for the chunked_transform option and last chunk.

% git range-diff e5dbfcf7...440837dc
 -:  -------- >  1:  e77a0c16 Tests: explicitly specify 'f' prefix to format string before printing
 -:  -------- >  2:  c9dced37 Tests: print unit.log on unsuccessful unmount
 -:  -------- >  3:  98983f3f Add a GitHub discussions badge to the README
 -:  -------- >  4:  a7e3686a Tools: improved error handling for unitc
 -:  -------- >  5:  d7ec30c4 ci: Limit when to run checks on pull-requests
 -:  -------- >  6:  04a24f61 http: fix use-of-uninitialized-value bug
 -:  -------- >  7:  965fc94e fuzzing: add fuzzing infrastructure in build system
 -:  -------- >  8:  a93d878e fuzzing: add fuzzing targets
 -:  -------- >  9:  665353dc fuzzing: add a fuzzing seed corpus and dictionary
 -:  -------- > 10:  5b65134c fuzzing: add a basic README
 -:  -------- > 11:  35a572c2 fuzzing: Add a .gitattributes file
 1:  e5dbfcf7 ! 12:  440837dc Tests: chunked request body
    @@ test/test_chunked.py (new)
     +client = ApplicationPython()
     +
     +
    -+def test_chunked():
    [email protected](autouse=True)
    ++def setup_method_fixture():
     +    client.load('mirror')
     +
    ++    assert 'success' in client.conf(
    ++        {"http": {"chunked_transform": True}}, 'settings'
    ++    )
    ++
    ++
    ++def test_chunked():
     +    def chunks(chunks=[]):
     +        body = ''
     +
    @@ test/test_chunked.py (new)
     +
     +
     +def test_chunked_pipeline():
    -+    client.load('mirror')
    -+
     +    sock = client.get(
     +        no_recv=True,
     +        headers={
    @@ test/test_chunked.py (new)
     +    assert len(re.findall('%', resp)) == 1
     +
     +
    [email protected]('not yet')
     +def test_chunked_max_body_size():
    -+    client.load('mirror')
    -+
     +    assert 'success' in client.conf(
    -+        {'http': {'max_body_size': 1024}}, 'settings'
    ++        {'max_body_size': 1024, 'chunked_transform': True}, 'settings/http'
     +    )
     +
     +    body = f'{2048:x}\r\n{"x" * 2048}\r\n0\r\n\r\n'
    @@ test/test_chunked.py (new)
     +    )
     +
     +
    [email protected]('not yet')
    -+def test_chunked_invalid():
    -+    client.load('mirror')
    ++def test_chunked_after_last():
    ++    resp = client.get(
    ++        headers={
    ++            'Host': 'localhost',
    ++            'Connection': 'close',
    ++            'Transfer-Encoding': 'chunked',
    ++        },
    ++        body='1\r\na\r\n0\r\n\r\n1\r\nb\r\n0\r\n\r\n',
    ++    )
     +
    ++    assert resp['status'] == 200
    ++    assert resp['headers']['Content-Length'] == '1'
    ++    assert resp['body'] == 'a'
    ++
    ++
    ++def test_chunked_transform():
    ++    assert 'success' in client.conf(
    ++        {"http": {"chunked_transform": False}}, 'settings'
    ++    )
    ++
    ++    assert (
    ++        client.get(
    ++            headers={
    ++                'Host': 'localhost',
    ++                'Connection': 'close',
    ++                'Transfer-Encoding': 'chunked',
    ++            },
    ++            body='0\r\n\r\n',
    ++        )['status']
    ++        == 411
    ++    )
    ++
    ++
    ++def test_chunked_invalid():
     +    # invalid chunkes
     +
     +    def check_body(body):

Copy link
Contributor

@hongzhidao hongzhidao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hongzhidao hongzhidao assigned hongzhidao and unassigned hongzhidao Jun 21, 2024
@andrey-zelenkov andrey-zelenkov merged commit c7e921c into nginx:master Jun 24, 2024
19 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants