Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] rename_keys processor: json pointers with escaped syntax fail to validate #5121

Open
joelmarty opened this issue Oct 28, 2024 · 3 comments
Labels
bug Something isn't working follow up question Further information is requested

Comments

@joelmarty
Copy link
Contributor

joelmarty commented Oct 28, 2024

Describe the bug
The escaped syntax for json pointers define how to build json pointers for fields that include special characters.

However, the isValidKey() method in JacksonEventKey only checks the basic character set and keys defined with the escaped syntax are rejected.

To Reproduce
Steps to reproduce the behavior:

  1. Create a pipeline with a rename_keys processor using an escaped syntax:
my-file-pipeline:
  source:
    file:
      path: run/data/events.jsonl
      record_type: event
      format: json
  sink:
    - file:
        path: "run/data/result.jsonl"
  processor:
    - rename_keys:
        entries:
          - from_key: host
            to_key: '"cs(host)"'
  1. Run data-prepper
  2. data-prepper cannot start with the error:

2024-10-28T16:35:39,367 [main] ERROR org.opensearch.dataprepper.core.validation.LoggingPluginErrorsHandler - 1. rp-pipeline-file.processor.rename_keys: caused by: Parameter "entries.null.to_key" for plugin "rename_keys" is invalid: key "cs(host)" must contain only alphanumeric chars with .-_@/ and must follow JsonPointer (ie. 'field/to/key')

Expected behavior
The to_key argument "cs(host)" should be accepted as it conforms to the documented syntax.

Screenshots
N/A

Environment (please complete the following information):

  • OS: macOs
  • Version 14.5

Additional context
N/A

@dlvenable
Copy link
Member

@joelmarty , Do you want to produce key names with parenthesis in them?

@dlvenable dlvenable added question Further information is requested follow up and removed untriaged labels Oct 29, 2024
@dlvenable
Copy link
Member

Perhaps we can make use of escape sequences to allow parenthesis. Right now, our validation just looks for the characters themselves. But, we do not allow them to be escaped.

We have some related work in #5111.

@joelmarty
Copy link
Contributor Author

@dlvenable yes, I am trying to produce field names compatible with w3c's extended log file format, that uses the format prefix(header) to designate headers sent in the request or the response. For instance, cs(user-agent) is the field for the user-agent header sent in the request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working follow up question Further information is requested
Projects
Development

No branches or pull requests

2 participants