Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Enhance ML inference processor for input processing #3193

Open
ylwu-amzn opened this issue Oct 30, 2024 · 6 comments
Open

[Enhancement] Enhance ML inference processor for input processing #3193

ylwu-amzn opened this issue Oct 30, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@ylwu-amzn
Copy link
Collaborator

ylwu-amzn commented Oct 30, 2024

What's the problem?

Currently, in using ml inference processors, when the users choose a model and then try to set input_map, users try to load the input map’s key following the model interface. This design works for the simple model interfaces when the documents have the exact format matching the model interface.

However, If the input parameter of model interface can't be set to the document field name after using json path, users need to change the model interface to match the document field. Constructing prompt is one of the complex use case. Prompt is usually mixing with some static instructions and content from the documents. It’s not reusable if configuring different model interfaces for different prompt engineering use cases.

Sample Use Case -Bedrock Claude V3 :

0. Index

Using music document for the following use case. Every document has two fields “persona” and “query”:


PUT /music/_doc/1
{
  "persona": "financial analyst" ,
  "query":"who is talor switch"
}

PUT /music/_doc/2
{
  "persona": "local farmer" ,
  "query":"who is talor switch"
}

PUT /music/_doc/3
{
  "persona": "financial analyst" ,
  "query":"justin biber"
}

1. Model:

Using Bedrock Claude V3 model as an example: This is the model api, usually it’s required a message field

POST /model/us.anthropic.claude-3-5-sonnet-20240620-v1:0/invoke HTTP/1.1
{
  "anthropic_version": "bedrock-2023-05-31",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Hello world"
        }
      ]
    }
  ]
}

This is optional system prompt, the system field is optional

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=2048,
    system="You are a seasoned data scientist at a Fortune 500 company.", # <-- role prompt
    messages=[
        {"role": "user", "content": "Analyze this dataset for anomalies: <dataset>{{DATASET}}</dataset>"}
    ]
)

print(response.content)

2. Blueprint:

In OpenSearch, we have the blueprint as (currently, it’s missing the system field), the ideal blueprint is going to be:

POST /_plugins/_ml/connectors/_create
{
  "name": "Claude V3",
  "description": "Connector for Claude V3",
  "version": 1,
  "protocol": "aws_sigv4",
  "parameters": {
        "region": "us-west-2",
        "service_name": "bedrock",
        "auth": "Sig_V4",
        "response_filter": "$.content[0].text",
        "max_tokens_to_sample": "8000",
        "anthropic_version": "bedrock-2023-05-31",
        "model": "anthropic.claude-3-sonnet-20240229-v1:0",
        "system":""
  },
  "credential": {
    "access_key": "",
    "secret_key": "",
    "session_token": ""
  },
  "actions": [
    {
     "action_type": "PREDICT",
      "method": "POST",
      "url": "https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-instant-v1/invoke",
      "headers": {
        "x-amz-content-sha256": "required",
        "content-type": "application/json"
      },
      "request_body": "{\"messages\":[{\"role\":\"${parameters.users}\",\"content\":[{\"type\":\"text\",\"text\":\"${parameters.inputs}\"}]}],\"anthropic_version\":\"${parameters.anthropic_version}\",\"max_tokens\":${parameters.max_tokens_to_sample},\"system\":\"${parameters.system:-null}\"}"
    }
  ]
}

POST /_plugins/_ml/models/_register
{
  "name": "Claude V3 model",
  "version": "1.0.1",
  "function_name": "remote",
  "description": "Claude V3",
  "connector_id": "Pali-ZIBAs32TwoKZ1th"
} 
POST /_plugins/_ml/models/Qqli-ZIBAs32TwoKh1tO/_deploy
POST /_plugins/_ml/models/Qqli-ZIBAs32TwoKh1tO/_predict  
{
  "parameters": {
   "inputs": "How many moons does Jupiter have?",
   "system": "You are an ${parameters.role}, tell me about ${parameters.inputs}, Ensure tha you can generate a short answer less than 10 words.",
   "role":"assistant"}
}

Sample Predict with prompt requesting within 60 words:

POST /_plugins/_ml/models/Qqli-ZIBAs32TwoKh1tO/_predict  
{
  "parameters": {
   "inputs": "How many moons does Jupiter have?",
   "role":"assistant"
   "system": "You are an ${parameters.role}, tell me about ${parameters.inputs}, Ensure tha you can generate a short answer less than 90 words.",
   }
}

##returnning

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "response": "Jupiter has 79 known moons. The four largest moons of Jupiter that were discovered by Galileo Galilei in 1610 are Io, Europa, Ganymede, and Callisto. Io is the innermost of the four and volcanically active due to tidal heating from gravitational tug-of-war with Jupiter and the other large moons. Europa's icy surface likely hides an ocean of liquid water beneath. Ganymede is the largest moon in the Solar System. Callisto is also thought to harbor a subsurface ocean. Many of Jupiter's other moons are much smaller and more irregularly shaped. Several were discovered during the past few decades using ground- and space-based telescopes."
          }
        }
      ],
      "status_code": 200
    }
  ]
}

Sample Predict with prompt requesting within 10 words:

POST /_plugins/_ml/models/cqkn-ZIBAs32TwoK11ql/_predict  
{
  "parameters": {
   "inputs": "How many moons does Jupiter have?",
   "system": "You are an ${parameters.role}, tell me about ${parameters.inputs}, Ensure tha you can generate a short answer less than 10 words.",
   "role":"assistant"}

##returnning

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "response": "79 moons."
          }
        }
      ],
      "status_code": 200
    }
  ]
}

3. Model Interface for Bedrock:

we have predefine model interface for bedrock that requires a parameters.input field:

GET /_plugins/_ml/models/Qqli-ZIBAs32TwoKh1tO


{
  "name": "Claude V3 model",
  "model_group_id": "b6mh-JIBAs32TwoKZViT",
  "algorithm": "REMOTE",
  "model_version": "8",
  "description": "Claude V3",
  "model_state": "DEPLOYED",
  "created_time": 1730760836942,
  "last_updated_time": 1730760846556,
  "last_deployed_time": 1730760846556,
  "auto_redeploy_retry_times": 0,
  "planning_worker_node_count": 4,
  "current_worker_node_count": 4,
  "planning_worker_nodes": [
    "cBnpgDCdSd-qvzs6U8LT0g",
    "fqaeDeUxQwuE67FM-LpPNQ",
    "YTE9jdiwTfaMhPPUvt5D5A",
    "I1c8plZITK6EVQ5Ah3iekQ"
  ],
  "deploy_to_all_nodes": true,
  "is_hidden": false,
  "connector_id": "Pali-ZIBAs32TwoKZ1th",
  "interface": {
    "output": """{
    "type": "object",
    "properties": {
        "inference_results": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "output": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {
                                    "type": "string"
                                },
                                "dataAsMap": {
                                    "type": "object",
                                    "properties": {
                                        "response": {
                                            "type": "string"
                                        }
                                    },
                                    "required": [
                                        "response"
                                    ]
                                }
                            },
                            "required": [
                                "name",
                                "dataAsMap"
                            ]
                        }
                    },
                    "status_code": {
                        "type": "integer"
                    }
                },
                "required": [
                    "output",
                    "status_code"
                ]
            }
        }
    },
    "required": [
        "inference_results"
    ]
}""",
    "input": """{
    "type": "object",
    "properties": {
        "parameters": {
            "type": "object",
            "properties": {
                "inputs": {
                    "type": "string"
                }
            },
            "required": [
                "inputs"
            ]
        }
    },
    "required": [
        "parameters"
    ]
}"""
  }
}

4. Prompt

Now, when configuring prompt, we put in the model config field

{"model_config":{"system": "You are an ${parameters.role}, tell me about ${parameters.inputs}, Ensure tha you can generate a short answer less than 10 words."}}

But when we load the model interface, it required field inputs , but the role field is optional model input field, the system prompt field is also optional model input field.

5. Input_map

Now the current design, we can try to load the model interface keys into input_map key, but we also need to map role field as well. But it’s not in the model interface

"input_map": [
          {
            "role": "persona",
            "inputs": "query"
          }
        ]

here is the full config of search pipeline:

PUT /music/_doc/1
{
  "persona": "financial analyst" ,
  "query":"who is talor switch"
}

PUT /music/_doc/2
{
  "persona": "local farmer" ,
  "query":"who is talor switch"
}

PUT /music/_doc/3
{
  "persona": "financial analyst" ,
  "query":"justin biber"
}

PUT /_search/pipeline/my_pipeline_claude3
{
  "response_processors": [
    {
      "ml_inference": {
        "tag": "ml_inference",
        "description": "This processor is going to run claude3",
        "model_id": "Qqli-ZIBAs32TwoKh1tO",
        "function_name": "REMOTE",
        "input_map": [
          {
            "role": "persona",
            "inputs": "query"
          }
        ],
        "output_map": [
          {
            "claude_response": "response"
          }
        ],
        "model_config": {"system":"You are an ${parameters.role}, tell me about ${parameters.inputs}, Ensure tha you can generate a short answer less than 50 words."
        },
        "one_to_one":true,
        "ignore_missing": false,
        "ignore_failure": false
      }
    }
  ]
}

Pain point:

The pain point to format document field to match model interface

If we want to load the model input field match as the document field, users need to modify the model interface .

Proposal:

Proposal 1: Allow String Substitutions in input_map field
remove the prompt setting with model_config in ml inference processor, and have the prompt(system) field config in the input_map similar to below:

"input_map": [
          {
            "system": "You are an ${persona}, tell me about ${query}, 
                       Ensure tha you can generate a short answer less than 10 words.",
            "inputs": "query"            
          }
        ]

To benchmark, this is the predict API command:

POST /_plugins/_ml/models/Qqli-ZIBAs32TwoKh1tO/_predict  
{
  "parameters": {
   "inputs": "How many moons does Jupiter have?",
   "system": "You are an ${parameters.role}, tell me about ${parameters.inputs}, Ensure tha you can generate a short answer less than 10 words.",
   "role":"assistant"}
}

Pro:

  1. directly mapping the document field into system field and format with document fields

Cons:

  1. optional field not showing in the model interface:
    the model interface is requiring a inputs field, and the system field is optional, we can’t preload the system field in the input map’s key

  2. regex pattern limitation:
    when we apply the string substitution in input_map, the questions cannot happen to have the same pattern. for example, I was asking a question, can you explain this code to me String templateString = "The ${animal} jumped over the ${target}."; animal = "quick brown fox"; "target"= "lazy dog" ” , but in the document, there is a field target="giant lion"

    in this case, the string in the question is interpreted in the input map is
    The quick brown fox jumped over the giant lion
    , but in the question the string should meant:
    The quick brown fox jumped over the lazy dog

    the pattern of ${} in the string will always substitute when there happens to have the same document name

Proposal 2: transform model input format , apply setting in model_input field

in ml inference processor, there is a model_input config parameters, that can help format the model_inputs. There is a rerank example which use model_input in the doc. model_input can bypass the response_body format in the connector setting.

In this example, the model_input field can construct like this :

{\"messages\":
[{\"role\":\"${parameters.persona}\",
\"content\":[{\"type\":\"text\",\"text\":\"${parameters.inputs}\"}]}],
\"system\":\"You are an ${parameters.persona}, 
            tell me about ${parameters.inputs}, 
            Ensure that you can generate a short answer less than 50 words.\"}

the input map will be the same as the placeholder as

"input_map": [
          {
            "persona": "persona",
            "inputs": "query"
          }
        ]

Pros:

  1. the model_input field can be powerful to format model input, not just for prompt engineering cases.
  2. we still keep the input_map and model_input and model_config separately, thus we don’t need regex pattern to mix up the string substitution. The mapping only happens in input_maps and output_maps
  3. this model_inputfield is already in used to format local model format input. This is added for local models to format local model format with less pre-processing functions.
    Cons:
    TBD
@ylwu-amzn ylwu-amzn added enhancement New feature or request untriaged labels Oct 30, 2024
@dylan-tong-aws
Copy link

The highest-level requirement is that a user should never be forced/constrained to having to create a custom model interface to support a specific search pipeline or use case. The user should be able to use the natural interface for a model (eg. Bedrock Claude V3's natural interface is the messaging or converse API) for all possible pipelines and use cases. This allows the user to share the model across any use case or pipeline. If you don't do this, then there is a scalability issue from a operational and usability (likely performance as well) because for each model, there will be the need to manage N models with unique interfaces for every N unique pipelines. A proper design should only require a single model with a common (natural) interface to support N unique pipelines. The current design does not satisfy these requirements. It's evident in RAG use cases. This is due to problems withe coupling of functionality and the order of operations of how fields are mapped and processed into a LLM prompt.

A simple design that decouples preprocessing and post processing logic from the core processor functionality should suffice. The processing flow can simply be: i. (optional) preprocess: perform data transform on input data ii. map transform output to model and execute inference iii. (optional) post process: perform data transform on inference output data. This simple design will guarantee the requirements are satisfied.

@ylwu-amzn
Copy link
Collaborator Author

ylwu-amzn commented Oct 31, 2024

Thanks @dylan-tong-aws , I had offline discussion with Tyler yesterday. He has another proposal which can also solve the concern "a user should never be forced/constrained to having to create a custom model interface to support a specific search pipeline or use case":

Tyler prefers something like this in input map part

"prompt": "Human: you are a helpful assistant. You have such context $.field1.content, please summarize and answer my question $.query.question"

Rather than have such thing in input map

"content": "$.field1.content"

And config prompt in model config

"prompt": "Human: you are a helpful assistant. You have such context ${parameters.content}, please summarize and answer my question $.query.question"

Correct me if wrong, @ohltyler

@ohltyler
Copy link
Member

ohltyler commented Oct 31, 2024

Regarding option 3 Enhance current ML inference processor input map parsing logic:

The highest-level requirement is that a user should never be forced/constrained to having to create a custom model interface to support a specific search pipeline or use case.

This implies a flexible and consistent model interface, which implies flexible and consistent keys in the input map / output map configurations, regardless of the use case. This is because the keys are the model interface inputs/outputs. See the ML processor documentation.

LLM inputs typically include a freeform text input as part of the API. See Anthropic messages API's content field. For prompt building, the prompt would be passed via this freeform text input. Therefore, the ideal model interface includes a text input (note the current preset connector/model is set up this way as well, with an inputs field. Therefore, the key to the input map should be this freeform text input. Therefore, the value should be the freeform input. Therefore, for prompt building use cases, the user should be able to build out this freeform prompt as a value to this input map.

@ylwu-amzn I agree with the above Option 3 as you laid out as one option. Options 1/2 also seem reasonable. I am indifferent on a solution, and suggest to go with the one that makes the most sense from an implementation perspective, and what the limitations are for ML processors etc. If it is not reasonable to expand the ML processor functionality, then it is simply not reasonable, and one of the other solutions should be pursued.

@mingshl
Copy link
Collaborator

mingshl commented Oct 31, 2024

Thanks @dylan-tong-aws , I had offline discussion with Tyler yesterday. He has another proposal which can also solve the concern "a user should never be forced/constrained to having to create a custom model interface to support a specific search pipeline or use case":

Tyler prefers something like this in input map part

"prompt": "Human: you are a helpful assistant. You have such context $.field1.content, please summarize and answer my question $.query.question"

Rather than have such thing in input map

"content": "$.field1.content"

And config prompt in model config

"prompt": "Human: you are a helpful assistant. You have such context ${parameters.content}, please summarize and answer my question $.query.question"

Correct me if wrong, @ohltyler

@ohltyler checkout the model_input field in ml inference processor configs,

model_input String Optional for externally hosted modelsRequired for local models A template that defines the input field format expected by the model. Each local model type might use a different set of inputs. For externally hosted models, default is "{ "parameters": ${ml_inference.parameters} }.

We can load the prompt field in model_input and skip input mapping if you want. that serves the same purpose.

@dylan-tong-aws
Copy link

@mingshl, @ylwu-amzn, I'm fine with whatever approach or syntax that @ohltyler proposes. Tyler represents an end user, so if he approves of the usability, I'm good with it.

I only expect the solution to address the aforementioned requirement.

@mingshl mingshl removed the untriaged label Nov 5, 2024
@mingshl mingshl moved this to In Progress in ml-commons projects Nov 5, 2024
@mingshl mingshl self-assigned this Nov 5, 2024
@mingshl
Copy link
Collaborator

mingshl commented Nov 5, 2024

Updated the issue description with sample use cases to explain the work flow of using ml inference search response processor and adding the two proposals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

4 participants