-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] feat: executor to gcp custom container #6136
base: master
Are you sure you want to change the base?
Conversation
raise HTTPException(status_code=499, detail=status.description) | ||
else: | ||
return {"predictions": resp.docs} | ||
return output_model(predictions=resp.docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the first return is not good
client.containers.prune() | ||
|
||
|
||
class chdir(AbstractContextManager): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this? I do not like chdir
behavior in tests
|
||
|
||
def test_provider_gcp_pod_inference(): | ||
with chdir(os.path.join(os.path.dirname(__file__), 'SampleExecutor')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use chdir
, pass the proper path
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #6136 +/- ##
===========================================
- Coverage 75.06% 47.90% -27.17%
===========================================
Files 152 151 -1
Lines 14061 14133 +72
===========================================
- Hits 10555 6770 -3785
- Misses 3506 7363 +3857
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
from jina.serve.runtimes.gateway.models import _to_camel_case | ||
|
||
if not docarray_v2: | ||
logger.warning('Only docarray v2 is supported with GCP. ') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this raise an Exception at some point? it should
allow_methods=['*'], | ||
allow_headers=['*'], | ||
) | ||
logger.warning('CORS is enabled. This service is accessible from any website!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why warning?
return await process(input_model(data=[])) | ||
else: | ||
raise HTTPException( | ||
status_code=400, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use constants and not magic numbers.
if status.code == jina_pb2.StatusProto.ERROR: | ||
raise HTTPException(status_code=499, detail=status.description) | ||
else: | ||
return VertexAIResponse(predictions=req.data.docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why I got this error when running pytest -v -s -x test_gcp.py -k test_provider_gcp_deployment_inference
, looks like it's failing on some docarray validation
File "pydantic/main.py", line 449, in pydantic.main.BaseModel.dict
File "pydantic/main.py", line 868, in _iter
File "pydantic/main.py", line 794, in pydantic.main.BaseModel._get_value
File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 135, in __init__
super().__init__(docs)
File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 163, in _validate_docs
yield self._validate_one_doc(doc)
File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 170, in _validate_one_doc
raise ValueError(f'{doc} is not a {self.doc_type}')
ValueError: {'id': '5d814334ba65cd66b356f363262dd2ad', 'text': 'hello world', 'embeddings': NdArray([[0.72778929, 0.23900942, 0.03286786, 0.95298338, 0.77614308,
0.03805365, 0.52810364, 0.70904448, 0.03221386, 0.40657516,
0.05665328, 0.73416605, 0.04400287, 0.79719694, 0.42839442,
0.43580189, 0.07334091, 0.60751732, 0.10342848, 0.51660762,
0.06586024, 0.88129511, 0.42874586, 0.88633873, 0.31858046,
0.71453457, 0.31093969, 0.92729799, 0.25257687, 0.92740828,
0.61681301, 0.51788409, 0.00997884, 0.21210052, 0.91713272,
0.37283237, 0.53357638, 0.77158797, 0.96699236, 0.13919022,
0.92879329, 0.78211408, 0.65285461, 0.89399441, 0.42854324,
0.94649221, 0.1966403 , 0.39643899, 0.42736609, 0.73116549,
0.88923411, 0.9894739 , 0.9124282 , 0.83614148, 0.27662465,
0.69740887, 0.34939623, 0.07094385, 0.38615401, 0.04148779,
0.0133029 , 0.68877819, 0.60318039, 0.64945002]])} is not a <class 'executor.EmbeddingResponseModel'>
Goals
Similar to #6046, we need to use
jina
to serve as a fastAPI for GCP.Bunch of setup codes are copied, as Sagemaker and GCP requirements are mostly the same, except that GCP requires API request and response in certain format:
Request:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#request_requirements
Response:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#response_requirements
TODO