Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insert event pages more efficiently #54

Open
danielballan opened this issue May 30, 2024 · 0 comments
Open

Insert event pages more efficiently #54

danielballan opened this issue May 30, 2024 · 0 comments

Comments

@danielballan
Copy link
Member

Currently, if suitcase.mongo_normalized.Serializer receives an event_page, it "unpacks" it into N event documents and inserts them separately.

def event_page(self, doc):
# Unpack an EventPage into Events and do the actual insert inside
# the `event` method. (This is the oppose what DocumentRouter does by
# default.)
event_method = self.event # Avoid attribute lookup in hot loop.
filled_events = []
for event_doc in event_model.unpack_event_page(doc):
filled_events.append(event_method(event_doc))

This can incur a large amount of latency with Mongo. On the floor we've seen that one event takes Tiled ~60ms to process but one event_page of about a dozen rows takes almost 1000ms.

It should be possible to do the update as a single MongoDB command that that adds N new documents to the event collection. This might be as simple as a bulk insert operation. That is: still "unpack" in Python but insert the resulting event documents in bulk.

It might be possible to get even fancier and do the unpacking server-side through some kind of aggregation, but I would start by benchmarking the simple thing. My guess is that MongoDB latency >> Python runtime cost of unpacking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant