Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document deletion/update flow automation #54

Open
Foco22 opened this issue Dec 11, 2023 · 3 comments
Open

Document deletion/update flow automation #54

Foco22 opened this issue Dec 11, 2023 · 3 comments
Assignees

Comments

@Foco22
Copy link

Foco22 commented Dec 11, 2023

Hi,

please add an automation for when a file is deleted or modified. it should delete the records from both indexers and the chunks in the storage account regarding that file so the solution dont give answers coming from deleted or outdated data.

Index and chunks:
index

MicrosoftTeams-image (8)

Best Regards

@framigni
Copy link

I second this request.
It's an essential requirement that the documentation is kept dynamically updated and the chat answers are not "polluted" by obsolete information.
Destroying and re-creating the Index and Indexer at any update or deletion is not practical.
My understanding this is a limitation of the Azure Search service, so probably a bigger issue
Best Regards

@gbecerra1982
Copy link
Collaborator

We are already working on this feature it will be available soon.

@framigni
Copy link

@gbecerra1982 @Foco22
I think actually there is a solution already, and it works for me:
First, you need to Enable soft delete for blobs (as described in https://learn.microsoft.com/en-us/azure/storage/blobs/soft-delete-blob-enable?tabs=azure-portal) on the Storage Account where is your document container
Then, you need to Configure native soft delete (as described in https://learn.microsoft.com/en-us/azure/search/search-howto-index-changed-deleted-blobs?tabs=portal) in the Search service, Data Source settings
After that, once a document has been deleted from the document container, all the knowledge of that document is removed as well at the next Indexer run (and document_chunking execution)
I guess that only thing now would be to embed the 2 settings into the package/script for deployment

@placerda placerda transferred this issue from Azure/GPT-RAG Feb 19, 2024
This was referenced Mar 18, 2024
@placerda placerda self-assigned this Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants