Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job138 - Omitted rows in ingested sheet #1350

Open
bpasquer opened this issue Apr 19, 2024 · 2 comments
Open

Job138 - Omitted rows in ingested sheet #1350

bpasquer opened this issue Apr 19, 2024 · 2 comments

Comments

@bpasquer
Copy link
Contributor

bpasquer commented Apr 19, 2024

Opening an issue to further investigate problem raised in email by Lizzi (email 2024-04-10) regarding Job 138 (2022 RLS MPA TEMPLATE WITH INVERT SIZE AND ALGAE_SA_Amended JB.xlsx):

Lizzie: For Job 138. 20 rows were omitted from ingest. Many were for the species Austrolabrus maculatus but we are unsure why. Some of the “omitted” rows are actually true duplicates that are supposed to be summed on ingest. This still needs to be checked though as it looks like the totals of these duplicates are not being summed on ingest. User should receive a warning error for duplicate observations for same site/date/depth/method/block/species combos but if the user chooses to ignore this warning, data are to be summed. The original staged data and the ingested data are attached, with the highlighted rows in yellow needing investigating (green highlights are expected omissions) .

Bene: Job138 issue requires further investigation as we haven't yet identified clear patterns to explain why the highlighted rows weren't ingested.
Regarding your assessment on duplicates processing,
I can confirm is that the current software version does:
    - flag duplicate rows
- sum "true" duplicates in the endpoints

2022 RLS MPA TEMPLATE WITH INVERT SIZE AND ALGAE_SA_Amended JB_Missing highlighted.xlsx

@bpasquer
Copy link
Contributor Author

Toni reported after testing the ingest of 2022 RLS MPA TEMPLATE WITH INVERT SIZE AND ALGAE_SA (email 2024-05-23) that the current software version does not have the issue.
Because the original error cannot be replicated, investigating the cause is not possible.
Nonetheless, I think it could be beneficial to verify whether sheets ingested with the same version of the software have also been affected.
I propose we discuss the method for conducting this check (manual or automated;values/fields to check) once the list is established

@bpasquer
Copy link
Contributor Author

bpasquer commented Jun 6, 2024

A manual check was done on a couple of sheets ingested at the same period by Toni, to see if data has been omitted from the ingest
After our discussions during the catch-up on 06/06/2024, it was acknowledged that implementing a solution to automate the check is complex. This would involve comparing rows in the original datasheet with records in the database and reporting on the differences. Given that we can assume the issue has been resolved, it is not considered worthwhile to invest further effort into this.

An enhancement to the ingested sheet summary report was proposed: to include the number of rows that were not ingested (i.e., the difference between staged and ingested rows).

@bpasquer bpasquer added this to the Maintenance - Software milestone Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant