-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate entries in qualitative measurements table #70
Comments
This reverts changes introduced in ImagingDataCommons#64 and following PRs. Following the investigation of the issues identified in ImagingDataCommons#70, I decided not to hold the release of v13 any longer. These revisions will require more work and time.
TL;DR: After reviewing this, thinking about possible solutions, and discussing with @vkt1414, I decided to revert the updates to the queries done in #64 and proceeding with v13. The original queries that correspond to
For all of those SRs, the following assumptions were valid:
With those assumptions, the result of "flattening" was the following table schema (for the qualitative measurements): Now, the new dataset has SRs that:
It is not clear to me what one would want to have as expected behavior flattening those measurement groups above into the schema of the table we established, or if this would make any sense at all. We could, arguably, put "Finding site" into the Alternatively, we could have a completely separate query that would handle evaluations that are not derived from segmentations. I think this would be easier to understand for the user. I think for v13 we should do just that, and use that query in the notebooks and other materials accompanying the new |
The issues below were reported by @deepakri201 via discord. Need to investigate.
Using this query: #69. I think similar issues are occurring as before, when a slice has more than one body part region assigned to it, or more than one landmark assigned to it. For example:
For the case of multiple regions per slice -- If we take PatientID="LUNG1-002", and check where trackingIdentifier="Annotations group 14", we should only get 2 rows corresponding to Abdomen and Chest regions, but we get 4 rows.
For the case of multiple landmarks per slice -- If we take PatientID="LUNG1-001", and check where trackingIdentifier="Annotations group landmarks 1" , we should only get 2 rows corresponding to Kidney + Bottom, and L2 vertebra + Center, but we get 4 rows.
However, using the query here: https://github.com/vkt1414/etl_flow/blob/dde527d1e3ad85fcabe3571a66468f69c387a033/bq/derived_table_creation/BQ_Table_Building/derived_data_views/sql/qualitative_measurements.sql, the regions and landmarks are correct. Andrey, I think you may have worked from a slightly older version of Vamsi's query where he fixed these problems. (edited)
The text was updated successfully, but these errors were encountered: