R03: need for data format code indication #56

delphinedobler · 2023-02-10T09:17:51Z

In the former Excel spreadsheet, the format code (column data type with the following values: float, double, NC_Short (16-bit signed integer), NC_DOUBLE) was indicated and is used in the file checker. This information should be also reflected on the NVS side.

Is there a difference (subtlety) between double and NC_double ? It seems mainly to be a question of associated fill_value :

double fill_value is 99999 while NC_double is 9.9692099683868690e+36
As for float that is not mentioned as NC_float:
float fill_value is 99999.f while NC_float is 9.9692099683868690e+36
We need to clarify why we use both semantics (both netCDF NC_* and simple float/double) and if it is related to the used fill_values.

To tackle this, we could create an additionnal table (as I don't see any other tables that would fit but I may have missed it) with the list of netCDF types + float and double if it is proved relevant as questioned above :
https://docs.unidata.ucar.edu/nug/current/md_types.html

Then the R03 entries would be mapped to the corresponding format code.

vpaba · 2023-02-10T14:31:32Z

Thanks for opening this ticket @delphinedobler. I've raised it with the wider BODC Vocab Team, as some CF vocabularies already exist on the NVS (though not for Data Types yet I believe): http://vocab.nerc.ac.uk/search_nvs/cvl/?searchstr=CF&options=identifier,preflabel,altlabel,governance

I will update on what I find on the BODC side!

In the meantime, like you say it would be good to understand whether the FillValue difference you've spotted is important, or whether we (Argo) would be happy with NC_FLOAT, NC_SHORT and NC_DOUBLE (and let go of 'double' and 'float).

apswong · 2023-02-10T20:36:28Z

@delphinedobler @vpaba
The assignment of data types in the Argo netcdf files, such as double or nc_double, float or nc_float, is not due to the difference in FillValue, but is simply a matter of legacy. In the early years of Argo, we used the primitive data types that were generally used in programming (float, double, etc.) and assigned them an arbitrary but certain to be out-of-range number (99999.) as FillValues. More recently we started using the NetCDF data types (nc_float, nc_double, etc), which have their own defined FillValues. The result is that there is now a mix of data types (and FillValues) in the parameters xlsx. These differences are not important in terms of the data that we make public. However, we cannot rewrite the data types that are already assigned, because that will mean rewriting all the Argo netcdf files.

I don't think it's necessary to create an additional table for data types. But the assigned data type related to each parameter should be added in R03, similar to the min/max issue, since they are used by the File Checker.

vpaba · 2024-04-24T14:39:35Z

@apswong thanks for the explanation above. Is the data type something relevant to R03 alone, or to other collections as well?

apswong · 2024-04-24T16:38:10Z

@vpaba Please be careful with using the term "data type". In the context of R03, the column labelled "data type" refers to the parameter attributes. There is also an official variable called "DATA_TYPE", which I believe is in R01.

In terms of the parameter attributes, I'm not sure but I think they are only relevant in R03. Perhaps @tcarval can confirm?

vpaba added the avtt Argo Vocabulary Task Team label Feb 10, 2023

gmaze mentioned this issue Feb 14, 2023

R03: add min/max values to each parameter description #57

Open

vpaba added the R03 label Apr 24, 2024

github-project-automation bot added this to AVTT issues management Aug 29, 2024

github-project-automation bot moved this to To do in AVTT issues management Aug 29, 2024

cgourcuf added this to AVTT issues management Oct 9, 2024

cgourcuf moved this to Todo in AVTT issues management Oct 9, 2024

danibodc self-assigned this Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R03: need for data format code indication #56

R03: need for data format code indication #56

delphinedobler commented Feb 10, 2023

vpaba commented Feb 10, 2023

apswong commented Feb 10, 2023 •

edited

Loading

vpaba commented Apr 24, 2024

apswong commented Apr 24, 2024

R03: need for data format code indication #56

R03: need for data format code indication #56

Comments

delphinedobler commented Feb 10, 2023

vpaba commented Feb 10, 2023

apswong commented Feb 10, 2023 • edited Loading

vpaba commented Apr 24, 2024

apswong commented Apr 24, 2024

apswong commented Feb 10, 2023 •

edited

Loading