Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R03: need for data format code indication #56

Open
delphinedobler opened this issue Feb 10, 2023 · 4 comments
Open

R03: need for data format code indication #56

delphinedobler opened this issue Feb 10, 2023 · 4 comments
Assignees
Labels
avtt Argo Vocabulary Task Team R03

Comments

@delphinedobler
Copy link

In the former Excel spreadsheet, the format code (column data type with the following values: float, double, NC_Short (16-bit signed integer), NC_DOUBLE) was indicated and is used in the file checker. This information should be also reflected on the NVS side.

Is there a difference (subtlety) between double and NC_double ? It seems mainly to be a question of associated fill_value :

  • double fill_value is 99999 while NC_double is 9.9692099683868690e+36
    As for float that is not mentioned as NC_float:
  • float fill_value is 99999.f while NC_float is 9.9692099683868690e+36
    We need to clarify why we use both semantics (both netCDF NC_* and simple float/double) and if it is related to the used fill_values.

To tackle this, we could create an additionnal table (as I don't see any other tables that would fit but I may have missed it) with the list of netCDF types + float and double if it is proved relevant as questioned above :
https://docs.unidata.ucar.edu/nug/current/md_types.html

Then the R03 entries would be mapped to the corresponding format code.

@vpaba vpaba added the avtt Argo Vocabulary Task Team label Feb 10, 2023
@vpaba
Copy link
Contributor

vpaba commented Feb 10, 2023

Thanks for opening this ticket @delphinedobler. I've raised it with the wider BODC Vocab Team, as some CF vocabularies already exist on the NVS (though not for Data Types yet I believe): http://vocab.nerc.ac.uk/search_nvs/cvl/?searchstr=CF&options=identifier,preflabel,altlabel,governance

I will update on what I find on the BODC side!

In the meantime, like you say it would be good to understand whether the FillValue difference you've spotted is important, or whether we (Argo) would be happy with NC_FLOAT, NC_SHORT and NC_DOUBLE (and let go of 'double' and 'float).

@apswong
Copy link

apswong commented Feb 10, 2023

@delphinedobler @vpaba
The assignment of data types in the Argo netcdf files, such as double or nc_double, float or nc_float, is not due to the difference in FillValue, but is simply a matter of legacy. In the early years of Argo, we used the primitive data types that were generally used in programming (float, double, etc.) and assigned them an arbitrary but certain to be out-of-range number (99999.) as FillValues. More recently we started using the NetCDF data types (nc_float, nc_double, etc), which have their own defined FillValues. The result is that there is now a mix of data types (and FillValues) in the parameters xlsx. These differences are not important in terms of the data that we make public. However, we cannot rewrite the data types that are already assigned, because that will mean rewriting all the Argo netcdf files.

I don't think it's necessary to create an additional table for data types. But the assigned data type related to each parameter should be added in R03, similar to the min/max issue, since they are used by the File Checker.

@vpaba
Copy link
Contributor

vpaba commented Apr 24, 2024

@apswong thanks for the explanation above. Is the data type something relevant to R03 alone, or to other collections as well?

@vpaba vpaba added the R03 label Apr 24, 2024
@apswong
Copy link

apswong commented Apr 24, 2024

@vpaba Please be careful with using the term "data type". In the context of R03, the column labelled "data type" refers to the parameter attributes. There is also an official variable called "DATA_TYPE", which I believe is in R01.

In terms of the parameter attributes, I'm not sure but I think they are only relevant in R03. Perhaps @tcarval can confirm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
avtt Argo Vocabulary Task Team R03
Projects
Development

No branches or pull requests

4 participants