Add package 2024_Barquera_ChichenItza #211

RodrigoBarquera · 2024-09-05T14:01:01Z

The package does not exist already in the community archive, also not with a different name.
The package title in the POSEIDON.yml conforms to the general title structure suggested here: <Year>_<Last name of first author>_<Region, time period or special feature of the paper>, e.g. 2021_Zegarac_SoutheasternEurope, 2021_SeguinOrlando_BellBeaker or 2021_Kivisild_MedievalEstonia.
The package is stored in a directory that is named like the package title.

The Publication column in the .janno file is filled and the respective .bib file has complete entries for the listed mentioned keys.
The .janno file does not include any empty columns or columns only filled with n/a.
The order of columns in the .janno file adheres to the standard order as defined in the Poseidon schema here.
The .janno and the .ssf files are not fully quoted, so they only use single- or double quotes ("...", '...') to enclose text fields where it is strictly necessary (i.e. their entry includes a TAB).

Large genotype data files are properly tracked with Git LFS and not directly pushed to the repository. For an instruction on how to set up Git LFS please look here. If you accidentally pushed the files the wrong way you can fix it with git lfs migrate import --no-rewrite path/to/file.bed (see here).

stschiff · 2024-10-09T07:35:03Z

Hi @RodrigoBarquera, this is great. Super that you even entered the relationship columns, which I know is a lot of work.

Sorry for taking so long to give feedback, but I have some points:

We actually would like the Collection_ID column to reflect the ID from the actual collection. I see that you've used the column Alternative_IDs for that. I suggest that you simply rename the Alternative_IDs column to Collection_ID and remove the empty Collection_ID column.
You have only given date information for the few samples that you've C14-dated. But I'm sure you can also give dates for all samples that have no C14-date, right? We have contextual in the Date_Type for that, and it would be good to fill. We generally aspire to have at least contextual dates for every single sample, to facilitate meta-analyses through space and time. Note that with contextual dates, you should only fill columns Date_BC_AD_Start, Date_BC_AD_Median and Date_BC_AD_End, where the median can just be the mid-point of the interval.
I see that you've left columns ´Endogenous Nr_SNPs, Coverage_on_Target_SNPs, Damage, Contamination, Contamination_Err, Contamination_MeasandContamination_Note` empty. I'm sure these information are available in your paper, right? Do you need help with these? We have three student assistants now who can help with this. Let us know! I would be willing to leave these empty for now, but if it's just about needing help, let us help.
The Genetic_Source_Accession_IDs should be filled. They can all have the exact same Project Accession ID entry from the ENA.

Again, let us know if you need help with this and we can ask someone from our team.

added new package named 2024_Barquera_ChichenItza

dd80eab

nevrome changed the title ~~added new package named 2024_Barquera_ChichenItza~~ Add package 2024_Barquera_ChichenItza Sep 6, 2024

stschiff self-assigned this Sep 9, 2024

Provide feedback