Skip to content

ESGF_Node|PMIP3example

Stephen Pascoe edited this page Apr 9, 2014 · 9 revisions
Wiki Reorganisation
This page has been classified for reorganisation. It has been given the category MOVE.
The content of this page will be revised and moved to one or more other pages in the new wiki structure.

Goal

We want to use the existing ESGF tools and infrastructure to distribute data used in the PMIP3 project (Paleoclimate Modelling Intercomparison Project Phase 3).

  • Some of the PMIP3 data is already available in the CMIP5 database: standard paleo experiments ( _ midHolocene _ , _ lgm _ and _ past1000 _ ) and other _ paleo-related _ experiments ( _ piControl _ and _ 1pctCO2 _ ) performed by institutes officially part of CMIP5.

  • This tutorial specifies how to deal with the rest of the PMIP3 data (aka _ non-CMIP5 data _ or _ PMIP3-specific _ data):

    • CMIP5 paleo (or _ paleo-related _ ) experiments performed by non-CMIP5 participants

    • non-CMIP5 paleo experiments ( _ plioMIP _ , _ last Interglacial _ , _ Holocene _ experiments)

Data Access and Terms of Use

The users must be able to access the CMIP5 and the PMIP3 data with the same OpenID, and the PMIP3 data is distributed with the same Terms of Use used for CMIP5 data . Data use can be:

  • limited to non-commercial research and educational purposes
  • unrestricted

e.g., a user who created a CMIP5 OpenID and enrolled in the _ commercial _ group, will only be able to access PMIP3 (and CMIP5) data listed as _ unrestricted _ .

When a user goes to a Gateway referencing both CMIP5 and PMIP3-specific data and he selects a CMIP5 experiment, he will be showed models having performed this experiment, both in the _ CMIP5 _ and the _ PMIP3 _ database/projects (see example below)

attachment:BADC_PMIP3_1pctCO2_enhanced.png

HOWTO add another new model...

If everything is already set up (following the instructions provided below), and _ another _ new model must be processed

  1. Add the new model to PMIP3_models_drslib.csv

  2. Add the new model to esgcet_models_table.txt

  3. Add the new model to esg.pmip3.ini

  4. Update the Data Node database

  5. Go through the rest of the steps: move the data with drs_tool, generate the map files, generate the catalogs, publish the datasets

Preparing the data

Creating the data files

The PMIP3 data files supplied by the PMIP3 participants are as similar as possible to files prepared for CMIP5

  • netcf files following the CF convention, 1 variable per file

  • the files sent by the PMIP3 participants should be in a directory hierarchy following the CMIP5 Data Reference Syntax (DRS)

  • in order to comply with the preceding pre-requisites, the files are generated with the CMOR2 library using PMIP3 tables (PMIP3 tables are very similar to CMIP5 tables...).

Preparing the data files for publication

Introduction

We assume that we have received PMIP3 data files from the P3INS institute, generated with the P3-mod model in an incoming directory (we have of course used checksums to make sure that the data was not corrupted during the transfer :-) ) and we have to move them to a DRS directory hierarchy accessible from the Data Node.

Notes:

  • The user moving the files from the _ PMIP3 incoming _ to the _ PMIP3 outgoing _ hierarchy must have write access to both hierarchies, and access to a python where drslib/drs_tool is installed. The _ PMIP3 outgoing _ hierarchy has to be accessible from the Data Node.

  • a correctly configured drslib can handle non-CMIP5 data (as of Mon, 19 Mar 2012 ). Just in case, it is probably useful to install it in develop mode and to check drslib history

  • all the PMIP3 data will go to the output _ product _ directory, so there is no need to perform any _ product detection _ .

  • using drs_tool to move the files to the output directory will ensure that the files are in the correct DRS structure

    Incoming data

    /fs/PMIP3/esg/incoming/P3INS/P3-mod/

    Location of the CMIP5 data, if the data node distributes PMIP3 and CMIP5 data

    /fs/esg/CMIP5/output[12]

    Location of the PMIP3 data distributed/published from the Data Node

    /fs/esg/PMIP3/output/P3INS/P3-mod/

Configuring drslib

Note: we put all the configuration files in a PMIP3 specific configuration directory. This makes it possible to have the same user uses the same python+drslib to handles data for different projects (we just have to set up and find correctly the PMIP3_metaconfig.conf file)

$ ls /fs/PMIP3/esg/conf
pmip3-cmor-tables/  PMIP3_metaconfig.conf
PMIP3_logging.conf  PMIP3_models.csv
PMIP3 CMOR tables

Install the PMIP3 CMOR tables in /fs/PMIP3/esg/conf/pmip3-cmor-tables

$ cd /fs/PMIP3/esg/conf/
$ git clone http://uv-cdat.llnl.gov/git/pmip3-cmor-tables.git
PMIP3_logging.conf

The configuration file below is mostly copied from drslib logging . More information about logging configuration can be found in Logging facility for Python and [ Logging configuration ](h ttp://docs.python.org/library/logging.config.html#logging.config.fileConfig) .

The configuration is set up so that the activity of drslib (and drs_tool) will be logged to _ /fs/PMIP3/esg/logs/drs_tool.log _ . The error messages will go to this file! It is therefore important to know where it is located and study its content if nothing seems to be working.

#
# Basic logging configuration for drs_tool
#
# This configuration prints product detection decisions to STDERR and logs
# warnings to ./drs_tool.log
#

[loggers]
keys=root,drslib,p_cmip5

[handlers]
keys=drslib_h,p_cmip5_h

[formatters]
keys=f1,f2

#---------------------------------------------------------------------------
# Loggers

# No catch-all logging
[logger_root]
handlers=
level=NOTSET

[logger_drslib]
qualname=drslib
handlers=drslib_h

[logger_p_cmip5]
qualname=drslib.p_cmip5
handlers=p_cmip5_h
propagate=0

#---------------------------------------------------------------------------
# Handlers & Formatters

[handler_drslib_h]
class=FileHandler
args=('/fs/PMIP3/esg/logs/drs_tool.log', )
formatter=f1
level=INFO

[handler_p_cmip5_h]
class=StreamHandler
args=(sys.stderr, )
formatter=f2
level=INFO

[formatter_f1]
format=%(asctime)s [%(levelname)s] %(name)s: %(message)s
datefmt=

[formatter_f2]
format=[%(levelname)s] %(name)s: %(message)s
PMIP3_models_drslib.csv

Add a line describing the current PMIP3 model and institute at the end of the list used by drslib

Get the current PMIP3_models_drslib.csv from PMIP3_models_drslib.csv (possibly out of date) or from JeanyvesPeterschmitt

# Copy the existing list of CMIP5 models to a list used for PMIP3 models
$ cp esgf-drslib/drslib/data/CMIP5_models.csv /fs/PMIP3/esg/conf/PMIP3_models_drslib.csv

# ... and add to PMIP3_models.csv the appropriate line for the new institute/model
# "full institute name","institute acronym","country","PMIP3 official model_id","possibly modified model_id"
# WARNING! NO blank lines at the end of this csv file!
"PMIP3 famous institute","P3INS","France","P3-mod","P3-mod"
PMIP3_metaconfig.conf

The main metaconfig drslib configuration file specifies where all the drslib configuration files (described above) are located. Its location is specified by the METACONFIG_CONF environment variable

$ export METACONFIG_CONF=/fs/PMIP3/esg/conf/PMIP3_metaconfig.conf



# PMIP3_metaconfig.conf

# The content of this file is read by
#    drslib-0.3.0a4/drslib/config.py
# or    esgf-drslib/drslib/config.py

# Note: metaconfig will use the value of the METACONFIG_CONF
# environment variable to locate this file

[metaconfig]
configs = drslib
# Logging...
# Example logging.conf
#   http://esgf.org/esgf-drslib-site/intro.html#logging
logging = /fs/PMIP3/esg/conf/PMIP3_logging.conf

[drslib:tables]
# "path" below can also be configured with
# env variable MIP_TABLE_PATH
path = /fs/PMIP3/esg/conf/pmip3-cmor-tables/Tables

# You can override the default location of CSV versions of the tables
# ${path}/../Tables_csv
# with the "path_csv" variable
# IMPORTANT! In the current version of drslib, the directory
#            ${path}/../Tables_csv MUST exist, but its content is not used!
#            drslib gets the information it needs directly from the
#            actual CMOR tables in ${path}
#path_csv = 

# Prefix of the CMOR tables
# Default is "CMIP5_" ...
prefix = PMIP3_

# CSV file with some information about models and institutes
model_table = /fs/PMIP3/esg/conf/PMIP3_models_drslib.csv

[drslib:drs]

# Default root of the DRS tree
# Can be overridden with '-R' or '--root='
root = /fs/esg/PMIP3

# Default activity
# Can be overridden with '-a' or '--activity='
activity = pmip3

Using drs_tool to move the files

Once everything is correctly set up for using drslib, the PMIP3 data files can be moved to the DRS hierarchy from where they will be published by using drs_tool. The steps are exactly the same as would need to be done for CMIP5 data :-)

Finding data sets

The first step is to check that drs_tool can find some PMIP3 datasets

# Command line with all the arguments
$ drs_tool list -R /fs/esg/PMIP3 -I /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -p output -e piControl -i P3INS -a pmip3

# The values for the '-R' and '-a' options are specified
# in PMIP3_metaconfig.conf, so we don't really need them
$ drs_tool list -I /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -p output -e piControl -i P3INS

# It's also possible to scan all the experiments and datasets for a given institute with a single command
$ drs_tool list -I /fs/PMIP3/esg/incoming/P3INS/P3-mod -p output -i P3INS

# The following command can be used to keep the output in a log file
$ drs_tool list -I /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -p output -e piControl -i P3INS 2>&1 | tee /fs/PMIP3/esg/logs/P3INS_piControl_list.`date +"%Y%m%d_%H%M"`.log

If things work correctly, we get something like

# Output of drs_tool list on stdout
==============================================================================
DRS Tree at /fs/esg/PMIP3
------------------------------------------------------------------------------
pmip3.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1        0:0 36:49242847488
------------------------------------------------------------------------------
1 datasets awaiting upgrade
------------------------------------------------------------------------------

# Output of drs_tool list in drs_tool.log
2012-03-22 11:48:45,957 [INFO] drslib.mip_table: Adding table Amon from /fs/PMIP3/esg/conf/pmip3-cmor-tables/Tables/PMIP3_Amon to table store
2012-03-22 11:48:46,003 [INFO] drslib.mip_table: Adding table LImon from /fs/PMIP3/esg/conf/pmip3-cmor-tables/Tables/PMIP3_LImon to table store
[...]
2012-03-22 11:52:03,201 [INFO] drslib.publisher_tree: Deduced 36 incoming DRS files for PublisherTree <DRS pmip3.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1.%.tos.130001-134912>
2012-03-22 11:52:03,201 [INFO] drslib.publisher_tree: Deduced 36 incoming DRS files for PublisherTree <DRS pmip3.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1.%.tos.130001-134912>

If no dataset is found, the content of the drs_tool.log will give the reasons why the data files in the incoming directory were rejected...

# We requested 'pmip3' data (in metaconfig or with '-a pmip3',
# but the data files were recognized as 'cmip5'. This probably
# comes from an old drslib that can't handle non CMIP5 data
2012-03-16 17:45:25,591 [WARNING] drslib.drs_tree: FILTERED OUT: <DRS cmip5.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1.%.zos.155001-159912>.  'cmip5' != 'pmip3'

# We requested past1000 data ('-e past1000') but the data files
# in the specified incoming directory are for midHolocene
... [WARNING] drslib.drs_tree: FILTERED OUT: ...  'midHolocene' != 'past1000'

Note about the numbers displayed by 'drs_tool list': 36:49242847488 . It's just the number of netcdf files retained for a given dataset, and their total size (in bytes)

# Number of files
$ find /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -name '*.nc' | wc -l
36
# Total size
$ find /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -name '*.nc' -ls | awk 'BEGIN {tot=0} {tot += $7} END {print tot}'
49242847488
Moving the data

It's possible to list the operations that will be performed by drs_tool without executing them

$ drs_tool todo -I /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -p output -e piControl -i P3INS 2>&1 | tee /fs/PMIP3/esg/logs/P3INS_piControl_todo.`date +"%Y%m%d_%H%M"`.log

# Some output of the 'drs_tool todo' command
Publisher Tree pmip3.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1 todo for version 20120322

mkdir -p /fs/esg/PMIP3/output/P3INS/P3-mod/piControl/mon/ocean/Omon/r1i1p1/files/tos_20120322
[...]
mv /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl/mon/ocean/tos/r1i1p1/tos_Omon_P3-mod_piControl_r1i1p1_130001-134912.nc /fs/esg/PMIP3/output/P3INS/P3-mod/piControl/mon/ocean/Omon/r1i1p1/files/tos_20120322/tos_Omon_P3-mod_piControl_r1i1p1_130001-134912.nc
[...]
ln -s ../../files/tos_20120322/tos_Omon_P3-mod_piControl_r1i1p1_130001-134912.nc /fs/esg/PMIP3/output/P3INS/P3-mod/piControl/mon/ocean/Omon/r1i1p1/v20120322/tos/tos_Omon_P3-mod_piControl_r1i1p1_130001-134912.nc

The upgrade command will move the files to the appropriate location. Note that we do not specify a version number, because the default _ upgrade date as a version number _ is fine

$ drs_tool upgrade -I /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -p output -e piControl -i P3INS | & tee /fs/PMIP3/esg/logs/P3INS_piControl_upgrade.`date +"%Y%m%d_%H%M"`.log

# Command output
DRS Tree at /fs/esg/PMIP3
Upgrading pmip3.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1 to version 20120322 ... done 36

# If we run 'drs_tool list' again, we see that there is now a dataset
# with the version, and the number of files that were just moved
$ drs_tool list -I /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -p output -e piControl -i P3INS
DRS Tree at /fs/esg/PMIP3
pmip3.output.P3INS.P3-mod.piControl.mon.ocean.Omon.r1i1p1.v20120322  36:49242847488

We can optionally check by hand that the data files have been moved from the incoming directory

# Space used in the incoming directory before moving the files. And number of files
$ du -sh /fs/PMIP3/esg/incoming/P3INS
81G     /fs/PMIP3/esg/incoming/P3INS
$ find /fs/PMIP3/esg/incoming/P3INS -name '*.nc' | wc -l
66
$ find /fs/PMIP3/esg/incoming/P3INS/P3-mod/piControl -name '*.nc' | wc -l
36
# After the piControl files have been moved, we check that we indeed
# have 36 files less in the incoming directory
$ du -sh /fs/PMIP3/esg/incoming/P3INS
35G     /fs/PMIP3/esg/incoming/P3INS
$ find /fs/PMIP3/esg/incoming/P3INS -name '*.nc' | wc -l
30

Preparing a Gateway for receiving PMIP3 data

_ ... ask Stephen for details _

Creating the 'PMIP3' project

How? Where? (including a project description)

Creating a 'PMIP3' top level collection

... parent_id=pmip3_ _ <data_node> _ ...

Configuring who can publish PMIP3 data to the Gateway

The OpenID that will be used for publishing the data from the _ <data_node> _ Data Node to the Gateway has to be given a _ Data Publisher _ role on the Gateway.

How?

When this is done, the _ publishing _ user can check the membership by login on the Gateway, going to the Account tab and selecting the _ List Current Memberships _ option. The displayed result should look like

Group

Description

Status

Special Roles

_ <data_node> _ -publisher

Group with permissions to publish to the CMIP5_ _ <data_node> _ collection

Enrolled

Data Publisher

Setting up the Data Node for publishing PMIP3 data

Note: this part is hugely based on the LUCID example

Thredds and catalog configuration

  • Create the directory that will hold the PMIP3 project catalogs at and make sure the publishing user has write access to it

    Need to be root for the following commands

    $ mkdir /esg/content/thredds/pmip3 $ chown publishing-user:publishing-user pmip3

    Later, if the ownership of this directory and its files goes back

    to tomcat:tomcat, use the following to give it back to the publishing user

    $ chown -R publishing-user:publishing-user /esg/content/thredds/pmip3

  • Add a catalog reference to /esg/content/thredds/catalog.xml that points to the main location of the PMIP3 catalog. Check that _ catalog.xml _ belongs to the _ tomcat _ user and _ tomcat _ group

    PMIP3 reference in /esg/content/thredds/catalog.xml

    <catalog ... > ...

    Set the correct access rights (if needed)

    % chown tomcat:tomcat /esg/content/thredds/catalog.xml

Model list configuration

Add a line about the current PMIP3 model (and the other models performing paleo experiments) to the models' table specified in esg.pmip3.ini file

Get the current PMIP3 models' list from PMIP3_models_table.txt (possibly out of date) or from JeanyvesPeterschmitt

# Location of the models's table (specified in esg.pmip3.ini)
initial_models_table = /esg/config/esgcet/esgcet_models_table.txt

# Line(s) added to the models' table
# ... as many lines as there are models involved in PMIP3, distributed from this data node
# ...... also add the models involved in CMIP5 in case they run PMIP3-specific experiments
pmip3 | P3-mod | URL_or_empty| PMIP3 famous institute

Creation of the PMIP3 handler

  • Unless the PMIP3 handler is added to the standard esgcet _ egg _ , make sure you make a backup of the PMIP3 handler before updating a data node, and put it back in place after the update! Otherwise you will get an error message at the _ esgscan_directory _ stage

  • You can get the current PMIP3 handler and _ init _ files from pmip3_handler.py and init.py (possibly out of date) or from JeanyvesPeterschmitt

    Create a PMIP3 handler based on the existing CMIP5 handler

    $ cd /usr/local/cdat/lib/python2.6/site-packages/esgcet-2.10.1-py2.6.egg/esgcet/config $ cp ipcc5_handler.py pmip3_handler.py

    Edit pmip3_handler.py to set it up for PMIP3

    * replace ippc5 with pmip3 mostly, but warning there's a cmip5_product that needs to remain so!

    * replace 'monclim': '1 month' with 'monClim': '1 month'

    * add the references to the PMIP3 CMOR tables used for generating monClim data

    ('Aclim', 'LIclim', 'Lclim', 'OIclim' and 'Oclim')

    $ diff pmip3_handler.py ipcc5_handler.py

    1c1 < "Handle PMIP3/IPCC5 data file metadata"

    "Handle IPCC5 data file metadata" 20c20 < 'monClim': '1 month',


    'monclim': '1 month',
    

    40,41c40 < cmorTables = ['3hr', '6hrLev', '6hrPlev', 'Amon', 'LImon', 'Lmon', 'OImon', 'Oclim', 'Omon', 'Oyr', 'aero', 'cf3hr', 'cfDay', 'cfMon', 'cfOff', 'cfSites', 'day', 'fx', 'grids', 'noTable', < 'Aclim', 'LIclim', 'Lclim', 'OIclim', 'Oclim']

    cmorTables = ['3hr', '6hrLev', '6hrPlev', 'Amon', 'LImon', 'Lmon', 'OImon', 'Oclim', 'Omon', 'Oyr', 'aero', 'cf3hr', 'cfDay', 'cfMon', 'cfOff', 'cfSites', 'day', 'fx', 'grids', 'noTable'] 99c98 < class PMIP3Handler(BasicHandler):


    class IPCC5Handler(BasicHandler): 142,143c141,142 < result = (project_id[:5]=="PMIP3") < message = "project_id should be 'PMIP3'"


            result =  (project_id[:5]=="CMIP5")
            message = "project_id should be 'CMIP5'"
    

    262c261 < drsid = 'pmip3.%s.%s.%s.%s.%s.%s.%s.%s'%(result['product'], result['institute'], result['model'], result['experiment'], result['time_frequency'], result['realm'], result['cmor_table'], result['ensemble'])

            drsid = 'cmip5.%s.%s.%s.%s.%s.%s.%s.%s'%(result['product'], result['institute'], result['model'], result['experiment'], result['time_frequency'], result['realm'], result['cmor_table'], result['ensemble'])
    

    Add to init.py a reference to the PMIP3 handler so that it can be

    correctly imported and used

    $ diff init.py init.py.bak

    15d14 < from pmip3_handler import PMIP3Handler 22d20 < 'pmip3_builtin' : PMIP3Handler,

esg.ini configuration (esg.pmip3.ini)

  • Copy an existing valid esg.ini file (already set up for publishing to an existing gateway) and set it up for PMIP3. This PMIP3 specific esg.pmip3.ini file will be later used explicitly where needed by using the " _ -i esg.pmip3.ini _ " option.

    Use an existing esg.ini as a template and make sure it is readable

    by the user doing the publication and the tomcat group

    $ cd /esg/config/esgcet $ cp esg.ini esg.pmip3.ini

  • Notes about the [DEFAULT] section of _ esg.pmip3.ini _

Don't remove anything from thredds_dataset_roots , just add what you need. The publisher will dump catalogs from all missing entries while restarting the TDS if you do

[DEFAULT]
# Generate the checksums by default when creating
# the map files with esgpublish
checksum = md5sum | MD5

# All the projects listed in project_options  must be defined in
# a [section] of esg.pmip3.ini, otherwise esginitialize will generate an error
# If esg.pmip3.ini is used only for PMIP3... list only PMIP3 :-)
project_options =
        pmip3 |PMIP3| 1

thredds_dataset_roots =
     ... #don't delete anything you have had here (in other esg*.ini files)!
         # if you do those catalogs will get erased too!
        test_dataroot | /esg/data/test
         # We do not explicitely specify PMIP3 below
         # distributed PMIP3 data will be in /fs/esg/PMIP3, CMIP5 data in /fs/esg/CMIP5
        esg_dataroot | /fs/esg

# The creation of the thredds_root directory has been described above
thredds_root = /esg/content/thredds/pmip3
thredds_root_catalog_name = PMIP3 catalog
thredds_url = http://esgf-node.ipsl.fr/thredds/pmip3
  • Notes about the [project:pmip3] section of _ esg.pmip3.ini _ . This section must be updated each time we add new experiments ( _ experiment_options _ ) or new institutes ( _ institute_map _ and _ institute_options _ )

    [project:pmip3]

    Define the categories to be used for this project:

    name | category_type | is_mandatory | is_thredds_property | display_order

    categories = project | enum | true | true | 0 experiment | enum | true | true | 1 product | enum | true | true | 2 model | enum | true | true | 3 time_frequency | enum | true | true | 4 realm | enum | true | true | 5 cmor_table | enum | true | true | 6 ensemble | string | true | true | 7 institute | enum | true | true | 8 forcing | string | false | true | 9 title | string | false | true | 10 creator | enum | false | false | 11 publisher | enum | false | false | 12 creation_time | string | false | true | 13 format | fixed | false | true | 14 source | text | false | false | 15 drs_id | string | false | true | 16 description | text | false | false | 99 category_defaults = product | requested cmor_table_options = 6hrLev, 6hrPlev, Amon, LImon, Lmon, OImon, Omon, day, fx, Aclim, LIclim, Lclim, OIclim, Oclim dataset_id = pmip3.%(product)s.%(institute)s.%(model)s.%(experiment)s.%(time_frequency)s.%(realm)s.%(cmor_table)s.%(ensemble)s dataset_name_format = project=%(project_description)s, model=%(model_description)s, experiment=%(experiment_description)s, time_frequency=%(time_frequency)s, cmor_table=%(cmor_table)s, modeling realm=%(realm)s, ensemble=%(ensemble)s, version=%(version)s directory_format = /fs/esg/PMIP3/%(product)s/%(institute)s/%(model)s/%(experiment)s/%(time_frequency)s/%(realm)s/%(cmor_table)s/%(ensemble)s/%(version)s/%(variable)s experiment_options = pmip3 | 1pctCO2 | 1 percent per year CO2 pmip3 | historical | historical pmip3 | lgm | last glacial maximum pmip3 | midHolocene | mid-Holocene pmip3 | past1000 | last millennium pmip3 | piControl | pre-industrial control handler = esgcet.config.pmip3_handler:PMIP3Handler institute_map = map(model : institute) P3-mod | P3INS institute_options = P3INS las_configure = true las_time_delta_map = map(time_frequency : las_time_delta) yr | 1 year mon | 1 month day | 1 day 6hr | 6 hours monClim | 1 month fx | fixed maps = institute_map, las_time_delta_map

    id used for publishing the PMIP3 data to the receiving gateway

    parent_id = pmip3_ipsl

    All PMIP3 data goes to product 'output'

    product_options = output realm_options = atmos, ocean, land, landIce, seaIce, aerosol, atmosChem, ocnBgchem thredds_exclude_variables = a, a_bnds, alev1, alevel, alevhalf, alt40, b, b_bnds, basin, bnds, bounds_lat, bounds_lon, dbze, depth, depth0m, depth100m, depth_bnds, geo_region, height, height10m, height2m, lat, lat_bnds, latitude, latitude_bnds, layer, lev, lev_bnds, location, lon, lon_bnds, longitude, longitude_bnds, olayer100m, olevel, oline, p0, p220, p500, p560, p700, p840, plev, plev3, plev7, plev8, plev_bnds, plevs, pressure1, region, rho, scatratio, sdepth, sdepth1, sza5, tau, tau_bnds, time, time1, time2, time_bnds, vegtype time_frequency_options = yr, mon, day, 6hr, monClim, fx variable_per_file = true

Update the PMIP3 part of the node database

$ esginitialize -c -i /esg/config/esgcet/esg.pmip3.ini

INFO       2012-04-17 14:41:57,245 Upgrading schema to latest version.
INFO       2012-04-17 14:41:57,268 Initializing standard names ...
INFO       2012-04-17 14:42:00,781 Initializing projects, models, and experiments ...
...
INFO       2012-04-17 14:42:00,804 Adding model P3-mod for project pmip3
...

Specifying the access rights to the data

See ESGF_Access_Control

Publishing the data

At this stage, we consider that we have successfully used drs_tool to move the data we want to publish from its incoming location to its final resting place in /fs/esg/PMIP3

Generating the map files

We use esgscan_directory with the appropriate options to generate the map files.

esgscan_directory -i /esg/config/esgcet/esg.pmip3.ini --project pmip3 /prodigfs/esg/PMIP3/output/IPSL/IPSL-CM5A- LR/piControl/monClim/atmos/Aclim/r1i1p1/v20120417 > /home/esg- user/pmip3_publish/mapfiles/IPSL-CM5A-LR/piControl/pmip3.output.IPSL.IPSL- CM5A-LR.piControl.monClim.atmos.Aclim.r1i1p1.v20120417

INFO 2012-04-17 16:09:28,845 Running: md5sum /prodigfs/esg/PMIP3/output/IPSL /IPSL-CM5A-LR/piControl/monClim/atmos/Aclim/r1i1p1/v20120417/tas /tas_Aclim_IPSL-CM5A-LR_piControl_r1i1p1_180001-279912-clim.nc

pmip3.output.IPSL.IPSL-CM5A-LR.piControl.monClim.atmos.Aclim.r1i1p1.v20120417

pmip3.output.IPSL.IPSL-CM5A-LR.piControl.monClim.atmos.Aclim.r1i1p1 | /prodigfs/esg/PMIP3/output/IPSL/IPSL-CM5A- LR/piControl/monClim/atmos/Aclim/r1i1p1/v20120417/tas/tas_Aclim_IPSL-CM5A- LR_piControl_r1i1p1_180001-279912-clim.nc | 898444 | mod_time=1334407334.000000 | checksum=04d82758c12808a7ac5c57530b231e7c | checksum_type=MD5

Updating the Thredds catalog

esgpublish -i /esg/config/esgcet/esg.pmip3.ini --service fileservice --project pmip3 --new-version 20120417 --thredds --map ALL.MAPFILES.pmip3.IPSL.IPSL- CM5A-LR.piControl.v20120417 2>&1 | tee /home/esg- user/pmip3_publish/scripts/mapgen/OutScript/IPSL-CM5A- LR_piControl.v20120417.publish. date&#160;+"%Y%m%d_%H%M" .log

Publishing the data

$ mkdir /home/publishing-user/.globus
$ myproxy-logon -T -b -s openid_provider -l publish_user_login -o /home/publishing-user/.globus/certificate-file
Enter MyProxy pass phrase:
A credential has been received for user publish_user_login in /home/publishing-user/.globus/certificate-file.
Trust roots have been installed in /home/publishing-user/.globus/certificates/.

$ esgpublish -i /esg/config/esgcet/esg.pmip3.ini --project pmip3 --publish --noscan --map ALL.MAPFILES.pmip3.IPSL.IPSL-CM5A-LR.piControl.v20120417 2>&1 | tee /home/esg-user/pmip3_publish/scripts/mapgen/OutScript/IPSL-CM5A- LR_piControl.v20120417.publish_step2. date&#160;+"%Y%m%d_%H%M" .log INFO 2012-04-17 17:32:44,648 Publishing: pmip3.output.IPSL.IPSL-CM5A- LR.piControl.monClim.atmos.Aclim.r1i1p1, parent = pmip3_ipsl INFO 2012-04-17 17:32:48,079 Result: PROCESSING INFO 2012-04-17 17:32:51,222 Result: PROCESSING INFO 2012-04-17 17:32:54,366 Result: SUCCESSFUL

Publishing the data locally

Following can be run by the publishing (non-root) user

esgf-crawl -- [ http://esgf-node.ipsl.fr/thredds/pmip3/catalog.xml ](http ://esgf-node.ipsl.fr/thredds/pmip3/catalog.xml)

Testing

Replicating the information to other Gateways

More?

  • PMIP3 data usage statistics?
Clone this wiki locally