-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
233 lines (171 loc) · 9.48 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
IVOA Registry-in-a-Box (RiaBox)
This package provides a simple implementation of a IVOA Publishing
Registry that is compliant with the harvesting portion IVOA Registry
Interfaces standard, v1.0
(http://www.ivoa.net/Documents/latest/RegistryInterfaces, section 3).
It has been adapted from the "OAI-PMH2 XMLFile data provider" Perl
library by Hussein Suleman, developed at Virginia Tech
(www.dlib.vt.edu) and, thus, is fully compliant with the OAI-PMH v2.0
standard (http://www.openarchive.org/).
-----------------------------------------------------------------
Summary of how it works
-----------------------------------------------------------------
It assumes that the records are stored as flat XML files in a
directory, organized into sub-directories according to OAI sets.
These records can be served directly in their native format or
transformed on the fly into another format using XSLT. It can be
configured completely with configuration files.
The typical practice is to store files natively in the VOResource
metadata v1.0 standard (www.ivoa.net/documents/latest/VOResource).
(Note, however, that some users of this library have hacked the code to
retrieve records directly from a local database.) The Dublin Core
format is required by the OAI standard; thus, a stylesheet to do the
conversion is found in the conf directory.
-----------------------------------------------------------------
Installation
-----------------------------------------------------------------
Summary of Installation:
1. Download xsltproc and install the xsltproc tool
2. Unpack the RofR distribution.
3. Install the cgi-bin directory into your web server
4. Configure and Run
Here are the details...
1. Download xsltproc and install the xsltproc tool
xlstproc is used to transform resource description records into other
formats, namely OAI-Dublin Core.
This package contains in the bin directory a precompiled version of
the xsltproc tool built for a RedHat Linux system, but most likely you
will have to rebuild this your system. To check, just execute
bin/xsltproc without arguments; if it spits out a help message, you
are probably good to go.
If you need to rebuild, download it from
http://xmlsoft.org/XSLT/xsltproc2.html and follow the instructions
found there.
It can be installed anywhere, but if you do not put it in the bin
directory, you will have to update the configuration file
accordingly (see step 3.).
2. Unpack the RofR distribution.
The RofR directory can be placed anywhere. Consider this location the
"installation directory".
3. Install the cgi-bin directory into your web server
The easiest way to do this is to place a link in your server's cgi-bin
directory that points to RiaBox's cgi-bin directory.
For example, if you installed RiaBox as /opt/apps/RiaBox and your web
server is installed in /opt/apps/httpd, then you might type:
cd /opt/apps/httpd/cgi-bin
ln -s /opt/apps/RiaBox/cgi-bin riabox
Then the OAI interface is accessible via something like:
http://yourserver.org/cgi-bin/riabox/oai.pl
4. Configure and Run
If you xsltproc is installed in the RiaBox/bin directory and your web
server has a link to RiaBox/cgi-bin (as described in step 3.), then you
are ready to run. Otherwise, you will need to consult the next
section on configuring the package.
-----------------------------------------------------------------
Configuration
-----------------------------------------------------------------
---++ Installing the CGI-BIN directory
The first thing the OAI script, cgi-bin/oai.pl, does is to change into
the directory where RiaBox is installed; this allows it to easily find
all of its various files. If you do not use the link trick, then you
may need to edit the oai.pl script.
In this case, you can update oai.pl:
o edit the "chdir" line: remove the "$FindBin::Bin/.." and enter
the path to the RiaBox directory into double quotes.
o if the conf directory is not found in the directory the script
changes into, edit the "my $OAI = ..." line by changing
"conf/config.xml" to the path to the configuration file.
If you have to edit this path, then it is likely that you will
have to similarly edit the paths found within this file.
---++ Editing the config.xml file
Tips:
o order of the parameters does not matter (but <metadata>
sub-parameters must be kept together).
o avoid extraneous spaces inside XML tags.
Here are the meanings of the various parameters:
<repositoryName> The title of the repository, exposed via the OAI
Identify verb.
<adminEmail> The administrator's email address; exposed via the OAI
Identify verb.
<recordLimit> The maximum number of records that will be
exported in one OAI call; if more are records are
available a resumption token is issued to the
user.
<datadir> the directory containing the VOResource
descriptions to expose.
<deletedRecord> The policy for handling deleted records; in the
IVOA context, this should be kept set to
persistant. See the OAI standard
(www.openarchives.org) for more details.
<filematch> A perl regular expression that XML files
containing VOResource descriptions must match in
order to be exported.
<confdir> The directory containing the configuration files,
setnames.xml and identity.xml.
<metadata> parameters for supporting different export formats
<prefix> the format identifier as accepted by the OAI
metadataPrefix parameter.
<namespace> the XML namespace for the format.
<schema> the location of the XML schema for the format.
This is needed by xsltproc.
<transform> the command for transforming VOResource records
into other formats. This command must accept a
VOResource record via standard input and send
the output format to standard output.
---++ Changing the Identity description
The OAI Identity operation will expose the VOResource record for this
Registry. This record is stored as a VOResource XML file called
identity.xml in the configuration directory (see <confdir>). Because
this description is exported with all the other resource descriptions
by other OAI operations, this file by default is a link that points to
the proper record in the data directory (see <datadir>). To change
this record just edit the file pointed to by this link.
You can added addition descriptions in any XML format; just call them
identity*.xml (e.g. identity2.xml) and place them in the conf
directory.
---++ Changing Set Descriptions and Membership
The OAI standard allows records to be organized into "sets" that allow
harvesters to retrieve only the portion of the records that falls into
specified set. RiaBox, for example, supports a special set mandated
by the IVOA Registry Interfaces standard called ivo_managed (see
conf/setnames-sample.xml for details). This package allows you to
configure the description and membership of sets; this is done via the
setnames.xml file and/or various _set_ files. The description is
subsequently exposed via the OAI ListSets operation.
First note that in order to define a set, there must be a directory in
the data directory (see <datadir>) whose name matches the set
identifier (i.e. that a harvester passes to an OAI operation via the
set parameter). If the directory doesn't exist, the set will not be
supported. That directory, though, can be empty. Subdirectories
define sub-sets, too. If no configuration file describing a set
exists, the OAI service will provide a trivial description. When a
client requests a particular set, the OAI service, by default, will
return all of the records found in the directory of the same name (and
those in its subdirectories, recursively).
A set can be more fully described either with a setnames.xml file in
the configuration directory (see <confdir>) or with a special file
called _set_ which resides in the set directory itself. Both use
essentially the same XML format for its data. If both exist, the
description in the setnames.xml file takes precedence.
The setnames.xml file contains a root element <setnames> which
contains inside it a <set> element for each set to be described. The
_set_ file contains just a single <set> element. Here are the
meanings of the elements it contains (in any order):
<spec> the identifier for the set (used by harvesters via the
set= parameter). _set_ files should *not* include
this element.
<name> a longer title for the set indicating the contents
<description> a longer description of the set and why it might be
used.
<includedIn> This adds the members of this set to the set named in
this element.
Notes:
o A <set> description in setnames.xml overrides the one found in
the set directory's _set_ file.
o Do include a <spec> element in a _set_ file. If it does include
one, it must match the set name for the directory it lives in or
it will otherwise be ignored.
o Start a _set_ file with a singe <set> element. It's actually
okay if it starts with a wrapping <setnames> element with a
single <set> child element; however, if multiple <set> elements
appear, you will probably get unexpected results.