-
Notifications
You must be signed in to change notification settings - Fork 34
/
glossary.html
155 lines (140 loc) · 8.99 KB
/
glossary.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
<!DOCTYPE html>
<html>
<head>
<title>Dwbp Glossary</title>
<meta charset='utf-8'>
<meta content="width=device-width,initial-scale=1" name="viewport">
<script src='http://www.w3.org/Tools/respec/respec-w3c-common'
async class='remove'></script>
<script class='remove'>
var respecConfig = {
specStatus: "WG-NOTE",
shortName: "dwbp-glossary",
editors: [
{ name: "Yaso",
url: "mailto:[email protected]?subject=dwbp-glossary",
company: "Nic.br",
companyURL: "http://w3c.br/" }
],
noRecTrack: true,
previousMaturity: null,
previousPublishDate: null,
wg: "Data on the Web Best Practices Working Group",
wgURI: "https://www.w3.org/2013/dwbp",
wgPublicList: "public-dwbp-wg",
wgPatentURI: "http://www.w3.org/2004/01/pp-impl/68239/status",
};
</script>
</head>
<body>
<section id='abstract'>
<p>
This document is the reference glossary used by Data on The Web Working Group.
</p>
</section>
<section class='informative'>
<h2>Introduction</h2>
<p>
The deliverables of the Best Practices for Data on the Web include of documents that aim to to facilitate the work between data consumers and data publishers. To fulfill this mission, the WG decided to build a Glossary to ensure the common ground terms between pdata consumers and data publishers.
</p>
<p>There is a mental model listed to ensure that the scope is delimited.</p>
</section>
<section id='sotd'>
</section>
<section>
<h3>Dataset</h3>
<p>
A <dfn id="dataset"><abbr title="Dataset">dataset</abbr></dfn> is defined as a collection of data, published or curated by a single agent, and available for access or download in one or more formats. A dataset does not have to be available as a downloadable file.
</p>
</section>
<section>
<h3>Citation</h3>
<p>
A <dfn id="Citation"><abbr title="Citation">Citation</abbr></dfn> may be either direct and explicit (as in the reference list of a journal article), indirect (e.g. a citation to a more recent paper by the same research group on the same topic), or implicit (e.g. as in artistic quotations or parodies, or in cases of plagiarism)
</p>
<p>
From: <a href="http://www.essepuntato.it/lode/http://purl.org/spar/cito">CiTO</a>
</p>
</section>
<section>
<h3>Data Consumer</h3>
<p>
For the purposes of this WG, a <dfn id="data_consumer"><abbr title="Data Consumer">Data Consumer</abbr></dfn> is A person or group accessing, using, and potentially performing post-processing steps on data."
</p>
<p>From: Strong, Diane M., Yang W. Lee, and Richard Y. Wang. "Data quality in context." Communications of the ACM 40.5 (1997): 103-110. </p>
</section>
<section>
<h3>Data format</h3>
<p>
<dfn id="data-format"><abbr title="Data Format">Data Format</abbr></dfn> defined as a specific convention for data representation i.e. the way that information is encoded and stored for use in a computer system, possibly constrained by a formal data type or set of standards."</p>
<p>From <a href="http://guide.dhcuration.org/representation/">DH Curation Guide</a>
</p>
</section>
<section>
<h3>Data producer</h3>
<p>
<dfn id="data-producer"><abbr title="Data Producer">Data Producer</abbr></dfn> is a person or group responsible for generating and maintaining data.
</p>
<p>From: Strong, Diane M., Yang W. Lee, and Richard Y. Wang. "Data quality in context." Communications of the ACM 40.5 (1997): 103-110. </p>
</section>
<section>
<h3>Data representation</h3>
<p>
<dfn id="data-representation"><abbr title="Data representation">Data representation</abbr></dfn> is any convention for the arrangement of symbols in such a way as to enable information to be encoded by a data producer and later decoded by data consumers.">Data representation</dfn>
</p>
<p>
From <a href="http://guide.dhcuration.org/representation/">DH Curation Guide</a>
</p>
</section>
<section>
<h3>Feedback</h3>
<dfn id="feedback"><abbr title="feedback">Feedback</abbr></dfn> is a forum used to collect messages posted by consumers about a particular topic. Messages can include replies to other consumers. Datetime stamps are associated with each message and the messages can be associated with a person or submitted anonymously.
<p>
<a href="http://rdfs.org/sioc/spec/#sec-modules-types">SIOC</a>, (2) <a href="http://www.w3.org/TR/annotation-model/#motivations">Annotation#Motivation</a>
</p>
<p>
To better understand why annotation (See Annotation) was created <a href="http://www.w3.org/TR/skos-reference/">SKOS</a> is used to show inter-related annotation between communities with more meaningful distinctions than a simple class/subclass tree.
</p>
</section>
<section>
<h3>Data Preservation</h3>
<p><dfn id="data-preservation"><abbr title="Data Preservation">Data Preservation</abbr></dfn> is defined by <a href="http://www.alliancepermanentaccess.org/index.php/consultancy/dpglossary/#Preservation">APA</a> as "The processes and operations in ensuring the technical and intellectual survival of objects through time". This is part of a data management plan <a href="http://guide.dhcuration.org/preservation/">focusing on preservation planning and meta-data</a>. Whether it is worthwhile to put effort into preservation depends on the (future) value of the data, the resources available and the opinion of the stakeholders (= designated community)</p>
</section>
<section>
<h3>Data Archiving</h3>
<p>
<dfn id="data-archiving"><abbr title="Data Archiving">Data Archiving</abbr></dfn> is the set of practices around the storage and monitoring of the state of digital material over the years. </p>
<p>
These tasks are the responsibility of a Trusted Digital Repository (TDR), also sometimes referred to as <a href="http://tools.ietf.org/html/rfc4810">Long-Term Archive Service (LTA)</a>. Often such services follow the <a href="http://en.wikipedia.org/wiki/Open_Archival_Information_System">Open Archival Information System</a> which defines the archival process in terms of ingest, monitoring and re-use of data.
</p>
</section>
<section>
<h3>File Format</h3>
<p>
<dfn id="file-format"><abbr title="File Format">File Format</abbr></dfn> is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free and may be either unpublished or open.
<p>Examples of file formats: <a href="https://en.wikipedia.org/wiki/Text_file#.TXT">txt</a>, <a href="https://en.wikipedia.org/wiki/Portable_Document_Format">pdf</a>, <a href="https://en.wikipedia.org/wiki/Postscript">ps</a>,<a href="https://en.wikipedia.org/wiki/Audio_Video_Interleave">avi</a>, <a href="https://en.wikipedia.org/wiki/GIF">gif</a> or <a href="https://en.wikipedia.org/wiki/JPEG">jpg</a></p>
</p>
<p>From <a href="http://en.wikipedia.org/wiki/File_format">Wikipedia</a></p>
</section>
<section>
<h3>Machine Readable Data</h3>
<p>
<dfn id="machine-readable"><abbr title="Machine Readable Data">Machine Readable Data</abbr></dfn> are data formats that may be readily parsed by computer programs without access to proprietary libraries. For example <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a> and <a href="http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/#section-graph-syntax">RDF turtle family for graphs</a> are machine readable, but <a href="http://www.data.gov/developers/blog/primer-machine-readability-online-documents-and-data">PDF</a> and <a href="https://en.wikipedia.org/wiki/JPEG">Jpeg</a> are not.
</p>
<p> From <a href="http://www.w3.org/TR/ld-glossary/#vocabulary">Linked Data Glossary</a> </p>
</section>
<section>
<h3>Vocabulary</h3>
<p>
<dfn id="vocabulary"><abbr title="Vocabulary">Vocabulary</abbr></dfn> is A collection of "terms" for a particular purpose. Vocabularies can range from simple such as the widely used <a href="">RDF Schema</a>, <a href="http://xmlns.com/foaf/spec/">Foaf</a> and <a href="https://en.wikipedia.org/wiki/Dublin_Core#Dublin_Core_Metadata_Element_Set_Version_1.1">Dublin Core Metadata Element Set</a> to complex vocabularies with thousands of terms, such as those used in healthcare to describe symptoms, diseases and treatments. Vocabularies play a very important role in Linked Data, specifically to help with data integration. The use of this term overlaps with Ontology.</p>
<p>
From: <a href="http://www.w3.org/TR/ld-glossary/#vocabulary">Linked Data Glossary</a>
</p>
</section>
<section>
<h3>Structured data</h3>
<p>
<dfn id="structured-data"><abbr title="Structured Data">Structured Data</abbr></dfn> refers to data that conforms to a fixed schema. Relational databases and spreadsheets are examples of structured data.</p>
</section>
</body>
</html>