The structure of a file containing ICRA labels
ICRA labels are held in a special file, usually called labels.rdf. This file is effectively broken down into sections to provide filters and other clients with the information they need. The best way to explain this is by examining the (fictitious) example below.
Section 1
|
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:label="http://www.w3.org/2004/12/q/contentlabel#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:icra="https://icra.org/rdfs/vocabularyv03#">
|
Section 2
|
<rdf:Description rdf:about="">
<dc:creator rdf:resource="https://icra.org" />
<label:authorityFor>https://icra.org/rdfs/vocabularyv03#
</label:authorityFor>
</rdf:Description>
|
Section 3
|
<label:Ruleset>
<label:hasHostRestrictions>
<label:Hosts>
<label:hostRestriction>example.org</label:hostRestriction>
<label:hostRestriction>example.com</label:hostRestriction>
</label:Hosts>
</label:hasHostRestrictions>
<label:hasDefaultLabel rdf:resource="#label_1" />
|
Section 4
|
<label:rules rdf:parseType="Collection">
<rdf:Description>
<label:hasURI>photography</label:hasURI>
<label:hasLabel rdf:resource="#label_2"/>
</rdf:Description>
<label:UnionOf>
<label:hasURI>guestbook</label:hasURI>
<label:hasURI>messages</label:hasURI>
<label:hasLabel rdf:resource="#label_3" />
</label:UnionOf>
</label:rules>
</label:Ruleset>
|
Section 5
|
<label:ContentLabel rdf:ID="label_1">
<rdfs:comment>Label for all/most of website</rdfs:comment>
<rdfs:label>No nudity, no sexual content, no violence, no
potentially offensive language, no potentially harmful
activities, no user-generated content</rdfs:label>
<icra:nz>1</icra:nz>
<icra:sz>1</icra:sz>
<icra:vz>1</icra:vz>
<icra:lz>1</icra:lz>
<icra:oz>1</icra:oz>
<icra:cz>1</icra:cz>
</label:ContentLabel>
<label:ContentLabel rdf:ID="label_2">
<rdfs:comment>Label for photography section</rdfs:comment>
<rdfs:label>Exposed breasts, Bare buttocks, No sexual
content, no violence, no potentially offensive language,
no potentially harmful activities, no user-generated
content, This material appears in an artistic
context</rdfs:label>
<icra:na>1</icra:na>
<icra:nb>1</icra:nb>
<icra:sz>1</icra:sz>
<icra:vz>1</icra:vz>
<icra:lz>1</icra:lz>
<icra:oz>1</icra:oz>
<icra:cz>1</icra:cz>
<label:hasModifier><icra:xa /></label:hasModifier>
</label:ContentLabel>
<label:ContentLabel rdf:ID="label_3">
<rdfs:comment>Label for guestbook and message board</rdfs:comment>
<rdfs:label>No nudity, no sexual content, no violence, no
potentially offensive language, no potentially harmful
activities, user-generated content
(moderated)</rdfs:label>
<icra:nz>1</icra:nz>
<icra:sz>1</icra:sz>
<icra:vz>1</icra:vz>
<icra:lz>1</icra:lz>
<icra:oz>1</icra:oz>
<icra:ca>1</icra:ca>
</label:ContentLabel>
</rdf:RDF>
|
Section 1
The first section declares information about how the data is encoded. The last item (xmlns:icra="https://icra.org/rdfs/vocabularyv03#"), for instance, declares that there are ICRA labels present. The other declarations refer to web standards and methods that may be used by any labelling scheme.
Tech note: the first two XML name spaces used are the standard declarations for RDF and RDF Schema. The "label" namespace is a schema for using RDF for content labelling. Although hosted on w3.org, this is not currently part of any W3C Recommendation.
Back to example
Section 2
This short section declares that the labels were created by ICRA and that further information is available at icra.org
Back to example
Section 3
This section declares the websites for which the data is valid. In this instance, we have declared that the labels can be applied to both example.org and example.com. It also declares that the default content label for material on those hosts is "label 1" (see section 5).
Tech note: we actually specify a host rather than a domain, since this is generally what is required. Any and all subdomains of the declared host are within scope and may be matched by the rules that follow.
Back to example
Section 4
We now declare the rules that determine where the default label should be overridden by another label. In this example, everything in the photography section of both example.com and example.org will be associated with "label no. 2," everything with either the word guestbook or messages in the URL will be associated with label 3. Otherwise, the default applies.
If a website doesn't have its own domain name but is part of a package provided by an ISP (something like www.isp.com/~username) then label no. 1 would only be associated with the user's own area, not the whole of the ISP's domain. This is why the first question asked in the label generator is "please enter the address of your homepage" - the label generator works out what it needs to from this to make sure the label only covers what is intended.
Tech note: matching is done using Perl 5 regular expressions so that if a rule should apply to "all URLs ending in .jpg" then this would appear as \.jpg$. If it is necessary to restrict the labels to a path on the given hosts then this is given separately in a hasURI property of the rule set.
Back to example
Section 5
Finally we declare the labels themselves. In the example, label 2 declares that there are exposed breasts, bare buttocks, and that the material appears in an artistic context. Label 3 declares that there is moderated user-generated content and label 1 states "none of the above" in all categories of the ICRA vocabulary.
Back to example
Powered by |
|
|
data:image/s3,"s3://crabby-images/48012/4801247fcf75c9094e33cd14995957e1668d93c0" alt="Powered by Kingston Communications" |
data:image/s3,"s3://crabby-images/b7e02/b7e021a4973c496a949fdb5adeb047b4494613d4" alt="ICRA" |
|