Table of Contents
This document provides an overview of two RDF vocabularies which together enable the use of RDF for content labelling with the ICRA scheme. The two vocabularies are the generic Content Labelling Vocabulary which provides a mechanism for describing a content labelling system; and the ICRA content labelling vocabulary which describes the ICRA content labelling scheme.
The first two sections of this document describe how the generic vocabulary for defining labelling schemes is constructed and how to apply the vocabulary to defining a new labelling scheme. If you are only interested in applying the ICRA labelling scheme in RDF, you can skip the first part of this document and go straight to the section called “Using the ICRA labelling vocabulary.”.
The Content Labelling Vocabulary (namespace https://icra.org/labellingv01/rdfs/#) provides a simple vocabulary for the description of a labelling scheme. A labelling scheme consists of one or more categories which group together related content descriptors and zero or more modifiers which provide further context for a label. Together, these are referred to in this document as the components of a labelling scheme.
In terms of the ICRA vocabulary, "Violence" is a category, "Deliberate injury to human beings" is a descriptor, and "Material appears in a sports context" is a modifier.
The Content Labelling Vocabulary defines a small set of classes and properties that are the basis for defining labelling schemes. A labelling scheme such as the ICRA scheme is created by defining instances of these classes and using the properties to define the relationships between those instances.
Full URI. https://icra.org/labellingv01/rdfs/#contentLabel
Description. An instance of this class is a single descriptive label for content which may be applied to one or more web resources.
Properties. The following properties may be specified for a contentLabel instance:
hasModifier specifies the modifiers for the content label.
Any subproperty of the the descriptor property.
Full URI. https://icra.org/labellingv01/rdfs/#category
Description. A category is a grouping of related content descriptors. In ICRA, these groupings are thematic, but this is not a constraint on category instances in general.
Properties.
hasDescriptor specifies the descriptors which make up this category.
Full URI. https://icra.org/labellingv01/rdfs/#descriptor
Description. A descriptor defines a single form of content which may or may not be present in a resource. When labelling web resources, a descriptor is used as a property of the content label that it applies to. This means that a descriptor has a range of allowed values. The Content Labelling Vocabulary does not restrict the allowed range of values.
Full URI. https://icra.org/labellingv01/rdfs/#hasDescriptor
Description. This property connects a category to the descriptors that make up that category. It can be used by applications to quickly list what all the possible descriptors for a category are.
Full URI. https://icra.org/labellingv01/rdfs/#modifier
Description. A modifier provides context for a content label as a whole. Each content labelling scheme may define its own set of modifiers.
Full URI. https://icra.org/labellingv01/rdfs/#hasModifier
Description. This property connects an instance of the modifier class to the contentLabel that it modifies.
Full URI. https://icra.org/labellingv01/rdfs/#applicationRule
Description. An applicationRule defines a processing rule that should be used to determine which URL(s) a content label applies to. Such rules are required in situations where the web resources do not carry their own content labels. This class is used as a base class for a number of more specific types of rule. New labelling systems may choose to introduce new types of rules by subclassing from applicationRule. An applicationRule may contain one or more oneOf, allOf and not properties. If more than one such property is present, the applicationRule must be processed as the disjunction of the results of processing each property.
Properties.
oneOf adds a logical OR of the applicationRules that are the object of the property to the applicationRule that is the subject of the property.
allOf adds a logical AND of the applicationRules that are the object of the property to the applicationRule that is the subject of the property.
not adds a logical NOT of the applicationRule that is the object of the property to the applicationRule that is the subject of the property.
Full URI. https://icra.org/labellingv01/rdfs/#oneOf
Description. A property of a applicationRule that indicates that the rule applies when any one of the applicationRules that are the object of this property apply. This produces a conjunction of the object applicationRules.
Full URI. https://icra.org/labellingv01/rdfs/#allOf
Description. A property of a applicationRule that indicates that the rule applies only if all of the applicationRules that are the object of this statement apply. This produces a disjunction of the object applicationRules.
Full URI. https://icra.org/labellingv01/rdfs/#not
Description. A property of a applicationRule that indicates that the rule DOES NOT apply if the applicationRule that is the object of this statement applies.
Full URI. https://icra.org/labellingv01/rdfs/#beginsWith
Description. A subclass of applicationRule which matches against the start of a web resource's URI.
Properties.
value the substring to match at the start of the resource URI.
Full URI. https://icra.org/labellingv01/rdfs/#endsWith
Description. A subclass of applicationRule which matches against the end of a web resource's URI.
Properties.
value the substring to match at the end of the resource URI.
Full URI. https://icra.org/labellingv01/rdfs/#contains
Description. A subclass of applicationRule which matches against a substring of a web resource's URI.
Properties.
value the substring to match anywhere withing the resource URI.
Full URI. https://icra.org/labellingv01/rdfs/#matches
Description. A subclass of applicationRule which matches against a web resource's URL using a Perl-style regular expression.
Properties.
value the regular expression to match against the resource URI.
Full URI. https://icra.org/labellingv01/rdfs/#value
Description. This is a property of the beginsWith, endsWith, contains, and matches application rules. It contains the substring or regular expression string that is used by the rule.
Full URI. https://icra.org/labellingv01/rdfs/#hasContentLabel
Description. This is a property that links an applicationRule to the contentLabel that labels resources that match the rule.
This section describes how to apply the Content Labelling Vocabulary to create a specific labelling scheme.
The Content Labelling Vocabulary makes use of basic RDF functionality for identifying, naming and describing the components that make up a labelling scheme.
Each component of the scheme is assigned an ID. This ID, when combined with the base URL of the RDF resource that describes the scheme, gives a unique URI identifier for the component. For example, the ICRA labelling scheme is defined by the resource with the base URI https://icra.org/ratingsv03/rdfs/# and the descriptor for "Unmoderated user-generated content" is currently defined in that resource with the ID "cb", so the full identifier for the descriptor is https://icra.org/ratingsv03/rdfs/#cb.
Each component should always be assigned a short name. This should be a name suitable for display in a user interface and should be consumer-oriented in nature. A good short name would be "Violence" or "Injury to animals", a bad short name would be "vb" or "vz". RDF provides a mechanism for these short names by using the rdfs:label property. A component can have any number of rdfs:label property values, although it is STRONGLY recommended that they should be distnguished from each other using an xml:lang attribute and that there should be only one label per language.
Example 1. An example of a short name
<label:category rdf:ID="nx"> <rdfs:label xml:lang="en">Nudity</rdfs:label> ... </label:category>
A component may also be assigned a longer description that might be displayed to a user as pop-up help text. For this description, use the RDF-defined rdfs:comment property. Again, multiple rdfs:comment labels may be provided, but should be distinguished by language using the xml:lang attribute.
Example 2. An example of a short description
<label:category rdf:ID="nx"> <rdfs:label xml:lang="en">Nudity</rdfs:label> <rdfs:comment xml:lang="en"> Erections or female genitals in detail, Male genitals, Female genitals, Female breasts, Bare buttocks </rdfs:comment> </label:category>
Finally, a component may also contain a link to another web resource that provides a much more detailed description. For this link, use the RDF-defined rdfs:seeAlso property. The value of this property MUST be an RDF resource URI.
Example 3. An example of a reference to a longer description
<label:category rdf:ID="nx"> <rdfs:label xml:lang="en">Nudity</rdfs:label> <rdfs:comment xml:lang="en"> Erections or female genitals in detail, Male genitals, Female genitals, Female breasts, Bare buttocks </rdfs:comment> <rdfs:seeAlso rdf:resource="https://icra.org/vocabulary/#hn"/> </label:category>
Define Categories
Each category in a labelling scheme has the identifier, name and descriptions described above, and a list of the descriptors that are part of that category. The descriptors are linked to the category using the label:hasDescriptor property. As there is a list of descriptors, and we want the list to be closed (i.e. no more can be added to the list without modifying our vocabulary file), we specify the hasDescriptors property value as a collection.
Each descriptor must be defined as being a subPropertyOf the descriptor property from the Content Labelling Vocabulary.
Example 4. Example of a Category Definition
<!-- Nudity category --> <label:category rdf:ID="nx"> <rdfs:label xml:lang="en">Nudity</rdfs:label> <rdfs:comment xml:lang="en"> Erections or female genitals in detail, Male genitals, Female genitals, Female breasts, Bare buttocks </rdfs:comment> <rdfs:seeAlso xml:lang="en" rdf:resource="https://icra.org/vocabulary/#hn"/> <label:hasDescriptor rdf:parseType="Collection"> <rdf:Property rdf:ID="na"> <rdfs:label>Exposed breasts</rdfs:label> <rdfs:comment>...</rdfs:comment> <rdfs:seeAlso rdf:resource="https://icra.org/ratingsv03/descriptions/#na"/> <rdfs:subPropertyOf rdf:resource="&label;descriptor"/> </rdf:Property> <rdf:Property rdf:ID="nb"> <rdfs:label>Bare buttocks</rdfs:label> <rdfs:comment>...</rdfs:comment> <rdfs:seeAlso rdf:resource="https://icra.org/ratingsv03/descriptions/#na"/> <rdfs:subPropertyOf rdf:resource="&label;descriptor"/> </rdf:Property> ... <rdf:Property rdf:ID="nz"> <rdfs:label>No nudity</rdfs:label> <rdfs:comment>...</rdfs:comment> <rdfs:seeAlso rdf:resource="https://icra.org/ratingsv03/descriptions/#na"/> <rdfs:subPropertyOf rdf:resource="&label;descriptor"/> </rdf:Property> </label:hasDescriptor> </label:category>
Define Modifiers
Each modifier is simply defined as an instance of the label:modifier class. Modifiers should be defined with names and descriptions as described above, but there is no need to define any other properties for a modifier.
Define any new Application Rule types
A new type of application rule is simply defined by creating a new subclass of the applicationRule class. Each new rule introduced SHOULD be well-documented to enable its implementation in label processing clients.
Labelling scheme designers should consider carefully the need to introduce new application rules. Each new rule introduced can only be successfully used if all labelling-processing client applications implement the rule.
This section covers the creation of content labels using the ICRA-defined labelling scheme.
A content label consists of two principle components:
a list of descriptor properties, and
a list of modifiers.
Under the ICRA labelling scheme, each descriptor property MUST have a value that is a valid boolean as defined by W3C XML Schema Part 2: Datatypes (this allows the values '0', '1', 'false' and 'true') and there SHOULD be at least one descriptor from each ICRA-defined category. Descriptors are listed as RDF properties of the contentLabel resource.
Modifiers are simply present or not present in a content label and no value is associated with them. If a modifier is present in a label, then the modifier applies. Modifiers are added to a content label using the hasModifier property.
Example 6. A simple label with descriptors and modifiers
<label:contentLabel rdf:ID="siteLabel"> <i:cz>1</i:cz> <i:lz>1</i:lz> <i:nz>1</i:nz> <i:oz>1</i:oz> <i:sz>1</i:sz> <i:vz>0</i:vz> <label:hasModifier><i:s/></label:hasModifier> </label:contentLabel>
The label:contentLabel tag tells the processor that this is an RDF resource of type contentLabel (from the namespace https://icra.org/labellingv01/rdfs/#). The rdf:ID attribute on the label:contentLabel element enables this label to be selected from an XML file containing multiple labels, the value must be unique within the XML file.
The elements i:cz to i:vz specify the values of descriptors defined by the ICRA scheme. The namespace i should be the URI https://icra.org/ratingsv03/rdfs/#, so these XML tags actually represent the descriptors https://icra.org/ratingsv03/rdfs/#cz (chat), https://icra.org/ratingsv03/rdfs/#lz (language) etc.
The label:hasModifier element contains a single modifier represented by the i:s tag. This indicates that the modifier identified by the URI https://icra.org/ratingsv03/rdfs/#s (sports context) applies to this label.
There are a number of ways in which a label may be applied to a resource, in many cases it is simplest to include a reference to the label either within the resource itself or within the HTTP response header generated by the server when it provides the resource. However, in some cases it is either necessary or more efficient to define a catalog of labels and the rules that a client should use to apply those labels to resources. This can be achieved using the applicationRule construct and its subclasses in conjunction with the hasContentLabel property.
The following example show how an application rule can contain the label to be to applied to any resource that matches that rule.
Example 7. Example of a content label with application rules
<label:startsWith> <label:value>https://icra.org</label:value> <label:hasContentLabel> <label:contentLabel rdf:ID="siteLabel"> <i:cz>1</i:cz> <i:lz>1</i:lz> <i:nz>1</i:nz> <i:oz>1</i:oz> <i:sz>1</i:sz> <i:vz>1</i:vz> <label:hasModifier><i:s/></label:hasModifier> </label:contentLabel> </label:hasContentLabel> </label:startsWith> <label:hasModifier><i:s/></label:hasModifier> </label:contentLabel>
It is also possible to separate the labels from the lists of resources to which the labels apply. This can be useful for two reasons. Firstly, it allows different people to be responsible for defining the labels and applying them to resources and allows the job of labelling a large site to be split amongst many people while still using a single consistent set of labels. Secondly, it is envisaged that label-processing clients will process a set of application rules in a top-to-bottom manner looking for the first rule that matches their situation. In such a case, separating the labels from the rules means that it should never be necessary to repeat a label just to ensure that rules are applied in the right order.
It is also possible to use the rule-combining properties oneOf, allOf, and not to specify logical combinations of resource matching rules that the processor will use to determine if a label applies to a particular resource. In the following example, the oneOf property is used to specify a list of matches to receive the "advert" label.
Example 8. Application rules separated from labels
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:label="https://icra.org/labellingv01/rdfs/#" xmlns:i="https://icra.org/ratingsv03/rdfs/#"> <label:contentLabel rdf:ID="defaultContentPage"> <i:cz>1</i:cz> <i:lz>1</i:lz> <i:nz>1</i:nz> <i:oz>1</i:oz> <i:sz>1</i:sz> <i:vz>1</i:vz> </label:contentLabel> <label:contentLabel rdf:ID="advert"> <i:cz>1</i:cz> <i:lz>1</i:lz> <i:nz>1</i:nz> <i:oz>0</i:oz> <i:sz>1</i:sz> <i:vz>1</i:vz> </label:contentLabel> <label:applicationRule> <label:oneOf> <label:startsWith> <label:value>http://a.tribalfusion.com</label:value> </label:startsWith> <label:startsWith> <label:value>http://m.tribalfusion.com</label:value> </label:startsWith> </label:oneOf> <label:hasContentLabel rdf:resource="#advert"/> </label:applicationRule> <label:matches> <label:value>*</label:value> <label:hasContentLabel rdf:resource="defaultContentPage"/> </label:matches> </rdf:RDF>
If the labels were defined in another file (e.g. at http://www.example.com/labels.rdf), then the application rules file would appear as follows:
Example 9. Application rules in a separate file
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:label="https://icra.org/labellingv01/rdfs/#" xmlns:i="https://icra.org/ratingsv03/rdfs/#"> <!-- These rules are processed first --> <label:applicationRule> <label:oneOf> <label:startsWith> <label:value>http://a.tribalfusion.com</label:value> </label:startsWith> <label:startsWith> <label:value>http://m.tribalfusion.com</label:value> </label:startsWith> </label:oneOf> <label:hasContentLabel rdf:resource="http://www.example.com/labels.rdf#advert"/> </label:applicationRule>> <!-- Then everything else gets a default label --> <label:matches> <label:value>*</label:value> <label:hasContentLabel rdf:resource="http://www.example.com/labels.rdf#defaultContentPage"> </label:matches> </rdf:RDF>