Dead Reckoning - the new album, out 23 March 2007. Click to learn more!

Ontology design patterns?

I played around with Semantic Web stuff a lot lately and while trying to build ontologies I came across problems which seemed like they could be solved with general design patterns. (Note: it gets a bit technical from here on so you need to know the backgrounds.)

Range and domain

The first thing that bothered me when reading the RDF Schema specification was that if you define a range for a property (that is: resources of which classes occur as objects of a property?) or the domain for a property (that is: resources of which classes occur as subjects of a property?) then every time you use that property the respective resources have to be of the given classes. This is fine and exactly what you want to achieve when defining one class as range or domain. However when you define several classes then the type of the resources which the property points to/from is all of those classes. Let's take a look at an example:

ex:property	rdfs:range	ex:Class1 .
ex:property	rdfs:range	ex:Class2 .
ex:Resource1	ex:property	ex:Resource2 .

Now ex:Resource2 is of type ex:Class1 and of type ex:Class2 (it is perfectly valid in RDF for resources to be instances of different types/classes). Note: it is of those types by definition now, as if we had added appropriate triples to state those types. But what we probably wanted to express was that ex:property is only allowed for instances of ex:Class1 or ex:Class2. Well, the first thing we have to say good-bye to is "not allowed to". There is nothing like that in RDF, no-one can prevent you from describing a resource the way you want to and publish that description. The "and vs. or" thing however is what I want to solve with my first design pattern.

One solution that comes to mind is using rdf:Alt since alternatives are a natural "or". That would look like this:

ex:property	rdfs:range	_:a .
_:a		rdf:type	rdf:Alt .
_:a		rdf:_1		ex:Class1 .
_:a		rdf:_2		ex:Class2 .

Two problems with this: first, the range isn't semantically defined to be used like this in the RDF Schema specification, so software interpreting those triples would infer that _:a is of type rdfs:Class and that all resources ex:property points to are of type _:a. Also rdf:Alt always prefers the first entry which isn't what we want.

Instead the solution I'm proposing is a rather simple convention. Even though we don't want ex:Class1 and ex:Class2 to have anything in common, we will define them as sub-classes of a conventional "or" class, solely for the purpose of using it for the range (or domain).

ex:Class1	rdfs:subClassOf	ex:Class1OrClass2 .
ex:Class2	rdfs:subClassOf	ex:Class1OrClass2 .
ex:property	rdfs:range	ex:Class1OrClass2 .

That's it. As long as no-one adds other sub-classes to ex:Class1OrClass2 or uses it as a type for resources (you can't prevent people from publishing dodgy additional descriptions about your ontology anyway) then all resources ex:property points to will either be of type ex:Class1 or ex:Class2. The cool thing is that our classes can still be defined as sub-classes of other classes in our model which make more sense (since RDF Schema also explicitly permits multiple inhertance). And we can still declare our two classes to be disjoint. The only downside I can see is that we are adding an artificial relation where there should be none (but in the end, everything is a resource anyway).

Separate interoperability file

The second thing that bothered me was that no-one was trying to stay in OWL-DL with their ontology. However I wanted to but I also wanted to import some other ontologies so that I could connect and map my concepts to them. I figured that the best way to do so was to define all my terms and concepts in one file and put all the interoperability stuff with other ontologies into another file. The xml:base for both files is the same which basically means they define parts of the same ontology.

The main file contains the needed descriptions inside of owl:Ontology but no imports (we don't need to import the RDF (Schema) or OWL definitions). Everyone who wants to do reasoning over the core terms in my ontology but doesn't care about relations to terms of other ontologies (rare case, eh?) can use that file.

The interoperability file has all the imports inside of owl:Ontology and nothing else. It provides relations to terms from the imported ontologies as well as restrictions of them when used with my terms. It imports the main file, so everyone who needs the interoperability can use just this file. Also the definitions in the interoperability file itself are inside of OWL-DL, so in case OWL-DL-valid versions of the imported ontologies get released, I just have to adjust the URLs and everything becomes OWL-DL-valid.

Update (2007-10-17): On the IRC channel of the Semantic Web Interest Group someone pointed out the obvious way of how to realise an "or" for range and domain: use owl:unionOf. Doh! Well I'm still learning this stuff. :-) And while reading the specs is one thing, you also need to work with it to get a clear picture about everything. But that's just normal, right?




Add a comment Send a TrackBack