WordNet linksWednesday, October 8th, 2008 5:29 pm
Using the Legacy 2.0 - 3.0 sense maps provided by WordNet, I was able to obtain the corresponding version 2.0 synsetid. These are directly mapped to the RDF representation in a download available at the W3C.
- 41 synsets in 3.0 combined synsets from 2.0. These are represented as oguid:identical statements to each 2.0 URI.
- 138 concepts in 3.0 were split from muddled 2.0 synsets. These were a problem because two Open GUIDs ended up with relations to the same 2.0 URI. Because of the transitive nature of oguid:identical, this declared the Open GUIDs to be identical. For most of these, I unlinked the Open GUID that least matched the 2.0 gloss. Eight of them were merged back together, because they only differed by being a slang or local term for the same concept.
- 2896 concepts were brand new. Since there is not an RDF representation of 3.0, these are maintained with a fictional URL to the W3C, approximately what it would be if they were to publish a new version.
- 37 links had a low map quality score and are not the same concept. These were unlinked.
A good example of the result of this mapping is a station wagon. It is the combination of two WordNet 2.0 synsets, and additionally has a merged relation. The term ’shooting brake’ is it’s own synset in 3.0, but only differs as a regional word usage. The 3.0 mapping is maintained for completeness, but not hyperlinked because an RDF representation does not exist.