Friday, June 16, 2006

Anatomy of a Reference 

I have been musing about some of the unsolved issues with the semantic web: One is how best to use URIs to reference objects that are not accessible on the web, such as people, or cars. Another is how to make such URI references self-defining. A method is needed to publish URI references in a decentralized way without reliance on context. This would allow the semantic web to grow rapidly. But it turns out that these are hard problems. These are some of my musings as I consider what is involved.

Suppose I look at a blue glass on my kitchen counter. I perceive a pattern of light and recognize it. The pattern of impulses coming from my retina recall the cause, a blue glass. I want it. You are standing near it. I search my internal lexicon for words associated with a blue glass. I find some words. I utter a request for you to hand me "that blue glass". You hear the sounds. They enter your brain and you associate patterns with them. The sound of those words brings to mind certain invariant forms, blue and glass. Then you scan the environment. You search your current perceptual input patterns. You select from all the patterns entering your brain those patterns which you associate with the invariant forms blue and glass. You see the blue glass on the counter. In the current context, you notice nothing else which is blue and a glass. You then correctly construe my words as referring to that blue glass.

Now something in the environment is the focus of both my attention and of your attention as well. Our brains are in some sort of synchrony. The universe is more ordered than it was. All due to the phase transition that occurred as a result of our shared semantics of my utterance of those sounds, "that blue glass", in a certain context, at a certain time.

What has happened here? First, I recalled words that I associate with an object that I perceive in my environment. Next, I spoke (or published) those words in a way that you could perceive them. I call this the "perception to word problem".

Then you, attending to my words, interpreted them as naming a class of things (drinking glasses) and an attribute of things (blue). Using these concepts, you recall perceptions you have had before of that class of things (glasses) with that attribute (blue). You then tried to match those recalled perceptions with your current perceptions of the scene, looking for a match. I call this the "word to perception problem".

Combined with certain inferences about the scene, you then construed my intention to have you focus your attention on that blue glass. And by this, you interpreted my reference. You identified the thing, in the current context, that I was referring to. Actually, it can be far more complex than this, but lets leave with that for now. Can we get software to do this? For some things this is easy, for others, seemingly impossible.

Now I am interested in the semantic web and so I want to create a URI to refer to that blue glass, instead of using the English language phrase, "that blue glass". And I want it to do so in any context, at any time, so that it can be used from anywhere on the web. And I want the reference to be machine interpretable. Further, I want the URI to identify that blue glass, in addition to referring to it. How can I do that?

Clearly the URI of the photograph ( is not enough. For starters, you would need to know the full context. That includes the address of the house in which the image was taken, and the time as well. Even then, the image resource there contains many objects, a blue bottle, a clear glass, etc. I also need to describe, or point to the specific object, that blue glass, that I want you to focus on.

Not only that, but I have several blue glasses just like that one. It is not easy for me to identify any one of them, they all are basically interchangeable to me. And the manufacturer may have made millions of them, all basically the same to the naked eye. How can I identify that one on the counter uniquely?

Upon closer examination, it appears that each glass is distinguished by small defects on the bottom. Of course, I have only looked at 2 or 3 instances out of possibly millions of copies. But the random appearing nature of these defects make it look like they will be unique to each glass. These then are the glasses 'fingerprints', a unique characteristic that can be used to identify each glass. Now if each URI returns an image of its fingerprint, we can use that to identify each glass. This idea has many problems. For example, it would require expertise, probably human, and training to learn to read the marks and associate the proper URI with them. Its main advantage is that you don't have to alter the object itself in any way.

Perhaps it would be better to name them all, with a serial number, for example. But how about the URI instead! I can attach the URI, to act as a name, or a serial number, to the bottom of each glass. Each of my glasses could have a different URI attached to it, thatBlueGlass-1, thatBlueGlass-2. Then each URI would identify each glass. We don't even need web-access to use the URI for identification in this way. And it would solve the perception to word problem since a human could pick up the glass and find the proper URI to use to discuss it. On the other end, given the URI, a human could search each object to find one that had that URI on the bottom, solving the word to perception problem. Finally, if the URI were encoded as a bar code, even a machine could do it. This is essentially what is done with package shipping labels, except here we are using URI.

Having a fingerprint, or a serial number, or a proper name, is still not enough. A fingerprint provides uniqueness, but it doesn't tell us what thing is unique. For example, I could use these URI to identify the little manufacturer's mark that can be seen on the bottom of each glass, rather than to identify the whole drinking glass as a drinking glass. So it seems I would need to make explicit the class of things each instance identified belongs to. In words, "A blue drinking glass identified by URI:".

How about 'tags'? For example, here is a search from of photos tagged with "blue" and "glass". There are 13,235 photos retrieved with this search. The URI of this collection is "".


Prior Art


"We are in the habit, I take it, of positing a single idea or form in the case of the various multiplicities to which we give the same name. Do you not understand?" "I do." "In the present case, then, let us take any multiplicity you please; for example, there are many couches and tables." "Of course." "But these utensils imply, I suppose, only two ideas or forms, one of a couch and one of a table." "Yes." "And are we not also in the habit of saying that the craftsman who produces either of them fixes his eyes on the idea or form, and so makes in the one case the couches and in the other the tables that we use, and similarly of other things? For surely no craftsman makes the idea itself. How could he?" "By no means."
Plato, Republic X, page 596a

David Hume

"This convention is not of the nature of a promise: For even promises themselves, as we shall see afterwards, arise from human conventions. It is only a general sense of common interest; which sense all the members of the society express to one another, and which induces them to regulate their conduct by certain rules. I observe, that it will be for my interest to leave another in the possession of his goods, provided he will act in the same manner with regard to me. He is sensible of a like interest in the regulation of his conduct. When this common sense of interest is mutually expressed, and is known to both, it produces a suitable resolution and behaviour. And this may properly enough be called a convention or agreement betwixt us, though without the interposition of a promise; since the actions of each of us have a reference to those of the other, and are performed upon the supposition, that something is to be performed on the other part. Two men, who pull the oars of a boat, do it by an agreement or convention, though they have never given promises to each other. Nor is the rule concerning the stability of possession the less derived from human conventions, that it arises gradually, and acquires force by a slow progression, and. by our repeated experience of the inconveniences of transgressing it. On the contrary, this experience assures us still more, that the sense of interest has become common to all our fellows, and gives us a confidence of the future regularity of their conduct: And it is only on the expectation of this, that our moderation and abstinence are founded. In like manner are languages gradually established by human conventions without any promise. ..." - A Treatise of Human Nature, Chapter 74 by David Hume

John Locke

"...Semeiotike, or the doctrine of signs; the most usual whereof being words, it is aptly enough termed also Logike, logic: the business whereof is to consider the nature of signs, the mind makes use of for the understanding of things, or conveying its knowledge to others. For, since the things the mind contemplates are none of them, besides itself, present to the understanding, it is necessary that something else, as a sign or representation of the thing it considers, should be present to it: and these are ideas. And because the scene of ideas that makes one man's thoughts cannot be laid open to the immediate view of another, nor laid up anywhere but in the memory, a no very sure repository: therefore to communicate our thoughts to one another, as well as record them for our own use, signs of our ideas are also necessary: those which men have found most convenient, and therefore generally make use of, are articulate sounds. The consideration, then, of ideas and words as the great instruments of knowledge, makes no despicable part of their contemplation who would take a view of human knowledge in the whole extent of it. And perhaps if they were distinctly weighed, and duly considered, they would afford us another sort of logic and critic, than what we have been hitherto acquainted with." - AN ESSAY CONCERNING HUMAN UNDERSTANDING by John Locke 1690