Friday, January 27, 2006

Terminology services 

OCLC Research - Terminology services: "The goal of OCLC’s terminology services project is to make the concepts in knowledge organization schemes and the relationships within and between schemes more accessible to people and computer applications. For example, if a hypothetical Web service provided access to the equivalent and related terms for concepts in LC Subject Heading records, it would be possible for software developers to create tools to improve Web searching. To test this hypothesis, go to your favorite search engine and search for the word ‘vog.’ Then modify your search to include the words ‘vog volcanic smog volcanic gases.’ The latter search, which includes variant and related terms from LCSH, will likely produce higher quality search results for materials about volcanic smog.

Before a Web service can be developed for a given knowledge organization scheme, it’s often necessary to preprocess the concept data. For some schemes, it’s necessary to convert the data from word processing documents or html pages to structured data formats, such as the MARC 21 formats for authority or classification data, or the SKOS core, an RDF schema for thesauri
and related knowledge organization schemes. Once a scheme is in a structured format, it can be enhanced in several ways. Typical enhancements include mappings to other schemes, the addition of persistent identifiers, and the addition of coding to track the origin of records and the sources of changes. The end products of these processes are XML files that can be used as the basis for terminology Web services.

Terminology services are Web services that involve various types of knowledge organization resources, including authority files, subject heading systems, thesauri, Web taxonomies and classification schemes. OCLC researchers have prototyped several experimental terminology services. One Web service that uses the Dewey Decimal Classification (DDC) provides access to the DDC summaries. The service returns captions, in four languages, for DDC numbers at the top three levels of the classification."

Thursday, January 26, 2006

Review of Ontological Semantics 

Review of Ontological Semantics: "Chapter 5 is a survey of formal ontology, an ancient subject that has become the latest hope for conferring interoperability on incompatible systems. Most of that work, however, has not been adapted to natural language processing. Work on lexical resources, such as WordNet, is only loosely connected to the work on formal ontology. Section 5.3 discusses 'the difficult and underexplored part of formal ontology, namely, the relations between ontology and natural language.' The most difficult problem, which the proponents of formal ontology fail to address, is the nature of ambiguities in natural languages. A good parser can enumerate syntactic ambiguities, and selectional constraints are usually sufficient to resolve most of them. The most serious ambiguities are the subtle variations in word senses (sometimes called microsenses), which change over time with variations in word usage or in the subject matter to which the words are applied. Such variations inevitably occur among independently developed systems and web sites, and attempts to legislate a single definition will not stop the growth and shift of meaning. From their long experience with NL processing, Nirenburg and Raskin probably have a deeper understanding of the nature of ambiguity than the proponents of the Semantic Web. Section 5.4 is a wish list of features from formal ontology that NL processors would need. Providing them is still a major research problem."

OBSERVER: An Approach for Query Processing in Global Information Systems based on Interoperation across Pre-existing Ontologies 

OBSERVER: An Approach for Query Processing in Global Information Systems based on Interoperation across Pre-existing Ontologies (PDf)
"Abstract
There has been an explosion in the types, availability and volume of data accessible in an information system, thanks to the World Wide Web (the Web) and related inter-networking technologies. In this environment, there is a critical need to replace or complement earlier database integration approaches and current browsing and keyword-based techniques with concept-based approaches. Ontologies are increasingly becoming accepted as an important part of any concept or semantics based solution, and there is increasing realization that any viable solution will need to support multiple ontologies that may be independently developed and managed. IN particular, we consider the use of concepts from pre-existing real world domain ontologies for describing the content of the underlying data repositories. The most challenging issue in this approach is that of vocabulary sharing, which involves dealing with the use of different terms or concepts to describe similar information. In this paper, we describe the architecture, design and implementation of the OBSERVER system. Brokering across the domain ontologies is enabled by representing and utilizing interontology relationships such as (but not limited to) synonyms, hyponyms, hypernyms" across terms in different ontologies. User queries are rewritten by using these relationships to obtain translations across ontologies. Well established metrics like precision and recall based on the extensions underlying the concepts are used to estimate the loss of information, if any.

The Chatty Web: Emergent Semantics Through Gossiping 

The Chatty Web: Emergent Semantics Through Gossiping: "Abstract

This paper describes a novel approach for obtaining semantic interoperability among data sources in a bottom-up, semi-automatic manner without relying on pre-existing, global semantic models. We assume that large amounts of data exist that have been organized and annotated according to local schemas. Seeing semantics as a form of agreement, our approach enables the participating data sources to incrementally develop global agreement in an evolutionary and completely decentralized process that solely relies on pair-wise, local interactions: Participants provide translations between schemas they are interested in and can learn about other translations by routing queries (gossiping). To support the participants in assessing the semantic quality of the achieved agreements we develop a formal framework that takes into account both syntactic and semantic criteria. The assessment process is incremental and the quality ratings are adjusted along with the operation of the system. Ultimately, this process results in global agreement, i.e., the semantics that all participants understand. We discuss strategies to efficiently find translations and provide results from a case study to justify our claims. Our approach applies to any system which provides a communication infrastructure (existing websites or databases, decentralized systems, P2P systems) and offers the opportunity to study semantic interoperability as a global phenomenon in a network of information sharing parties."

This page is powered by Blogger. Isn't yours?