Saturday, September 11, 2004

A survey of XML standards: Part 4 

A survey of XML standards: Part 4"The world of XML is vast and growing, with a huge variety of standards and technologies that interact in complex ways. It can be difficult for beginners to navigate the most important aspects of XML, and for users to keep track of new entries and changes in the space. XML is a basic syntax upon which you develop local and global vocabularies. Uche Ogbuji has presented the most important standards relating to XML in three in-depth articles. In this fourth article, he provides a detailed cross-reference of all the covered standards."

Swoogle 

Swoogle"Swoogle provides data service that discovers, digests, analyzes and indexes semantic web documents as well as portal service for the semantic web researchers. It is carried out by the ebiquity research group at UMBC"

Webware for Python 

Webware for Python

Friday, September 10, 2004

Gmane -- Mail To News And Back Again 

Gmane -- Mail To News And Back Again

Free software is mainly developed on mailing lists. Mailing lists have many advantages over other forms of communication, but they have two weaknesses: It's difficult to follow discussions in a sensible way, and mailing list archives (when they exist) have a tendency to disappear over time.

Several mailing list archives exist, but these are all hidden under a web interface. Reading mail that way is not convenient. Reading mail as if it were news is convenient.

This is what Gmane offers. Mailing lists are funneled into news groups. This isn't a new idea; several mail-to-news gateways exist. What's new with Gmane is that no messages are ever expired from the server, and the gateway is bidirectional. You can post to some of these mailing lists without being subscribed to them yourself."



Named Graphs API for Jena 

"The Named Graphs API for Jena (NG4J) is an extension to the Jena Semantic Web framework for parsing, manipulating and serializing sets of Named Graphs.

This is just a preview. NG4J V0.1 will be released 09/15/04.

The features of NG4J V0.1 are:

Thursday, September 09, 2004

SourceForge.net: Project Info - Open Workbench 

SourceForge.net: Project Info - Open Workbench

HelpOnAccessControlLists - MoinMaster 

HelpOnAccessControlLists - MoinMaster is a page on my Wiki that will let me control pages on my public web site. I can limit access to only those I provide with a name and password account.

Open Lexicon Interchange Format (OLIF) 

Open Lexicon Interchange Format (OLIF)
What is OLIF?
OLIF, the Open Lexicon Interchange Format, is a user-friendly vehicle for exchanging terminological and lexical data.

Why choose OLIF?
Designed for users of language technology, OLIF is an open, XML-compliant standard that can streamline the exchange of terminological and lexical data. With its flexible design and representative array of terminological and linguistic features, OLIF can help the user address language data management needs ranging from basic terminology exchange to managing lexicons for natural language processing (NLP) systems, such as machine translation.

How was OLIF designed? Who supports OLIF?
OLIF was designed by members of the OLIF Consortium, an organization of major NLP technology suppliers, corporate users of NLP, and research institutions. Headed by SAP, the OLIF Consortium designed and released the official version 2 of OLIF and provides support for its implementation by users worldwide.


Experiments in Automatic Word Class and Word Sense Identification for Information Retrieval 

Experiments in Automatic Word Class and Word Sense Identification for Information Retrieval
"Abstract

Automatic identification of related words and automatic detection of word senses are two long-standing goals of researchers in natural language processing. Word class information and word sense identification may enhance the performance of information retrieval systems. Large online corpora and increased computational capabilities make new techniques based on corpus linguistics feasible. Corpus-based analysis is especially needed for corpora from specialized fields for which no electronic dictionaries or thesauri exist. The methods described here use a combination of mutual information and word context to establish word similarities. Then, unsupervised classification is done using clustering in the word space, identifying word classes without pretagging. We also describe an extension of the method to handle the difficult problems of disambiguation and of determining part-of-speech and semantic information for low-frequency words. The method is powerful enough to produce high-quality results on a small corpus of 200,000 words from abstracts in a field of molecular biology."

Using Third-Party Drivers with WebLogic Server 

Using Third-Party Drivers with WebLogic Server: "http://otn.oracle.com/software/content.html."

Stephen's Web Administration ~ Search Results 

Stephen's Web Administration ~ Search Results has some interesting ideas about Google.

Theoretical Perspective for Link Counting 

Theoretical Perspective for Link Counting
"The web site of the book, "Link Analysis: An Information Science Approach" by Mike Thelwall, to be published by Academic Press in 2005. The web site will be finished by the time the book is in print. Here are some of the web site contents: Updated URLs, Instructions for link analysis and related techniques (mostly in Parts IV and V), Links to relevant resources and additional information, An expanded glossary"

The nature of meaning in the age of Google. Google, Indexing, Web, Meaning 

The nature of meaning in the age of Google. Google, Indexing, Web, Meaning
"Abstract

The culture of lay indexing has been created by the aggregation strategy employed by Web search engines such as Google. Meaning is constructed in this culture by harvesting semantic content from Web pages and using hyperlinks as a plebiscite for the most important Web pages. The characteristic tension of the culture of lay indexing is between genuine information and spam. Google's success requires maintaining the secrecy of its parsing algorithm despite the efforts of Web authors to gain advantage over the Googlebot. Legacy methods of asserting meaning such as the META keywords tag and Dublin Core are inappropriate in the lawless meaning space of the open Web. A writing guide is urged as a necessary aid for Web authors who must balance enhancing expression versus the use of technologies that limit the aggregation of their work."

But especially important, consider this quote, "Google's continued success depends on its ability to collect unaffected Web content, which means that it must avoid the single individual's assertion of meaning. This strategy implies that any metadata scheme for the Web that promotes the meaning assertion of a single Web author (i.e., My Web page means this) will be avoided by aggregators. The strategy of aggregation, the enlistment of Web authors as lay indexers, and the temptation of bad faith points to the importance of maintaining the ignorance of lay indexers."

This makes Google the worst enemy of the Semantic Web. It implies that Google's success depends on ignoring the metadata added by web authors to their own pages.

Wednesday, September 08, 2004

The Stony Brook Algorithm Repository 

The Stony Brook Algorithm Repository

Enterprise 

Enterprise
"OVERVIEW

Budgetted at over £2.6 million, the Enterprise project is the UK government's major initiative to promote the use of knowledge-based systems in enterprise modelling, aiming to support organisations effectively in the Management of Change. The project focused on management innovation and the strategic use of IT to help manage change. It supports the use of enterprise modelling methods which capture various aspects of how a business works and how it is organised. The aim of enterprise modelling is to obtain an enterprise-wide view of an organisation which can then be used as a basis for taking decisions. During the Enterprise project, the Enterprise Toolset was developed. The Toolset uses executable process models to help users to perform their tasks. It is implemented using an agent-based architecture to integrate off-the-shelf tools in a plug-and-play style. The approach of the Enterprise project addresses the key problems of communication, process consistency, impacts of change, IT systems, and responsiveness."

BEA Systems - Downloads - Product Selection Page 

BEA Systems - Downloads - Product Selection Page

SKOS-Core 1.0 Guide 

SKOS-Core 1.0 Guide

Ontology Definition MetaModel 

Ontology Definition MetaModel

Macromedia 

Macromedia

ItcWiki - Home Page 

ItcWiki - Home Page

LSI - Latent Semantic Indexing Web Site 

LSI - Latent Semantic Indexing Web Site

SenseClusters Package 

SenseClusters Package
"Synopsis

SenseClusters is a complete Word Sense Discrimination system that takes users from preprocessing of raw text to actual discrimination that involves selection of most discriminating features, context representations, clustering, followed by extensive analysis and performance evaluation."

The Semantics of Ontology Alignment 

The Semantics of Ontology Alignment (PDF)
"Abstract

Ontology alignment is a foundational problem area for semantic interoperability. We discuss the complexity faced by automated alignment solutions and describe an ontology-based approach for describing and evaluating alignments."

The Ontology Alignment Source 

The Ontology Alignment Source
"Purpose of this Website

Ontology alignment is the automated resolution of semantic correspondences between the representational elements of heterogenous sytems. Ontology alignment (including ontology/schema matching/mapping) is a critical technical challenge for the dynamic semantic integration of information resources as well as for ontology-mediated cognitive agent learning.

The Ontology Alignment Source supports the ontology alignment research community by providing a forum that promotes the sharing and comparison of research data. This site provides access to tools, test data, and metrics for ontology alignment algorithm development and evaluation. We also publish ontology alignment experiment data contributed by members of the research community. This will allow for comparson of various ontology alignment opproaches, aided by visualization tools."

Tuesday, September 07, 2004

MathPlayer: Download and Installation 

MathPlayer: Download and Installation

Patterns in Unstructured Data 

Patterns in Unstructured Data

RDF Data 

RDF Data

Open Source Applications Foundation 

Open Source Applications Foundation

Re: [VM,ALL] Revised VM Task Force description from Jeremy Carroll on 2004-06-24 (public-swbp-wg@w3.org from June 2004) 

Re: [VM,ALL] Revised VM Task Force description from Jeremy Carroll on 2004-06-24 (public-swbp-wg@w3.org from June 2004)

[VM,ALL] Revised scope statement from Thomas Baker on 2004-06-13 (public-swbp-wg@w3.org from June 2004) 

[VM,ALL] Revised scope statement from Thomas Baker on 2004-06-13 (public-swbp-wg@w3.org from June 2004)

Recommendations for Documentation of Published Subjects - ISSUES 

Recommendations for Documentation of Published Subjects - ISSUES

mediagods RGB to HEX color picker 

mediagods RGB to HEX color picker

Taking the RDF Model Theory Out For a Spin 

Taking the RDF Model Theory Out For a Spin
"Abstract. Entailment, as defined by RDF's model-theoretical seman-
tics, is a basic requirement for processing RDF, and represents the kind
of semantic interoperability" that RDF-based systems have been antic-
ipated to have to realize the vision of the Semantic Web". In this paper
we give some results in our investigation of a practical implementation
of the entailment rules, based on the graph-walking query mechanism of
the Wilbur RDF toolkit."

Ontological Closure 

Re: Proposed issue: What does using an URI require of me and my software? from Tim Berners-Lee on 2003-09-26 (public-sw-meaning@w3.org from September 2003)
"cwm will operate in a number of modes with respect to looking
stuff up as a function of the URIs loaded into the knowledge base.

cwm --closure=flags

where flags are a combination of
p - look up predicates
s - look up subjects
o - look up objects
t - look up the object only where the predicate is rdf:type

An interesting mode is cwm --closure=pt.
Lets call this "ontological closure".
It is a reasonable thing to do, as it adds to the KB the
machine-readable
information which is assumed shared by the writer and reader of
a document. If people do this a lot, then it useful to write
documents whose
ontological closure is of a manageable size. This is the case with any
real RDF files I've tried. It is what you might expect - people define
ontologies using ontologies but only to a limited level.

Contrast with --closure=spo which pulls in the whole contiguous
semantic web starting at the given document. This is not practical.
This is interesting, as it highlights a difference between p and s and
o
not only in the spec but in the topology of the web."

Porting Wordnets to the Semantic Web 

Porting Wordnets to the Semantic Web
"Abstract

Wordnets are valuable resources both as lexical repositories and as sources of ontological distinctions. This documents presents a framework and workplan for porting wordnets to Semantic Web languages, like RDFS and OWL. Some phases are distinguished, and preliminary resources are referenced."

Monday, September 06, 2004

CollabNet - Collaborative Software Development On Demand 

CollabNet - Collaborative Software Development On Demand

Semantic Web Technologies 

Semantic Web Technologies

This page is powered by Blogger. Isn't yours?