Wed 2 Nov 2005
During lunch break, Steve Diggs asked me “Why are ontologies important?” to which I aptly responded: “They aren’t”.
I explained the theory of a folksonomy, an emerging vocabulary set resulting from a bottom-up process in which members of a community freely choose keywords to their liking. A folksonomy is self-evolving, and provides an accurate model of the dynamic world we are trying to describe. This makes more sense to me than an ontology, which attempts to break everything into distinct categories from a top-down perspective.
Some sites that are based on folksonomies are (surprise!) delicious and Flickr. In fact, even Google’s search engine page-rank algorithm is based on a folksonomy. Instead of Yahoo!’s old approach of categorizing the web, Google ranks pages by popularity. But how do they know which sites are popular?…. they get that data straight from us! All Google does is aggregate existing data and perform algorithms to determine a site’s popularity, and thus, it’s rank order for search results.
The same logic applies to tagging for Delicious and Flickr. The more times one tag is used for the same object, the more meaningful that tag becomes. Statistical analysis can then be performed to determine which tags are frequently used and can relate like tags together.
An ontology serves a purpose only when it’s needed in a controlled environment. Building an ontology makes sense when all factors are considered and recognized. Software agents built on ontologies will run faster and more efficiently.
However, the world is not controlled. Scientific data is not controlled. Building an ontology here just doesn’t seem to make sense.
3 Responses to “Why are ontologies important?”
Leave a Reply
You must be logged in to post a comment.


November 2nd, 2005 at 5:56 pm
hmmmm, as a former(?) official ontology resistance movement member, i completely understand the need and use for local solutions to large-scope issues. i understand that the complexity of *science* seems to lend itself towards a free-form homogeneous ooze of relationships and that top-down imposed order can create resentment for those of us on the lower levels of the implementation chain. ontologies also narrow the scope of the community allowed to participate, and since science is a part of all of our lives, is this necessary and proper?
i see a danger of using a folksonomy (as i understand it from the sites you listed as well as a quick perusal of wikipedia and google hits) to pinpoint and describe scientific relationships in that there is no weighting of opinions. if popular preference and democratic tagging would (in an extreme example) weigh the opinions of a third grader and a domain expert on equal territory, what purpose would the folksonomy be filling? an artist would likely see, catagorize, describe and outwardly relate a satellite image or a whale song much differently than a physical oceaongrapher or biologist. and while all of those opinions have importance in their different realms of understanding, we must consider the purpose served by each of the classifications. statistical analysis can help filter, but if there are more artists or thrid graders in the world than physical oceanographers, what will the statistics show?
secondly, while science is indeed largely uncontrolled and undefined, i believe there IS structure and even hierarchy and data control (like QAQC, etc) on a local level. scientific domains are broken into categories to narrow our research scope to something we can handle cognitively, ie. earth science vs. astronomy, geology vs. oceanography, physics vs. biology. it is true that as humans learn more and broaden our horizons to interdisciplinary work some of these categories blend or become related, but in my opinion this does not mean the structure we already have in place breaks down, it just means that there are more relationships to consider and more ways to compare data collected. i think science on a small scale is tightly controlled even, i think the issue is that a lack of communication and community agreement inhibits the spread of that environment.
fwiw, i am (slowly and against my own desire to stay in my comfort zone) coming to the realization that ontologies are a logical next step for research science. however, i believe it IS possible to create a functioning ontology from the bottom-up as i think we have already begun with our dictionaries and database mappings. i think units are an excellent place to start, and in combining/including attribute qualifiers in preparation for creating a controlled vocabulary for attributes will bring us that much closer to ontologies. once we have working vocabularies we can start mapping them with other communities and work up from there (possibly meeting a top-down approach somewhere in the middle, who knows?). in this way, the people creating and controlling the input have some knowledge of what they are talking about, have a vested interest in making sure relationships are thought out, flexible and appropriate. there IS qualitative reasoning and logic that should have a say in what is related and how for the purposes of research and exploration, and i think an ontology (as daunting a prospect as it is) inherently and correctly encomasses that logic. allowing a larger community to map out our world irrespective of their education, understanding, or experience may well have some purpose and use in terms of finding a website, etc, i just cannot see the application in terms of scientific understanding or data use.
for the record, my answer to the initial question is: ontologies are important because they provide a means to bridge gaps in our historically specialized and limited data collections. since earth systems rarely (if ever) exist without influence from other systems, we can blend and merge previously pigeonholed data in order to better understand, model and predict the behavior of those systems.
did i hurt your eyes with my spelling errors?
November 20th, 2005 at 11:59 am
The participants of the discussion Ontology vs Folksonomy should ponder Postel’s Prescription:
‘’Be liberal in what you accept, and conservative in what you send.'’
Jonathan Postel was the author of some of the protocols that made TCP/IP the clear winner over a plethora of other protocols.
I’d translate his prescription more like: accept it in any format, and have one fixed format for storage, and interaction with other processes.
In fact, this seems a good forum for us to look at my proposal for points in time at http://humu.ucsd.edu/~garrod/y10k/
1132516759
Sun Nov 20 11:59:19 PST 2005
20051120.195919
November 21st, 2005 at 12:12 pm
Hi Chris,
Thanks for registering and posting your thoughts. (If you’d ever like to post a story, let me know).
In regards to the ontology vs. folksonomy debate… My mindset keeps changing on the topic, but the main point I have taken away thus far is:
Ontologies are useful for providing structure.
Folksonomies are useful for providing semantics.
Not that ontologies don’t also provide some semantics. It’s true that ontologies provide a “base” semantic rule for structuring and organizing objects. Think of the Dewey Decimal System. That’s a strict set of rules for defining how to categorize a book, and subsequently, where it is stored in the library. However, this system provides no means of showing relations among books with similar topics, especially if they are of different genres (fictional vs. non-fiction, etc.).
This is where a folksonomy helps. It allows the users to apply relations to objects in such a way that the relations are ever-changing and evolving. Folksonomies do a better job at modelling the community’s current trends and interests, and allow the flexibility of a user to search for an object by a user-defined topic rather than from a set of pre-defined categories.
In regards to the quote “Be liberal in what you accept, and conservative in what you send”, I feel this accurately portrays the power of both ontologies and folksonomies.
To be liberal in what it accepts, a system must accept any type of search parameter and be able to return meaningful results. However, for a system to return results that are meaningful, it must be a context-aware system that learns from the users. Hence a bottom-up folksonomy.
To be conservative in what is sends, a system must have a pre-defined notion of the organization of its objects. After accepting a liberal search parameter, the system must return to the user a conservative method (e.g. Dewey Decimal) for finding those objects (of course, only objects with meaningful results). Hence a top-down ontology.
***
Chris, I read your interesting proposal about storing dates as numbers. It’s an intriguing idea, but I honestly can’t find any reason why to implement it.
The format of …yyyyymmdd.hhmmsssss… still neglects providing other meaningful information such as time zone, daylight savings time, day of week, etc. Additionally, the numeric format is deceiving for algebraic functions in that it is not a pure decimal number. Not all digit places will have values from 0-9.
What is interesting about your model is that allows for both dates and times to extend infinitely without bounds. Years can go on forever, and times (seconds) can be defined with as much preciseness as needed. This is a conceptually great idea to remember, and certainly needed for any reasonable date/time solution. I’m just not sure pseudo-decimal “numbers” are the best way to do it.