Archive for the 'General' Category

Irish Blog Awards voting open

You can now vote for your desired winners in the Irish Blog Awards to be held next month… Congratulations to our own Ina O’Murchu who made the shortlist for Best Technology Blog!

Adding Structure to Blog Posts

(Originally posted by John Breslin on the IIA Blog.)

As you probably know (since you’re reading this!), blogs are [usually open access] websites which contain periodic time-stamped posts (in reverse chronological order) about a particular genre or touching on a number of topics of interest. They range from individual’s online diaries or journals to promotional tools used by companies or political campaigns, and many allow public commenting on their posts. They are also starting to cross the generation gap - your kids might have a blog on Bebo, you may blog yourself and your parents could be reading or commenting on your posts.

The growth and takeup of blogs over the past four years has been dramatic, with a doubling in the size of the blogosphere every six or so months (according to statistics from Technorati). Over 100,000 blogs are created every day, working out at about one a second. Nearly 1.5 million blog posts are being made each day, with over half of bloggers contributing to their sites three months after the blog’s creation.

Similar to accidentally wandering onto message boards and web-enabled mailing lists, when you’re searching for something on the Web, you may often happen across a relevant entry on someone’s blog. RSS feeds are also a useful way of accessing information from your favourite blogs, but they are usually limited to the last 15 entries, and don’t provide much information on exactly who wrote or commented on a particular post, or what the post is talking about. Some approaches like SIOC aim to enhance the semantic metadata provided about blogs, forums and posts, but there is also a need for more information about what exactly a person is writing about. If you’re searching for particular information in or across blogs, it’s often not that easy to get it because of “splogs” (spam blogs) and the fact that the virtue of blogs so far has been their simplicity - apart from the subject field, everything and anything is stored in one big text field for content. Keyword searches may give some relevant results, but useful questions such as “find me all the restaurants that bloggers reviewed in Dublin with a rating of at least 5 out of 10″ cannot be posed, and you cannot easily drag-and-drop events or people or anything (apart from URLs) mentioned in blog posts into your own applications.

I’m going to talk about two approaches to tackle this issue of adding more information to posts, so that queries can be made and the things that people talk about can be reused in other posts or applications, because not everyone is being served well by the lowest common denominator that we currently have in blogs. The first is called structured blogging and the second semantic blogging. (I’ll cover semantic blogging in my next installment…)

“Structured blogging” is an open source community effort that has created tools to provide microcontent (including microformats like hReview) from popular blogging platforms such as WordPress and Moveable Type. In structured blogging, packages of structured data are becoming post components. Sometimes (not all of the time) you will have a need for more structure in your posts - if you know a subject deeply, or if your observations or analyses recur in a similar manner throughout your blog - then you may best be served by filling in a form (which has its own metadata and model) during the post creation process. For example, you may be writing a review of a film you went to see, or a report on a sports game you attended, or a guide to tourist attractions you saw on your travels. Not only do people get to express themselves more clearly, but blogs can start to interoperate with enterprise applications through the microcontent that is being created in the background.

Let’s say that someone (or a group of people) is reviewing some soccer games that they watched. Their after-game soccer reports will typically include information on which teams played, where the game was held and when, who were the officials, what were the significant game events (who scored, when and how, or who received penalties and why, etc.) - it’d be great if these blog posters could use a tool that would understand this structure, presenting an editing form with the relevant fields and creating both HTML and RSS with this stucture embedded in it. Then other people reading these posts could say, “hey, I want to reuse this structure in my own posts” and their blog reader / creator could make this structure available when the blogger is ready to write. As well as this, reader applications could begin to answer questions based on the form fields available - “show me all the matches from Germany with more than two goals scored”, etc.

20070209a.pngAt the moment, the structured blogging tools do provide a fixed set of forms that bloggers can fill in (see the WordPress restaurant review form on the right) - for things like reviews, events, audio, video and people - but there is no reason that people couldn’t create custom structures, and news aggregators or readers could auto-discover an unknown structure, notify a user that a new structure is available, and learn the structure for reuse in the user’s future posts.

There have been some other past efforts with similar aims to the structured blogging community, including Qlogger, the Lafayette project, and JemBlog. And in the future, Semantic Web technologies could be used to ontologise any available post structures for more linkage and reuse… This neatly brings me on to semantic blogging, which I’ll discuss in the next post!

The Semantic Web: Web 3.0?

(Originally posted by John Breslin on the IIA Blog.)

A key feature of Web 2.0 sites is community-contributed content that may be tagged and can be commented on by others. That content can be virtually anything: blog entries, board posts, videos, audio, images, wiki pages, user profiles, bookmarks, events, etc. I fully expect to see a site with live multiplayer video games appearing in little browser-embedded windows just as we already have YouTube for videos, with running commentaries going on about the games in parallel. Tagging is common to many Web 2.0 sites - a tag is a keyword that acts like a subject or category for the associated content. Then we have folksonomies: collaboratively generated, open-ended labeling systems that enable Web 2.0 users to categorise content using the tags system, and to thereby visualise popular tag usages via “tag clouds” (visual depicitions of the tags used on a particular website, like a weighted list in visual design).

Folksonomies are one step in the same direction as what some have termed Web 3.0, or the Semantic Web. (The Semantic Web often uses top-down controlled vocabularies to describe various domains, but can also utilise folksonomies and therefore develop more quickly since folksonomies are a great big distributed classification system with low entry costs.) As Tim-Berners Lee et al. said in Scientfic American in 2001, the Semantic Web is “an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation”. You probably know that the word “semantic” stands for “the meaning of”, and therefore the Semantic Web is one that is able to describe things in a way that computers can better understand (yes, computers are just like Ginger in the Far Side). Some of the more popular Semantic Web vocabularies include FOAF (Friend-of-a-Friend, for social networks) and Geo (for geographic locations).

It consists of metadata that is associated with web resources, and then there are associated vocabularies or “ontologies” that describe what this metadata is and how it is all related to each other. SEO experts have known that adding metadata to their websites can often improve the percentage of relevant document hits in search engine result lists, but it is hard to persuade web authors to add metadata to their pages in a consistent, reliable manner (either due to perceived high entry costs or because it is too time consuming). For example, few web authors make use of the simple Dublin Core metadata system, even though the use of DC meta tags can increase their pages’ prominence in search results.

The main power of the Semantic Web lies in interoperability, and combinations of vocabulary terms: interoperability and increased connectivity is possible through a commonality of expression; vocabularies can be combined and used together:
e.g. a description of a book using Dublin Core metadata can be augmented with specifics about the book author using the FOAF vocabulary. Vocabularies can also be easily extended (modules, etc.). Through this, true intelligent search with more granularity and relevance is possible: e.g. a search can be personalised to an individual by making use of their identity profile and relationship information.

The challenge for the Semantic Web is related to the chicken-and-egg problem: it is difficult to produce data without interesting applications, and vice versa. The Semantic Web can’t work all by itself, because if it did it would be called the “Magic Web”. For example, it is not very likely that you will be able to sell your car just by putting your a Semantic Web file on the Web. Society-scale applications are required, i.e. consumers and processors of Semantic Web data, Semantic Web agents or services, and more advanced collaborative applications that make real use of shared data and annotations.

The Semantic Web effort is mainly towards producing standards and recommendations that will interlink applications, and the primary Web 2.0 meme as already discussed is about providing user applications. These are not mutually exclusive: with a little effort, many Web 2.0 applications can and do use Semantic Web technologies to great benefit, and this picture from Nova Spivack shows some evolving areas where these two streams have and will come together: semantic blogging, semantic wikis, semantic social networks and the Semantic Desktop all fall in the realm of what he terms the Metaweb, or “social semantic information spaces”. Semantic MediaWiki, for example, has already been commercially adopted by Centiare.

20070201d.png

There are also great opportunities for mashing together of both Web 2.0 data or applications and Semantic Web technologies - just use your imagination! Dermod Moore wrote of one such Web 2.0 application mashing for a hobby: a Scuttle + Gregarius + Feedburner + Grazr hybrid that allows one to aggregate one’s favourite blogs or other content on a particular topic and then to annotate bookmarks to the most interesting content found. Bringing this a step further, we could have a “semantic social collaborative resource aggregator”. Okay, it needs a better name, like “scraggy” or something :). In this hypothetical system:

  • Social network members specify their favourite content sources
  • You and your friends specify any topics of interest
  • You specify friends whose topic lists you value
  • Metadata aggregator collects content from sites you and friends like (which may be human tagged, or could be auto-tagged)
  • Highlights content that may be of interest to you or your friends
  • If nothing of interest is currently available, content sources may have semantically-related sources in other communities for secondary content acquisition and highlighting
  • You bookmark and tag the interesting content, and share!

That’s all for now; next time I’ll be talking about the evolution from blogging to structured and semantic blogging.

From Web 1.0 to 2.0…

(Originally posted by John Breslin on the IIA Blog.)

Hello and welcome to the first of my guest posts for the Irish Internet Association’s blog. For the next two weeks, I’ll be talking about matters Web 2.0 related - hopefully with enough material to pique the interest of those who are both new to or already involved in this and related areas.

20070201a.jpgAbout me: I’m a researcher at the Digital Enterprise Research Institute at NUI Galway, and co-founder of boards.ie. Some more information about myself can be found on my personal and work pages. In parallel to this guest blogging session, I’m teaching a new module in “Emerging Web Media” to Masters in Digital Media students at the Huston Film School, and some of the topics being covered in that will overlap with these entries.

First off, I will mention Web 1.0. The structural / syntactic web put in place in the early 90s is still much the same as what we use today: resources (web pages, files, etc.) connected by untyped hyperlinks. By untyped, I mean that there is no easy way for a computer to figure out what a link between two pages means - for example, on the IIA website, there are hundreds of links to the various organisations that are registered members of the association, but there is nothing explicitly saying that the link is to an organisation that is a “member of” the IIA or what type of organisation is represented by the link. On my work page, I link to many papers I’ve written, but I haven’t said that I am the author of those papers or that I wrote such-and-such when I was working at NUI Galway.

20070201b.gif In fact, the Web was envisaged to be much more, as you’ll see from the image on the right which is taken from Tim Berners-Lee’s original outline for the Web in 1989, entitled “Information Management: A Proposal”. In this, all the resources are connected by links describing the type of relationships, e.g. “wrote”, “describe”, “refers to”, etc. This is a precursor to the Semantic Web which I’ll come back to…

Now to Web 2.0, a term made popular by Tim O’Reilly and explained here. But what exactly is it? I’m sure if you ask 10 different people you’ll come up with at least five answers. (Here are a Web 2.0 meme cloud, meme map and an elements picture. Any clearer as to what it is?!). The global brain, or as it likes to call itself, “Wikipedia”, says in one place that “Web 2.0 … has … come to refer to what some people describe as a second phase of architecture and application development for the World Wide Web.” I like to think of it as a web where “ordinary” users can meet, collaborate, and share [content] using social software applications on the Web - via tagged items, social bookmarking, AJAX functionality, etc. And there are many popular examples that work along this collaboration and sharing meme: Bebo, del.icio.us, digg, Flickr, UseAMap.com, Technorati, orkut, 43 Things, Wikipedia, and so on.

Over the last 13 years, there’s been a shift from just ‘existing’ on the Web to participating on the Web. Web 2.0 is a platform for social and collaborative exchange with reusable community contributions, where anyone can mass-publish using web-based social software and others can subscribe to desired information, news, data flows, or other services. It is “social software” that is being used for this communication and collaboration, software that “lets people rendezvous, connect or collaborate by use of a computer network. It results in the creation of shared, interactive spaces…” Examples include instant messaging, IRC, forums, blogs, wikis, SNS (social network services), social bookmarking, podcasts, and MMOGs / MMORPGs.

O’Reilly wrote a long article on the seven features or principles of Web 2.0, to which some have added an eighth: the long tail phenomenon. But in short, Web 2.0 is all about being more open, more social, and through user-created content, cheaper!

20070201c.png

Tomorrow I’l talk about the move from Web 2.0 towards what has been termed Web 3.0, or the “Semantic Web”.

Selling a bunch of interesting Irish / general domains…

It’s spring cleaning time, so I’m selling the following domains at Sedo:

Ads

thisadvert.com

Anime, J-Pop

animeandmanga.com
jpopforums.com
mangaandanime.com

Blogs, Social

planetoftheblogs.com
social365.com
thecritic.org

General

compla.in
greatbrita.in
pumpk.in
www.gen.nz

Ireland

inniu.com
irishpubs.net

Irish Cities

belfastcity.biz
belfastcity.info
corkcity.net
corkcity.org
dublincity.org
irishcity.com

News

irishnews.org
irishnewspapers.com

TV

smallville.biz

Digital Media awards approach

20070118a.pngDigital Media awards are being held on the 1st February 2007 in the Burlington Hotel. Another blog linked to DERI was nominated in the same category as mine, that of Galway author Sandra Bunting who is writing a novel online via her blog “sandwriter” and began blogging following a tutorial by our community and education outreach officer, Brendan Smith.

I also noticed that the Strawberry Alarm Clock from FM104 has been dropped from our category, no doubt due to their recent transfer to 2FM…

At the ExpertFinder Workshop in Berlin

I’m at the first ExpertFinder Workshop and co-located Knowledge Web General Assembly in Berlin. I gave a short presentation on SIOC for expert finding scenarios this morning. There have been some very interesting presentations on finding experts using Semantic Web technology; see the #foaf IRC log from today and accepted paper PDFs for more.

Blogging from the Drupal Ireland Meetup

We’re having some interesting talks at the first Drupal Ireland Meetup in DERI, NUI Galway today. Alan Burke has talked about the views module in Drupal, and Vincent Jordan is now talking about Drupal 5 multi-site installations (I also gave a SIOC module and taxonomies presentation earlier). Looking forward to the rest of the talks (aggregators by Aidan Finn, webforms and jQuery by Stéphane Corlosquet, and CCK by Alan Burke). Also in attendance are Haklae Kim, Sukhyung Hwang from DERI, and Gerry Shanahan from boards.ie.

Back in the fold…

…and off the antibiotics - yay! Some nasty bugs going around this Christmas, and really virulent too.

I’ve been nominated in the Digital Media awards “Content / Blogging Award” category. Thanks to my nominee (Brendan) and I’m looking forward to the event!

In other news, I gave my first lectures in CT108 Next Generation Technologies I (Semantic Web) and DM110 Emerging Web Media yesterday - I’ll be uploading a PDF of the slides after each week’s lecture. We’ve provisionally booked a guest lecturer for DM110, none other than Conn Ó Muineachain from Edgecast…

Belated happy new year…

…but I’m suffering from a viral chest infection, grrr!