Monthly Archive for February, 2007

Irish Blog Awards voting open

You can now vote for your desired winners in the Irish Blog Awards to be held next month… Congratulations to our own Ina O’Murchu who made the shortlist for Best Technology Blog!

IIA Blog: Adding Structure to Blog Posts

My latest IIA guest blogger post is now available. It’s a fairly light but lengthy piece about adding structure to blog posts. I’ll talk about semantic blogging next time…

Adding Structure to Blog Posts

(Originally posted by John Breslin on the IIA Blog.)

As you probably know (since you’re reading this!), blogs are [usually open access] websites which contain periodic time-stamped posts (in reverse chronological order) about a particular genre or touching on a number of topics of interest. They range from individual’s online diaries or journals to promotional tools used by companies or political campaigns, and many allow public commenting on their posts. They are also starting to cross the generation gap - your kids might have a blog on Bebo, you may blog yourself and your parents could be reading or commenting on your posts.

The growth and takeup of blogs over the past four years has been dramatic, with a doubling in the size of the blogosphere every six or so months (according to statistics from Technorati). Over 100,000 blogs are created every day, working out at about one a second. Nearly 1.5 million blog posts are being made each day, with over half of bloggers contributing to their sites three months after the blog’s creation.

Similar to accidentally wandering onto message boards and web-enabled mailing lists, when you’re searching for something on the Web, you may often happen across a relevant entry on someone’s blog. RSS feeds are also a useful way of accessing information from your favourite blogs, but they are usually limited to the last 15 entries, and don’t provide much information on exactly who wrote or commented on a particular post, or what the post is talking about. Some approaches like SIOC aim to enhance the semantic metadata provided about blogs, forums and posts, but there is also a need for more information about what exactly a person is writing about. If you’re searching for particular information in or across blogs, it’s often not that easy to get it because of “splogs” (spam blogs) and the fact that the virtue of blogs so far has been their simplicity - apart from the subject field, everything and anything is stored in one big text field for content. Keyword searches may give some relevant results, but useful questions such as “find me all the restaurants that bloggers reviewed in Dublin with a rating of at least 5 out of 10″ cannot be posed, and you cannot easily drag-and-drop events or people or anything (apart from URLs) mentioned in blog posts into your own applications.

I’m going to talk about two approaches to tackle this issue of adding more information to posts, so that queries can be made and the things that people talk about can be reused in other posts or applications, because not everyone is being served well by the lowest common denominator that we currently have in blogs. The first is called structured blogging and the second semantic blogging. (I’ll cover semantic blogging in my next installment…)

“Structured blogging” is an open source community effort that has created tools to provide microcontent (including microformats like hReview) from popular blogging platforms such as WordPress and Moveable Type. In structured blogging, packages of structured data are becoming post components. Sometimes (not all of the time) you will have a need for more structure in your posts - if you know a subject deeply, or if your observations or analyses recur in a similar manner throughout your blog - then you may best be served by filling in a form (which has its own metadata and model) during the post creation process. For example, you may be writing a review of a film you went to see, or a report on a sports game you attended, or a guide to tourist attractions you saw on your travels. Not only do people get to express themselves more clearly, but blogs can start to interoperate with enterprise applications through the microcontent that is being created in the background.

Let’s say that someone (or a group of people) is reviewing some soccer games that they watched. Their after-game soccer reports will typically include information on which teams played, where the game was held and when, who were the officials, what were the significant game events (who scored, when and how, or who received penalties and why, etc.) - it’d be great if these blog posters could use a tool that would understand this structure, presenting an editing form with the relevant fields and creating both HTML and RSS with this stucture embedded in it. Then other people reading these posts could say, “hey, I want to reuse this structure in my own posts” and their blog reader / creator could make this structure available when the blogger is ready to write. As well as this, reader applications could begin to answer questions based on the form fields available - “show me all the matches from Germany with more than two goals scored”, etc.

20070209a.pngAt the moment, the structured blogging tools do provide a fixed set of forms that bloggers can fill in (see the WordPress restaurant review form on the right) - for things like reviews, events, audio, video and people - but there is no reason that people couldn’t create custom structures, and news aggregators or readers could auto-discover an unknown structure, notify a user that a new structure is available, and learn the structure for reuse in the user’s future posts.

There have been some other past efforts with similar aims to the structured blogging community, including Qlogger, the Lafayette project, and JemBlog. And in the future, Semantic Web technologies could be used to ontologise any available post structures for more linkage and reuse… This neatly brings me on to semantic blogging, which I’ll discuss in the next post!

DM110 - Week 5 - Audio Podcasting

This week’s lecture slides from my DM110 module (Emerging Web Media) are now available at Slideshare. The topic this week was audio podcasting.

3 becomes 4 - BarCamp Galway moves to September

Thanks to balanced mediation from Tom Raftery, we’ve postponed BarCamp Galway until after the BarCamp Ireland 3 / BarCamp Dublin event which will be held in April or May.

BarCamp Ireland 4 will now take place in Galway in September, with number 5 in Cork during either December / January next.

20070207a.png

We also have a wiki page at barcamp.org/BarCampIreland4 and a blog a barcampgalway.wordpress.com.

BarCamp Ireland 3

I think it’s time to start organising BarCamp Ireland 3, as I was supposed to organise it last time and of course was beaten to it!

[Edit. Image snipped.]

You can visit the wiki page here.

IIA Blog: The Semantic Web: Web 3.0?

The second of my IIA guest blog posts has now been published: “The Semantic Web: Web 3.0?“.

IIA Blog: From Web 1.0 to 2.0…

The first of my guest blog posts for the IIA was published on their blog on Friday, entitled “From Web 1.0 to 2.0…“.

The Semantic Web: Web 3.0?

(Originally posted by John Breslin on the IIA Blog.)

A key feature of Web 2.0 sites is community-contributed content that may be tagged and can be commented on by others. That content can be virtually anything: blog entries, board posts, videos, audio, images, wiki pages, user profiles, bookmarks, events, etc. I fully expect to see a site with live multiplayer video games appearing in little browser-embedded windows just as we already have YouTube for videos, with running commentaries going on about the games in parallel. Tagging is common to many Web 2.0 sites - a tag is a keyword that acts like a subject or category for the associated content. Then we have folksonomies: collaboratively generated, open-ended labeling systems that enable Web 2.0 users to categorise content using the tags system, and to thereby visualise popular tag usages via “tag clouds” (visual depicitions of the tags used on a particular website, like a weighted list in visual design).

Folksonomies are one step in the same direction as what some have termed Web 3.0, or the Semantic Web. (The Semantic Web often uses top-down controlled vocabularies to describe various domains, but can also utilise folksonomies and therefore develop more quickly since folksonomies are a great big distributed classification system with low entry costs.) As Tim-Berners Lee et al. said in Scientfic American in 2001, the Semantic Web is “an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation”. You probably know that the word “semantic” stands for “the meaning of”, and therefore the Semantic Web is one that is able to describe things in a way that computers can better understand (yes, computers are just like Ginger in the Far Side). Some of the more popular Semantic Web vocabularies include FOAF (Friend-of-a-Friend, for social networks) and Geo (for geographic locations).

It consists of metadata that is associated with web resources, and then there are associated vocabularies or “ontologies” that describe what this metadata is and how it is all related to each other. SEO experts have known that adding metadata to their websites can often improve the percentage of relevant document hits in search engine result lists, but it is hard to persuade web authors to add metadata to their pages in a consistent, reliable manner (either due to perceived high entry costs or because it is too time consuming). For example, few web authors make use of the simple Dublin Core metadata system, even though the use of DC meta tags can increase their pages’ prominence in search results.

The main power of the Semantic Web lies in interoperability, and combinations of vocabulary terms: interoperability and increased connectivity is possible through a commonality of expression; vocabularies can be combined and used together:
e.g. a description of a book using Dublin Core metadata can be augmented with specifics about the book author using the FOAF vocabulary. Vocabularies can also be easily extended (modules, etc.). Through this, true intelligent search with more granularity and relevance is possible: e.g. a search can be personalised to an individual by making use of their identity profile and relationship information.

The challenge for the Semantic Web is related to the chicken-and-egg problem: it is difficult to produce data without interesting applications, and vice versa. The Semantic Web can’t work all by itself, because if it did it would be called the “Magic Web”. For example, it is not very likely that you will be able to sell your car just by putting your a Semantic Web file on the Web. Society-scale applications are required, i.e. consumers and processors of Semantic Web data, Semantic Web agents or services, and more advanced collaborative applications that make real use of shared data and annotations.

The Semantic Web effort is mainly towards producing standards and recommendations that will interlink applications, and the primary Web 2.0 meme as already discussed is about providing user applications. These are not mutually exclusive: with a little effort, many Web 2.0 applications can and do use Semantic Web technologies to great benefit, and this picture from Nova Spivack shows some evolving areas where these two streams have and will come together: semantic blogging, semantic wikis, semantic social networks and the Semantic Desktop all fall in the realm of what he terms the Metaweb, or “social semantic information spaces”. Semantic MediaWiki, for example, has already been commercially adopted by Centiare.

20070201d.png

There are also great opportunities for mashing together of both Web 2.0 data or applications and Semantic Web technologies - just use your imagination! Dermod Moore wrote of one such Web 2.0 application mashing for a hobby: a Scuttle + Gregarius + Feedburner + Grazr hybrid that allows one to aggregate one’s favourite blogs or other content on a particular topic and then to annotate bookmarks to the most interesting content found. Bringing this a step further, we could have a “semantic social collaborative resource aggregator”. Okay, it needs a better name, like “scraggy” or something :). In this hypothetical system:

  • Social network members specify their favourite content sources
  • You and your friends specify any topics of interest
  • You specify friends whose topic lists you value
  • Metadata aggregator collects content from sites you and friends like (which may be human tagged, or could be auto-tagged)
  • Highlights content that may be of interest to you or your friends
  • If nothing of interest is currently available, content sources may have semantically-related sources in other communities for secondary content acquisition and highlighting
  • You bookmark and tag the interesting content, and share!

That’s all for now; next time I’ll be talking about the evolution from blogging to structured and semantic blogging.

The Digital Media Awards 2007

Had a great night at the DMA 2007 event last night. I didn’t win the blogging category, but I didn’t expect anyone except Blogorrah to win it - very well deserved. Like me, they have a love for the Webdings font for their icons :)

But the story of the night was Conn Ó Muineachain’s win for not only independent podcaster but the DMA overall prize as well. This award dispels some uneasiness I had about pay-to-enter awards events; if personal hard work is rewarded over wads of cash being dropped on media projects by large companies, it says something positive about this particular event. Also met a bunch of people, including the Fallon brothers from Daft, the Silicon Republic crew, and Sandra, Dan, John, Conor, Brian and Damien.