It is very interesting to see that Amazon have now made available over 1TB of public data. Its great that all of this data is now available in one place, ready shredded into queryable structures that allows developers to get to grips with it and start to do something really interesting. But wait a minute, if I want the DBPedia dump I have to go here…, if I want the Wikipedia english articles I need to go over there. If I want the US census data from 2000 its this place, if its the census data from 1990, its somewhere else. Oh and don’t even get me started on having to choose between Windows and Linux. What these data sets are are essentially separate database snapshots that you can load into your own EC2 instance in the Amazon cloud and then start processing.
…and thats kind of disappointing. Having lots of open data is a great start, but it is only a start. And here is the challenge – there are no consistent semantics acrosss these data sets, there is a great deal of wet-ware time that needs to be invested in working out the linkages between them and in getting hold of some consistent notions of identity that could assist in merging. The easy way out is to pick and choose and make a “mash-up”, but there is nothing reusable in a mash-up, and a million mash-ups do not make a viable platform for building the really cool apps of the next decade on. Topic maps on the other hand has a model for reflecting a consistent notion of identity, for reconciling different identity notions and different entity schemas.
There’s the challenge – can we integrate all of this data using topic maps ? Can we make use of the tools provided by the Amazon platform to build something even more cool – a cloud based index of the entities and relations in these data sets ? Because I believe that when we can do that we will really have 1TB (or given the expansion of the topic maps model probably 1PB of useful knowledge, rather than 1TB of bits and bytes that you can hack with a mash-up.
There’s the glove. Who’s going to pick it up ?
Definitely an experience to file under the “it just works” category. Upgrading from my old old old installation of MoveableType to WordPress was almost totally painless. There were only a couple of small “gotchas” that I haven’t quite got around to fixing yet.
Firstly my collaborator on the Pepys topic map stuff, Stuart Brown has lost his credit on his posts. For some reason the importer decided that all posts has been written by Stu and I had a choice of either going through and manually editing each post by hand or just assigning them all to me. Always one to take the easy way out I opted for the latter. Sorry Stu! Still, if he is that upset, I can always let him come on over and do the manual editing, eh ?
Secondly the RSS feed will move so those of you who have a subscription will need to update your readers to look at the feed for posts or for comments (or both!) . I’ll see if I can force some sort of redirect through the magic of .htaccess, but hey what did I just say about always taking the easy way ?
The theme is “Love The Orange” for which my thanks go to the creators at Web Design Creatives.
I really like the editing interface in WP – spell checking and a nice WYSIAWYG HTML editor will mean I have fewer excuses for producing poorly written garbage. The whole admin interface is neat and intuitive and possibly my only gripe so far is that I can’t find a quick way to bulk edit more than the 15 posts you can display on a single page. And WP supports more ways to get a post up than my venerable MT isntallation did. Speaking of which, if anyone has any tips on good tools for posting to WP (from desktop and/or from a Symbian-based phone) please let me know in the comments section!
In the four years since I last did anything with this blog, things have moved on a bit in terms of blogging systems, so I am considering replacing this MT installation with something a bit more hip, with it and down with the kids….well OK with WordPress anyways. As you can see from the page layout my design skills are somewhat poor but as a dedicated geek I hold fast to the belief that there is nothing that can’t be fixed by upgrading.
So all things being equal this weekend I’m going to try and get WordPress on to this site, and transfer all of my old posts over to it. My host has PHP and MySQL so I have the basic WP requirements, so hopefully it should be easy enough. If anyone has any tips or knows of any gotchas in moving from MT 2.6 to the latest WordPress please let me know.
Yes its been a very long time since I last added anything to this blog. Nearly 4 years in fact! I’ve not been on a round-the-world tour or used the last four years to perfect my violin technique or been living in a closed community that uses only 19th century technology (aka Windows 3.1). Most of my time has been spent working on stuff at my company NetworkedPlanet. And running a startup, and keeping up wiht Topic maps and .NET (amongst other things) have basically taken over most of my time.
But now its time for a restart of my blogging career.
Roll on the next post (hopefully without the 4 year gap this time).
The Pepys-Map is a topic map of the people, places and events described in the famous 17th Century diaries of Samuel Pepys.
This round-up topic map covers all of the entries from June 1661 through to March 1662. You can either download the full XTM representation of Pepys-Map or browse the HTML rendition of Pepys-Map online.
Two more entries posted today.
In the entry for 1st April 1662, there is a sequence of events worth noting for the way in which the temporal relationships are modelled. The sequence starts when Pepys, Paulina Montagu and others attend a performance of “The Maid in the Mill”. During the performance, Paulina Montagu is taken ill, so Pepys takes her to the Grange (a nearby tavern) where she “did what she had a mind to”, and then the pair return to the play. This gives us the following events:
- A performance of “The Maid in the Mill”
- Paulina Montagu’s illness
- Samuel and Paulina Montagu travel from the Opera to the Grange
- Samuel and Paulina Montagu return from the Grange to the Opera
(1) is the wrapper event that events (2) – (4) all occur during, but (4) can be said to occur after the end of (2), so we have the following event relationships:
- (2) occurs during (1)
- (3) occurs during (1) and is caused by (2)
- (4) occurs during (1) and occurs after (2)
Lots more entries uploaded today which takes us to the end of March and finally gets me caught up again. Yay!
Watch this space for the end-of-month roundup…
5 more entries uploaded today, with no major ontology changes.
Another big batch update today. The only new addition to the ontology is a new event type “Arrest” with participation roles “Accused” and “Accuser”, used for the capture or arrest of individuals. Accuser role is played by the one bringing the charge.
Five new entries posted today.
The only new addition to the ontology is the creation of a new event type ‘Vote of Parliament’ which is used to record a vote in the House of Parliament.