I have thousands of Mac OS X url clippings files on my mac. These little files are reported by the finder as “Web Internet Location” files. They’re sometimes known as “weblocs” (due to their file extension) and look like this: 
The reason I have so many of these things is that I switch browsers relatively frequently (bouncing between Safari and OmniWeb and occasionally Firefox) and I find weblocs are more permanent and just more convenient than browser bookmarks.
Mostly they’re “organised” into a single folder and I use Spotlight to find the ones I am interested in at that moment. However Spotlight doesn’t have a default webloc metadata importer, and this is where Toxic Software stands up to fill the gap. My Spotlight Importer Collection package includes three importers, one of which as luck would have it is a webloc importer (in fact it also imports a plethora of other internet clipping files: afploc, fileloc, ftploc, inetloc, mailloc, urlloc and webloc).
After installing Toxic URL Importer and waiting for Spotlight to index your volumes each webloc (if you’re in a rush use mdimport to force Spotlight to index files immediately) you’ll be able to view a webloc’s URL directly in the finder: 
From Terminal.app you can use mdls to inspect the extra metadata added to the file by Toxic URL Importer:
[schwa@cobweb] Desktop$ mdimport The\ XML\ Bookmark\ Exchange\ Language\ Resource\ Page.webloc
[schwa@cobweb] Desktop$ mdls The\ XML\ Bookmark\ Exchange\ Language\ Resource\ Page.webloc
The XML Bookmark Exchange Language Resource Page.webloc -------------
kMDItemContentType = "com.toxicsoftware.webloc"
kMDItemContentTypeTree = (
"com.toxicsoftware.webloc",
"com.toxicsoftware.urlloc",
"public.url",
"public.data",
"public.item"
)
kMDItemDisplayName = "The XML Bookmark Exchange Language Resource Page.webloc"
kMDItemID = 17168962
kMDItemKind = "Web Internet Location"
kMDItemURL = "http://pyxml.sourceforge.net/topics/xbel/"
org_spotlightdev_digest_sha1 = "794043e7673e525d654e7b8e4115636408182528"
org_spotlightdev_metadata_mdimporters = ("com.toxicsoftware.url-importer")
org_spotlightdev_metadata_url_host = "pyxml.sourceforge.net"
org_spotlightdev_metadata_url_scheme = "http"
org_spotlightdev_metadata_urls = ("http://pyxml.sourceforge.net/topics/xbel/")
You can then search for weblocs by host or by scheme or a by any text in the url. Very handy:
On the whole using weblocs and Spotlight together is a great solution. But I think I’ve found a better one. Luis de la Rosa’s WebNote Happy is a fantastic application for managing bookmarks. It allows me to manage my bookmarks in a single window (like my folder of weblocs) and integrates really well with del.icio.us. As fast as Spotlight searching is, WebNoteHappy beats it hands down with lightning fast searches.
Unfortunately WebNoteHappy doesn’t yet import or export webloc files (Luis is a very responsive developer and am sure will be making up for this minor deficiency in later releases). But fortunately WebNoteHappy does import XBEL files, which as I’ve already written about in my post about OmniWeb and XBEL is a very handy intermediate file format.
As it turns out I have almost all the tools necessary to get my thousands of weblocs files into WebNoteHappy. I have mdfind2, which like its little cousin mdfind can perform Spotlight searches from the terminal. However mdfind2 can also export an XML file describing the metadata of the found items. All I needed to do was use mdfind2 to find all my webloc files and then transform the XML into XBEL via a custom written XSLT.
Here is the XSLT file “mdfind2_to_xbel.xsl”:
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" doctype-public="+//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML" doctype-system="http://www.python.org/topics/xml/dtds/xbel-1.0.dtd"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xbel version="1.0">
<xsl:apply-templates/>
</xbel>
</xsl:template>
<xsl:template match="item">
<bookmark>
<xsl:attribute name="href"><xsl:value-of select="attributes/attribute[@key='kMDItemURL']"/></xsl:attribute>
<xsl:attribute name="id">_<xsl:value-of select="attributes/attribute[@key='kMDItemID']"/></xsl:attribute> <xsl:attribute name="added"><xsl:value-of select="attributes/attribute[@key='kMDItemContentCreationDate']"/></xsl:attribute>
<xsl:attribute name="modified"><xsl:value-of select="attributes/attribute[@key='kMDItemContentModificationDate']"/></xsl:attribute>
<xsl:attribute name="visited"><xsl:value-of select="attributes/attribute[@key='kMDItemLastUsedDate']"/></xsl:attribute>
<title><xsl:value-of select="name"/></title>
</bookmark>
</xsl:template>
<xsl:template match="text()"></xsl:template>
</xsl:stylesheet>
And here is the tiny snippet of shell script needed to tie it all together:
mdfind2 --xml "kMDItemContentTypeTree == 'com.toxicsoftware.urlloc'" | \
xsltproc mdfind2_to_xbel.xsl -
And here is the final resulting XBEL file (limited to just a single entry) ready to be imported in WebNoteHappy:
<?xml version="1.0"?>
<!DOCTYPE xbel PUBLIC "+//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML" "http://www.python.org/topics/xml/dtds/xbel-1.0.dtd">
<xbel version="1.0">
<bookmark href="http://www.txrollergirls.com/teams.htm" id="_9799996" added="2005-02-17 13:54:05 -0500" modified="2005-02-17 13:54:05 -0500" visited="2005-02-17 13:54:05 -0500">
<title>texas rollergirls TEAMS</title>
</bookmark>
</xbel>
After running the script I now have all my weblocs safely imported into WebNoteHappy, with duplicates removed (in fact they were never imported in the first place). I can tag them and add them to del.ici.ous with just a click, and I can now import my current Safari bookmarks into the list and manage all my bookmarks in one place.
Using a little bit of custom coding (my URL Importer, mdfind2 and the XSLT) I was able to use the extensibility of Spotlight and WebNoteHappy to my advantage. A sure sign of how useful a piece of software is how easy it is to be extended by its end users.
Subversion Repository: http://toxic-public.googlecode.com/svn/tags/BlogTag_20070927_729/Projects/Misc/mdfind2_to_xbel