The Next ORM Killer App: A Duplicate Content Filter for Feed Readers

Many organizations (from corporations to trade groups and news media) are syndicating information about themselves via social media.  On the one hand, this is great because it makes so much more information available.  On the other hand, it’s problematic because it creates a lot of duplication and makes the process of separating the wheat from the chaff more laborious.  There’s a whole lot of chaff involved in Online Reputation Management (ORM).

For example:

The Grand Rapids Press published a story on some changes to GRCC’s policy on tobacco use on campus.

That one story generated dozens of mostly-irrelevant duplicate mentions (62 if one counts the duplications across the several social media search engines like Technorati, Google Blogsearch and Social Mention that I use to follow discussion about the college) as the story was picked up by GRCC and mentioned in its employee newsletter, was picked up by industry websites, mentioned on a message board for another school, was scraped by splogs and other websites that siphon traffic from search engines by appearing to be relevant to a particular search query, and at least one tobacco industry site.  Sometimes Mlive (parent website that hosts the GR Press) will even syndicate the same story under one or more of its other papers (creating more references to be sorted through).

The tweets alone are mind-boggling; the story was recycled by GRCC as well as another community college, and both the Grand Rapids Press and the Press’ parent company in addition to a company that helps students apply for scholarships to college, a local private college, a youth health counselor in South Dakota, a marketing professor at another college in Michigan, and re-tweeted from a grand rapids resident following the local private college’s feed,

And those are only the results to date: this blog post will likely also be indexed and mentioned (then possibly scraped or re-referenced by splogs or other sites) – producing yet another entry adding to the feedback loop.

If the staff that develop my favorite feed reader, Google Reader, or any other developers are  looking to create the next killer app for feed readers/aggregators are reading – they would do the world a service by creating a feature that helps to filter out some of this content.  Search engines like Google already incorporate duplicate content filters, so it’s not that big a stretch to apply the principle elsewhere.

Some features of such an app that would be helpful might include:

  • An algorithm for automatically determining the most relevant, original entry containing the mention (perhaps by comparing posting dates/times, or looking for keyphrases like “RT @” and prioritizing it above others (perhaps with some sort of drill-down menu to see related entries).
  • Some way to manually specify domains to be excluded (like one’s own, for example).
  • A way to tap into the input of the user community (say Google Reader users) to supplement the algorithms with an organic component (similar to spam email filters).

It’s important to note that not all of these mentions are lacking in value, quite the contrary; it can be immensely valuable to the practice of ORM (and the brand management of any organization) to see how information about an organization is perceived by the public.  These insights can help with future planning not only at the macro level, but on the micro level in developing individual communication campaigns.

2 thoughts on “The Next ORM Killer App: A Duplicate Content Filter for Feed Readers

  1. Jason Johnson says:

    Thanks; really interesting topic. I’ve often wondered about the feedback loops or chain reactions that are possible with the right (or wrong) combination of online tools and services. Say for example you use to tweet/update your LinkedIn status message/update your facebook status and your blog has a plugin or widget that posts any tweets about certain strings and from certain sources in a page widget. You have to be careful what you’re hooking up/into with services and tools otherwise you’ll end up getting a massive echo of the same info posted and reposted all over the blog and pumped RSS subscriptions (posts/comments/pingbacks/trackbacks…) giving readers a giant tangled mess of the same stuff to sort through.


    1. derekdevries says:

      Indeed; it’s a variation on the same problem one frequently runs into with syncing any database (like mp3 libraries, or integrating one’s mobile device(s) with one’s scheduling software). That’s what makes it interesting that it appears no one has yet stepped up to the plate to produce a solution (as there is already a market for software that cleans up music collections and reconciles contacts/appointments).

      It’s understandable though; there’s a rather high level of complexity involved and it may be that there’s not yet a critical mass of people who use aggregators.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s