WordPress Hacks: Techmeme River of News Clone, Part 2

[Advanced users] To review, a river of news is a mashup of RSS feeds, arranged in (reverse) chronological order and presented on a web page. Normally, an RSS reader might suffice to monitor a blog niche, but my intent is to produce a Techmeme clone. A step in that direction is a Techmeme River clone, which is what this post is for. Either tool can be immensely useful if you want to keep on top of

Please review Part 1 of this mini-series, which goes over the basics of building a clone of Techmeme River. In that article, we produced an HTML badge that shows a block of headlines collated from multiple source feeds. The intent in this article is to present the headlines block in the content area of a blog instead of in the navigation column.

As with the previous article, I am not supplying 100% of the code, just the basics. This article, unlike the last, is intended for advanced WP users as a stepping stone for producing a Techmeme River clone that lives on its home page. If you’re serious about building your own version, I’m assuming you will likely pursue your own development, based on what I discuss here. (If you simply need such a tool, I’m available for consulting.)

I’ve built a tool, RideSpottr, for monitoring automotive niche blogs, and I’ll use it as an example below.

What You’ll Need

Here’s what you’ll need for this WordPress hack:

  1. Theme: You can use any WordPress theme you like, but I chose the Popurls clone theme package from Ericulous. It includes a modified version of the Simplr WP theme, as well as instructions for specific WP Plugins.
  2. Plugins: I only used the Exec-PHP plugin, which the Popurls clone theme also uses. The other plugins are not necessary. This plugin, when activated, allows your blog to execute WP/PHP code that is contained in a blog post.
  3. Feeds: Mashup your selection of feeds to create a superfeed. As previously mentioned, I prefer using Yahoo Pipes because of the ease with which I can manipulate RSS feeds. You can use whatever web service you prefer. Notes:
    • Be sure that the superfeed you use for monitoring a blog niche is
      actually in the RSS format and not the Atom or RDF feed formats.
    • Make sure that you sort the superfeed’s items reverse chronologically.
  4. Code block: The engine for your river of news is a block of PHP code that will only execute properly on a WordPress blog. The code block starts off identical to the code show in Thord Hendgren’s article Mashing Up Feeds Using Yahoo Pipes, which I referred to in the previous article. After that, I apply a serious of tweaks, which are only partially discussed below.
  5. WordPress: Of course, you’ll need an installation of WordPress to test this. You might be eager to use the brand spanking new WP 2.5, but I highly recommend that you DO NOT use anything in the WP 2.3.x or higher series, else you might run into technical difficulties. WP 2.3.x (or even slightly earlier) have significant database changes that normally wouldn’t introduced into an intermediate version. (That is, such changes should have been reserved for, say, 3.0.) So you could run into problems like I did.
  6. Blueprint CSS Grid Framework code: This is only necessary if you want to pretty up the river of news with some structured layout framework. It is not a Google product but can be downloaded from the Blueprint CSS pages at Google Code. Follow the install instructions discussed there. Do not use the “compressed” version, which typically gets placed in a subdirectory of the WP theme you’re using. I had trouble getting this to work. The default approach allows you to change themes if you like. (The default of Blueprint is 950 px wide pages. So your theme should support this, unless you feel like doing a lot of tweaking. That’s why I used Ericulous’ modified Simplr theme, as mentioned above.)

A Few Cautions

There are a couple of factors that you should be aware of before starting.

  1. The WP/PHP code block uses a WP function, fetch_rss(). This function is very sensitive, and if there are any problems with the input superfeed, nothing will show up. I was constantly in a hair-tearing frame of mind because some of the source feeds had a different format for the “pubDate” field (publishing date/ time stamp). Yahoo Pipes would sometimes work and then immediately after would not. There’s absolutely no way a production web service should rely on Yahoo Pipes. Use it as a prototyper only. If you use Pipes in this hack, like I have, expect that your test site’s home page will occasionally show no headline items.
  2. More Pipes issues: use only RSS formats, not RDF or Atom. Also make sure that if you plan to display the date/ time stamp in your river of news, that the source feeds use the same format. Else Pipes may not sort properly.
  3. The WP/PHP code block I use (presented in full somewhere below) is not robust. It does not have full error handling and is simply meant as a starting point for you to work with.
  4. I did manage to get my test sites working on WP 2.3.3, but not without problems. If you apply this approach, you’d better have patience and enjoy a challenge. I have a long way to go before I have robust web service, despite the hours I’ve already poured in.

Just don’t give up on this hack – it’s very useful as a niche monitor. If you have a more reliable way than Yahoo Pipes to “normalize” all the input feeds for the superfeed, then try that. If you have any questions about this WordPress hack, drop a comment and I’ll do my best to answer.

Example: Automotive Blog Niche

I’ve built two test sites in a slightly different manner, to test different theories. The one mentioned below is RideSpottr, which depending on what I’m trying in the code, may show headline items or might not (because of the problems in Yahoo Pipes). RideSpottr’s layout, at the time of this writing, is no work of beauty and is by no means robust. Though I intend to apply the principle of kaizen and improve it by slow degrees over time – in hopes that it might actually be something other people will use.

Stage 1: Raw List

The instructions below assume that you have already installed WordPress, Ericulous’ modified Simplr theme (for his Popurls clone), and the Exec-PHP plugin. (See above for links.)

  1. In your WP admin panel, change your “Reading” options to allow only one blog post to display on the home page.
  2. Follow Thord’s instructions (linked above) for building a “network feed”. He offers the necessary HTML/ WP/ PHP code necessary to produce an HTML badge.
  3. Take the exact code in Thord’s article and paste it into a new blog post in WP. Make sure you have turned off “WYSIWYG” mode in the WP editor.
  4. “Publish” the blog post, then view your home page.
  5. If you can see headline items, then you can now use your own feed.

Suggestions: Don’t monkey around with Yahoo Pipes to start with. I suggest that you use a single source feed (not a superfeed) while you’re testing, and when you have headline items displaying the way you want, you can use a superfeed.

The result of using Thord’s code will look something like the snapshot below.

Stage 2: Prettying Things Up

From here, you can apply a series of improvements using WP/PHP code tweaks. (My experience over two years is that most performancing members don’t care to see the code step by step. What I’ll do here instead is, as with the last article, describe the steps. There’s also a section somewhere below where you can obtain the final code block and use it for your river of news.)

  1. Add a timestamp.
  2. Change Feedburner URLs to native URLs. Many sites use Feedburner to “burn” their news feeds. But Feedburner feeds retain the original source URLs, so you can publish these in your river, if you prefer. I did this in Yahoo Pipes.
  3. Add floating item excerpt. In an earlier version of my river tool, I did not display item excerpts. Instead, I set up the river so that if a visitor moused over a headline, an item excerpt would appear in a floating box. In later versions, I dropped this. If you want to do this, it is already set up in Ericulous’ modified Simplr theme. You’ll need to study his sample code to understand how this works, as I’m not discussing it here.
  4. Insert author name of news item. Some RSS feeds publish this as an “author” field, others as “dc:creator”. Since I’ve used Yahoo Pipes for building the superfeed, I’ve converted all dc:creator values to “author”. But some feeds do not contain any author info, so be warned of this. This does not cause an error in the river of news, though. The author will just show blank.
  5. Apply a grid layout. After I got the headlines and related info displaying in my river of news, I applied the Blueprint CSS grid framework to control overall presentation.
  6. Add feed sources column. To spiffy things up, an earlier version of my river had a column showing the URLs of blogs/sites used in the superfeed. I later changed the text links to 240×90 banners, though I could also have used screen snapshots of home pages. (If you build a serious, robust tool, you might be able to sell that banner space as advertising.)

Additional Improvements to Consider

Before I get into the actual code used, here are some additional improvements that you might consider trying.

  1. Normalize all the date formats. I found that many of the feeds that I wanted to include in the superfeed had different pubDate formats. Some didn’t even use “pubDate”. Without having to do a great deal of hacking in Yahoo Pipes, I simply could not include many feeds. It would be nice to add date format handling to “normalize” item publishing date for any RSS feed.
  2. Format the date. The current date/time stamp format isn’t pretty. A nicer format might be “Sun Mar 30, 2008; 10:05:00 am,” or some such.
  3. Limit the number of items per source feed. Currently, if one blog has lots of recent items and others have few, the former will dominate in the river of news. Limiting the number of headline items per source feed is one way to resolve this. If you use 50 display slots like I have, thenyou’ll want to trim each input feed to a maximum of X items each. X will decrease as the number of input feeds, N, increases (Xv,N^).
  4. Filter for authority sites. Don’t randomly add blogs from a given niche. Pick those that are authority sites.

Just the Code, Please

Okay, if you just want the short summary and code, here it is:

  1. Pick an unused domain or subdomain.
  2. Install WP. I recommend something before 2.3.
  3. Install and activate Ericulous’ modified Simplr theme.
  4. Install and activate the Exec-PHP plugin.
  5. Install the Blueprint grid framework code. Follow their installation instructions, as there are a couple of lines to add to your header.php file.
  6. Set WP to only display one post maximum on the home page.
  7. Take the code shown below and paste into a new blog post. Publish the post and refresh your home page.

The HTML/ WP/ PHP Code Block

The code shown below gets pasted into a blog post. You could try pasting it into your index.php file, but I haven’t gone there yet. You’re on your own.


    

RideSpottr - automotive news


items) && 0 != count($rss->items) ) { ?>
items = array_slice($rss->items, 0, 50); foreach ($rss->items as $item ) { $pubdate = wp_specialchars($item['pubdate']); $desc = wp_specialchars(substr(strip_tags($item['description']), 0,200) ); $auth = wp_specialchars($item['author']); ?>


by
no rss items to display
\n"; } ?>

Note: all the “column”, “span-X”, “first”, “last”, etc class values in the div tags are Blueprint CSS grid framework values. The “river_nav”, “river_item” types of classes are abitrary.

When you view your home page now, you should see something like the snapshot below:

Now you’ll need to change the code as follows:

  1. Replace the heading text in a way relevant to the niche you want to monitor.
  2. Replace the feed URL used in fetch_rss();
  3. Change the HTML code where the site banners are displayed.
  4. Spiffy up the CSS to make the river of news more attractive than I’ve got it.

Summary

To keep this article relatively readable, I’ve not gone into great detail about the coding. Experiment on your own, and ask questions in the comments, if you like. Good luck.

13 thoughts on “WordPress Hacks: Techmeme River of News Clone, Part 2

  1. Thanks for the heads up on this. I do agree that the date needs to be prettier but I am not sure how to go about it. Anyone can provide some advice? Also, how can I put the name of the website that published the article? I tried using the ‘author’ tag but there are sites that do not use it and there are those that do use the author’s name instead of the site’s name.

  2. The email idea is essentially another variation of the above only for use when Yahoo Pipes goes on a coffee break. Either could be used instead of getting “no rss items to display” on the front page. When the pipe fails visitors still get content instead of clicking away never to return.

  3. Thanks for the heads up on this. I do agree that the date needs to be prettier but I am not sure how to go about it. Anyone can provide some advice? Also, how can I put the name of the website that published the article? I tried using the ‘author’ tag but there are sites that do not use it and there are those that do use the author’s name instead of the site’s name.

    Any recommendations?

  4. ah

    1. Yes, you’re absolutely right. In fact, I alluded to it in the post (or maybe the last one). I just have a tendency to complicate my writing with too many options. So this time, I focused on just the home page option.

    2. There IS a cron job plugin, but it runs fake cron jobs. I think it’s also hosting-dependent. But maybe I can do a advanced followup article in the future.

    3. Ah, got it: a backup/failsafe. Well, I’m crossing my fingers that (a) Pipes will leave beta mode some day; and (b) that the caching issue of the fetch_rss() function can be overridden. This would eliminate the annoying problem of nothing showing in the River from time to time.

    Right, right. Well, if I ever get my version completed, I might release it as open source, or maybe donationware. Depends on how much coding time I have to put in. Most of the past year has been spent in research and contemplation – no coding other than this series for the River. I’ll release as much as I can.

    But excellent ideas. Thanks for the contemplation fodder.

  5. Ahh let me explain a bit. I might have got ahead of myself.

    1. index.php is a great choice, but incase a person wanted to place the river on a seperate page, the index would still be good for other purposes. Plus I’m just partial to creating page-templates and pushing WP’s envelope a bit.

    2. This might be true, but I just spotted a cron job plugin. (Sorry not sure where)

    3. It would still be the same niche, it’s just for backup purposes so the page won’t show “no rss items to display”
    Yahoo Pipes could just be the primary aggregator. When it goes down use an autoblog plugin or something else that pulls from the same sources.

    The email idea is essentially another variation of the above only for use when Yahoo Pipes goes on a coffee break. Either could be used instead of getting “no rss items to display” on the front page. When the pipe fails visitors still get content instead of clicking away never to return.

    You can also change the code to display to the admin only, something like “Yahoo Pipes is down, backup in use.” instead of “no rss items to display”.

    You are absolutely right, custom code is ideal for a production system. I am suggesting ideas for newbies, non-coders and people without money to outsource could use to make something right now. Plus, just by bringing positive input a lightbulb might go off that may help you or other visitors improve the idea in a direction not thought of before.

  6. Hmmm… Very interesting.

    (1) Not sure why you’d do this. Index.php suffices for my purposes. Am I missing something? You can of course put the river of news pretty much anywhere on your site that you prefer.

    (2) Cron services are not available on a lot of budget hosting accounts, so I didn’t start exploring this angle. But for a more advanced version, it’s a great consideration.

    (3) Very cool, but if I’m following one niche, why would I partition its feed set into two or more groups? That defeats the purpose of a niche monitor if you’re only seeing some of the feeds you’re following at any given time.

    I’m not sure I understand the email idea. For my purposes, I need a river that shows headlines, links and excerpts, so that I can quickly scan what people are writing in a niche. Later, I want to transform my river of news into a tool exactly like Techmeme/Megite, but for any niche I pick.

    Maybe it’s Saturday and my brain’s not working, but it seems the email step is unnecessary. My dependency on Yahoo Pipes in this example is purely for prototype, and so that I can discuss the tool in this article. In a production system, I’d use cron jobs and replace Yahoo Pipes with custom Perl or PHP code.

    How can I credit you when you’re anonymous?

  7. Okay playboy, it’s me again. This is good stuff. Now let me give you some seasoning to throw in this good stew you’re cooking.

    How about adding this twist

    1. Instead of using index.php create a template page “page-newsclone.php” dump the code in there.
    2. Set up a cron job to check for new feeds and update “autopost” accordingly.

    3. Instead of leaving the pipe dry when yahoo bonks out kick back out to another group feed. You could even have it on another file, pulling manually

    ooh the lights done came on! Have all the feeds in the clone going to email. Then post by email and set up you template to only show title and link to the source. Booya! No blank pages.

    If you code it up before I hack my way to a funcitoning version, I would kindly appreciate some source code love.

Comments are closed.