Advanced Feed Mashup: Yahoo Pipes By Example, Pt 2

In the previous post of this series, Mashing Up Multiple Feeds, Ryan asked in the comments section for a customization of the Yahoo Pipe I presented. The basic Pipe took two hard-coded RSS feeds, mashed them together, sorted reverse chronologically, and truncated the resulting mashed feed at 20 items.

Ryan had either of two customization requests:

  1. Last 10 articles in “temporal” order, from the set of all feeds in a list X.
  2. Max of one story per feed in “temporal” order, from the set of all feeds in a list X.

List X is an external list that contains a set of RSS feed URLs that can be changed at will. I’m assuming “temporal” order means oldest first, which is the opposite of “reverse chronological.” (If I’m wrong, it’s easy to fix.)

Process and Modules For Option #1

The Pipe preparation and building process for option #1 is as follows:

  1. Create a list of feed URLs to be processed. I suggest .CSV format, for the simplest effort. (Basically, one field per line, where a field is the full URL of an RSS feed.) You can use OPML, I believe, but I haven’t explored this option, and it’s not the simplest one either.
  2. The Pipe will prompt the end user for the URL of the feed list.
  3. In a loop, fetch the items of each feed and mash them all into one feed stream.
  4. Sort the stream temporally.
  5. Truncate the stream to 10 items. (Or we can add a user input field and let the them decide.)
  6. Output the stream results.

The modules to use for this Pipe are as follows:

  1. Fetch CSV.
  2. Loop.
  3. Fetch Feed. (Or if the external list contains the URL of the blog instead of the feed, then use the Fetch Site Feed module.)
  4. Sort.
  5. Truncate.
  6. Output.

Process and Modules For Option #2

The Pipe preparation and building process for option #2 is as follows:

  1. Create a list of feed URLs to be processed. (As above, I suggest .CSV format.)
  2. In a loop, fetch the items of each feed but only grab the first item (i.e., truncate to 1 item), and mash everything together.
  3. Sort the mashed stream temporarily.
  4. Output the stream results.

There actually a few ways to build this Pipe. I’m presenting just one combination of modules, which involves building two Pipes.

Slave: Pipe 2a: This Pipe grabs the most recent post of a single RSS feed, whose URL is supplied from the loop in Pipe 2.

  1. URL Input.
  2. Fetch Feed.
  3. Sort.
  4. Truncate.
  5. Output.

Master: Pipe 2b: This Pipe grabs the feed list, sends each URL to Pipe 2a, and mashes up the returned items.

  1. Fetch CSV.
  2. Loop.
  3. Pipe 1.
  4. Sort.
  5. Output.

Pipe 2b is the master Pipe, and loops through the list of feeds, calling upon Pipe 2a to do some of the work. (This embedding of Pipes as modules gives Yahoo Pipes tremendous power for feed mashups.)

We do have to use Sort in both Pipes to ensure that the mashed feed is sorted properly.

Sample Pipes

As in the last post, I am not going to provide indepth commentary on each Pipe or module because this series is “by example”. Instead, you can access both sample Pipes, clone them and explore them, as well as me questions here.

Where are the sample Pipes? At the time of this writing, I’m running late. I’ll add links in the comments once they’re completed.

12 thoughts on “Advanced Feed Mashup: Yahoo Pipes By Example, Pt 2

  1. I have been messing with your created pipes but haven’t yet achieved the result I want. WOuld it be possible to modify the pipe to allow another RSS feed ‘feed’ the pipe. The scenario I have in mind is a Feed of bookmarks that I have Tagged RSS. Delicious will easily allow you to subscribe to an RSS feed of tagged items and then your pipe would parse the RSS feed for RSS feeds and then Fetch the items, weed out the dups, sort them and possibly append a Source tag in the header. Nice and simple right? I can’t seem to get it done but maybe someone else can.

    This would of course allow me to just use delicious to bookmark a feed and tag it RSS and have it appear automagically in my Mondo Feed.


    Thanks for a great article by the way.

  2. I’ve made a slight tweak for end-user convenience… Pipe 2b Ver 4 asks for the URL of the CSV feed list. That makes it more robust; you don’t have to edit the Pipe’s Fetch CSV module each time you want to use a different feed.

  3. Ryan: Here is Pipe 2b Ver 3. It takes an external list of feeds (as per the description above) and produces just one linked headline per feed.

    Note 1: One item per feed is the default. You can change that to, say, 3. Then run the Pipe and copy the output’s RSS URL if you want to use it elsewhere.

    Note 2: Some feeds will not work for some reason. For example, David’s native feed just doesn’t want to work in Pipes. It might not be RSS 2.0. If you have that problem with anything, what you have to do is run it through Feedburner’s Smart Feed feature then use the resulting feed. (You can add someone else’s feed into your Feedburner list of burned feeds.)

  4. Yeah, I unfortunately have to agree. It’s the first thing that I thought when Pipes was released in Feb 2007. In fact, it supports what cyberpunk author Bruce Sterling (I believe) said in a mid-1990s novel: that in the coming years, copyright would be very difficult to enforce/ maintain. Though consider that the Internet was originally designed to be a vehicle of sharing, not commerce. (This is all a slice of the reason that I am moving towards offline work again, though even my intended career change is drastically affected by the Internet and copyright violation. I’m simply not relying long-term on making a living online.)

  5. I had a chat with a friend the other day about Yahoo Pipes and how easy it is to agrregate data and create an RSS feed. He thought that whilst Pipes had some good uses, i.e. the mapping mashup I mentioned here in an earlier post about Pipes, but his worry was that it would make it much much easier for everyone and anyone to steal content and publish it.

    What’s your opinion on this?

  6. Okay, so you want “reverse chronological”, which is what I did.

    Re Pipe 3b, sure. So you want to see just the headlines, with a link to the story. Very easy. I’ll try to get it done late tonight or tomorrow. (Hopefully going to the Toronto car show and catching Jumper as well. Unless it’s an emergency.)

  7. Raj, is there any way you could do a Pipe Version 3b that does exactly what Pipe Version 2b does except only output linked headlines?

  8. Well, temporal order could mean either oldest first or newest first. I was actually looking for newest first.

  9. After a bit of experimentation, I found that adding an input field to control the number of items per feed is actually easy. So here is Pipe 2b version 2.

    The default is 1, for the number of items to use per feed.

  10. If you have any questions about either Pipe, please ask – even if it’s a request for a customization.

  11. Pipe #2 is actually Pipe 2b as described above. It calls Pipe 2a, which does function as a standalone Pipe.

    This pairing of Pipes reads the same .csv file as Pipe #1, pulls the feed items from each feed in the list, and allows at most one item from each feed (the most recent item). The result is a reverse chronological list of the recent post from each blog.

    This pipe can be modified to allow the end user to specify the number of items per feed (i.e., 3 instead of 1). However, there are some complications in how parameters are passed between two pipes, and my previous experiments had mixed success. It’s far simpler to just clone Pipe 2b and tweak the Truncate module.

  12. Pipe #1 is relatively simple, and basically a step forward from the Pipe in the last article of this series.

    The link for the CSV file that I’ve used is:

    The first line is used to specify a field name for use in the Pipes Fetch CSV module.

    I’ve sorted the items reverse chronologically because “chronologically” seems to make no sense.

    Watch the comments for Pipe #2.

Comments are closed.