Yahoo Pipes By Example, Part 1: Mashing Up Multiple Feeds

One of the most common uses of Yahoo Pipes is to mashup several RSS feeds and perform a few operations on them, such as sorting by date or filtering for keywords. This is a relatively easy Pipe to build, and you have many options for filtering and/or sorting, depending on your needs. Before you continue reading this tutorial, please read up on basic Yahoo Pipes functionality.

Sample Yahoo Pipe Details

In the sample Pipe for this tutorial, we’ll mash together the RSS feeds of two blogs, Performancing and Blog Herald, sort reverse chronologically by date, then truncate the resulting RSS “stream” at 20 items.

Basic Yahoo Pipe Building Process

Let’s go over a basic “feed mashup” example, then discuss variations. The most basic feed mashup is to simply supply two or more RSS feed URLs in a Yahoo Pipe. (For simplicity, let’s start with a static list of feeds.) But that’s not useful since you don’t know how items in the resulting “stream” are ordered. So let’s sort the mashed up feed in reverse chronological order (newest to oldest). The process is as follows:

  1. Supply two or more RSS feed URLs.
  2. Fetch the feeds.
  3. Combine the feeds together.
  4. Sort them in reverse chronological order.
  5. Truncate the mashed feed to 20 items.
  6. Output the resulting stream as an RSS feed

Yahoo Pipes Modules to Use

That is the most basic process, and to produce an actual Yahoo Pipe, we need to use only 4 modules, in the following order:

  1. Fetch Feed. Specify one or more feeds, grab their items, and combine them into a single feed.

    Notes:

    • This Pipes module is found under the Sources sub-menu.
    • Assume resulting item ordering is undefined.
    • In our example, shown below, we’ll use the Performancing and Blog Herald RSS feeds.
  2. Sort. Specify a sorting field and criteria. In this case, sort by Publication Date (item.pubDate) in descending order.

    Notes:

    • This module is found under the Operators sub-menu.
    • In the past, Pipes’ Sort module has not always worked consistently, but we are at its mercy.
  3. Truncate. Take the mashed stream and truncate it after X items.

    Notes:

    • This module is found under the Operators sub-menu.
    • We’re limiting the result to 20 items.
  4. Output. Output the mashed stream. This will be available from the standard Yahoo Pipes “run” interface. From there, you can retrieve the dynamic RSS feed URL for the output stream.

    Note:

    • This module is automatically displayed in a new Pipe, once you drag and drop any other module.

A screenshot of a sample Pipe is displayed below. Provided that you have a Yahoo Mail account, you can access the sample Pipe, clone a copy into your account, and tweak to your heart’s content.

Basic Pipe Variations

There are a number of simple variations that you can apply to the above sample Pipe that are relatively easy to do.

  1. Sort in chronological order (oldest article first).
  2. Filter by date. E.g., articles newer than a certain day.
  3. User-supplied date filter. Pick the maximum
  4. User-supplied truncation limit. So instead of hard-coding “20” as the number of items in the mashed feed, let the end user of the Pipe supply the value.
  5. User-supplied URLs. You are limited to a fixed number of URLs that can be specified.

Advanced Pipe Variations

Here are some more advanced variations for our feed mashup Pipe.

  1. Remove extraneous data fields in input streams. If you look at the results of the example Pipe, you’ll see more than just the item description and title. One of the input feeds has extra links thanks to its Feedburner settings. If you don’t want them appearing in the result, you have to filter these fields out before the two feeds are mashed together.
  2. Truncate the number of items used on a per feed basis rather than on the entire mashed feed. E.g., the 5 most recent items for each feed.
  3. Use a dynamic external list of feed URLs. The best way to do this is to build Pipe #1 to process a single feed: sort reverse chronologically and truncate to X items (hard-coded or user-supplied). Then build Pipe #2 to read an external list of feed URLs and loop through each, supplying each stream to Pipe #1, then mashing up the results of all streams.

Requests for Custom Pipe

Obviously, I cannot provide example Pipes for each variation listed above, but what I will do is two things:

  1. Take suggestions for an advanced Pipe as described above, create it, then either share it in the comments here or cover it in the next Yahoo Pipes By Example post. So I’ll build a free custom Pipe, provided it’s not too complex, can be done in a few hours, and is generic enough that it’ll be useful to someone other than yourself. (I can only work on it on weekends.)
  2. Progressively include other Pipes modules in hard exampes.

So if there’s a Pipe you need and something like it is described in the Advanced section above, feel free to ask. Be specific about the details. I’ll be doing the next Yahoo Pipes By Example post next weekend (one per week).

15 thoughts on “Yahoo Pipes By Example, Part 1: Mashing Up Multiple Feeds

  1. Paulms: You probably want to either parse the RSS feed onto your page via fetch_rss() (if using WordPress) or use a means to “HTML badge” it. For example, if you run your Pipes output, grab the dynamic RSS URL that’s generated, then “burn that URL” in Feedburner.com, you can then use Feedburner’s HTML badge to embed the content on your page. Feedburner uses a snippet of JavaScript, which gets embedded into your page. Now, when you browse the source of your page, you’ll only see the JavaScript snippet. But I believe when a search engine bot sees it, it’ll get rendered into HTML (not 100% sure about that, but previous experience leads me to believe that’s true). Thus if you have the right text content in the feed, your page will get SEO benefits.

  2. Just found this great blog after pulling my hair out all day.

    I’m trying to figure out the best way to include Yahoo Pipes output for SEO purposes. I’m not technical enough to quite comprehend the javascript/json method, but most search engines won’t recognize the content that way will they?

    I’m trying to add Pipes output to other content on pages and am trying to understand the best way to include it on the page for SEO purposes.

  3. Hmm. Interesting. Well that’s good news then, but the Babelfish module is listed in their “Deprecated” menu. Not sure why it’s still accessible in new Pipes.

  4. Strange Raj, because I used it on a new pipe at the weekend. I did notice the translation module, but not played around that one, I chose Babelfish as I am familiar with it.

  5. Ryan: Very easy. I’ll do that and share it here. (Both version)

    Darren: Babelfish module has unfortunately been “deprecated”. You can continue to use it in older Pipes but you cannot access for new Pipes. They have a replacement, but I haven’t studied it enough.

  6. Within Pipes there’s a module for ‘babel fish’ where I managed to translate my RSS feed into Spanish. I can offer this out as a non-English RSS feed, which I think is great, but I am going to struggle to communicate with those that want to leave comments in Spanish.

    What I’d like is that the reader can write comments in Spanish and then translates the comments into English. This will make your blog multi-national and target those millions who speak Spain and other languages.

    Anyone else think this is a good idea?

    Another thought about pipes, and at this moment I am not sure if this would be possible, but I’d like to be able to add in advertising into my RSS feed much easier than I can now. So if I did have a Spanish version of my site, I could easily add in advertising which suited the Spanish readers.

  7. No, I don’t need “top blogs” – I just want one of the following 2 things:

    1. The last 10 articles posted, in temporal order, from the set of all feeds in list X

    or…

    2. Max of one story per feed, temporally ordered, most recent stories from the set of all feeds in list X

    Option 2 is more difficult because it puts a limit of 1 on the total number of stories that can be extracted from any given feed.

  8. Ryan: It’s quite incredible what you can produce, powered by Pipes.

    I think I know what you wanted to do. In the spirit of this series, I will do it here for free, in the next post. But how did you want the feeds filtered/ manipulated? I.e., how many items per feed, any special treatment per feed, etc.?

    I’ve already built a Pipe that takes the URL of a web page, yanks out all the href links, then produces the list of corresponding feeds per link. That’s one way of producing the external list. Another way is to supply an RSS URL list in .CSV form (single data field per row), read it into the Pipe, truncate each feed to the X most recent items, then mash them all together. So if the external list has 10 feeds, we sort each feed reverse chron, truncate to, say, 5 items, and output 10×5 = 50 items maximum.

    If you need for this Pipe to pick out the top blogs in a niche, that’s more complex, and I hope to do that a few weeks down the line – depending on the state of Technorati’s API.

  9. Raj, I think you already know my need, re: a dynamic external list of feed URLs. I’d pay you to create the tool I’ve referenced in the past.

  10. Wow Raj. This is truly the first time I’ve actually gotten my mind around the power of Pipes. Thanks man!

  11. Darren: That’s great to know. Thanks for sharing. I never thought it through, so I wasn’t sure if Google Maps could be integrated. (I have a dormant travel blog and a city blog-in-waiting.) Also agree with you: if I start working on a Pipe, I often find I spend many hours playing with variations or checking out other people’s Pipes.

  12. Yep, I’ve not got a great deal done this weekend because of Yahoo Pipes – thanks! 😐

    I used pipes using the ‘union’ operator to combine results from Yahoo local for ‘hotels’ and ‘restaurants’ within 5 miles of New York. With this data I was able to export a KML file which I uploaded to my server, and then went to Google map, entered the URL, and use the street view functionality along with the Yahoo local results.

    Hey presto!

    This is the result.

    I’ve had a great idea how I can implement pipes into my travel blog..

  13. Thanks for this Raj! Although I dabbled with Pipes to create Travel in Papers, it was a while ago, but this is an excellent reference article for me to come back to. Kudos for your generosity mate 😉

Comments are closed.