HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Simple hack to merge RSS feeds

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
simplerssmergehackfeeds

Problem

Have I made any obvious mistakes?

Is this code clean?

Is there a better way to do this in python?

# -*- coding: utf-8 -*-
import feedparser
from feedgen.feed import FeedGenerator
from random import shuffle
from feedformatter import Feed
import time
from datetime import datetime

tstart = datetime.now()
# Set the feed/channel level properties
# ----------------------------------- #
chanTitle = 'Feed Merger'
chanLink = 'http://server.com/feed'
chanAuthor = 'Bob Dylan'
chanDescription = 'Brain Food'
# ----------------------------------- #
# Apply feed/channel level properties
# ----------------------------------- #
feed = Feed()
feed.feed["title"] = chanTitle
feed.feed["link"] = chanLink
feed.feed["author"] = chanAuthor
feed.feed["description"] = chanDescription
# ----------------------------------- #
urls = list(set(open('urls.txt', 'r').readlines()))
shuffle(urls)
extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
shuffle(feed.entries)
save = lambda outfile: feed.format_rss2_file(outfile)
merg(urls)
save('feed.xml')
tend = datetime.now()
runtime = tend - tstart
print "Runtime > %s" % (runtime)
print "Merged  > %d items" % (len(feed.entries))

Solution

These lines appear to be out of order:

extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
shuffle(feed.entries)
save = lambda outfile: feed.format_rss2_file(outfile)
merg(urls)


In particular, when that shuffle runs, the feed has no entries yet, so there's nothing to shuffle. I think you meant this way:

extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
merg(urls)
shuffle(feed.entries)


More importantly, I think you're overusing lambdas for no benefit at all. The flow of the logic would be a lot easier to read if you used loops instead:

urls = list(set(open('urls.txt', 'r').readlines()))
shuffle(urls)
for url in urls:
    for entry in feedparser.parse(url).entries:
        feed.items.append(entry)
shuffle(feed.entries)
feed.format_rss2_file('feed.xml')

Code Snippets

extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
shuffle(feed.entries)
save = lambda outfile: feed.format_rss2_file(outfile)
merg(urls)
extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
merg(urls)
shuffle(feed.entries)
urls = list(set(open('urls.txt', 'r').readlines()))
shuffle(urls)
for url in urls:
    for entry in feedparser.parse(url).entries:
        feed.items.append(entry)
shuffle(feed.entries)
feed.format_rss2_file('feed.xml')

Context

StackExchange Code Review Q#47598, answer score: 2

Revisions (0)

No revisions yet.