patternpythonMinor
Simple hack to merge RSS feeds
Viewed 0 times
simplerssmergehackfeeds
Problem
Have I made any obvious mistakes?
Is this code clean?
Is there a better way to do this in python?
Is this code clean?
Is there a better way to do this in python?
# -*- coding: utf-8 -*-
import feedparser
from feedgen.feed import FeedGenerator
from random import shuffle
from feedformatter import Feed
import time
from datetime import datetime
tstart = datetime.now()
# Set the feed/channel level properties
# ----------------------------------- #
chanTitle = 'Feed Merger'
chanLink = 'http://server.com/feed'
chanAuthor = 'Bob Dylan'
chanDescription = 'Brain Food'
# ----------------------------------- #
# Apply feed/channel level properties
# ----------------------------------- #
feed = Feed()
feed.feed["title"] = chanTitle
feed.feed["link"] = chanLink
feed.feed["author"] = chanAuthor
feed.feed["description"] = chanDescription
# ----------------------------------- #
urls = list(set(open('urls.txt', 'r').readlines()))
shuffle(urls)
extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
shuffle(feed.entries)
save = lambda outfile: feed.format_rss2_file(outfile)
merg(urls)
save('feed.xml')
tend = datetime.now()
runtime = tend - tstart
print "Runtime > %s" % (runtime)
print "Merged > %d items" % (len(feed.entries))Solution
These lines appear to be out of order:
In particular, when that shuffle runs, the feed has no entries yet, so there's nothing to shuffle. I think you meant this way:
More importantly, I think you're overusing lambdas for no benefit at all. The flow of the logic would be a lot easier to read if you used loops instead:
extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
shuffle(feed.entries)
save = lambda outfile: feed.format_rss2_file(outfile)
merg(urls)In particular, when that shuffle runs, the feed has no entries yet, so there's nothing to shuffle. I think you meant this way:
extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
merg(urls)
shuffle(feed.entries)More importantly, I think you're overusing lambdas for no benefit at all. The flow of the logic would be a lot easier to read if you used loops instead:
urls = list(set(open('urls.txt', 'r').readlines()))
shuffle(urls)
for url in urls:
for entry in feedparser.parse(url).entries:
feed.items.append(entry)
shuffle(feed.entries)
feed.format_rss2_file('feed.xml')Code Snippets
extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
shuffle(feed.entries)
save = lambda outfile: feed.format_rss2_file(outfile)
merg(urls)extract_entries = lambda url: feedparser.parse(url).entries
addEntries = lambda entries: [feed.items.append(entry) for entry in entries]
merg = lambda urls: [addEntries(extract_entries(url)) for url in urls]
merg(urls)
shuffle(feed.entries)urls = list(set(open('urls.txt', 'r').readlines()))
shuffle(urls)
for url in urls:
for entry in feedparser.parse(url).entries:
feed.items.append(entry)
shuffle(feed.entries)
feed.format_rss2_file('feed.xml')Context
StackExchange Code Review Q#47598, answer score: 2
Revisions (0)
No revisions yet.