Previous | Next | WireHose Developers Guide |
The next step is to write the crawler. This method is similar to
the importFeeds
method, except that it will call
fetchDictionaryFromURL
and insertResources
repeatedly, once for each available feed. It will also assign tags to
the items in the feed based on the feed's tags.
public static void crawlFeeds() { EOEditingContext ec = new EOEditingContext(); ec.lock(); NSArray feeds = fetchFeedsToCrawl(ec); NSLog.debug.appendln("Found "+feeds.count()+" to crawl..."); RSSFeed feed; NSMutableDictionary rss; NSMutableArray snapshots; NSDictionary statusDict; NSArray inserted; // iterate through feeds and fetch items from each one for (int i=0, count=feeds.count(); i<count; i++) { feed = (RSSFeed)feeds.objectAtIndex(i); NSLog.debug.appendln("Crawling "+feed.name()+": "+feed.link()); try { // import the dictionary from the feed's URL rss = WHImporter.fetchDictionaryFromURL( feed.link(), "Contents/Resources/rss20MappingModel.xml"); // extract and clean up the dictionaries snapshots = cleanSnapshots(rss.valueForKeyPath("channel.items")); // insert the resources into the database // insertResources returns a dictionary of inserted, updated, deleted items statusDict = WHImporter.insertResources(ec, snapshots, "RSSItem", "Content/", null, WHImporter.IgnoreAndTag, true, true, true, true, false); // get inserted items from the returned dictionary inserted = (NSArray)statusDict.objectForKey(WHImporter.InsertedKey); // add tags to the inserted items based on the feed's tags tagItemsForFeed(ec, inserted, feed); // don't fetch for another hour feed.setLastFetchDate(new NSTimestamp()); ec.saveChanges(); } catch (Exception e) { NSLog.debug.appendln("Exception importing "+feed.link()+" - "+e); feed.setLastFetchWasInvalid(true); } } ec.unlock(); ec.dispose(); }
The insertResources
method returns a status
dictionary which contains arrays of updated, inserted, removed and
ignored objects. The importFeeds
method ignored this
return value, but here the list of inserted items are extracted
from the dictionary so they can be tagged.
Copyright ©2000-2003 Gary Teter. All rights reserved. WireHose is a trademark of Gary Teter.