Subscribe: How to make a web service that consume about 500 RSS and save new items in Database ? - Stack Overflow
http://stackoverflow.com/feeds/question/4114929
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
consume save  consume  database  items database  make web  make  new items  new  save new  save  service consume  service  web service  web 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: How to make a web service that consume about 500 RSS and save new items in Database ? - Stack Overflow

How to make a web service that consume about 500 RSS and save new items in Database? - Stack Overflow



most recent 30 from stackoverflow.com



Updated: 2018-04-24T21:42:52Z

 



How to make a web service that consume about 500 RSS and save new items in Database?

2014-02-18T18:43:03Z

I have a project that i need to make a service that we will add to it about 500 RSS for different sites and we want this service to collect new RSS feeds from these sources and save Title & URL in my SQl Server database ... Could you help me in finding best architecture design with any codes that help me in that ?




Answer by Julien Genestoux for How to make a web service that consume about 500 RSS and save new items in Database?

2010-11-06T20:24:09Z

These indications are not specific to your stack (c#, asp.net), but I would definitely not recommend doing anything from the request-response cycle of your web app. It must be done in an asynchronous fashion, but results can be served from the database that you populate with the feed entries.

  1. It's likely that you'll have to build an architecture where you poll each feed every X minutes. Whether it's using a cron job, or a daemon that runs continuously, you'll have to poll each feed one after other other (or with some kind of concurrency, but the design is the same). Please make use of the HTTP headers likes Etags and If-Modified to avoid polling data that hasn't been updated.

  2. Then, you will need to parse the feeds themselves. It's very likely that you'll have to support different flavors of RSS and Atom, but most parsers actually support both.1.

  3. Finally, you'll have to store the entries and, more importantly before you insert them, make sure you haven't already added them. You should use the the id or guid for the entries, but it's likely that you'll have to use your own system too (links, hash...) because many feeds do not have these.

If you want to reduce the amount of polling that you'll have to do, while still keeping timely results, you'll have to implement PubSubHubbub for the feeds which support it.

If you don't want to deal with any of the numerous issues exposed earlier (polling in a timely maner, parsing content, diffing to keep uniqueness of entries...), I would recommand using Superfeedr as it deals with all the pain points.




Answer by Fredrik Mörk for How to make a web service that consume about 500 RSS and save new items in Database?

2010-11-06T20:18:51Z

I am not going to go into details about implementation or detailed architecture here (mostly from lack of time at this particular moment), but I will say this:

  • It's not the web service that should consume the RSS feeds, it should merely be responsible of spawning the work to do so asynchronously.
  • You should not use threads from the TreadPool to do this, for two reasons. One is that the work can be assumed to be more or less time consuming (ThreadPool is recommended primarily for short-running tasks), and, perhaps more important, ThreadPool threads are used to serve incoming web requests; don't want to compete with that.