--- date: "2004-10-25T02:29:48Z" title: New Raggle Engine in CVS ---
What will probably become the new Raggle engine is now in CVS, under the module name squaggle
. Here's what I've got so far:
ETag
and Last-Modified
)http_proxy
env variable or the config hash; there's a stub for win32 proxy support at the moment)Squaggle#feeds
and Squaggle#feed_items
methodsfeed_attrs
table, for elements I haven't implemented yet, and for Atom support as well. I have more to say about this one belowI spent a bunch of time in the last month reading through as many RSS specs as could get my hands on. I read through the Atom spec as well. The three biggest problems users have had with Raggle are speed, memory use, and supported feeds. I'm attempting to address the speed issue in a couple of ways: by deferring as much of the internal searching and sorting to SQLite (aside: this also has a side benefit of dramatically simplifying the code, since all the funky array indexing, time conversions, ID hashing, etc goes away and becomes SQL queries :D). The memory use has also been addressed with a caveat (see my note above about the end-user interfaces and memory requirements). Paradoxically, the Ncurses interface may end up using more memory than the web interface, because the Ncurses interface has more speed and caching requirements than the web interface. As for proper feed support, that one is a little bit trickier.
Supporting RSS properly is actually
kind of a bitch, because there is no official standard (although there
are plenty
of specifications). Even worse, a lot of feeds play fast
an loose with requirements, so strict RSS parsers (like the undocumented one included with Ruby 1.8, or Chad Fowler's Ruby/RSS module) are nice
pieces of code, but useless for writing an RSS aggregator, in the same way that strict
The way I dealt with this problem in previous versions of Raggle was to simply ignore the specs that were out there and look for specific elements in feeds. This has worked so well I'm going to keep doing it, with a twist. My goal with Squaggle is to keep Raggle aware of as much of the RSS spectrum as I can, but have the engine (Squaggle) only pay attention to what it absolutely has to. For example, if a feed has mixed RSS 0.92/1.0 elements, Raggle will parse it blindly and save what it can.
What I've got so far is available in CVS under the module squaggle
. Play around with it and let me know what you think.