aboutsummaryrefslogtreecommitdiff
path: root/content/posts/2003-12-06-retreat.html
blob: b5a44bb9a8b2ce0ef3201e2d5e478430d8c509fd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
date: "2003-12-06T09:13:55Z"
title: Retreat!
---

<p>
I give up.  I'm going to disable the parallel feed grabbing in <a
href='http://www.raggle.org/'>Raggle</a> so we can put out a new
version.  <a href='http://www.pekdon.net/'>Claes (pekdon)</a> suggested
I try and rewrite it, but the implementation is already pretty simple.
Here's a high-level view of how the old non-parallel and new parallel
feed grabbing stuff works:
</p>

<p>
<b>Old Code</b>
<pre>
$config['feeds'].each { |feed|
  # download feed
}
</pre>


<p>
<b>New Code</b>
<pre>
threads = { }
$config['feeds'].each { |the_feed|
  threads[the_feed['url']] = Thread::new(the_feed) { |feed|
    # download feed
  }

  thread = threads[the_feed['url']]
  if thread &amp;&amp; thread.status == 'run' &amp;&amp;
     !$config['grab_in_parallel']
    # thread.join            
  end
  until Thread::list.size &lt; ($config['max_threads'] || 10)
    $log.puts 'DEBUG: waiting for threads'
    sleep 5
  end
}
</pre>
</p>

<p>
Of course, looking at this code as I'm pasting it, it just occured to me
that if you have two feeds with the same <acronym title='Uniform
Resource Locator'>URL</acronym>, you could have two threads trying to
muck with the feed at the same time.  Wonder if that's what's causing <a
href='http://www.ruby-lang.org/'>Ruby</a> to freak out.  By the way,
this is why I <em>really</em> dislike threads.  Not because I'm an
ignoramus, but because they encourage subtle bugs like this.  Anyway,
let's see if that fixes our random crash woes.
</p>

<p>
Oh, and before anyone asks, yes, I realize that's not the best
way to implement the thread capping stuff.  And yes, I realize thread
pooling would be more efficient.  Right now I'm just trying to get it to
work reliably, then I'll focus on optimization.
</p>