Ivan Kozik
d2f06d22a1
Add another IMDb ignore pattern
2015-07-29 07:33:45 +00:00
Ivan Kozik
9e5b2ef8fa
Add IMDb ignore patterns
...
/board/nest/ has everything, no need to grab other /board/ formats
2015-07-29 07:33:45 +00:00
Ivan Kozik
5f662492ab
Add more forum ignore patterns
2015-07-29 07:33:45 +00:00
Ivan Kozik
d99a7d48be
Add more phpBB ignore patterns
2015-07-29 07:33:45 +00:00
Ivan Kozik
4ba9cc3a78
Add phpBB patterns; make patterns stricter
2015-07-29 07:33:45 +00:00
Ivan Kozik
0c256a272a
Add twitter.com/intent/tweet; add blogspot TLDs
2015-07-29 07:33:45 +00:00
David Yip
054722c334
Ignore patterns: Lua pattern syntax -> regex syntax.
2015-07-29 07:33:45 +00:00
Ivan Kozik
cb5aa1f2ca
blogs ignore set: ignore http://www.tumblr.com/impixu
2015-07-29 07:33:45 +00:00
David Yip
fa54c01f56
Fix syntax error in forums ignore set.
2015-07-29 07:33:45 +00:00
David Yip
81a4e2b4b6
Also ignore registration, RSS, and some odd cronjob runner.
2015-07-29 07:33:45 +00:00
David Yip
b3d97ebb67
Start a forums ignore set.
...
These ignore patterns are derived from vBulletin; more work is needed to
derive a good set for e.g. PhpBB, IPS, and IBB. I don't think we'll run
into a situation where one set of URLs is valid for one forum software
but not another, but if we do, you can expect this set to be split out
by software name.
2015-07-29 07:33:45 +00:00
David Yip
eed67549f1
Ignore another "open with reply form" LJ URL.
2015-07-29 07:33:45 +00:00
David Yip
e14dbf5261
Add patterns useful for archiving LiveJournal sites.
...
The rundown:
livejournal%.com/ljcounter%?: LJ's hit counter thing
%?replyto=%d+: reply-to links that just generate a reply box; useless
for anonymous archival
xiti%.com/hit%.xiti%?: another hit counter thing
2015-07-29 07:33:45 +00:00
David Yip
1c9d0af35c
Remove _id from blogs ignore pattern. #40 .
...
_id is now automatically calculated.
2015-07-29 07:33:45 +00:00
Ivan Kozik
694885b733
Also ignore http://r-login.wordpress.com/remote-login.php
2015-07-29 07:33:45 +00:00
Ivan Kozik
d981f64e3d
Ignore all ?share=
2015-07-29 07:33:45 +00:00
Ivan Kozik
580120eee7
Add showComment=; add /search/label/; fix . -> %.
2015-07-29 07:33:44 +00:00
David Yip
17a13ea5f5
Fix mistakenly escaped . in blogs ignore set.
2015-07-29 07:33:44 +00:00
David Yip
c1a52c3321
Fix unescaped ( in blogs ignore set.
2015-07-29 07:33:44 +00:00
David Yip
7e5ecf25ce
Add the blogs ignore set in #21 .
2015-07-29 07:33:44 +00:00