Ivan Kozik
|
76ba117d34
|
Document DIR/max_content_length
|
2015-08-10 13:15:20 +00:00 |
|
Ivan Kozik
|
bf080c7cb4
|
Implement --max-content-length=N for skipping large responses
|
2015-08-10 13:12:34 +00:00 |
|
Ivan Kozik
|
8b1791475d
|
Remove unused import
|
2015-08-10 13:00:37 +00:00 |
|
Ivan Kozik
|
dfd1e8cd47
|
singletumblr igset: explain
|
2015-08-10 11:51:33 +00:00 |
|
Ivan Kozik
|
1cb9331939
|
nosortedindex igset: add comment
|
2015-08-10 11:49:45 +00:00 |
|
Ivan Kozik
|
33cc3040ed
|
mediawiki igset: add comments
|
2015-08-10 11:48:53 +00:00 |
|
Ivan Kozik
|
40cae40dc5
|
blogs igset: comment more
|
2015-08-10 11:46:32 +00:00 |
|
Ivan Kozik
|
4e517e2994
|
blogs igset: remove ignores that are already covered by 'global'
|
2015-08-10 11:45:28 +00:00 |
|
Ivan Kozik
|
4d570d88bd
|
Add some comments to 'blogs' ignore set
|
2015-08-10 11:44:20 +00:00 |
|
Ivan Kozik
|
6f03c5137d
|
Move pixel.redditmedia.com from reddit to global ignore set
|
2015-08-10 11:42:03 +00:00 |
|
Ivan Kozik
|
e304c60586
|
Describe why various ignores are in the 'global' ignore set; add support for comments in ignore sets
|
2015-08-10 11:41:16 +00:00 |
|
Ivan Kozik
|
aa9b877843
|
Don't crash with "error: unrecognized arguments" if cwd contains space
Closes #32.
|
2015-08-02 03:51:37 +00:00 |
|
Ivan Kozik
|
9f071a706d
|
setup.py: specify minimum version for all dependencies
Specifically, this solves a problem where trollius is too old to have
ensure_future.
|
2015-08-02 01:47:03 +00:00 |
|
Ivan Kozik
|
e55fa13004
|
Make wpull write .cdx file (its impl does one .cdx covering all WARC files)
|
2015-07-31 23:55:27 +00:00 |
|
Ivan Kozik
|
e1bb1ec749
|
README: tweak
|
2015-07-31 22:47:58 +00:00 |
|
Ivan Kozik
|
ed869864d4
|
README: link to ArchiveBot
|
2015-07-31 03:52:42 +00:00 |
|
Ivan Kozik
|
6cd50f9688
|
README: tweak
|
2015-07-31 03:50:39 +00:00 |
|
Ivan Kozik
|
412ea7791f
|
README: changes to ignores may take up to 3 seconds to apply
|
2015-07-30 23:36:12 +00:00 |
|
Ivan Kozik
|
19f6971261
|
dashboard: don't handle ctrl-f, alt-f, and other ctrl/alt- key combinations
|
2015-07-29 23:04:20 +00:00 |
|
Ivan Kozik
|
d72e4094d1
|
Bump version
|
2015-07-29 18:38:31 +00:00 |
|
Ivan Kozik
|
91ed7689a2
|
Remove unused local
|
2015-07-29 18:37:43 +00:00 |
|
Ivan Kozik
|
73d9c03e5e
|
Remove unused import
|
2015-07-29 18:35:46 +00:00 |
|
Ivan Kozik
|
a418beaff8
|
README: tweak for the non-ArchiveBot audience
|
2015-07-29 08:55:06 +00:00 |
|
Ivan Kozik
|
4f437ae2d0
|
dashboard: remove mentions of ignore sets
|
2015-07-29 08:46:35 +00:00 |
|
Ivan Kozik
|
deb05d981d
|
README: link to correct ignore sets
|
2015-07-29 08:45:25 +00:00 |
|
Ivan Kozik
|
b806316cb1
|
Use built-in ignore sets; don't crash if invalid ignore set is specified
|
2015-07-29 08:36:36 +00:00 |
|
Ivan Kozik
|
22835a5ddc
|
igsets: global: don't exclude archive.org (that ignore made sense for ArchiveBot, which sent WARCs to IA)
|
2015-07-29 08:24:42 +00:00 |
|
Ivan Kozik
|
51d3b1f794
|
igsets: rm internetcentrum - it is long gone
|
2015-07-29 07:45:58 +00:00 |
|
Ivan Kozik
|
5276fec1a9
|
Convert JSON ignore sets to plain text to avoid the backslash doubling
|
2015-07-29 07:44:12 +00:00 |
|
Ivan Kozik
|
68f5fc0dd2
|
igsets: noonion: fix backslash
|
2015-07-29 07:40:20 +00:00 |
|
Ivan Kozik
|
4c0f60cf06
|
Don't try to install patched-wpull as it doesn't exist
|
2015-07-29 07:37:52 +00:00 |
|
Ivan Kozik
|
e53f4465e5
|
db/ignore_patterns -> libgrabsite/ignore_sets
|
2015-07-29 07:37:19 +00:00 |
|
Ivan Kozik
|
5e70cd4acc
|
Remove questionable /(.*)/(\1/){3,} ignore
|
2015-07-29 07:33:51 +00:00 |
|
David Yip
|
62d1dbc0ad
|
Revert "Temporarily ignore voat.co, as it is not responding"
This reverts commit f6fb34ad5b46cf730d5e07475b1c1fc73b3570a8.
voat.co is back up.
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
04a6c18054
|
Add .kr TLD for blogspot
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
a6f8d510c0
|
Ignore simple.reddit.com
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
7c4c5e42cd
|
Ignore /.mobile on reddit
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
ed5fb60cce
|
Temporarily ignore voat.co, as it is not responding
Please revert this when it comes back up
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
8aae334c25
|
Ignore another streaming site
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
4d2a496fbb
|
Ignore another share link
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
1d388ae969
|
Ignore another streaming site
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
e014c48215
|
Fix filename
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
45ec93cc1a
|
Add noonion ignore set to ignore .onion sites
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
7320865fd7
|
Ignore another share link
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
f34ed18ce6
|
Ignore Yahoo beacon
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
6cb0fa49f5
|
Ignore more ?sort= pages on reddit
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
cec78653cb
|
Ignore another share link
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
9face53dba
|
Ignore Special:Diff and Special:MobileDiff
|
2015-07-29 07:33:51 +00:00 |
|
Ivan Kozik
|
8110d41ac4
|
Ignore another Google Analytics endpoint
|
2015-07-29 07:33:51 +00:00 |
|
Start
|
3b86cb984e
|
minor improvements
|
2015-07-29 07:33:51 +00:00 |
|