77 Commits

Author SHA1 Message Date
Ivan Kozik
db212f6716 Reorder options in README 2015-08-12 05:39:16 +00:00
Ivan Kozik
668c03d5d2 Implement -i / --input-file, supporting both local input files and URLs 2015-08-12 05:24:09 +00:00
Ivan Kozik
37a6cb655c README: tweak 2015-08-10 13:40:52 +00:00
Ivan Kozik
b7743e780a Implement --ua= for setting the User-Agent 2015-08-10 13:38:00 +00:00
Ivan Kozik
ee4dbe162e Implement --igon / --igoff 2015-08-10 13:23:43 +00:00
Ivan Kozik
76ba117d34 Document DIR/max_content_length 2015-08-10 13:15:20 +00:00
Ivan Kozik
bf080c7cb4 Implement --max-content-length=N for skipping large responses 2015-08-10 13:12:34 +00:00
Ivan Kozik
e1bb1ec749 README: tweak 2015-07-31 22:47:58 +00:00
Ivan Kozik
ed869864d4 README: link to ArchiveBot 2015-07-31 03:52:42 +00:00
Ivan Kozik
6cd50f9688 README: tweak 2015-07-31 03:50:39 +00:00
Ivan Kozik
412ea7791f README: changes to ignores may take up to 3 seconds to apply 2015-07-30 23:36:12 +00:00
Ivan Kozik
a418beaff8 README: tweak for the non-ArchiveBot audience 2015-07-29 08:55:06 +00:00
Ivan Kozik
deb05d981d README: link to correct ignore sets 2015-07-29 08:45:25 +00:00
Ivan Kozik
e6f830764e Allow changing concurrency using DIR/concurrency file 2015-07-28 14:21:28 +00:00
Ivan Kozik
1198c88f2a Document --delay in README 2015-07-28 14:01:28 +00:00
Ivan Kozik
7cf8db39d3 Mention pipe to sort | less -S 2015-07-28 11:40:14 +00:00
Ivan Kozik
0dc440ffd8 Tweak README 2015-07-28 11:34:58 +00:00
Ivan Kozik
975f328c95 Document gs-dump-urls 2015-07-28 11:33:51 +00:00
Ivan Kozik
dbe1deb9f0 Clarify ignore sets 2015-07-27 13:28:49 +00:00
Ivan Kozik
41f7683d98 +1 is OK 2015-07-27 08:52:48 +00:00
Ivan Kozik
015df2a0df Link yipdw 2015-07-27 08:06:33 +00:00
Ivan Kozik
a89ef4b22b README: add Thanks and P.S. 2015-07-27 07:59:57 +00:00
Ivan Kozik
3b5f8b4be3 Clarify --concurrency 2015-07-27 07:38:06 +00:00
Ivan Kozik
b7c2f1d1bd Add --sitemaps/--no-sitemaps 2015-07-27 06:55:20 +00:00
Ivan Kozik
2e7d928614 Update README 2015-07-27 06:50:48 +00:00
Ivan Kozik
8d2acd669a README: minor tweaks 2015-07-20 09:53:13 +00:00
Ivan Kozik
a7f2ee7684 Document webarchiveplayer for viewing your WARCs 2015-07-20 09:47:22 +00:00
Ivan Kozik
5e85e00201 README: document --concurrency= 2015-07-20 09:30:51 +00:00
Ivan Kozik
08933f60e2 Clarify ?host= dashboard option 2015-07-20 08:50:47 +00:00
Ivan Kozik
3f78e5f4bf README: use an <h3> 2015-07-20 08:30:57 +00:00
Ivan Kozik
58b560257a README: improve docs for options 2015-07-20 08:29:37 +00:00
Ivan Kozik
9af02f122b Unbreak README 2015-07-20 08:25:33 +00:00
Ivan Kozik
1fce3af4a0 Add --1 option for turning off recursion; document options 2015-07-20 08:23:35 +00:00
Ivan Kozik
e83375382d README: there are control files in DIR too 2015-07-20 08:04:14 +00:00
Ivan Kozik
210c3d03b5 README: include suggestions from @ethus3h (thanks!) and wrap long lines 2015-07-20 07:50:49 +00:00
Ivan Kozik
c7a272d7ba Document how to fix your PATH for grab-site 2015-07-20 07:25:06 +00:00
Ivan Kozik
0e38441234 Add OS X support 2015-07-20 06:35:32 +00:00
Ivan Kozik
3ffed7dfbb Tell people to use GitHub issues 2015-07-19 20:15:23 +00:00
Ivan Kozik
55e3507122 Tweak README 2015-07-18 12:09:51 +00:00
Ivan Kozik
9f872f4fae Recommend starting gs-server first 2015-07-18 12:06:00 +00:00
Ivan Kozik
210baaa156 Tweak README 2015-07-18 11:25:00 +00:00
Ivan Kozik
b1d5f677b0 Link to raw.githubusercontent.com for screenshot 2015-07-18 11:22:28 +00:00
Ivan Kozik
bec8615d46 Add dashboard screenshot 2015-07-18 11:19:07 +00:00
Ivan Kozik
47940fd09e Explain how to stop a crawl 2015-07-18 10:51:17 +00:00
Ivan Kozik
dc7fe9ed06 Update install and usage instructions 2015-07-18 10:41:24 +00:00
Ivan Kozik
1266cf6c97 Fix typo 2015-07-18 10:02:10 +00:00
Ivan Kozik
bcd29c1837 Mention duplicate page detection 2015-07-18 10:01:25 +00:00
Ivan Kozik
4aeb715c0f Mention ignore sets 2015-07-18 09:58:17 +00:00
Ivan Kozik
8e47415e83 Tweak README 2015-07-18 09:54:07 +00:00
Ivan Kozik
266cf34a23 aiohttp is required as well 2015-07-18 09:50:42 +00:00