Ivan Kozik
|
210c3d03b5
|
README: include suggestions from @ethus3h (thanks!) and wrap long lines
|
2015-07-20 07:50:49 +00:00 |
|
Ivan Kozik
|
c7a272d7ba
|
Document how to fix your PATH for grab-site
|
2015-07-20 07:25:06 +00:00 |
|
Ivan Kozik
|
0e38441234
|
Add OS X support
|
2015-07-20 06:35:32 +00:00 |
|
Ivan Kozik
|
cd893cb1e3
|
Keeping your crawling problems in perspective
Spanish scriptorium? (Madrid, Biblioteca de San Lorenzo de El Escorial, 14th century).
Credit: https://medievalfragments.wordpress.com/2013/11/05/where-are-the-scriptoria/
|
2015-07-20 03:43:33 +00:00 |
|
Ivan Kozik
|
6c6c3197e7
|
Accept more exit codes from wpull as clean exit
|
2015-07-19 22:11:01 +00:00 |
|
Ivan Kozik
|
a95ee28c8d
|
Tell user where the output files are
|
2015-07-19 20:47:39 +00:00 |
|
Ivan Kozik
|
a5cc1d84c6
|
Bump version
|
2015-07-19 20:44:11 +00:00 |
|
Ivan Kozik
|
7566de05e3
|
Mark finished jobs as finished on dashboard
|
2015-07-19 20:42:29 +00:00 |
|
Ivan Kozik
|
8e2e1c5f58
|
Camelcase
|
2015-07-19 20:20:56 +00:00 |
|
Ivan Kozik
|
3ffed7dfbb
|
Tell people to use GitHub issues
|
2015-07-19 20:15:23 +00:00 |
|
Ivan Kozik
|
a66b970bfb
|
Enable faulthandler
|
2015-07-18 21:32:27 +00:00 |
|
Ivan Kozik
|
227052371e
|
Allow only grabbers to announce download/stdout/stderr/ignore
|
2015-07-18 13:35:11 +00:00 |
|
Ivan Kozik
|
fe659e21a6
|
Don't allow setting mode more than once
|
2015-07-18 13:32:26 +00:00 |
|
Ivan Kozik
|
6a866ad530
|
Don't assume WebSocket clients are dashboards by default; announce user agents
|
2015-07-18 13:29:19 +00:00 |
|
Ivan Kozik
|
55e3507122
|
Tweak README
|
2015-07-18 12:09:51 +00:00 |
|
Ivan Kozik
|
9f872f4fae
|
Recommend starting gs-server first
|
2015-07-18 12:06:00 +00:00 |
|
Ivan Kozik
|
877b170fde
|
Move some code around
|
2015-07-18 11:57:34 +00:00 |
|
Ivan Kozik
|
318fb3c03d
|
Set --max-redirect 8 like ArchiveBot
|
2015-07-18 11:52:31 +00:00 |
|
Ivan Kozik
|
63e8f1813b
|
Add --page-requisites-level= and --concurrency= options; use default concurrency of 2
|
2015-07-18 11:45:11 +00:00 |
|
Ivan Kozik
|
5331f4c9fe
|
Require 400MB disk free
|
2015-07-18 11:32:38 +00:00 |
|
Ivan Kozik
|
f848b28810
|
Use --level inf by default; add --level option
|
2015-07-18 11:32:28 +00:00 |
|
Ivan Kozik
|
210baaa156
|
Tweak README
|
2015-07-18 11:25:00 +00:00 |
|
Ivan Kozik
|
b1d5f677b0
|
Link to raw.githubusercontent.com for screenshot
|
2015-07-18 11:22:28 +00:00 |
|
Ivan Kozik
|
bec8615d46
|
Add dashboard screenshot
|
2015-07-18 11:19:07 +00:00 |
|
Ivan Kozik
|
5da054a837
|
Report concurrency level
|
2015-07-18 11:18:50 +00:00 |
|
Ivan Kozik
|
d5d2d49f5f
|
chmod +x
|
2015-07-18 11:02:24 +00:00 |
|
Ivan Kozik
|
d3715fe888
|
Dup -> Dupe
|
2015-07-18 11:00:33 +00:00 |
|
Ivan Kozik
|
2222aafa74
|
Spaces -> tabs
|
2015-07-18 11:00:08 +00:00 |
|
Ivan Kozik
|
e42e33d82f
|
My* -> Grabber*
|
2015-07-18 10:57:57 +00:00 |
|
Ivan Kozik
|
47940fd09e
|
Explain how to stop a crawl
|
2015-07-18 10:51:17 +00:00 |
|
Ivan Kozik
|
dc7fe9ed06
|
Update install and usage instructions
|
2015-07-18 10:41:24 +00:00 |
|
Ivan Kozik
|
43d8a9594f
|
Move everything and make grab-site installable with pip3
|
2015-07-18 10:39:04 +00:00 |
|
Ivan Kozik
|
1266cf6c97
|
Fix typo
|
2015-07-18 10:02:10 +00:00 |
|
Ivan Kozik
|
bcd29c1837
|
Mention duplicate page detection
|
2015-07-18 10:01:25 +00:00 |
|
Ivan Kozik
|
4aeb715c0f
|
Mention ignore sets
|
2015-07-18 09:58:17 +00:00 |
|
Ivan Kozik
|
8e47415e83
|
Tweak README
|
2015-07-18 09:54:07 +00:00 |
|
Ivan Kozik
|
266cf34a23
|
aiohttp is required as well
|
2015-07-18 09:50:42 +00:00 |
|
Ivan Kozik
|
c907a9313b
|
Increment runk too
|
2015-07-18 09:49:50 +00:00 |
|
Ivan Kozik
|
66d869a545
|
Report networking errors correctly
|
2015-07-18 09:49:03 +00:00 |
|
Ivan Kozik
|
8a8ea70a7d
|
On ctrl-c, touch 'stop' file instead of letting wpull handle it, so that server gets notified of stop
|
2015-07-18 09:21:14 +00:00 |
|
Ivan Kozik
|
fc2e802999
|
Implement stopping via stop file
|
2015-07-18 08:53:38 +00:00 |
|
Ivan Kozik
|
7cbfe0f2ca
|
Update igoff status; highlight lines on dashboard based on response code instead of is_warning/is_error
|
2015-07-18 08:41:18 +00:00 |
|
Ivan Kozik
|
8b9c283d0f
|
Indicate start url for job in hello frame
|
2015-07-18 08:26:58 +00:00 |
|
Ivan Kozik
|
f4f445b7dd
|
igoff by default
|
2015-07-18 08:23:56 +00:00 |
|
Ivan Kozik
|
0d8288e6fb
|
Send more job_data stats
|
2015-07-18 08:21:34 +00:00 |
|
Ivan Kozik
|
8c9ce8c24b
|
ignore_sets -> igsets
|
2015-07-18 06:23:24 +00:00 |
|
Ivan Kozik
|
3965233862
|
Tweak README
|
2015-07-18 06:22:47 +00:00 |
|
Ivan Kozik
|
02502c5260
|
Tweak README
|
2015-07-18 06:21:03 +00:00 |
|
Ivan Kozik
|
804cb0a1ee
|
Document grab-site dashboard
|
2015-07-18 06:16:46 +00:00 |
|
Ivan Kozik
|
35ad90cecb
|
Allow customizing grab-site server location
|
2015-07-18 05:57:42 +00:00 |
|