218 Commits

Author SHA1 Message Date
Ivan Kozik
d3715fe888 Dup -> Dupe 2015-07-18 11:00:33 +00:00
Ivan Kozik
2222aafa74 Spaces -> tabs 2015-07-18 11:00:08 +00:00
Ivan Kozik
e42e33d82f My* -> Grabber* 2015-07-18 10:57:57 +00:00
Ivan Kozik
47940fd09e Explain how to stop a crawl 2015-07-18 10:51:17 +00:00
Ivan Kozik
dc7fe9ed06 Update install and usage instructions 2015-07-18 10:41:24 +00:00
Ivan Kozik
43d8a9594f Move everything and make grab-site installable with pip3 2015-07-18 10:39:04 +00:00
Ivan Kozik
1266cf6c97 Fix typo 2015-07-18 10:02:10 +00:00
Ivan Kozik
bcd29c1837 Mention duplicate page detection 2015-07-18 10:01:25 +00:00
Ivan Kozik
4aeb715c0f Mention ignore sets 2015-07-18 09:58:17 +00:00
Ivan Kozik
8e47415e83 Tweak README 2015-07-18 09:54:07 +00:00
Ivan Kozik
266cf34a23 aiohttp is required as well 2015-07-18 09:50:42 +00:00
Ivan Kozik
c907a9313b Increment runk too 2015-07-18 09:49:50 +00:00
Ivan Kozik
66d869a545 Report networking errors correctly 2015-07-18 09:49:03 +00:00
Ivan Kozik
8a8ea70a7d On ctrl-c, touch 'stop' file instead of letting wpull handle it, so that server gets notified of stop 2015-07-18 09:21:14 +00:00
Ivan Kozik
fc2e802999 Implement stopping via stop file 2015-07-18 08:53:38 +00:00
Ivan Kozik
7cbfe0f2ca Update igoff status; highlight lines on dashboard based on response code instead of is_warning/is_error 2015-07-18 08:41:18 +00:00
Ivan Kozik
8b9c283d0f Indicate start url for job in hello frame 2015-07-18 08:26:58 +00:00
Ivan Kozik
f4f445b7dd igoff by default 2015-07-18 08:23:56 +00:00
Ivan Kozik
0d8288e6fb Send more job_data stats 2015-07-18 08:21:34 +00:00
Ivan Kozik
8c9ce8c24b ignore_sets -> igsets 2015-07-18 06:23:24 +00:00
Ivan Kozik
3965233862 Tweak README 2015-07-18 06:22:47 +00:00
Ivan Kozik
02502c5260 Tweak README 2015-07-18 06:21:03 +00:00
Ivan Kozik
804cb0a1ee Document grab-site dashboard 2015-07-18 06:16:46 +00:00
Ivan Kozik
35ad90cecb Allow customizing grab-site server location 2015-07-18 05:57:42 +00:00
Ivan Kozik
14b59e17dc Broadcast ignores 2015-07-18 05:55:51 +00:00
Ivan Kozik
09418a905a Better asserts 2015-07-18 05:51:53 +00:00
Ivan Kozik
3cd416f477 Fix formatting 2015-07-18 05:51:25 +00:00
Ivan Kozik
19107cbe28 Fix formatting 2015-07-18 05:51:15 +00:00
Ivan Kozik
1f789d204c Fix just in case wpull stops titlecasing headers 2015-07-18 05:50:07 +00:00
Ivan Kozik
0526f8c96e Log URLs being fetched to real stdout 2015-07-18 05:48:33 +00:00
Ivan Kozik
758ec1301a Camelcase 2015-07-18 05:40:35 +00:00
Ivan Kozik
952eb4c33f print some messages only to real stdout 2015-07-18 05:38:59 +00:00
Ivan Kozik
787db7da55 Make stdout/stderr capture actually work 2015-07-18 05:35:57 +00:00
Ivan Kozik
f1100e7223 Try to send stdout/stderr to dashboard and fail at it 2015-07-18 05:24:54 +00:00
Ivan Kozik
93adc1ad48 Refactor job_data broadcasting 2015-07-18 04:34:24 +00:00
Ivan Kozik
937908ef52 Report bytes downloaded to dashboard 2015-07-18 04:31:54 +00:00
Ivan Kozik
dcbcb28852 Reported started_at to dashboard 2015-07-18 04:21:35 +00:00
Ivan Kozik
a8eb218fa2 Don't say '1 crawls' 2015-07-18 04:17:08 +00:00
Ivan Kozik
e804f7171e Show job URLs on dashboard 2015-07-18 04:14:50 +00:00
Ivan Kozik
933d293305 Guess WebSocket port based on HTTP port; read dashboard.html on each request 2015-07-18 04:10:15 +00:00
Ivan Kozik
f155cbc4ed Make the dashboard sort-of work 2015-07-18 03:49:24 +00:00
Ivan Kozik
a4ce13001d Link to AutobahnPython bug 2015-07-18 03:33:57 +00:00
Ivan Kozik
53fd04a29e Make reconnecting work 2015-07-18 03:17:27 +00:00
Ivan Kozik
18a192739b Make WebSocket client/server sort of work; rename ignore_sets to igsets 2015-07-18 02:11:18 +00:00
Ivan Kozik
db21e530e2 Generate a grab id and put in the dir name; add some temporary print debugging 2015-07-18 01:06:56 +00:00
Ivan Kozik
5621863b09 Refactor the WebSocket client in hooks 2015-07-17 23:57:46 +00:00
Ivan Kozik
cc43d39a8e Remove License note in README 2015-07-17 23:53:28 +00:00
Ivan Kozik
aa0d7b280f Fix LICENSE 2015-07-17 23:53:11 +00:00
Ivan Kozik
69e56e3d12 grab-site -> grab site server 2015-07-17 23:47:18 +00:00
Ivan Kozik
8353827a02 Update UA 2015-07-17 23:45:18 +00:00