170 Commits

Author SHA1 Message Date
Ivan Kozik
659b25481e Set each timeout individually and use a session-timeout of two days
(we want to avoid hanging crawls forever, but we don't want to prevent the
downloading of large files from very slow hosts.)
2016-08-03 10:23:24 +00:00
Ivan Kozik
6685bf51a8 global igset: fix addtoany.com ignore and use https?:// for all ignores 2016-08-01 22:08:42 +00:00
Ivan Kozik
63fdb9d5c6 Lock html5lib version to work around https://github.com/chfoo/wpull/issues/332 2016-07-15 05:38:38 +00:00
Ivan Kozik
32b68e9342 dashboard: fix '?' shortcut key 2016-07-13 01:07:25 +00:00
Ivan Kozik
ca7bc71045 README: Add warning about tmux 2.1 2016-06-21 06:42:46 +00:00
Ivan Kozik
e7fcec6f85 global igset: make libsyn ignore libsyn-specific 2016-06-21 06:39:15 +00:00
Ivan Kozik
76f8b2cf48 Lock wpull dependency to 1.2.3 for now 2016-06-12 08:47:36 +00:00
Ivan Kozik
c08f229e1d global igset: Add facebook login.php 2016-06-03 01:02:36 +00:00
Ivan Kozik
450a5c394f Bump Firefox UA 2016-06-01 02:16:36 +00:00
Ivan Kozik
febee9c85e global igset: Add /%3Ca%20href= pattern 2016-05-29 14:00:15 +00:00
Ivan Kozik
fb6e01caa7 global igset: Add /%20https?:/ pattern 2016-05-27 14:03:28 +00:00
Ivan Kozik
aa366bbb27 Update grab-site URL in setup.py and dashboard 2016-05-27 13:53:35 +00:00
Ivan Kozik
842fab4b23 Stop listening on legacy ws port 29001 2016-05-22 11:06:38 +00:00
Ivan Kozik
f0bb696dc8 Actually install the favicon.ico 2016-05-22 11:01:53 +00:00
Ivan Kozik
b3c75b0ffb dashboard: Add a favicon 2016-05-22 10:59:32 +00:00
Ivan Kozik
7df1761bf0 dashboard: Allow for another digit in the MB stat 2016-05-22 10:22:11 +00:00
Ivan Kozik
8d93776742 dashboard: Align the req/s stat properly 2016-05-22 10:18:20 +00:00
Ivan Kozik
38877106ef Don't raise an exception if client lacks User-Agent 2016-05-03 07:16:48 +00:00
Ivan Kozik
fe2530e667 global igset: ignore amp%3Bamp%3Bamp%3B loops 2016-04-22 03:55:09 +00:00
Ivan Kozik
01ac84da06 global igset: tumblr serves 16px avatars on https now as well 2016-04-04 18:38:59 +00:00
Ivan Kozik
ffecfcabda global igset: Ignore instapaper share links 2016-03-30 17:11:03 +00:00
Ivan Kozik
316db6eec4 grab-site 0.11 2016-02-25 01:08:17 +00:00
Ivan Kozik
bfb866dfed s/started/listening/ 2016-02-25 00:32:42 +00:00
Ivan Kozik
fcc3a7f904 Listen on port 29001 as well until everyone's old port-29001 crawls finish 2016-02-25 00:20:23 +00:00
Ivan Kozik
be8418f52c Don't allow anyone to frame/iframe us 2016-02-25 00:01:27 +00:00
Ivan Kozik
0f5342b9c0 Factor out a sendPage 2016-02-24 23:58:43 +00:00
Ivan Kozik
25b3c9c076 Read bytes from file and send the correct Content-Length 2016-02-24 23:52:37 +00:00
Ivan Kozik
7486daa33f Use \r\n instead of \x0d\x0a 2016-02-24 23:47:03 +00:00
Ivan Kozik
7024f1e232 Fix comment 2016-02-24 23:46:21 +00:00
Ivan Kozik
c20a0f76ba Merge branch 'pr_76' into ws-http-same-port 2016-02-24 23:43:23 +00:00
12As
5da8e85d43 Update server.py
fixed errors
2016-02-24 16:44:13 -06:00
12As
e7e3030d4e Update server.py based on comments in PR #76
Changed server to handle paths, added a send404 method and fixed letter case on variables
2016-02-24 16:03:34 -06:00
12As
3f0c433668 Update dashboard.html, per comments in PR. 2016-02-24 15:22:31 -06:00
12As
4a51ee1c83 Create 404 Not Found page 2016-02-24 15:10:33 -06:00
Ivan Kozik
506a7604ef Rename --which-wpull-args-full to --which-wpull-command 2016-02-21 04:49:53 +00:00
Ivan Kozik
5805e4c155 Implement --which-wpull-args-partial and --which-wpull-args-full for figuring out which wpull arguments grab-site would use, without actually starting wpull 2016-02-21 04:33:14 +00:00
Ivan Kozik
bda4d8cf6d Pass maybe_log_ignore and print_to_terminal as globals to custom_hooks.py as well 2016-02-21 00:53:03 +00:00
Ivan Kozik
c37b32bd1c Implement --custom-hooks so that users can modify wpull_hook 2016-02-21 00:23:18 +00:00
12As
dec4150969 Change default port in wpull_hooks.py 2016-02-20 15:25:20 -06:00
12As
492625a09d Change default port in dashboard.html 2016-02-20 15:23:31 -06:00
12As
53e079228d Make gs-server use single port
Remove the need for gs-server to require multiple ports
2016-02-20 15:20:53 -06:00
Ivan Kozik
292682a48f Bump version 2016-02-16 17:25:11 +00:00
Daniel Oaks
95012d1e0c gs-server: Use env instead of py3 directly, makes virtualenvs nicer 2016-02-16 17:24:20 +00:00
Ivan Kozik
7c1afbefe0 Use wpull 1.2.3 2016-02-05 19:15:25 +00:00
Ivan Kozik
ef5137ae86 Update UA 2016-02-01 16:19:49 +00:00
Ivan Kozik
7ec2f90534 global igset: Ignore /CSI/CSI/ loops on blogspot 2016-01-12 22:44:57 +00:00
Ivan Kozik
0214558d5e global igset: ignore bogus /search/label/CSI/ links on blogspot 2016-01-12 03:20:44 +00:00
Ivan Kozik
3b9f8c1a4c global igset: ignore /CaptchaImage.axd 2016-01-09 21:46:35 +00:00
Ivan Kozik
dff87eba2f global igset: also ignore www.digg.com/submit 2016-01-09 02:19:18 +00:00
Ivan Kozik
01bf6d527b Bump version 2016-01-05 01:08:14 +00:00