Ivan Kozik
4699e581fc
README: add install steps for Debian 8 (jessie)
2017-12-07 02:36:14 +00:00
Ivan Kozik
26655fb28c
README: switch from PPA-based python3.4 install to pyenv-based install; add install steps for Debian 9 and 10
2017-12-07 02:28:45 +00:00
Ivan Kozik
95e98ecefe
README: link to wpull v1.2.3
2017-11-22 18:34:50 +00:00
Ivan Kozik
b3c83f203c
README: add note about gs-server listening on all interfaces by default
2017-11-22 18:09:49 +00:00
Ivan Kozik
62d4575b0c
README: point to the newer ppa:deadsnakes/ppa PPA with Python 3.4.7
2017-11-22 17:57:36 +00:00
Ivan Kozik
2276adefe8
README: be less confusing about "start a new shell"
2017-11-22 17:25:31 +00:00
Ivan Kozik
fc09d22028
README: ask users to file issues
2017-11-19 04:11:57 +00:00
Ivan Kozik
c677c29aaf
global igset: ignore new facebook like.php links
...
e.g. https://www.facebook.com/v2.9/plugins/like.php?href=
2017-11-16 20:23:28 +00:00
Ivan Kozik
6119aef9ed
global igset: ignore pixel.wp.com tracking pixels
2017-11-09 16:52:54 +00:00
Ivan Kozik
297c5b1b8d
Patch dns.inet.is_multicast to not crash wpull
2017-11-09 11:33:14 +00:00
Ivan Kozik
90300f0f57
Document how to grab a website that requires login / cookies
2017-11-09 11:10:54 +00:00
Ivan Kozik
469974864e
Rename some unused bindings
2017-11-09 09:44:46 +00:00
Ivan Kozik
82a5fa6650
Use wpull v3 hooks so that custom hooks get more information passed into wait_time
2017-11-09 09:37:26 +00:00
Ivan Kozik
a8a50f523c
youtube igset: remove redundant ignore
2017-11-02 02:58:50 +00:00
Ivan Kozik
5442414d28
Remove googleplus ignore set and add accounts.google.com-related ignores to global igset
2017-11-02 02:31:30 +00:00
Ivan Kozik
7200878118
extra_docs/custom_hooks_sample.py: add a hook that queues additional URLs
2017-10-25 20:53:51 +00:00
Ivan Kozik
2b56a73aaa
dashboard: adjust code formatting
2017-10-24 18:15:44 +00:00
Ivan Kozik
87e4bd79a6
dashboard: enable context menu for all browsers (Safari 10+ has document.execCommand('copy')
.)
2017-10-24 18:13:48 +00:00
Ivan Kozik
d9f75f5ae3
README: update "Install on a non-Ubuntu distribution" steps to also use a virtualenv
2017-10-24 17:55:48 +00:00
Ivan Kozik
ad5c4d2449
README: OS X -> macOS and update instructions to use virtualenv
2017-10-24 17:42:33 +00:00
Ivan Kozik
d5698bc08a
README: fix TOC order
2017-10-24 17:25:33 +00:00
Ivan Kozik
112a3175c2
Bump version to 1.3.0
2017-10-24 17:24:57 +00:00
Ivan Kozik
d9b89f551b
README: rework instructions to not require activating the virtualenv
2017-10-24 17:24:27 +00:00
Ivan Kozik
be5db3f397
README: rework the Ubuntu 14.04 install steps to use virtualenv; assume grab-site and related executables are in PATH
2017-10-24 17:14:57 +00:00
Ivan Kozik
0ad6bdf89f
README: ancient non-LTS Ubuntu releases are not supported
2017-10-24 17:02:33 +00:00
Ivan Kozik
a954a0caca
README: "Python 3.5 or newer"
2017-10-24 16:45:40 +00:00
Ivan Kozik
cd3931b5fc
Add install instructions for Windows 10
2017-10-24 16:43:53 +00:00
Ivan Kozik
96e1f229dc
dashboard: adjust the font stacks; add Segoe UI for Windows
2017-10-24 16:31:23 +00:00
Ivan Kozik
6680cf7e50
README: add install steps for Ubuntu 17.10
2017-10-24 01:02:42 +00:00
Ivan Kozik
eefb6a3eba
global igset: ignore another unwanted medium.com URL
2017-09-30 13:36:04 +00:00
Ivan Kozik
0c2f160db6
global igset: ignore unwanted medium.com URLs
2017-09-30 13:29:52 +00:00
Ivan Kozik
a941fdfb9c
global igset: ignore more incorrectly extracted links on YouTube
...
e.g.
404 Not Found https://www.youtube.com/{{data}}
2017-09-22 06:05:02 +00:00
Ivan Kozik
7863f344a7
global igset: ignore incorrectly extracted YouTube links
...
e.g.
404 Not Found https://www.youtube.com/[[data.videoNavigationEndpoint]]
404 Not Found https://www.youtube.com/[[menuRendererData]]
404 Not Found https://www.youtube.com/[[videoReportActionResultRenderer_]]
404 Not Found https://www.youtube.com/[[speedyGData_.videoQualityPromoRenderer]]
2017-09-22 06:02:55 +00:00
Ivan Kozik
a6d5bd0227
global igset: also handle http://finance.google.com/finance
2017-09-18 01:02:41 +00:00
Ivan Kozik
e43bdcbf24
Bump Firefox UA
2017-08-29 19:03:16 +00:00
Ivan Kozik
f1b3501505
global igset: ignore another never-ending video stream
2017-08-29 19:02:09 +00:00
Ivan Kozik
a72272d5c3
Bump Firefox UA
2017-06-26 18:17:01 +00:00
Ivan Kozik
de32a1bfe9
Bump version
2017-06-26 18:14:05 +00:00
Ivan Kozik
0c43dc11f3
dashboard: opt out of DNS prefetching to avoid making DNS lookups on every host
...
Before this fix, if "Use a prediction service to load pages more quickly" was
enabled in Chrome, it would make DNS lookups on the hostname in every URL that
flew by in the dashboard.
https://www.chromium.org/developers/design-documents/dns-prefetching
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-DNS-Prefetch-Control
2017-06-26 18:12:06 +00:00
Ivan Kozik
2d7222cc5f
Remove completely ineffective protection against crawling sites on localhost
...
Any hostname can resolve to 127.0.0.1, 192.168.x.y, etc.
If you care about this protection, run grab-site in a container or use
iptables/ferm to block outbound traffic on loopback for the user that runs
grab-site.
2017-04-27 16:32:34 +00:00
Ivan Kozik
25a19d1dc3
Update install instructions for Ubuntu 17.04 and fold Ubuntu 16.10 instructions into 16.04 instructions
2017-04-09 08:18:59 +00:00
Ivan Kozik
ae400137d3
README: update Help section
2017-03-09 07:58:28 +00:00
Ivan Kozik
69d1dab393
Mention grab-site 'URL' instead of grab-site URL to avoid issues with ? or &
2017-02-26 21:42:53 +00:00
Ivan Kozik
d88dccac27
Fix link to Python installer for OS X (there is no 3.4.5 installer)
2017-02-17 23:33:11 +00:00
Ivan Kozik
bd3c896146
Fix .travis.yml
2017-02-08 20:45:21 +00:00
Ivan Kozik
1f2d915fef
Rename a metavar
2017-02-08 20:25:40 +00:00
Ivan Kozik
57e3455189
Bump Firefox UA
2017-02-08 20:24:07 +00:00
Ivan Kozik
94e486c7cf
Document --permanent-error-status-codes
2017-02-08 20:23:12 +00:00
Ivan Kozik
bf6382d724
Add --permanent-error-status-codes argument
...
https://github.com/ludios/grab-site/issues/97
2017-02-08 20:16:31 +00:00
Ivan Kozik
4fd740e815
Point to Python 3.4.5 instead of 3.4.3
2017-02-04 13:39:46 +00:00