744 Commits

Author SHA1 Message Date
Ivan Kozik
17464e7b74 README: Tweak 2015-12-12 18:35:20 +00:00
Ivan Kozik
f8fece9ebb Set NCR=1 cookie for .blogspot.com to avoid getting redirected 2015-12-12 18:29:53 +00:00
Ivan Kozik
7975566add README: Add travis link and shields.io button 2015-12-12 17:58:52 +00:00
Ivan Kozik
0e5f8c8b83 .travis.yml: Test a few more things 2015-12-12 17:50:09 +00:00
Ivan Kozik
9c3c3c3afd .travis.yml: Remove osx until https://github.com/travis-ci/travis-ci/issues/2312 is fixed 2015-12-12 17:46:31 +00:00
Ivan Kozik
e023105c5e .travis.yml: pip3 install verbosely 2015-12-12 17:41:23 +00:00
Ivan Kozik
bdf1684834 .travis.yml: Fix script 2015-12-12 17:40:48 +00:00
Ivan Kozik
37ddc8c6b8 .travis.yml: remove --user because we're in a virtualenv 2015-12-12 17:39:18 +00:00
Ivan Kozik
2f121a864b Add a .travis.yml 2015-12-12 17:36:35 +00:00
Ivan Kozik
86e92d684c Send an over18=1 cookie to reddit.com to avoid the age gate on many subreddits 2015-12-12 16:49:42 +00:00
Ivan Kozik
6a647f637e If using Python 3.4.0, depend on an older version of aiohttp that works on Python 3.4.0 2015-12-12 10:08:02 +00:00
Ivan Kozik
d96809ea78 README: Tweak 2015-12-11 09:10:07 +00:00
Ivan Kozik
a9a06127ea README: More subreddit crawling 2015-12-11 09:03:02 +00:00
Ivan Kozik
01a72b910a README: Tweak 2015-12-11 08:56:27 +00:00
Ivan Kozik
a2f1192b92 README: Document how to archive websites whose domains have just expired 2015-12-11 08:54:09 +00:00
Ivan Kozik
0363f825d5 README: Document how to archive subreddits 2015-12-11 08:48:18 +00:00
Ivan Kozik
52895892d5 README: Tweak 2015-12-11 08:44:23 +00:00
Ivan Kozik
e82937a9f6 README: Tweak 2015-12-11 08:43:30 +00:00
Ivan Kozik
6d3f0f901d README: Fix headings 2015-12-11 08:40:30 +00:00
Ivan Kozik
140efe082a README: Write about how to archive specific websites 2015-12-11 08:39:36 +00:00
Ivan Kozik
b000c1a759 README: Tweak 2015-12-06 06:00:04 +00:00
Ivan Kozik
8a6eaea16c Bump Firefox version in UA string 2015-12-04 11:45:28 +00:00
Ivan Kozik
d80bde8c8c Add warning about warcprox 2015-11-30 06:31:21 +00:00
Ivan Kozik
58a2711058 Use Roboto font if installed 2015-11-30 05:16:29 +00:00
Ivan Kozik
e72c5fc3a7 Don't crash if psutil is not available on non-Windows OS (it is no longer installed by wpull 1.2.2) 2015-11-21 19:54:13 +00:00
Ivan Kozik
8c306aed50 Don't use --debug-manhole on Windows to avoid a crash
File "C:\Python34\lib\site-packages\click\core.py", line 680, in main
    rv = self.invoke(ctx)
  File "C:\Python34\lib\site-packages\click\core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Python34\lib\site-packages\click\core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "C:\Python34\lib\site-packages\libgrabsite\main.py", line 276, in main
    wpull.__main__.main()
  File "C:\Python34\lib\site-packages\wpull\__main__.py", line 38, in main
    manhole.install()
  File "C:\Python34\lib\site-packages\manhole.py", line 565, in install
    _MANHOLE.configure(**kwargs)  # Threads might be started here
  File "C:\Python34\lib\site-packages\manhole.py", line 412, in configure
    self.patch_os_fork_functions()
  File "C:\Python34\lib\site-packages\manhole.py", line 505, in patch_os_fork_functions
    self.original_os_fork, os.fork = os.fork, self.patched_fork
AttributeError: 'module' object has no attribute 'fork'
Exception in thread Manhole:
Traceback (most recent call last):
  File "C:\Python34\lib\threading.py", line 920, in _bootstrap_inner
    self.run()
  File "C:\Python34\lib\site-packages\manhole.py", line 192, in run
    sock = self.get_socket()
  File "C:\Python34\lib\site-packages\manhole.py", line 445, in get_socket
    sock = _ORIGINAL_SOCKET(socket.AF_UNIX, socket.SOCK_STREAM)
AttributeError: 'module' object has no attribute 'AF_UNIX'
2015-10-28 16:31:39 +00:00
Ivan Kozik
ca5c4611fa Don't crash on lack of SIGINT support on Windows 2015-10-28 16:29:24 +00:00
Ivan Kozik
7a0c1b73bc dupes: Also catch lmdb.Error for Windows, which complains about lacking disk space 2015-10-28 16:24:28 +00:00
Ivan Kozik
e6ebbd7b51 Don't try to use the unavailable --monitor- options on Windows 2015-10-28 16:19:11 +00:00
Ivan Kozik
ec9f7bdb43 setup.py: if GRAB_SITE_NO_CCHARDET env var set, don't require cchardet; wpull will fall back on chardet 2015-10-28 16:11:23 +00:00
Ivan Kozik
f00ab84949 README: Tweak 2015-10-25 02:23:23 +00:00
Ivan Kozik
8dc43fe73c Clarify upgrade instructions 2015-10-25 02:21:40 +00:00
Ivan Kozik
1bb8bcc4d8 global igset: also ignore recaptcha /mailhide/d links 2015-10-23 00:41:41 +00:00
Ivan Kozik
40ca80638d global igset: Ignore /impixu on a new tumblr domain 2015-10-22 15:10:58 +00:00
Ivan Kozik
6ac6bf69d4 Add Ubuntu 15.10 install instructions 2015-10-21 22:44:14 +00:00
Ivan Kozik
4487c43c83 Use wpull>=1.2.2 2015-10-21 22:38:47 +00:00
Ivan Kozik
6da093a817 README: Update TOC 2015-10-20 10:50:41 +00:00
Ivan Kozik
3dfb9dc757 README: Remove instructions for grabbing a site that requires a cookie because the --header solution results in the cookie being sent to all domains instead of just the intended domain 2015-10-20 10:50:18 +00:00
Ivan Kozik
77248eb47c README: Be more specific about which Python 3 must be installed 2015-10-18 04:38:38 +00:00
Ivan Kozik
8101d4b04c README: OS X install also works on 10.11 2015-10-18 04:36:21 +00:00
Ivan Kozik
9f3fe3f4cb Add a table of contents 2015-10-12 08:24:26 +00:00
Ivan Kozik
74c385332a Document zless and how to grab a site that requires a cookie 2015-10-11 06:35:39 +00:00
Ivan Kozik
1957154321 Mention --no-skip-getaddrinfo 2015-10-07 08:13:19 +00:00
Ivan Kozik
b3c433a60b Fix: new Click gives us () instead of None when no start_url's are given 2015-10-03 22:53:20 +00:00
Ivan Kozik
ea483bc74b README: Tweak 2015-09-30 23:46:46 +00:00
Ivan Kozik
8260a0d7cf Add note about Python 3.5 2015-09-30 23:44:25 +00:00
Ivan Kozik
bfe92c9605 Document how to upgrade grab-site 2015-09-30 22:31:46 +00:00
Ivan Kozik
7a63a3dcd1 Add --no-dupespotter for turning off dupespotter which sometimes has false positives 2015-09-30 22:16:56 +00:00
Ivan Kozik
17c1b9caaa Update default user agent 2015-09-25 20:32:08 +00:00
Ivan Kozik
f1548521ec Write URLs skipped by --max-content-length= to DIR/skipped_max_content_length 2015-09-02 19:15:00 +00:00