Commit Graph

1158 Commits (master)

Author SHA1 Message Date
Ivan Kozik b297c9e280 README: tweak BrowserStack message 2022-09-01 05:28:58 +00:00
Ivan Kozik eed00f234a Revert "README: remove BrowserStack mention"
This reverts commit e45f6f5b97.
2022-09-01 05:27:57 +00:00
Ivan Kozik b659255c1c README: remove old tmux 2.1 note 2022-08-08 19:23:47 +00:00
Ivan Kozik 32b127067e README: add nix-env install steps for installing the latest version of grab-site 2022-08-07 08:49:12 +00:00
Ivan Kozik a2e49a1d96 2.2.7 2022-08-07 08:16:46 +00:00
Ivan Kozik 077d61af03 Update Firefox UA 2022-08-07 08:16:07 +00:00
Ivan Kozik 951522ce65 2.2.6 2022-08-07 08:11:46 +00:00
Ivan Kozik f04e5f6166 README: update link to ludios_wpull 2022-08-07 08:11:09 +00:00
Ivan Kozik 5ee2132472 README: document --no-global-igset 2022-08-07 08:10:58 +00:00
Ivan Kozik d2bf2844dc Fix: when there are no ignores, ignore nothing rather than everything 2022-08-07 08:07:59 +00:00
Ivan Kozik a178d76ea7 Don't crash when igsets file is empty
This fixes:

Traceback (most recent call last):
  File "gs-venv/lib/python3.8/site-packages/libgrabsite/wpull_hooks.py", line 56, in wrapper
    return f(*args, **kwargs)
  File "gs-venv/lib/python3.8/site-packages/libgrabsite/wpull_hooks.py", line 318, in update_ignores
    for pattern in get_patterns_for_ignore_set(igset):
  File "gs-venv/lib/python3.8/site-packages/libgrabsite/wpull_hooks.py", line 48, in get_patterns_for_ignore_set
    assert name != "", name
AssertionError
2022-08-07 07:52:23 +00:00
Ivan Kozik 4994331eea Add --no-global-igset for starting a crawl without the "global" ignore set 2022-08-07 07:46:55 +00:00
Ivan Kozik 0e67d79ae7 2.2.5 2022-07-28 16:15:06 +00:00
Ivan Kozik 09b26c88fd global igset: don't ignore Wikipedia thumbnails
If you would like to keep using this ignore, add it to a file and use --import-ignores=FILE
2022-07-28 16:12:26 +00:00
Ivan Kozik a8538b0118 2.2.4 2022-06-27 23:24:30 +00:00
Ivan Kozik df06e14415
Merge pull request #222 from JustAnotherArchivist/warc-header-gs-version
Record grab-site version in WARC headers
2022-06-27 16:15:01 -07:00
Ivan Kozik cb477e68a5 README: nixpkgs 21.11 -> 22.05 2022-06-27 23:13:49 +00:00
JustAnotherArchivist a2ca380534 Record grab-site version in WARC headers 2022-05-07 17:29:51 +00:00
Ivan Kozik 9e6e95b513 2.2.3 2022-05-04 20:13:54 +00:00
Ivan Kozik 4f2526dbc6 gs-server: fix RuntimeError: To use txaio, you must first select a framework with .use_twisted() or .use_asyncio()
Caused by an updated autobahn, probably.

Fixes https://github.com/ArchiveTeam/grab-site/issues/220
2022-05-04 20:12:38 +00:00
Ivan Kozik 24a67521ef README: Python 3.8.12 -> 3.8.13 2022-05-04 20:12:31 +00:00
Ivan Kozik 53899380a9 README: Debian install: add packages `wget ca-certificates` for the subsequent steps 2022-05-04 20:12:04 +00:00
Ivan Kozik 9987b25af9 README: add a note 2022-05-03 07:23:28 +00:00
Ivan Kozik 87c725a333 README: fix anchor link 2022-05-03 07:18:26 +00:00
Ivan Kozik 20e5fef01d README: remove the Nix-based macOS install because it fails due to Yapsy test failures
https://github.com/ArchiveTeam/grab-site/issues/218
2022-04-10 18:23:30 +00:00
Ivan Kozik 3d2699fb2f README: macOS Nix-based install: no need to edit shell startup files yourself now 2022-04-10 17:58:13 +00:00
Ivan Kozik 81995d67c2 README: update the macOS Nix-based install 2022-04-10 17:57:14 +00:00
Ivan Kozik fe006e4fe1 README: fix macOS homebrew-based install: update for Python 3.8 and M1 Macs 2022-04-10 17:52:02 +00:00
Ivan Kozik 14c3bbdf71 README: nixpkgs 21.05 -> 21.11 2022-01-02 07:21:18 +00:00
Ivan Kozik 5a306b6917 README: Python 3.7.11 -> 3.8.12; mention Debian 11 2022-01-02 07:20:55 +00:00
Ivan Kozik 41a77f0812 README: have consistency among the 'and then restart your shell' text 2022-01-02 07:00:21 +00:00
Ivan Kozik d696604509 README: tabs before spaces 2022-01-02 06:58:29 +00:00
Preservation-Quest 865aae75c9 Update README.md
Updated as requested.
2022-01-02 06:57:57 +00:00
Preservation-Quest 9a3e577295 Update README.md 2022-01-02 06:57:57 +00:00
Preservation-Quest 9f6e012c83 Update README.md 2022-01-02 06:57:57 +00:00
Ivan Kozik 6269289a2c README: document --wpull-args=--no-warc-compression 2021-10-25 19:58:55 +00:00
Ivan Kozik 6f4d435bf2
Merge pull request #203 from TheTechRobo/simplemachineforums-igsets 2021-10-25 12:53:33 -07:00
TheTechRobo f0df373701
Remove printpage from forum igset 2021-10-25 11:07:34 -04:00
TheTechRobo a376f67130
Add SimpleMachineForum ignores to `forums` igset
I will test it on an SMF.
2021-10-24 18:12:04 -04:00
Ivan Kozik 4fbf6469a1 README: fix the nix-based install steps to use release-21.05 because master has the incompatible sqlalachemy 1.4 2021-08-29 04:55:59 +00:00
Ivan Kozik 4fec250a32 README: mention Ubuntu 20.04 2021-08-29 04:33:48 +00:00
Ivan Kozik bf7da79ce5 2.2.2 2021-08-29 04:25:29 +00:00
Ivan Kozik 0f9585db6f README: Python 3.7.10 -> 3.7.11 2021-08-29 04:25:01 +00:00
Ivan Kozik b5962676be Use ludios_wpull 3.0.9 to fix https://github.com/ArchiveTeam/ludios_wpull/issues/16 2021-08-29 04:23:40 +00:00
Ivan Kozik fe3cc6ab14 README: Python 3.7.9 -> 3.7.10 2021-05-10 01:00:13 +00:00
Ivan Kozik 2e3be5dd29 2.2.1 2021-05-10 00:56:09 +00:00
Ivan Kozik cb3068c0ca Use ludios_wpull 3.0.8 to fix https://github.com/ArchiveTeam/grab-site/issues/181 2021-05-10 00:55:45 +00:00
Ivan Kozik 132064a24e README: Debian-based install: Python 3.7.8 -> 3.7.9 2021-01-03 02:37:11 +00:00
Ivan Kozik 8c5681c726 README: macOS: update maximum version 2021-01-03 02:35:33 +00:00
Ivan Kozik d25e419fb6 README: remove the note about the macOS-specific lxml bug because I can't repro it 2021-01-03 02:31:44 +00:00