From 73587696f2f943de285a43e39bd0f03a712e1081 Mon Sep 17 00:00:00 2001
From: Ivan Kozik <ivan@ludios.org>
Date: Tue, 9 Oct 2018 17:42:35 +0000
Subject: [PATCH] README: http:// -> https:// links

---
 README.md | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index 607cb46..dc6fe8c 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@ grab-site
 
 grab-site is an easy preconfigured web crawler designed for backing up websites.
 Give grab-site a URL and it will recursively crawl the site and write
-[WARC files](http://www.archiveteam.org/index.php?title=The_WARC_Ecosystem).
+[WARC files](https://www.archiveteam.org/index.php?title=The_WARC_Ecosystem).
 Internally, grab-site uses [a fork](https://github.com/ludios/wpull) of
 [wpull](https://github.com/chfoo/wpull) for crawling.
 
@@ -317,7 +317,7 @@ grab-site does not respect `robots.txt` files, because they frequently
 [whitelist only approved robots](https://github.com/robots.txt),
 [hide pages embarrassing to the site owner](https://web.archive.org/web/20140401024610/http://www.thecrimson.com/robots.txt),
 or block image or stylesheet resources needed for proper archival.
-[See also](http://www.archiveteam.org/index.php?title=Robots.txt).
+[See also](https://www.archiveteam.org/index.php?title=Robots.txt).
 Because of this, very rarely you might run into a robot honeypot and receive
 an abuse@ complaint.  Your host may require a prompt response to such a complaint
 for your server to stay online.  Therefore, we recommend against crawling the
@@ -326,8 +326,8 @@ web from a server that hosts your critical infrastructure.
 Don't run grab-site on GCE (Google Compute Engine); as happened to me, your
 entire API project may get nuked after a few days of crawling the web, with
 no recourse.  Good alternatives include OVH ([OVH](https://www.ovh.com/us/dedicated-servers/),
-[So You Start](http://www.soyoustart.com/us/essential-servers/),
-[Kimsufi](http://www.kimsufi.com/us/en/index.xml)), and online.net's
+[So You Start](https://www.soyoustart.com/us/essential-servers/),
+[Kimsufi](https://www.kimsufi.com/us/en/index.xml)), and online.net's
 [dedicated](https://www.online.net/en/dedicated-server) and
 [Scaleway](https://www.scaleway.com/) offerings.
 
@@ -352,10 +352,10 @@ The defaults work fine except for blogs with a JavaScript-only Dynamic Views the
 Some blogspot.com blogs use "[Dynamic Views](https://support.google.com/blogger/answer/1229061?hl=en)"
 themes that require JavaScript and serve absolutely no HTML content.  In rare
 cases, you can get JavaScript-free pages by appending `?m=1`
-([example](http://happinessbeyondthought.blogspot.com/?m=1)).  Otherwise, you
+([example](https://happinessbeyondthought.blogspot.com/?m=1)).  Otherwise, you
 can archive parts of these blogs through Google Cache instead
 ([example](https://webcache.googleusercontent.com/search?q=cache:http://blog.datomic.com/))
-or by using http://archive.is/ instead of grab-site.
+or by using https://archive.is/ instead of grab-site.
 
 #### Tumblr blogs
 
@@ -370,7 +370,7 @@ crawl's `ignores`.
 
 Some tumblr blogs appear to require JavaScript, but they are actually just
 hiding the page content with CSS.  You are still likely to get a complete crawl.
-(See the links in the page source for http://X.tumblr.com/archive).
+(See the links in the page source for https://X.tumblr.com/archive).
 
 #### Subreddits
 
@@ -470,7 +470,7 @@ changes will be applied within a few seconds.
 
 `DIR/igsets` is a comma-separated list of ignore sets to use.
 
-`DIR/ignores` is a newline-separated list of [Python 3 regular expressions](http://pythex.org/)
+`DIR/ignores` is a newline-separated list of [Python 3 regular expressions](https://pythex.org/)
 to use in addition to the ignore sets.
 
 You can `rm DIR/igoff` to display all URLs that are being filtered out