2165 Commits

Author SHA1 Message Date
Mike Fährmann
db6685eeae
[aryion] support downloading from folders (fixes #694) 2020-04-18 01:25:54 +02:00
Mike Fährmann
fa2952ac55
[furaffinity] add 'following' extractor (#515) 2020-04-17 22:18:39 +02:00
Mike Fährmann
9b194520db
[newgrounds] add 'following' extractor (closes #684) 2020-04-17 22:17:43 +02:00
Mike Fährmann
6386ee54e1
[deviantart] add extractor info to 'following' results 2020-04-16 23:20:07 +02:00
Mike Fährmann
d5273f9b0c
[hiperdex] update domain to hiperdex.net 2020-04-16 20:39:56 +02:00
Mike Fährmann
08674a91f3
[patreon] fix hash extraction from download URLs (closes #693)
The old method was assuming every URL path ends with '/1'. For URLs
where this is not the case, the segment containing the post ID was
used as file hash.
2020-04-15 23:28:57 +02:00
Mike Fährmann
a31c1aae72
release version 1.13.4 2020-04-12 21:24:52 +02:00
Mike Fährmann
a6286bb551
[hiperdex] add 'artist' extractor (#606) 2020-04-12 02:32:37 +02:00
Mike Fährmann
291033720a
[hiperdex] fix manga extraction 2020-04-12 02:27:13 +02:00
Mike Fährmann
dfc0557807
[vsco] fix collection extraction 2020-04-11 23:06:29 +02:00
Mike Fährmann
fd438f0d78
update extractor test results 2020-04-11 23:00:42 +02:00
Mike Fährmann
bae1e8ed12
[deviantart] fix JPEG quality replacement pattern
'q_\d+' would sometimes also replace something in the 'token' query
parameter, invalidating the URL.
2020-04-11 02:37:06 +02:00
Mike Fährmann
cf4cef3d63
[aryion] adjust 'date' to UTC time 2020-04-11 02:08:05 +02:00
Mike Fährmann
a0f4c295c0
add optional 'utcoffset' argument to 'parse_datetime()' 2020-04-11 02:05:00 +02:00
Mike Fährmann
6c531be294
[aryion] fix malformed 'last-modified' headers (#390) 2020-04-10 23:08:52 +02:00
Mike Fährmann
38bc6430d3
[downloader:http] don't overwrite existing '_mtime' fields 2020-04-10 23:08:03 +02:00
Mike Fährmann
dc65f7d8dc
[aryion] use generic download URLs (#390)
i.e. /g4/data.php?id=…

- get filename & extension from Content-Disposition header
- handle all downloadable file types (docx, swf, etc)
2020-04-10 22:08:45 +02:00
Mike Fährmann
96b78bcf04
[aryion] include path in default directory format (#390) 2020-04-10 21:58:46 +02:00
Mike Fährmann
300264f676
read config files from PyInstaller exe directory (closes #682) 2020-04-08 21:53:50 +02:00
Mike Fährmann
6143050980
[aryion] add gallery and post extractors (#390, #673) 2020-04-08 21:52:51 +02:00
Mike Fährmann
9e7dfc0cfc
[myportfolio] fix extraction of galleries without title 2020-04-08 21:08:05 +02:00
Mike Fährmann
88fca0a172
[mastodon] update OAuth credentials for pawoo.net (#665) 2020-04-06 00:50:30 +02:00
Mike Fährmann
4ae8a25567
[mastodon] use 'combine_dict()' to combine extractor info dicts 2020-04-05 21:45:00 +02:00
Mike Fährmann
220c06b86e
[mastodon] handle rate limits 2020-04-05 21:44:00 +02:00
Mike Fährmann
d02f7c1118
improve Extractor.wait()
- allow 'until' to be a datetime object
- do "time calculations" with UTC timestamps
- set a default 'reason'
2020-04-05 21:23:05 +02:00
Mike Fährmann
5d7404ab58
[oauth] use the new name for 'DeviantartAPI' (fixes #670) 2020-04-04 20:34:47 +02:00
Mike Fährmann
762c758af4
[hiperdex] fix extraction 2020-04-03 21:25:25 +02:00
Mike Fährmann
f9a590f92b
[deviantart] apply HTTP request limits in more places
"Request blocked" can also happen on sta.sh and for *any* HTTP
request directed at deviantart.com
2020-04-03 21:21:59 +02:00
Mike Fährmann
2587296deb
[mastodon] add access tokens for mastodon.social and baraag.net
(closes #665)
2020-04-02 22:34:32 +02:00
Mike Fährmann
ff7c0b7eff
[deviantart] handle "Request blocked" errors (#655)
- add a 2 second wait time between requests to deviantart.com
- catch 403 "Request blocked" errors and wait for 3 minutes until
  retrying
2020-04-02 22:14:02 +02:00
Mike Fährmann
c874684f05
[deviantart] retrieve *all* download URLs through OAuth API
'/extended_fetch' as well as Deviation webpages now again contain
Deviation UUIDs needed to grab Deviation info through the OAuth API,
meaning cookies are no longer necessary to grab original files.

The only instance were cookies are still needed are scraps marked as
"mature", since those entries are hidden for public users.

(#655, #657, #660)
2020-04-02 22:10:33 +02:00
Mike Fährmann
5c27b25a8f
[deviantart] improve sta.sh extraction
Extract all sta.sh items in a single extractor run.
Don't spawn a new StashExtractor for each individual sta.sh item to
preserve the current requests.Session and its opened TCP connections.
2020-04-01 03:17:25 +02:00
Mike Fährmann
e2fc4eaa6f
[deviantart] detect stash folders (fixes #659) 2020-04-01 01:59:03 +02:00
Mike Fährmann
c034159701
[piczel] fix extraction for single images 2020-03-31 22:47:23 +02:00
Mike Fährmann
699036ea0c
[weibo] accept status URLs with non-numeric IDs (#664) 2020-03-31 22:46:50 +02:00
Mike Fährmann
fe96f99e4b
[hentainexus] reduce line length (flake8) & update test 2020-03-31 22:08:43 +02:00
墨焓
6f81cac8fa
Add metadata to hentainexus: circle, event, title_conventional. (#661) 2020-03-31 21:59:02 +02:00
Mike Fährmann
3ed72f82dc
release version 1.13.3 2020-03-28 22:03:33 +01:00
Mike Fährmann
6f911aeb1c
[deviantart] add error message for cloudFront blocks (#655) 2020-03-28 21:18:04 +01:00
Mike Fährmann
7499d71d02
[simplyhentai] ignore certificate errors in video test 2020-03-28 21:07:30 +01:00
Mike Fährmann
4203dc0bdc
[mangapark] fix metadata extraction 2020-03-28 03:00:26 +01:00
Mike Fährmann
6ecb0a19cf
handle sys.stdin being None when using '-' as input file (#653) 2020-03-25 22:33:39 +01:00
Mike Fährmann
1b82d36ab2
[deviantart] handle decode errors for extended_fetch results (#655)
This isn't going to solve the underlying problem, but it should at
least provide the server response when those errors happen.
2020-03-24 20:56:41 +01:00
Mike Fährmann
09f2271528
[35photo] add 'tag' extractor 2020-03-24 02:49:00 +01:00
Mike Fährmann
77fda8190c
[35photo] simplify/remove tests for the 'genre' extractor
There is still a nice genre overview page (https://35photo.pro/genre/)
but the individual sub-pages don't list photos anymore
2020-03-24 02:48:25 +01:00
Mike Fährmann
4bc161ca0f
prevent crash when sys.stdout and co. are None (#653) 2020-03-23 23:38:55 +01:00
Mike Fährmann
fb846c9ee5
[instagram] reduce line lengths and make flake8 happy 2020-03-23 22:56:43 +01:00
Mike Fährmann
ad2efa8509
[e621] derive from Danbooru extractors (#651)
- use extractor implementations from 'danbooru'
- use "page": "b[ID]" to paginate over results instead of
  "tags": "id:<[ID]", avoiding infinite loops with certain
  post orders
- bump User-Agent version
2020-03-22 21:08:45 +01:00
Mike Fährmann
9b39e1cd7e
[e621] fix bug in API rate limiting (#651) 2020-03-22 14:01:23 +01:00
Mike Fährmann
b607d0ad7f
[twitter] fix typo in 'x-twitter-auth-type' header (#625) 2020-03-21 23:11:39 +01:00