Mike Fährmann
b233531aaa
[sankaku] use '/posts' endpoint for single posts
2020-12-22 02:44:40 +01:00
Mike Fährmann
459a0af4f8
[sankaku] add support for sankaku.app URLs ( closes #1193 )
2020-12-22 01:57:53 +01:00
Mike Fährmann
371e9ca6df
[pinterest] implement video support ( closes #1189 )
2020-12-21 16:09:06 +01:00
Mike Fährmann
537742c0ee
[sankaku] normalize 'created_at' metadata ( closes #1190 )
2020-12-21 02:06:29 +01:00
Mike Fährmann
ae6748996a
[pornhub] update tests
2020-12-21 02:06:28 +01:00
Mike Fährmann
bf629a2818
[instagram] add 'include' option ( closes #1180 )
...
Split the functionality of the old 'user' extractor into separate
'posts' and 'highlights' extractors, which respond to virtual URLs
('/<user>/posts' and '/<user>/highlights')
2020-12-21 02:06:28 +01:00
Mike Fährmann
78061658ea
[booru] reduce exceptions caught during _prepare_post()
...
don't catch HttpErrors etc.
2020-12-21 02:05:59 +01:00
Mike Fährmann
212ae0c399
[mangapanda] remove module
...
site now redirects to mangareader.net
2020-12-20 17:42:15 +01:00
Mike Fährmann
337b118e25
[instagram] warn about private profiles ( #1187 )
2020-12-19 22:32:28 +01:00
Mike Fährmann
e8c64dd961
[postprocessor:exec] do not auto-add '{}' to command ( #1185 )
...
This was initially done to mimic youtube-dl's behavior and
implementation of --exec, and it seemed reasonable at the time.
2020-12-19 20:53:46 +01:00
Mike Fährmann
0a3bbc9c63
[postprocessor:exec] update output
2020-12-19 20:36:39 +01:00
Mike Fährmann
511d8d3fa3
increase SQLite connection timeouts ( #1173 )
2020-12-19 20:15:07 +01:00
Mike Fährmann
465015f75a
[sankaku] reimplement login support ( #1176 , #1182 )
2020-12-17 16:12:59 +01:00
Mike Fährmann
8d2e4e5f13
[booru] improve error handling
...
e.g. for posts without a valid 'file_url' (#1176 )
2020-12-17 01:16:45 +01:00
Mike Fährmann
1f9121fecb
release version 1.16.0
2020-12-12 23:08:25 +01:00
Mike Fährmann
1d753542c2
[hentainexus] fix extraction ( fixes #1166 )
2020-12-12 20:30:51 +01:00
Mike Fährmann
b6f1fe59cb
add deprecation warnings for exec.final and metadata.bypost
2020-12-12 16:58:23 +01:00
Mike Fährmann
1e59aa6123
update README.rst and setup.py
...
- remove superfluous '-'s
- use definition list for config paths
- add Python 3.9 support to setup.py classifiers
2020-12-12 16:16:29 +01:00
Mike Fährmann
b8f2e42f7a
update pip install instructions
2020-12-12 15:45:34 +01:00
Mike Fährmann
476d563ec2
[downloader:http] add MIME type and signature for .swf files
2020-12-11 14:21:04 +01:00
Mike Fährmann
a00b60fbe7
[twitter] update 'x-csrf-token' header ( fixes #1170 )
...
Twitter started using a bigger (80 instead of 16 bytes) CSRf token for
logged in users, and expects those to be used as 'x-csrf-token' header
when send via 'ct0' cookie.
Generating an 80 byte token ourselves doesn't work, and Twitter will
still insist on using its own.
2020-12-11 13:46:58 +01:00
Mike Fährmann
b88c97b873
[instagram] add 'cursor' option ( #1149 )
...
To enable at least 'some' way to continue downloading from the middle
of a user profile listing.
2020-12-11 13:46:58 +01:00
Mike Fährmann
0d406c8daf
[common] restrict values used in 'generate_extractors()'
2020-12-11 13:46:47 +01:00
Mike Fährmann
fe0265c7a5
[downloader.http] small improvements to file signature list
...
- specify multiple entries for gif, mp3, zip
- add entries for pdf
2020-12-08 21:20:18 +01:00
Mike Fährmann
b2c55f0a72
[sankaku] remove login support
...
The old login method for 'https://chan.sankakucomplex.com/user/login '
and the cookies it produces have no effect on the results from
'beta.sankakucomplex.com'.
2020-12-08 21:05:47 +01:00
Mike Fährmann
7f3d811d7b
[moebooru] inherit from BooruExtractor
2020-12-08 18:34:56 +01:00
Mike Fährmann
a3a863fc13
[booru] add generalized extractors for *booru sites
...
similar to cc15fbe7
2020-12-08 18:34:30 +01:00
Mike Fährmann
5f23441e12
[piczel] update API URLs
2020-12-07 15:56:32 +01:00
Mike Fährmann
47114339a2
[webtoons] update 'ageGate' cookie
2020-12-07 14:56:32 +01:00
Mike Fährmann
4225f12783
[nozomi] handle empty 'date' fields ( fixes #1163 )
2020-12-07 00:08:53 +01:00
Mike Fährmann
2b93515ee0
[instagram] reimplement support for stories ( #1149 )
2020-12-06 21:32:10 +01:00
Mike Fährmann
ecdea799dd
[sankaku] use 'beta.sankakucomplex.com' API endpoints
2020-12-05 22:08:58 +01:00
Mike Fährmann
b3ecc89a9a
[instagram] use double quotes for strings when possible
2020-12-05 19:33:42 +01:00
Mike Fährmann
76285eb60d
[instagram] reimplement support for story highlights ( #1149 )
2020-12-05 19:13:00 +01:00
Mike Fährmann
8ca7f54750
rename '_request_…' variables
...
- remove '_' at the beginning
- _request_last -> request_timestamp
2020-12-05 00:09:15 +01:00
Mike Fährmann
15a122aff3
[instagram] update 'X-IG-WWW-Claim' headers
2020-12-04 20:58:34 +01:00
Mike Fährmann
e5d81bdc7b
[mangadex] handle 'external' chapters ( closes #1154 )
2020-12-04 20:56:30 +01:00
Mike Fährmann
447488fb18
[instagram] rewrite
...
(#1113 , #1122 , #1128 , #1130 , #1149 )
Rely on the results of GraphQL queries instead of requesting data
for each post separately via '/p/<shortcode>/?__a=1'.
This might result in some missing metadata, and there might be some
issues for '/channel/' and '/saved/' URLs, but at least downloading
from the regular post listings should work without issues and without
getting users blocked/banned.
TODO: reimplement support for stories
2020-12-03 14:30:59 +01:00
Mike Fährmann
cc15fbe71a
[moebooru] add generalized extractors for moebooru sites
...
- add support for sakugabooru.com (closes #1136 )
- add support for lolibooru.moe (closes #1050 )
This allows users to dynamically add support for moebooru/myimouto
based sites by adding an entry to their config file
(like for foolslide, foolfuuka, etc)
For example:
{
"extractor": {
"moebooru": {
"new-site-1": {"root": "https://site1.net "},
"new-site-2": {"root": "https://www.site2.moe "}
}
}
}
2020-12-01 22:27:18 +01:00
Mike Fährmann
43120407cc
[paheal] create directory for each post ( closes #1147 )
2020-12-01 12:14:55 +01:00
Mike Fährmann
63e61a0932
[twitter] update image URL format ( #1145 )
...
use
'/<name>?format=<fmt>&name=<size>'
instead of the potentially deprecated
'/<name>.<fmt>:<size>'
but keep all of them as fallback URLs
2020-12-01 11:53:51 +01:00
Mike Fährmann
1a4b61f7eb
[downloader:http] fix issues with chunked transfer encoding
...
(fixes #1144 )
2020-11-30 01:10:45 +01:00
Mike Fährmann
536c088462
[downloader:http] improve 'adjust-extensions' ( #776 )
...
Check file headers against a list of file signatures before
downloading the whole file and writing it to disk.
The file signature check needs some improvements (*),
but it produces usable results for the most part.
(*)
- 'webp', 'wav', and others start with 'RFFI'
- 'svg' uses the same "signature" as all XML documents
- 'webm' has the same signature as 'mkv' files
- only 'mp3' files in an ID3v2 container get recognized
2020-11-29 20:55:35 +01:00
Mike Fährmann
46323ae6ff
initialize 'hooks' as empty tuple
...
follow-up to 9c29fc4e
Prevent a "race" between initializing 'pathfmt' and 'hooks',
and receiving a signal in between (e.g. ctrl+c),
which would then crash in 'handle_finalize()'.
2020-11-28 18:18:49 +01:00
Mike Fährmann
06af57e84a
update CHANGELOG and README for 1.15.4
2020-11-28 00:09:34 +01:00
Mike Fährmann
9c29fc4e55
always initialize DownloadJob.hooks ( fixes #1135 )
...
and not just when any (potential) post processors are defined
2020-11-28 00:09:19 +01:00
Mike Fährmann
ae6a1d5fbc
[mangoxo] fix extraction 2
2020-11-27 13:55:30 +01:00
Mike Fährmann
0bc492c0fa
add docs for 'event' and 'filename' options
...
from 9c3568c3 and ca59bd69
2020-11-25 12:12:41 +01:00
Mike Fährmann
f6a684bc37
[hentainexus] update data decoding procedure ( #1125 )
2020-11-25 11:26:26 +01:00
Mike Fährmann
c57a918f4a
[e621] implement delay via '_request_interval_min'
2020-11-25 00:19:32 +01:00