Mike Fährmann
6bcdb264e0
[imgur] treat 't/unmuted' URLs as galleries
2020-05-25 22:21:57 +02:00
Mike Fährmann
b6cee3e45b
[imgur] fix extraction of animated images without 'mp4' entry
2020-05-25 22:21:57 +02:00
Leonardo Taccari
bcac31b7c7
[webtoons] make archive_fmt unique ( #779 )
...
close #778
2020-05-25 21:23:54 +02:00
Mike Fährmann
e19f665a44
[danbooru] change default for 'ugoira' to 'false'
...
Downloading the pre-rendered versions should be a better default
than .zip files with individual frames.
2020-05-20 19:57:28 +02:00
Mike Fährmann
3201fe3521
add global SENTINEL object
2020-05-19 22:32:53 +02:00
Mike Fährmann
c8787647ed
add global WINDOWS bool
2020-05-19 22:32:53 +02:00
Mike Fährmann
6294e2c540
add 'text.ensure_http_scheme()'
2020-05-19 22:32:53 +02:00
Mike Fährmann
0378d079a5
[webtoons] fixes and simplifications ( #593 , #761 )
...
- fix episode listings for french comics
- allow input URLs without explicit scheme
- add 'lang'/'language' metadata
- use str.format() instead of '+' to assemble URLs
2020-05-18 20:20:03 +02:00
Mike Fährmann
ab11b1c896
[imagechest] simplify code ( #750 )
2020-05-18 19:11:26 +02:00
Mike Fährmann
846d3a2466
[sexcom] replace 404ed test
2020-05-18 19:04:51 +02:00
Mike Fährmann
9b4635917f
[gelbooru] simplify and fix pool extraction
...
use 'pool:<pool id>' as search tag to get pool posts
2020-05-18 19:04:51 +02:00
Leonardo Taccari
39cd389679
[webtoons] Add a new extractor for webtoons.com ( #761 )
...
The webtoons extractor can extract episode and entire comic (all
episodes) from webtoons.com.
All the logic of the extractors should be trivial except for a couple
of kludges needed:
- `ageGatePass' cookie is always set to avoid possible redirect and stop of
extraction, especially in the comic extractor
- The image URLs returned by the episode extractor could not be fetched
directly and the `Referer:' HTTP header needs to be passed to fetch them
Close #593 .
2020-05-18 19:04:20 +02:00
Bepis
7b5711ee04
[imagechest] Add new extractor for ImageChest ( #750 )
...
* [imagechest] Add new extractor for ImageChest
* [imagechest] Fix flake8 compliance issues
2020-05-18 19:02:56 +02:00
Mike Fährmann
a1e739b96c
reuse connection adapters from parent extractors
2020-05-12 23:52:01 +02:00
Mike Fährmann
f8f95e68a7
improve '--write-pages' ( #737 )
...
- move code into its own function
- add enumeration index to filenames
- dump responses regardless of status code
2020-05-12 20:40:25 +02:00
Mike Fährmann
09cc9dbec0
prevent flake8 errors from comments looking like type annotations
2020-05-12 20:08:05 +02:00
Mike Fährmann
2d6724180b
[hiperdex] update domain to hiperdex.info
2020-05-12 17:00:51 +02:00
Vrihub
4cc761c730
Implement --write-pages option ( #736 )
...
* Implement --write-pages option
* Fix long lines
* Fix file mode to binary
* Fix pattern for Windows compatibility
2020-05-12 14:25:21 +02:00
Mike Fährmann
f557cac074
[redgifs] add image extractor ( #724 )
2020-05-10 00:31:42 +02:00
Mike Fährmann
65b1cb7acd
[deviantart] use private access tokens for Journals ( fixes #738 )
2020-05-08 21:45:01 +02:00
Mike Fährmann
0bf0146bfe
[reddit] don't send OAuth headers for file downloads ( fixes #729 )
2020-05-08 21:42:52 +02:00
Mike Fährmann
d6a480682f
update test results
2020-05-02 21:13:00 +02:00
Leonardo Taccari
b47cfc5ac9
[speakerdeck] Add a new extractor for speakerdeck.com ( #726 )
2020-05-01 22:32:22 +02:00
Mike Fährmann
90491ab606
[artstation] improve embed extraction ( #720 )
2020-04-30 21:25:03 +02:00
Mike Fährmann
999efec5cc
[deviantart] limit API wait times to 2**9=512 seconds ( #721 )
2020-04-30 21:16:09 +02:00
Mike Fährmann
504de79d8b
[vsco] fix extraction
2020-04-30 21:12:06 +02:00
Mike Fährmann
5e2974d699
[weibo] add 'videos' option
2020-04-30 00:00:30 +02:00
Mike Fährmann
9f638c2e01
[twitter] add 'replies' option ( closes #705 )
2020-04-29 23:20:06 +02:00
Mike Fährmann
fc3e54275b
[patreon] respect filters and sort order in query params ( #711 )
2020-04-28 23:58:03 +02:00
Mike Fährmann
46b9a4d8ff
[patreon] improve hash extraction ( #693 , #713 )
...
Instead of accessing a specific part of a download URL, potentially
causing an exception if it doesn't exist, we're now searching through
all parts for a potential MD5 hash without ever raising an exception.
2020-04-28 21:47:18 +02:00
Mike Fährmann
c56a751dae
[newgrounds] fix URLs produced by 'followng' extractors ( #684 )
2020-04-28 21:33:37 +02:00
Mike Fährmann
a4fd620a25
[hiperdex] revert domain back to hiperdex.com
2020-04-27 20:42:31 +02:00
Mike Fährmann
233b6f93a2
[patreon] recognize URLs with creator IDs ( #711 )
...
e.g. https://www.patreon.com/user/posts?u=…
2020-04-26 22:19:10 +02:00
Mike Fährmann
38b6bd66b0
[500px] match 'web.500px.com' subdomains
2020-04-26 22:17:20 +02:00
Mike Fährmann
d3b3b30107
update test results
2020-04-26 22:14:28 +02:00
Mike Fährmann
5d7ca76885
retry Cloudflare challenges
2020-04-24 22:47:27 +02:00
Mike Fährmann
3eab07739f
[twitter] ensure videos have a 'filename'
...
This usually gets set when invoking the 'ytdl' downloader, but when
that fails, the error message would use 'None' as filename.
2020-04-24 22:34:19 +02:00
Mike Fährmann
c4371a6970
[twitter] add 'reply' metadata field ( #705 )
2020-04-24 22:31:24 +02:00
Mike Fährmann
12ff23b6cc
[mastodon] improve account searches ( fixes #704 )
...
Searching for just the username ("@NAME") can produce multiple
unrelated results, so we now search for username + mastodon instance
("@NAME@INSTANCE")
2020-04-23 20:23:10 +02:00
Mike Fährmann
400a0df661
[jaiminisbox] update decoding procedure ( fixes #702 )
2020-04-23 20:21:48 +02:00
Mike Fährmann
8fe858eb0e
improve parameter extraction when solving Cloudflare challenge
2020-04-22 22:08:17 +02:00
Mike Fährmann
fb98b567fa
[gelbooru] improve post ID extraction for pools
2020-04-22 21:28:18 +02:00
Mike Fährmann
d6facdee7b
[mastodon] add tests ( #701 )
2020-04-22 21:10:34 +02:00
Mike Fährmann
12eebb6f16
[xhamster] support xhamster.porncache.net domains ( closes #700 )
2020-04-22 18:31:05 +02:00
Mike Fährmann
e749402191
[mastodon] fix pagination ( #701 )
2020-04-22 17:58:55 +02:00
Mike Fährmann
921914141e
[imgbb] improve redirect handling
2020-04-20 23:36:57 +02:00
Mike Fährmann
6cc800aad4
[instagram] add 'post_id' and 'num' metadata fields ( closes #698 )
2020-04-20 22:22:29 +02:00
Mike Fährmann
a3de234e70
[hitomi] add extractor for tag searches ( closes #697 )
2020-04-20 21:55:19 +02:00
Mike Fährmann
456f6e8d05
[nozomi] move '_unpack()' method to global scope
2020-04-20 21:44:16 +02:00
Mike Fährmann
55ac408bdf
[hitomi] fix extraction of galleries without tags
2020-04-20 21:42:14 +02:00