Mike Fährmann
8a909e478d
[imagebam] fix extraction of NSFW images ( #1534 )
2021-05-22 21:41:44 +02:00
Mike Fährmann
b5affc62aa
[twitter] rename 'text-only' to 'text-tweets' ( #570 )
2021-05-22 21:41:12 +02:00
Mike Fährmann
724ca61f36
[twitter] add 'text-only' option ( #570 )
2021-05-22 17:01:49 +02:00
Mike Fährmann
8fd8126117
fix ISO 639-1 code for Japanese
...
"jp" -> "ja"
2021-05-22 16:07:04 +02:00
Mike Fährmann
2c60c7d798
[reactor] skip deleted/empty posts
2021-05-21 16:14:09 +02:00
Mike Fährmann
532ac79fb0
update extractor test results
2021-05-21 02:28:53 +02:00
Mike Fährmann
d7bc4a2b8b
[500px] update query hashes
2021-05-21 01:20:31 +02:00
Mike Fährmann
0f35aca728
[aryion] minor code updates
2021-05-19 23:46:33 +02:00
Mike Fährmann
2eb46452ad
[aryion] update 'needle' to not skip text posts ( fixes #1568 )
...
on "Latest Updates" pages
"class='thumb scrollthumb' href='/g4/view/" and
"class='thumb' href='/g4/view/" both end with
"thumb' href='/g4/view/"
2021-05-19 23:35:05 +02:00
Mike Fährmann
4fc9668922
[imgur] update URL patterns ( #1561 )
2021-05-19 15:44:10 +02:00
Mike Fährmann
1eabfa5c7a
[pillowfort] implement login with username & password ( #846 )
2021-05-19 02:59:16 +02:00
Mike Fährmann
24dd10ac3c
[patreon] extract user defined 'tags' ( #1539 , closes #1540 )
2021-05-18 00:35:52 +02:00
Mike Fährmann
a7e4917ee1
[pillowfort] add 'inline' option ( #846 )
...
to support images present in a post's 'content',
but not listed in 'media'.
also separates the file hash present at the beginning
of each 'filename' into its own field.
2021-05-17 03:03:58 +02:00
Mike Fährmann
efa6cc8ec3
[pillowfort] add 'external' option ( #846 )
...
for links to external Twitter posts etc.
2021-05-17 01:46:42 +02:00
Mike Fährmann
394fbb5f56
[twitter] strip useless t.co links ( #1532 )
...
The 'full_text' of Tweets with media content usually ends with a t.co
link to itself. This commit removes those.
2021-05-17 00:20:29 +02:00
Mike Fährmann
41457dbb1b
[twitter] resolve t.co URLs in 'content' ( #1532 )
2021-05-15 18:52:37 +02:00
Mike Fährmann
2b5d80862e
[kemonoparty] add 'type' metadata field ( #1556 )
...
'file', 'attachment', or 'inline'
2021-05-15 01:13:41 +02:00
Mike Fährmann
17b0ccb071
[twitter] add missing retweet media entities ( fixes #1555 )
...
from the original tweets
2021-05-14 22:51:01 +02:00
Mike Fährmann
5eeaaee01d
[pixiv] add 'metadata' option ( #1551 )
2021-05-14 20:30:28 +02:00
Mike Fährmann
0717456b4e
[kemonoparty] add 'metadata' option ( closes #1548 )
...
to fetch creator names with an additional HTTP request
2021-05-14 19:56:49 +02:00
Mike Fährmann
36ed1efcfb
[pixiv] rename "noop" value for 'tags' option to "original"
...
(#1507 )
2021-05-07 20:41:54 +02:00
Mike Fährmann
14f983eab6
[deviantart] use default ID when 'client-id' is None
2021-05-07 16:14:38 +02:00
Mike Fährmann
3e4ffb0821
[gelbooru] add extractor for '/redirect.php' URLs ( #1530 )
2021-05-07 15:34:53 +02:00
Mike Fährmann
5e54105ae4
[instagram] update query hashes
2021-05-06 19:15:18 +02:00
Mike Fährmann
b3ee10a7fb
[500px] update query hashes
2021-05-06 17:28:26 +02:00
Mike Fährmann
15b0241bbc
[imagebam] fix extraction
2021-05-06 16:47:36 +02:00
Mike Fährmann
38ae61edd4
[inkbunny] add 'favorite' extractor ( #1521 )
2021-05-04 19:28:48 +02:00
Mike Fährmann
577fffad5f
[nozomi] update 'archive_fmt' values for tag and search extractors
...
… so they actually work for posts with more than 1 file.
(fixes #1523 )
2021-05-04 19:28:37 +02:00
Mike Fährmann
c5ca7905ce
add 'noop()' and 'identity()' functions
2021-05-04 19:27:17 +02:00
HRXN
e13cae182b
[nozomi] Extend default archive-fmt for Tag and Search Extractor ( #1529 )
...
Closes #1523
2021-05-04 19:26:35 +02:00
Mike Fährmann
2133f1d77f
[readcomiconline] change domain to 'readcomiconline.li'
...
(closes #1517 )
2021-05-01 16:41:16 +02:00
Mike Fährmann
66f28e471c
[kemonoparty] update file URLs directly linking to kemono.party
...
(#1514 )
2021-05-01 02:30:10 +02:00
Mike Fährmann
6fa20d456b
[sankaku] update invalid-token detection ( fixes #1515 )
2021-04-30 22:04:45 +02:00
Mike Fährmann
4b65ebf652
[kemonoparty] fix file URLs ( #1514 )
...
files are now hosted on https://data.kemono.party/
2021-04-29 19:36:34 +02:00
Mike Fährmann
fa519f9202
[pixiv] change 'translated-tags' option ( #1507 )
...
- rename to 'tags'
- use string-values: "japanese", "translated", "noop"
- remove duplicate entries for "translated" tags
2021-04-29 19:30:43 +02:00
Mike Fährmann
221015e586
[downloader:http] disable filename extension changes for ugoira
...
(#1507 )
2021-04-27 01:29:09 +02:00
thatfuckingbird
e47952ac14
add extractors for fantia and fanbox ( #1459 )
...
* add extractors for fantia and fanbox
* appease linter
* make docstrings unique
* [fantia] refactor post extraction
* [fantia] capitalize
* [fantia] improve regex pattern
* code style
* capitalize
* [fanbox] use BASE_PATTERN for url regexes
* [fanbox] refactor metadata and post extraction
* [fanbox] improve url base pattern
* [fanbox] accept creator page links ending with /posts
* [fanbox] more tests
* [fantia] improved pagination
* [fanbox] misc. code logic improvements
* [fantia] finish restructuring pagination code
* [fanbox] avoid making a request for each individual post when processing a creator page
* [fanbox] support embedded videos
* [fanbox] fix errors
* [fanbox] document extractor.fanbox.videos
* [fanbox] handle "article" and "entry" post types, all embeds
* [fanbox] fix downloading of embedded fanbox posts
2021-04-25 19:39:13 +02:00
Mike Fährmann
d900edfcfb
[simplyhentai] fix extraction
2021-04-25 18:51:43 +02:00
Mike Fährmann
ba8180b5e6
[bcy] don't crash with deleted posts
2021-04-25 18:51:09 +02:00
Mike Fährmann
d108421461
[myportfolio] fix extraction
2021-04-24 01:22:57 +02:00
Mike Fährmann
8b22d4e667
[mangapark] use '"browser": "firefox"' by default
...
to get rid of Cloudflare CAPTCHA resonses
2021-04-23 23:21:02 +02:00
Mike Fährmann
9514cb8c12
[exhentai] update 'limits' check ( #1487 )
...
Only use 'limits' to set a custom upper bound.
Checking if the actual maximum gets exceeded is not necessary.
2021-04-23 23:20:45 +02:00
thatfuckingbird
141ca4ac0a
[pixiv] also save untranslated tags when translated-tags is enabled ( #1501 )
2021-04-23 23:02:41 +02:00
Renan Vedovato Traba
9322c5e43b
[exhentai] restore limit config ( #1487 )
...
This partially reverts commit e9ec91c8
2021-04-22 21:21:41 +02:00
Mike Fährmann
cb86bb9cc9
[hentaicosplays] add 'slug' metadata field ( closes #1483 )
2021-04-19 16:28:01 +02:00
Mike Fährmann
dddda7d0e7
[hentaicosplays] use GalleryExtractor ( #1473 )
2021-04-18 20:30:39 +02:00
Mike Fährmann
d88e34f17e
[webtoons] use GalleryExtractor
2021-04-18 20:28:31 +02:00
Mike Fährmann
c4210b5371
[webtoons] update agegate/GDPR cookies
2021-04-18 20:28:31 +02:00
Mike Fährmann
d89eb7536b
[naverwebtoon] use GalleryExtractor
2021-04-18 20:28:31 +02:00
Mike Fährmann
9b52eb9bf1
[naverwebtoon] ignore non-comic images
2021-04-18 20:28:30 +02:00