2817 Commits

Author SHA1 Message Date
Mike Fährmann
105f3c9666
[twitter] add extractor for direct image links (closes #1417) 2021-04-02 02:45:23 +02:00
Mike Fährmann
ec3d5d58a8
[vk] improve extractor (#474)
- fetch all photos
- add 'metadata' option
- fix extracting photos without '?' in URL
2021-04-01 14:35:56 +02:00
Mike Fährmann
ebd142e2a8
[twitter] don't use youtube-dl for cards when videos are disabled
(#1416)
2021-04-01 14:26:08 +02:00
Mike Fährmann
d5aad999dc
[tapas] implement login with username & password (#692) 2021-03-30 01:45:28 +02:00
Mike Fährmann
e9ec91c811
[exhentai] improve image limits check
- check if current image is the '509 Bandwidth Exceeded' notification
  (https://ehgt.org/g/509.gif or https://exhentai.org/img/509.gif)
- remove 'limits' option
2021-03-29 19:01:13 +02:00
Mike Fährmann
387fe415d5
unescape items in text.split_html() 2021-03-29 02:12:29 +02:00
Mike Fährmann
36291176bc
[pinterest] add 'search' extractor (#1411) 2021-03-29 01:41:28 +02:00
Mike Fährmann
058cc47e9b
[bcy] improve pagination 2021-03-28 23:08:26 +02:00
Mike Fährmann
ddd48ceee5
update extractor test results 2021-03-28 23:06:44 +02:00
Mike Fährmann
1a540fbe00
[komikcast] fix extraction 2021-03-28 21:18:58 +02:00
Mike Fährmann
78fd63b8f0
remove 'text.clean_xml()'
was not used anywhere
2021-03-28 04:05:16 +02:00
Mike Fährmann
8553b218d9
replace calls to 'os.path.splitext()' with 'str.rpartition()'
Makes functions who used it more than twice as fast
and we can get rid of an import as well.
2021-03-28 04:01:27 +02:00
Mike Fährmann
5aa30c3669
[tapas] add 'series' and 'episode' extractors (#692) 2021-03-27 18:28:16 +01:00
Mike Fährmann
ccfa5a8694
[twitter] better error message when logging in with 2FA (#1409) 2021-03-27 18:26:37 +01:00
Mike Fährmann
214ecf62ce
[deviantart] fix arguments for search/popular results (#1408) 2021-03-27 18:26:10 +01:00
Magnus Boman
522d0a834c
[aryion] Unescape paths too (#1414)
Without this you'll get paths like this:
  - Starcross - Ch. 2 "The Ins and Outs of Sarah"

This commit changes it to:
  - Starcross - Ch. 2 "The Ins and Outs of Sarah"
2021-03-27 18:25:38 +01:00
beesdotjson
5ad615f0db
fix PixivFavoriteExtractor regex (#1405)
* fix PixivFavoriteExtractor regex

* do not use lookbehind
2021-03-25 14:59:33 +01:00
Mike Fährmann
62cfee4d28
[vk] initial support for albums (#474) 2021-03-23 19:02:16 +01:00
Mike Fährmann
0e601de67b
[sankaku] simplify 'pool' tags (#1388)
normalize 'tags' and 'artist_tags' to a string-list
2021-03-23 18:45:45 +01:00
Mike Fährmann
d085ade9d5
[sankaku] add 'tag_string' metadata field (#1388)
The 'join()'ed version of 'tags'.
Handling lists in format strings isn't properly supported yet.
2021-03-23 15:42:13 +01:00
Mike Fährmann
2dffd231b7
[sankaku] add enumeration index for books (#1388) 2021-03-23 15:32:54 +01:00
Mike Fährmann
139fb84108
[deviantart] fix username for 'watch' results (#794)
before it'd use "/" as username
2021-03-22 22:14:21 +01:00
Mike Fährmann
91c2e15da9
[deviantart] add support for posts from watched users (#794) 2021-03-22 19:25:04 +01:00
Mike Fährmann
03c20d8c8e
[deviantart] update 'watch' URL pattern (#794) 2021-03-21 22:48:06 +01:00
Mike Fährmann
2846235669
[twitter] allow specifying a custom format for user results
(#1337)
2021-03-21 22:26:26 +01:00
Mike Fährmann
bf241811dd
allow '_extractor' fields to be None or empty 2021-03-20 01:19:31 +01:00
Mike Fährmann
dc23cfd684
[deviantart] use fallback for /intermediary/ URLs
instead of checking availability with HEAD requests
2021-03-20 00:10:53 +01:00
Mike Fährmann
15daa62842
release version 1.17.1 2021-03-19 19:14:04 +01:00
Mike Fährmann
b0438c8f99
Revert "[deviantart] extend 'extra' option"
This reverts commit
5ad2b9c82bd9a92b80b935cb268cedb35008da86,
5c32a7bf58176bb5d2c5e22260cfe1d8a0844808, and
83f465faca3107c6406972d913d3f194412d9494.

(#1387, #1356)
2021-03-19 16:24:23 +01:00
Mike Fährmann
58b93635ee
[architizer] add 'firm' extractor (#1369) 2021-03-19 01:31:34 +01:00
Mike Fährmann
204523611c
[imgclick] use 'http://' for image URLs
The TLS certificate for main.imgclick.net is invalid.
2021-03-19 01:30:49 +01:00
Mike Fährmann
0b55f5ad84
[imgur] fix/improve rate limit handling (#1386)
- also wait-and-retry on 429 status codes
- use infinite loop instead of recursive calls
- 'extractor.sleep()' -> 'extractor.wait()'
2021-03-18 15:45:26 +01:00
Mike Fährmann
69ca4e29f1
[deviantart] add 'watch' extractor (#794) 2021-03-17 22:50:02 +01:00
Mike Fährmann
fcdda6128c
[mangastream] remove module 2021-03-16 23:52:36 +01:00
Mike Fährmann
c677ea19dd
[mangareader] remove module 2021-03-16 23:48:55 +01:00
Mike Fährmann
71523aaab6
[architizer] add 'project' extractor (#1369) 2021-03-16 03:24:29 +01:00
Mike Fährmann
3378b39719
[twitter] implement 'users' option (#1337) 2021-03-16 00:51:05 +01:00
Mike Fährmann
847e9b0ed7
[philomena] support post URLs without '/images/'
e.g. 'derpibooru.org/1'
2021-03-14 18:26:39 +01:00
Mike Fährmann
466966bf83
[hentaicafe] remove module 2021-03-14 17:19:57 +01:00
Mike Fährmann
97641cd151
[hentainexus] remove module 2021-03-14 17:19:57 +01:00
Mike Fährmann
23641742a3
improve 'parent-directory' (#1364)
Allow forwarding metadata from the top-level extractor to all children
if 'parent-directory' is enabled for all extractors along the way.

For example 'reddit' -> 'gfycat' -> 'redgifs'
2021-03-14 17:19:57 +01:00
Mike Fährmann
c485d0a956
[philomena] add generalized extractors for philomena sites
(closes #1379)
2021-03-14 17:19:57 +01:00
Mike Fährmann
6be7df53da
[hentaifox] improve metadata extraction (fixes #1378) 2021-03-14 17:19:56 +01:00
Mike Fährmann
72fe9ac0f3
[gelbooru_v01] support some more boorus by default
- https://drawfriends.booru.org/
- https://vidyart.booru.org/
- https://tlb.booru.org/
2021-03-14 17:19:56 +01:00
tux93
10c279f285
Weasyl: Drop the &feature=submit part of the favourite extractor URL (#1374)
It's optional and requiring it forces users to escape those URLs because
of the ampersand
2021-03-12 16:56:04 +01:00
Mike Fährmann
df94182e11
implement 'parent-metadata' option (#1364)
experimental, might not work as expected, etc.
2021-03-11 01:10:34 +01:00
Mike Fährmann
4be27ff0fe
[nozomi] support '/index-N.html' URLs (closes #1365)
and '/index-Popular-N.html'
2021-03-11 01:06:47 +01:00
Mike Fährmann
780bac4c8a
[gelbooru] update video server (fixes #1368)
from 'https://img2.gelbooru.com' to 'https://img3.gelbooru.com'
and provide fallback URLs
2021-03-10 01:48:07 +01:00
Mike Fährmann
f8441e851a
[hentaifox] improve image extraction (fixes #1366)
build image URLs from embedded JSON data
instead 0f rewriting thumbnail URLs
2021-03-10 01:38:32 +01:00
Mike Fährmann
c7c3fef0bc
[exhentai] support '/tag/' URLs (closes #1363) 2021-03-08 22:40:51 +01:00