970 Commits

Author SHA1 Message Date
Mike Fährmann
d69db60e2a
update unit test results 2018-10-02 20:37:46 +02:00
Mike Fährmann
f8b3b00249
[twitter] add experimental 'videos' option (#99)
Enabling this option will detect videos in tweets and output them as
"unsupported" URLs, so that these can then be downloaded with youtube-dl

There are a lot of improvements to be made to the current
implementation, but it works and does what it is supposed to, even if
inefficient as can be ...
2018-09-30 21:52:23 +02:00
Mike Fährmann
5507f5ce2e
[komikcast] fix extraction 2018-09-29 16:37:30 +02:00
Mike Fährmann
8080071174
[flickr] improve album metadata (closes #109) 2018-09-29 16:21:55 +02:00
Mike Fährmann
537448ba6e
[yuki] fix extraction of older threads (closes #112) 2018-09-29 11:38:55 +02:00
Mike Fährmann
1acaed73e0
[warosu] improve extraction and metadata
- convert values to int
- unquote original filenames
- don't parse posts twice
2018-09-28 13:03:12 +02:00
Mike Fährmann
2cf3f53839
[yuki] add thread extractor (closes #111) 2018-09-28 12:47:32 +02:00
Mike Fährmann
c402cc4047
[hentaifoundry] add 'popular' and 'recent' extractors
for "Popular Pictures" and "Recent Pictures" listings
2018-09-24 13:11:18 +02:00
Mike Fährmann
a5fc311dfa
[hentaifoundry] add 'favorite' extractor 2018-09-22 21:23:29 +02:00
Mike Fährmann
1c95a0173f
[hentaifoundry] split 'artist' into 'user'+'artist'
and some smaller changes ...

'user' is the name of the account an image is listed at and
'artist' is now the name of the account who created the image.

For example "https://www.hentai-foundry.com/user/Tenpura/faves/pictures"
- 'user': Tenpura
- 'artist' of the only image: LewdBrush
2018-09-22 21:21:07 +02:00
Mike Fährmann
e066f35118
update extractor tests 2018-09-21 11:25:56 +02:00
Mike Fährmann
006f75b538
[hentaifoundry] rewrite + more metadata
- extract width, height, artist per image
- improve pattern regex
- better extensibility for other listings
2018-09-21 11:23:51 +02:00
Mike Fährmann
eeb7424783
[hentaifoundry] add support for "scraps" (#110) 2018-09-20 13:41:23 +02:00
Mike Fährmann
6ea9a78588
[wallhaven] add login capabilities
Being logged in is required to access NSFW wallpapers.
2018-09-19 21:04:01 +02:00
Mike Fährmann
c9290d8212
[wallhaven] add wallpaper and search extractors
todo:
- login support to gain access to NSFW wallpapers
- extractors for tag-, similar-, latest-listings
- skip() support
2018-09-17 21:26:13 +02:00
Mike Fährmann
26cbcb3a72
[flickr] improve error handling (#109) 2018-09-17 10:12:14 +02:00
Mike Fährmann
2be4c9ffe3
[sankaku] small code improvements 2018-09-16 21:01:28 +02:00
Mike Fährmann
529aa21dd9
move FileAdapter definition into recursive.py 2018-09-16 20:59:22 +02:00
Mike Fährmann
22ab509a70
[bobx] rename "model" to "idol" extractor 2018-09-14 18:11:36 +02:00
Mike Fährmann
99137f1bee
[sankaku] send login info as formdata
Previously they were erroneously send as URL parameters.
2018-09-14 17:54:15 +02:00
Mike Fährmann
fa64c38d5b
[sankaku] fix pagination for user favorites (#106) 2018-09-14 17:51:46 +02:00
Mike Fährmann
69fd61ea86
[bobx] add gallery and model extractors 2018-09-13 20:13:12 +02:00
Mike Fährmann
0232d80cec
[deviantart] convert 'published_time' to int (fixes #108)
The 'published_time' field (a timestamp) changed from integer to string
and caused journal creation to fail.
2018-09-13 19:52:01 +02:00
Mike Fährmann
7742cf8601
[tumblr] change 'reblogs' option (#103)
- rename "deleted" to "same-blog"
- change test for deleted original post to test if
  original post owner has the same UUID (full blog name) as the one
  being downloaded from
- add 'blog[uuid]' metadata to allow comparison with
  'reblogged_from_uuid'
2018-09-10 15:40:25 +02:00
Mike Fährmann
d4d95d3154
[tumblr] improve rewrite rules for video URLs 2018-09-09 14:09:47 +02:00
Mike Fährmann
542a25c389
[ngomik] fix extraction 2018-09-09 13:45:40 +02:00
Mike Fährmann
a666ddd16b
[tumblr] extend 'reblogs' functionality (#103)
Setting 'reblogs' to "deleted" will check if the parent post of a
reblog has been deleted and download its media content if that is the
case, otherwise it will be skipped.

This is a rather costly operation (1 API request per reblogged post)
and should therefore be used with care.
2018-09-07 19:13:52 +02:00
Mike Fährmann
c9b8e6aefc
[reddit] fix submission-ID parsing (#104)
Uppercase characters caused a ValueError exception
2018-09-07 18:27:54 +02:00
Mike Fährmann
488abeca0b
[hentaicafe] adjust default directory format
A separate folder for each chapter is rather pointless if almost all
manga have only one chapter each.
2018-09-07 18:25:58 +02:00
Mike Fährmann
b4eca2633e
[tumblr] support /archive URLs 2018-09-06 11:09:13 +02:00
Mike Fährmann
aa1de70da0
[tumblr] recognize inline videos (#102) 2018-09-06 10:37:40 +02:00
Mike Fährmann
3ecea4cf36
[hentaicafe] add chapter and manga extractors (#101) 2018-09-05 21:08:40 +02:00
Mike Fährmann
0bc8ef51c8
[smugmug] Handle albums with no explicit owner (#100) 2018-09-01 12:55:02 +02:00
Mike Fährmann
b47af4637a
[mangadex] update URL pattern
Manga URLs now begin with /title/ instead of /manga/
2018-08-31 20:16:50 +02:00
Mike Fährmann
75862715ac
[behance] add user extractor 2018-08-31 17:42:09 +02:00
Mike Fährmann
a493fed376
[deviantart] fix journal creation if no 'username' is set 2018-08-31 17:38:12 +02:00
Mike Fährmann
5b8a314de7
[tumblr] replace inline URLs with higher quality ones (#98) 2018-08-25 18:43:51 +02:00
Mike Fährmann
2af2bb7911
[mangadex] fix relative page URLs 2018-08-25 11:07:26 +02:00
Mike Fährmann
34b556922d
update/restore tests 2018-08-23 15:47:40 +02:00
Mike Fährmann
ab2bfaeb46
[ngomik] add replacement for 'subapics'
http://subapics.com/ got discontinued and replaced by http://ngomik.in/.

ngomik.in is still displaying a link to the "old site" showing a big
"Account Suspended" sign.
2018-08-23 15:29:53 +02:00
Mike Fährmann
a2eeef1f5e
[behance] replace test
The "UVMW Studio" account and their galleries are gone.
2018-08-19 21:17:21 +02:00
Mike Fährmann
e9dd2eff1d
[twitter] add extractor for media-tweet timelines (#96)
For example "https://twitter.com/PicturesEarth/media".
They are different from normal timelines in that they do not contain
any (re)tweets from other users and feature all media the user ever
posted, including responses to other tweets.
2018-08-19 20:46:12 +02:00
Mike Fährmann
f45c9f2141
[gfycat] test-updates and code-adjustments 2018-08-18 23:04:45 +02:00
Mike Fährmann
9b1c39032c
[twitter] changes and improvements
- rename User- to TimelineExtractor
- rename 'userid' to 'user_id' to conform to the other ..._id values
- adjust archive_fmt to deal with retweets
- emulate browser behavior for API calls
2018-08-18 23:04:45 +02:00
Mike Fährmann
10365394d7
[twitter] add support for user-timelines (closes #96)
also adds a 'retweets' option to filter retweeted content
2018-08-17 20:04:11 +02:00
Mike Fährmann
d3f1eed2a6
[pinterest] improvements
- add stop condition for pin-related pins
- improve URL patterns
- make Pylint happy
2018-08-16 18:11:39 +02:00
Mike Fährmann
2801a0d997
[exhentai] skip "Content Warning" page when not logged in
(closes #97)
2018-08-16 09:17:22 +02:00
Mike Fährmann
63fa0b2006
[pinterest] add extractors for related pins
Related pins can not be accessed by adding a "#related" fragment
to the end of a Pinterest URL, for example:
- https://www.pinterest.com/pin/858146903966145189/#related
- https://www.pinterest.com/g1952849/test-/#related

There are no explicit real URLs for related pins,
using an option to enable them results in "clunky" code,
and a custom "related:<URL>" scheme doesn't feel right either.
2018-08-15 21:49:45 +02:00
Mike Fährmann
1694039de0
[komikcast] update ad-filter 2018-08-15 21:49:44 +02:00
Mike Fährmann
a74591b84b
[tumblr] remove "original image" functionality
Accessing higher/original quality images on
https://s3.amazonaws.com/data.tumblr.com and http://data.tumblr.com
is no longer possible and any HTTP request results in 403 Forbidden.

A few images can still be accessed through https//a.tumblr.com [1][2],
but not as "_raw", just "_1280", and that might also be "fixed" in
the near future.

[1] https://a.tumblr.com/tumblr_kzjlfiTnfe1qz4rgho1_1280.jpg
[2] https://a.tumblr.com/ee589c6345f29d2d5935cecb49b0a705/tumblr_oztu02dIHp1wgha4yo1_1280.png
2018-08-14 11:51:17 +02:00