540 Commits

Author SHA1 Message Date
Mike Fährmann
67bad04dda
[formatter] add 'g' conversion to sluGify a string () 2022-08-26 17:57:17 +02:00
Mike Fährmann
6990ad0ba8
[formatter] do NOT apply :J to strings () 2022-08-16 16:41:19 +02:00
Mike Fährmann
c0051d7d4c
fix test 2022-08-01 21:40:35 +02:00
Mike Fährmann
dd3a6a9fd1
make 'enumerate_reversed()' work with generators () 2022-08-01 14:08:44 +02:00
Mike Fährmann
0c73914848
[postprocessor:metadata] implement 'mode: modify' () 2022-07-19 12:24:26 +02:00
Mike Fährmann
f3de6b7a87
[postprocessor:metadata] implement 'mode: delete' () 2022-07-19 00:57:29 +02:00
Mike Fährmann
9704c04172
[postprocessor:zip] ensure target directory exists () 2022-07-14 11:55:39 +02:00
Mike Fährmann
74865adae5
implement 'format-separator' option ()
a global option, that servers as a workaround for shortcomings due to
lack of a proper format string parser
2022-07-10 13:31:43 +02:00
bradenhilton
117eeefda0
[postprocessor:mtime] add 'value' option () 2022-07-08 20:56:01 +02:00
Mike Fährmann
90ae48c40c
[formatter] implement 'O' format specifier ()
to apply a UTC offset to 'date' values and other datetime objects
2022-07-08 12:51:03 +02:00
Mike Fährmann
04bed1eba3
[formatter] allow for custom "format" functions () 2022-07-05 12:22:01 +02:00
Mike Fährmann
54525d2e21
[formatter] implement slice operator as format specifier
this allows using a slice operator alongside other (special) format
specifiers like J, to first join list elements to a string and then
trimming that with a slice.

{tags:J, /[:50]}
2022-06-25 16:52:58 +02:00
Mike Fährmann
241e82e18d
[horne] add support for horne.red () 2022-06-25 16:52:16 +02:00
Mike Fährmann
42525cfe8d
fix '{…!j}' for otherwise non-serializable types (##2624)
like 'datetime'
2022-06-07 17:47:07 +02:00
Mike Fährmann
5b43faffed
[postprocessor:metadata] write to stdout by setting filename to "-"
()
2022-05-30 21:17:31 +02:00
Mike Fährmann
6ad39f2b68
add ytdl tests
they only run when youtube-dl or yt-dlp are installed,
i.e. if __import__("<ytdl-package>") succeeds
2022-05-23 18:30:26 +02:00
Mike Fährmann
688d6553b4
replace calls to print() with stdout_write() () 2022-05-19 17:09:24 +02:00
Mike Fährmann
f3408a9d92
implement string literals in replacement fields
- either {_lit[foo]} or {'foo'}
- useful as alternative for empty metadata fields: {title|'no title'}
- due to using '_string.formatter_field_name_split()' to parse format
  strings, using certain characters will result in an error: [].:!
2022-05-09 23:49:33 +02:00
Mike Fährmann
c4b9f7bab8
update functions working with cookies.txt files
- rename
  - load_cookiestxt -> cookiestxt_load
  - save_cookiestxt -< cookiestxt_store
- in cookiestxt_load, add cookies directly to a cookie jar
  instead of storing them in a list first
- other unnoticeable performance increases
2022-05-06 13:21:29 +02:00
Mike Fährmann
ca3a364db7
fix build_duration_func() ()
for extractors with request_interval_min > 0
2022-04-27 20:28:14 +02:00
Mike Fährmann
7fe54bab2a
attempt to fix some issues with 'contains()' ()
add a third argument that gets used
when the values o search are given as a string
2022-04-08 14:40:26 +02:00
Mike Fährmann
d78a2c7163
re.escape() arguments for 'contains()' () 2022-04-07 15:35:54 +02:00
Mike Fährmann
413b77757b
implement 'contains()' ()
and add it to globals() in compiled expressions for --filter etc
2022-03-30 16:18:33 +02:00
Mike Fährmann
e7b30866d0
[postprocessor:mtime] fix timestamps from datetime objects ()
'datetime.timestamp()', which got used to convert datetime objects to
POSIX timestamps, assumes naive datetimes represent LOCAL time, while
datetimes in 'date' metadata fields represent UTC time.

Ref: https://docs.python.org/3/library/datetime.html#datetime.datetime.timestamp
> Naive datetime instances are assumed to represent local time
> you can obtain the POSIX timestamp by … calculating the timestamp directly
2022-03-23 23:05:14 +01:00
Mike Fährmann
29db716a63
implement 'datetime_to_timestamp()'
and rename 'to_timestamp()'
to the more descriptive 'datetime_to_timestamp_string()'
2022-03-23 22:36:01 +01:00
Mike Fährmann
8295bc6d97
fix loading/storing cookies without domain 2022-03-19 15:14:55 +01:00
Mike Fährmann
500a479026
fix a third(!) bug in _check_cookies() ()
turns out tests are worthless if you get em wrong ...
2022-03-18 19:52:37 +01:00
Mike Fährmann
cf44aba333
[formatter] allow evaluating f-string literals
by starting a format string with '\fF'.

This was technically already possible with '\fE',
but this makes it a bit more convenient.
2022-03-18 13:31:01 +01:00
Mike Fährmann
94452761ed
fix cookies tests 2022-03-11 18:16:00 +01:00
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
use domain from input URL for kemono
2022-03-01 03:09:57 +01:00
Mike Fährmann
f5b2b9333f
fix another bug in _check:cookies ()
regression introduced in ed317bfc

Added a couple of tests to hopefully catch such bugs
before they land in a release.
2022-02-16 22:58:57 +01:00
Mike Fährmann
563bd0ecf4
[danbooru] inherit from BaseExtractor
- merge danbooru and e621 code
- support booru.allthefallen.moe (closes )
- remove support for old e621 tag search URLs
2022-02-11 21:01:51 +01:00
Mike Fährmann
b5b4f5a168
use 'build_extractor_filter' in test_results.py 2021-12-28 17:25:07 +01:00
Mike Fährmann
64cf26eaf4
allow specifying sleep-* options as string
either as single value or as range: "3.5", "2.1 - 5.0"
2021-12-18 23:28:56 +01:00
Mike Fährmann
010d65dcec
extend blacklist/whitelist syntax ()
Each entry in such a list can now also include a subcategory
'<category>:<subcategory>'
and it is possible to use '*' or an empty string as placeholder
'*:<subcategory>', ':<subcategory>', '<category>:*'

For example
  "blacklist": "imgur,*:tag,gfycat:user" or
  "blacklist": ["imgur", "*:tag", "gfycat:user"]
will filter all 'imgur' extractors, all extractors  with a 'tag'
subcategory (e.g. https://danbooru.donmai.us/posts?tags=bonocho),
and all 'gfycat' user extractors.
2021-11-23 20:31:43 +01:00
Mike Fährmann
af6424f398
allow testing metadata in list elements 2021-11-21 22:46:34 +01:00
Mike Fährmann
3842cdcd8f
[formatter] implement 'D' format specifier
To be able to parse any string into a 'datetime' object
and format it as necessary.

Example:

{created_at:D%Y-%m-%dT%H:%M:%S%z}
->
"2010-01-01 00:00:00"

{created_at:D%Y-%m-%dT%H:%M:%S%z/%b %d %Y %I:%M %p}
->
"Jan 01 2010 12:00 AM"

with 'created_at' == "2010-01-01T01:00:00+0100"
2021-11-20 23:04:34 +01:00
Mike Fährmann
2ab190ce08
add tests for special format strings 2021-11-01 23:26:18 +01:00
Mike Fährmann
46e17c5e61
support accessing the current local datetime in format strings
{_now}, {_now:%Y-%m-%d}, etc
()
2021-10-30 21:41:09 +02:00
Mike Fährmann
38193dba46
support accessing environment variables in format strings ()
{_env[HOME]} to get the value of $HOME
every other format string feature is supported as well
2021-10-28 19:18:55 +02:00
Mike Fährmann
f2d6b3e6b4
run tests without using 'nose'
run_tests.sh -> run_tests.py
2021-10-13 04:07:41 +02:00
Mike Fährmann
12fc646c53
fix filename formatting tests 2021-09-29 23:39:02 +02:00
Mike Fährmann
e0bdacd932
[fappic] add 'image' extractor (closes ) 2021-09-28 23:35:29 +02:00
Mike Fährmann
c22ff97743
remove 'unit' argument from 'util.format_value()' 2021-09-28 23:07:55 +02:00
Mike Fährmann
cad85640de
move 'util.PathFormat' into its own 'path' module
to prevent circular imports between 'formatter' and 'util'
2021-09-27 21:29:37 +02:00
Mike Fährmann
74145467dd
move 'util.Formatter' into its own 'formatter' module 2021-09-27 02:37:04 +02:00
Mike Fährmann
9377543162
[mastodon] add 'following' extractor () 2021-09-26 00:12:34 +02:00
Mike Fährmann
bd845303ad
implement a way to shorten filenames with east-asian characters
()

Setting 'output.shorten' to "eaw" (East-Asian Width) uses a slower
algorithm that also considers characters with a width > 1.
2021-09-13 21:38:33 +02:00
Mike Fährmann
292fffc83c
add 'j' format string conversion
to convert to a JSON formatted string
2021-08-28 01:19:36 +02:00
Mike Fährmann
bb6a130942
automatically set required DDoS-GUARD cookies ()
for kemono.party and seiso.party
2021-08-16 17:40:29 +02:00