1667 Commits

Author SHA1 Message Date
rofl0r
094db9d670 version.sh: relax regex for release tag detection
this allows to use tag names with a custom suffix too.
2020-09-27 15:44:50 +01:00
rofl0r
4dfac863a5 version.sh: replace -g with -git-
git describe prefixes the sha1 commit hash with -g, which is exactly what
we're after. this change gets rid of the confusing "g" in the commit hash
and allows tag names that include "-".
2020-09-27 15:41:54 +01:00
rofl0r
c74fe57262 transparent: workaround old glibc bug on RHEL7
it's been reported[0] that RHEL7 fails to properly set the length
parameter of the getsockname() call to the length of the required
struct sockaddr type, and always returns the length passed if it
is big enough.

the SOCKADDR_UNION_* macros originate from my microsocks[1] project,
and facilitate handling of the sockaddr mess without nasty casts.

[0]: https://github.com/tinyproxy/tinyproxy/issues/45#issuecomment-694594990
[1]: https://github.com/rofl0r/microsocks
2020-09-18 12:12:14 +01:00
rofl0r
d4ef2cfa62 child_kill_children(): use method that actually works
it turned out that close()ing an fd behind the back of a thread
doesn't actually cause blocking operations to get a read/write event,
because the fd will stay valid to in-progress operations.
2020-09-17 21:24:45 +01:00
rofl0r
da1bc1425d tune error messages to show select or poll depending on what is used 2020-09-17 21:03:51 +01:00
rofl0r
22e4898519 add autoconf test and fallback code for systems without gperf 2020-09-16 23:04:12 +01:00
rofl0r
45b238fc6f main: print error when config_init() fails 2020-09-16 21:01:02 +01:00
rofl0r
45323584a0 speed up big config parsing by 2x using gperf 2020-09-16 21:01:02 +01:00
rofl0r
caeab31fca conf.c: simplify the huge IPV6 regex
even though the existing IPV6 regex caught (almost?) all invalid
ipv6 addresses, it did so with a huge performance penalty.
parsing a file with 32K allow or deny statement took 30 secs in
a test setup, after this change less than 3.

the new regex is sufficient to recognize all valid ipv6 addresses,
and hands down the responsibility to detect corner cases to the
system's inet_pton() function, which is e.g. called from insert_acl(),
which now causes a warning to be printed in the log if a seemingly
valid address is in fact invalid.

the new regex has been tested with 486 testcases from
http://download.dartware.com/thirdparty/test-ipv6-regex.pl
and accepts all valid ones and rejects most of the invalid ones.

note that the IPV4 regex already did a similar thing and checked only
whether the ip looks like [0-9]+.[0-9]+.[0-9]+.[0-9]+ without pedantry.
2020-09-16 21:01:02 +01:00
rofl0r
0ad8904b40 acl.c: detect invalid ipv6 string 2020-09-16 21:00:50 +01:00
rofl0r
99ed66cbc4 conf.c: warn when encountering invalid address 2020-09-16 21:00:50 +01:00
rofl0r
880a8b0ab6 conf: use cpp stringification for STDCONF macro 2020-09-16 21:00:04 +01:00
rofl0r
551e914d24 conf: merge upstream/upstream_none into single regex/handler 2020-09-16 21:00:04 +01:00
rofl0r
bad36cd9cd move config reload message to reload_config()
move it to before disabling logging, so a message with the correct
timestamp is printed if logging was already enabled.
also add a message when loading finished, so one can see from the
timestamp how long it took.

note that this only works on a real config reload triggered by
SIGHUP/SIGUSR1, because on startup we don't know yet where to log to.
2020-09-16 21:00:04 +01:00
rofl0r
683a354196 remove vector remains 2020-09-16 02:39:09 +01:00
rofl0r
06c96761d5 log_message_storage: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
54ae2d2a19 tests: add some AddHeader directives 2020-09-16 02:39:09 +01:00
rofl0r
e843519fb8 listen_addrs: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
a5381223df basicauth: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
487f2aba47 connect_ports: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
e929e81a55 add_header: use sblist
note that the old code inserted added headers at the beginning of the
list, reasoning unknown. this seems counter-intuitive as the headers
would end up in the request in the reverse order they were added,
but this was irrelevant, as the headers were originally first put
into the hashmap hashofheaders before sending it to the client.
since the hashmap didn't preserve ordering, the headers would appear
in random order anyway.
2020-09-16 02:39:09 +01:00
rofl0r
7d33fc8e8a listen_fds: use sblist 2020-09-16 01:05:58 +01:00
rofl0r
a5890b621b run_tests_valgrind: use tougher valgrind settings 2020-09-15 23:39:04 +01:00
rofl0r
2037bc64f5 free a mem leak by statically allocating global statsbuf 2020-09-15 23:28:33 +01:00
rofl0r
d453a4c2a4 main: include loop header 2020-09-15 23:20:14 +01:00
rofl0r
192f8194e1 free() loop records too 2020-09-15 23:12:00 +01:00
rofl0r
bd92446184 use poll() where available 2020-09-15 23:12:00 +01:00
rofl0r
10cdee3bc5 prepare transition to poll()
usage of select() is inefficient (because a huge fd_set array has to
be initialized on each call) and insecure (because an fd >= FD_SETSIZE
will cause out-of-bounds accesses using the FD_*SET macros, and a system
can be set up to allow more than that number of fds using ulimit).
for the moment we prepared a poll-like wrapper that still runs select()
to test for regressions, and so we have fallback code for systems without
poll().
2020-09-15 23:12:00 +01:00
rofl0r
0c8275a90e refactor conns.[ch], put conn_s into child struct
this allows to access the conn member from the main thread handling
the childs, plus simplifies the code.
2020-09-15 23:12:00 +01:00
rofl0r
5779ba8697 hsearch: add seed to prevent another CVE-2012-3505 instance 2020-09-15 23:12:00 +01:00
rofl0r
155bfbbe87 replace leftover users of hashmap with htab
also fixes a bug where the ErrorFile directive would create a
new hashmap on every added item, effectively allowing only
the use of the last specified errornumber, and producing memory
leaks on each config reload.
2020-09-15 23:12:00 +01:00
rofl0r
34a8b28414 save headers in an ordered dictionary
due to the usage of a hashmap to store headers, when relaying them
to the other side the order was not prevented.
even though correct from a standards point-of-view, this caused
issues with various programs, and it allows to fingerprint the use
of tinyproxy.

to implement this, i imported the MIT-licensed hsearch.[ch] from
https://github.com/rofl0r/htab which was originally taken from
musl libc. it's a simple and efficient hashtable implementation
with far better performance characteristic than the one previously
used by tinyproxy. additionally it has an API much more well-suited
for this purpose.

orderedmap.[ch] was implemented from scratch to address this issue.
behind the scenes it uses an sblist to store string values, and a htab
to store keys and the indices into the sblist.
this allows us to iterate linearly over the sblist and then find the
corresponding key in the hash table, so the headers can be reproduced
in the order they were received.

closes #73
2020-09-15 23:11:59 +01:00
rofl0r
9d5ee85c3e fix free()ing of config items
- we need to free the config after it has been succesfully loaded,
  not unconditionally before reloading.
- we also need to free them before exiting from the main program
  to have clean valgrind output.
2020-09-15 23:11:59 +01:00
rofl0r
372d7ff824 shutdown: free children from right place 2020-09-15 22:32:42 +01:00
rofl0r
2f3a3828ac Revert "childs.c: fix minor memory leak"
This reverts commit 6dd3806f7d1a337fb89e335e986e1fa4eab8340c.
2020-09-15 22:25:53 +01:00
rofl0r
6dd3806f7d childs.c: fix minor memory leak
this would leak only once on program termination, so it's no big
deal apart from having spurious reachable memory in valgrind logs.
2020-09-15 20:02:12 +01:00
rofl0r
7eb6600aeb main: orderly shutdown on SIGINT too
the appropriate code in the signal handler was already set up,
but for some reason the signal itself not being handled.
2020-09-14 20:59:02 +01:00
rofl0r
7014d050d9 run_tests: make travis happy, use signal nr instead of name 2020-09-14 17:02:36 +01:00
rofl0r
ff23f3249b conf.c: include common.h 2020-09-14 17:02:36 +01:00
rofl0r
17e19a67cf run_tests: do some more extensive testing
1) force a config reload after some initial tests.
   this will allow to identify memleaks using the valgrind test,
   as this will free all structures allocated for the config, and
   recreate them.
2) test ErrorFile directive by adding several of them.
   this should help catch regressions such as the one fixed in
   4847d8cdb3bfd9b30a10bfed848174250475a69b.
   it will also test memleaks in the related code paths.
3) test some scenarios that should produce errors and use the
   configured ErrorFile directives.
2020-09-13 01:09:21 +01:00
rofl0r
c64ac9edbe fix get_request_entity()
get_request_entity()'s purpose is to drain remaining unread bytes
in the request read pipe before handing out an error page,
and kinda surprisingly, also when connection to the stathost is
done.

in the stathost case tinyproxy just skipped proper processing and
jumped to the error handler code, and remembering whether a
connection to the stathost was desired in a variable, then doing
things a bit differently depending on whether it's set.

i tried to fix issues with get_request_entity in
88153e944f7d28f57cccc77f3228a3f54f78ce4e (which is basically the
right fix for the issue it tried to solve, but incomplete),
and resulting from there in 78cc5b72b18a3c0d196126bfbc5d3b6473386da9.
the latter fix wasn't quite right since we're not supposed to check
whether the socket is ready for writing, and having a return value
of 2 instead of 1 got resulted in some of the if statements not
kicking in when they should have.
this also resulted in the stathost page no longer working.

after in-depth study of the issue i realized that we only need to
call get_request_entity() when the headers aren't completely read,
additional to setting the proper connection timeout as
88153e944f7d28f57cccc77f3228a3f54f78ce4e already implemented.
the changes of 78cc5b72b18a3c0d196126bfbc5d3b6473386da9 have been
reverted.
2020-09-13 00:37:19 +01:00
rofl0r
bfe59856b2 tests/webclient: return error when HTTP status > 399 2020-09-13 00:35:38 +01:00
rofl0r
4847d8cdb3 add_new_errorpage(): fix segfault accessing global config
another fallout of the config refactoring finished by
2e02dce0c3de4a231f74b44c34647406de507768.

apparently no one using the ErrorFile directive used git master
during the last months, as there have been no reports about this issue.
2020-09-12 21:38:04 +01:00
rofl0r
df9074db6e vector.h: missing include <unistd.h> for ssize_t 2020-09-12 15:56:36 +01:00
rofl0r
9e40f8311f handle_connection(): print process_*_headers errno information 2020-09-10 21:13:31 +01:00
rofl0r
f1bd259e6e handle_connection: replace "goto fail" with func call
this allows to see in a backtrace from where the error was
triggered.
2020-09-10 14:48:39 +01:00
rofl0r
e94cbdb3a5 handle_connection(): factor out failure code
this allows us in a next step to replace goto fail with a call to that
function, so we can see in a backtrace from where the failure was
triggered.
2020-09-10 14:37:56 +01:00
rofl0r
b549ba5af3 remove bogus custom timeout handling code
in networking, hitting a timeout requires that *nothing* happens during the
interval. whenever anything happens, the timeout is reset.
there's no need to do custom time calculations, it's perfectly fine to let
the kernel handle it using the select() syscall.

additionally the code added in 0b9a74c29036f9215b2b97a301b7b25933054302
assures that read and write syscalls() don't block indefinitely and return
on the timeout too, so there's no need to switch sockets back and forth
between blocking/nonblocking.
2020-09-09 12:37:23 +01:00
rofl0r
b4e3f1a896 fix negative timeout resulting in select() EINVAL 2020-09-09 11:59:40 +01:00
rofl0r
78cc5b72b1 get_request_entity: fix regression w/ CONNECT method
introduced in 88153e944f7d28f57cccc77f3228a3f54f78ce4e.
when connect method is used (HTTPS), and e.g. a filtered domain requested,
there's no data on readfds, only on writefds.

this caused the response from the connection to hang until the timeout was
hit. in the past in such scenario always a "no entity" response
was produced in tinyproxy logs.
2020-09-08 14:45:24 +01:00