Commit Graph

39 Commits (master)

Author SHA1 Message Date
rofl0r 235b1c10a7 implement filtertype keyword and fnmatch-based filtering
as suggested in #212, it seems the majority of people don't understand
that input was expected to be in regex format and people were using
filter lists containing plain hostnames, e.g. `www.google.com`.

apart from that, using fnmatch() for matching is actually a lot less
computationally expensive and allows to use big blacklists without
incurring a huge performance hit.

the config file now understands a new option `FilterType` which can
be one of `bre`, `ere` and `fnmatch`.
The `FilterExtended` option was deprecated in favor of it.
It still works, but will be removed in the release after the next.
2022-05-02 13:13:40 +00:00
rofl0r 9d815f69a4 filter: hard error when filter file doesn't exist 2021-05-09 23:41:49 +01:00
rofl0r 35c8edcf73 speed up build by only including regex.h where needed 2020-09-30 05:13:45 +01:00
rofl0r 233ce6de3b filter: reduce memory usage, fix OOM crashes
* check return values of memory allocation and abort gracefully
  in out-of-memory situations

* use sblist (linear dynamic array) instead of linked list
  - this removes one pointer per filter rule
  - removes need to manually allocate/free every single list item
    (instead block allocation is used)
  - simplifies code

* remove storage of (unused) input rule
  - removes one char* pointer per filter rule
  - removes storage of the raw bytes of each filter rule

* add line number to display on out-of-memory/invalid regex situation

* replace duplicate filter_domain()/filter_host() code with a single
  function filter_run()
  - reduces code size and management effort

with these improvements, >1 million regex rules can be loaded with
4 GB of RAM, whereas previously it crashed with about 950K.

the list for testing was assembled from
http://www.shallalist.de/Downloads/shallalist.tar.gz

closes #20
2020-09-05 19:42:34 +01:00
rofl0r c63d5d26b4 access config via a pointer, not a hardcoded struct address
this is required so we can elegantly swap out an old config for a
new one in the future and remove lots of boilerplate from config
initialization code.

unfortunately this is a quite intrusive change as the config struct
was accessed in numerous places, but frankly it should have been
done via a pointer right from the start.

right now, we simply point to a static struct in main.c, so there
shouldn't be any noticeable changes in behaviour.
2020-01-15 16:09:41 +00:00
Janosch Hoffmann e666e4a35b filter file: Don't ignore lines with leading whitespace (#239)
The new code skips leading whitespaces before removing trailing
whitespaces and comments.
Without doing this, lines with leading whitespace are treated like empty
lines (i.e. they are ignored).
2019-05-05 19:13:38 +01:00
Michael Adam 7290691142 Move definition of "struct config_s" from main.h to conf.h
Michael
2009-12-07 22:33:27 +01:00
Michael Adam 83987babd3 filter: add function filter_reload() 2009-10-25 23:33:37 +01:00
Michael Adam 574a65c28e filter: un-linebreak after un-indent...
Michael
2009-09-15 02:25:10 +02:00
Michael Adam 93b00446b9 filter: reduce indentation in filter_init by 16 characters by using return.
Michael
2009-09-15 02:25:09 +02:00
Michael Adam f25b0e2872 filter: untangle assignment and check in filter_init().
Michael
2009-09-14 22:18:28 +02:00
Mukund Sivaraman 7b9234f394 Indent code to Tinyproxy coding style
The modified files were indented with GNU indent using the
following command:

indent -npro -kr -i8 -ts8 -sob -l80 -ss -cs -cp1 -bs -nlps -nprs -pcs \
    -saf -sai -saw -sc -cdw -ce -nut -il0

No other changes of any sort were made.
2009-09-15 01:11:25 +05:30
Mukund Sivaraman a21cd7e3ed Rename tinyproxy.[ch] to main.[ch] 2009-08-07 03:42:53 +05:30
Michael Adam 867b4c9813 filter_init(): kill implicit cast warnings by adding explicit casts.
Michael
2009-08-04 23:57:30 +02:00
Mukund Sivaraman 024b317de0 Convert tabs to spaces 2008-12-08 13:39:44 +00:00
Mukund Sivaraman a257703e59 Reformat code to GNU coding style
This is a commit which simply ran all C source code files
through GNU indent. No other modifications were made.
2008-12-01 15:01:11 +00:00
Mukund Sivaraman 249d4b7f33 Updated copyright, license notices in source code
The notices have been changed to a more GNU look. Documentation
comments have been separated from the copyright header. I've tried to
keep all copyright notices intact. Some author contact details have
been updated.
2008-05-24 13:35:49 +05:30
Robert James Kaes c0299e1868 * [Indent] Ran Source Through indent
I re-indented the source code using indent with the following options:

indent -kr -bad -bap -nut -i8 -l80 -psl -sob -ss -ncs

There are now _no_ tabs in the source files, and all indentation is
eight spaces.  Lines are 80 characters long, and the procedure type is
on it's own line.  Read the indent manual for more information about
what each option means.
2005-08-15 03:54:31 +00:00
Robert James Kaes a59117c7ca * Updated Copyright Email Addresses
Updated the copyright email addresses for Robert James Kaes.  The
users.sourceforge.net address should always exist.
2005-07-12 17:39:44 +00:00
Robert James Kaes aee5a63849 Removed unnecessary casts (mostly dealing with memory allocation.) I
should never have added them in the first place.  They don't really
buy anything, and they can hide bugs.
2004-02-13 21:27:42 +00:00
Robert James Kaes f2d846d057 Merged in changes from the 1.6.2 release. (Fixes for the filtering code
and the HTML installation script.)
2003-10-17 16:11:00 +00:00
Robert James Kaes d2098f638f tinyproxy no longer includes a fall-back regular expression library,
so these files needed to be modified to only use the system's
installed regular expression library.
2003-08-07 16:32:12 +00:00
Robert James Kaes 6aaa863432 Added appropriate casts from (void*) so that the code will compile
cleanly with a C++ compiler.  (Tested using GCC 3.3)
2003-07-31 23:38:28 +00:00
Robert James Kaes cb7e3eef04 Added support for conditionally using case sensitive filtering files.
Code changes from James E. Flemer.
2003-01-27 17:57:45 +00:00
Robert James Kaes 7fd291f407 Filtering is now case insensitive. 2002-10-03 20:40:39 +00:00
Robert James Kaes 1f2fe53c4b Added myself to the copyright since I've made a bunch of changes to this file. 2002-06-07 19:10:05 +00:00
Robert James Kaes 7e1de2012c Added code to handle the "FilterDefaultDeny" directive. The filter_set_default_policy() function is used to select the default policy (either default allow or default deny) for the filtering code. Also, the two filtering functions now support the policy code. 2002-06-07 18:36:22 +00:00
Robert James Kaes 0242d89877 (filter_domain): Removed code which stripped of a port number from the host name. The "host" variable will _always_ be just the name by the time filter_domain() is called. 2002-06-06 20:30:04 +00:00
Robert James Kaes b11015c2e1 Added a copyright for James E. Flemer since these are his changes.
(filter_init): Added code to handle both host and URLs.  Also include code to use extended regular expressions.
(filter_domain): The old filter_url function has been renamed filter_domain().
(filter_url): This function now actually filters complete URLs.
2002-05-27 01:56:22 +00:00
Robert James Kaes 451fad1ed2 Changed the header includes around to reflect the new source layout. 2002-05-23 18:20:27 +00:00
Robert James Kaes 9a8d732a13 Changed all calls to strdup to safestrdup. This should provide better
memory usage tracking.
2002-04-18 17:59:21 +00:00
Robert James Kaes 787ece6c01 Reformated text. 2001-11-22 00:31:10 +00:00
Robert James Kaes 4ac03908fc Header reorganization. Basically all system headers are now included in
tinyproxy.h and all the other files include the tinyproxy.h header. This
moves all the dependancy issues into one file.
2001-10-25 17:27:39 +00:00
Robert James Kaes 25457361c7 Changed mallocs to callocs. 2001-09-12 03:32:54 +00:00
Robert James Kaes 0668e42e8f Changed all the mallocs and callocs to use the new safemalloc and
safecalloc.
2001-09-08 18:58:37 +00:00
Robert James Kaes 69617f6d56 Just a bit of a cleanup. Nothing major. 2001-05-27 02:24:40 +00:00
Robert James Kaes b023ff577f Changed the filter_host command to filter_url. 2000-11-23 04:46:25 +00:00
Robert James Kaes f807f4b96c Just using standard malloc() since the xmalloc() didn't really add
anything useful to the command.
2000-09-11 23:43:59 +00:00
Steven Young 37e63909c0 This commit was generated by cvs2svn to compensate for changes in r2,
which included commits to RCS files with non-trunk default branches.
2000-02-16 17:32:49 +00:00