Update the HTML encoding detection regex to accept more valid inputs

HTML 4.01 don't seem to require the http-equiv value to be quoted, so
make the quotes optional.
Also allow more than one space between the meta tag and its http-equiv
attribute.

Closes #3300703.

git-svn-id: https://geany.svn.sourceforge.net/svnroot/geany/trunk@5796 ea778897-0a13-0410-b9d1-a72fbfd435f5
This commit is contained in:
Colomban Wendling 2011-05-11 22:52:05 +00:00
parent edc8457e8a
commit 36649da8d4
2 changed files with 8 additions and 1 deletions

View File

@ -1,3 +1,10 @@
2011-05-12 Colomban Wendling <colomban(at)geany(dot)org>
* src/encodings.c:
Update the HTML content-type encoding detection regexp to accept some
more valid inputs (closes #3300703).
2011-05-11 Colomban Wendling <colomban(at)geany(dot)org>
* src/plugins.c:

View File

@ -51,7 +51,7 @@
#endif
/* <meta http-equiv="content-type" content="text/html; charset=UTF-8" /> */
#define PATTERN_HTMLMETA "<meta[ \t\n\r\f]http-equiv[ \t\n\r\f]*=[ \t\n\r\f]*\"content-type\"[ \t\n\r\f]+content[ \t\n\r\f]*=[ \t\n\r\f]*\"text/x?html;[ \t\n\r\f]*charset=([a-z0-9_-]+)\"[ \t\n\r\f]*/?>"
#define PATTERN_HTMLMETA "<meta[ \t\n\r\f]+http-equiv[ \t\n\r\f]*=[ \t\n\r\f]*\"?content-type\"?[ \t\n\r\f]+content[ \t\n\r\f]*=[ \t\n\r\f]*\"text/x?html;[ \t\n\r\f]*charset=([a-z0-9_-]+)\"[ \t\n\r\f]*/?>"
/* " geany_encoding=utf-8 " or " coding: utf-8 " */
#define PATTERN_CODING "coding[\t ]*[:=][\t ]*\"?([a-z0-9-]+)\"?[\t ]*"