Reinstate the java->c++ source, generator code.

master
Fedor 2020-03-12 20:40:42 +03:00
parent 30e6400e30
commit 1b648eee31
390 changed files with 91467 additions and 135 deletions

8
.gitignore vendored
View File

@ -49,9 +49,11 @@ js/src/autom4te.cache
js/src/tests/results-*.html
js/src/tests/results-*.txt
# Java HTML5 parser classes
parser/html/java/htmlparser/
parser/html/java/javaparser/
# Java HTML5 parser codegen artifacts
parser/html/java/htmlparser/bin/
parser/html/java/javaparser/bin/
parser/html/java/*.jar
parser/html/javasrc/
# Ignore the files and directory that Eclipse IDE creates
.project

View File

@ -0,0 +1,63 @@
# Updating HTML5 parser code
Our html5 parser is based on the java html5 parser from [Validator.nu](http://about.validator.nu/htmlparser/) by Henri Sivonen. It has been adopted by Mozilla and further updated, and has been imported as a whole into the UXP tree to have an independent and maintainable source of it that doesn't rely on external sources.
## Stages
Updating the parser code consists of 3 stages:
- Make updates to the html parser source in java
- Let the java parser regenerate part of its own code after the change
- Translate the java source to C++
This process was best explained in the [following Bugzilla comment](https://bugzilla.mozilla.org/show_bug.cgi?id=1378079#c6), which explain how to add a new attribute name ("is") to html5, inserted in this document for convenience:
>> Is
>> there any documentation on how to add a new nsHtml5AttributeName?
>
> I don't recall. I should get around to writing it.
>
>> Looks like
>> I need to clone hg.mozilla.org/projects/htmlparser/ and generate a hash with
>> it?
>
> Yes. Here's how:
>
> `cd parser/html/java/`
> `make sync`
>
> Now you have a clone of [https://hg.mozilla.org/projects/htmlparser/](https://hg.mozilla.org/projects/htmlparser/) in > parser/html/java/htmlparser/
>
> `cd htmlparser/src/`
> `$EDITOR nu/validator/htmlparser/impl/AttributeName.java`
>
> Search for the word "uncomment" and uncomment stuff according to the two comments that talk about uncommenting
> Duplicate the declaration a normal attribute (nothings special in SVG mode, etc.). Let's use "alt", since it's the first one.
> In the duplicate, replace ALT with IS and "alt" with "is".
> Search for "ALT,", duplicate that line and change the duplicate to say "IS,"
> Save.
>
> `javac nu/validator/htmlparser/impl/AttributeName.java`
> `java nu.validator.htmlparser.impl.AttributeName`
>
> Copy and paste the output into nu/validator/htmlparser/impl/AttributeName.java replacing the text below the comment "START GENERATED CODE" and above the very last "}".
> Recomment the bits that you uncommented earlier.
> Save.
>
> `cd ../..` - Back to parser/html/java/
> `make translate`
## Organizing commits
**The html5 parser code is fragile due to its generation and translation before being used as C++ in our tree. Do not touch or commit anything without a code peer nearby with knowledge of the parser and the commit process (at this moment that means Gaming4JC (@g4jc)), and communicate the changes thoroughly.**
To organize this properly in our repo, commits should be split up when making these kinds of changes:
1. Commit your code edits to the html parser
2. Regenerate java into a translation-ready source
3. Commit
4. Translate and regenerate C++ code
5. Check a build to make sure the changes have the intended result
6. Commit
This is needed because the source edit will sometimes be in parts that are self-generated and may otherwise be lost in generation noise, and because we want to keep a strict separation between commits resulting from developer work and those resulting from running scripts/automated processes.

44
parser/html/java/Makefile Normal file
View File

@ -0,0 +1,44 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
libs:: translator
translator:: javaparser \
; mkdir -p htmlparser/bin && \
find htmlparser/translator-src/nu/validator/htmlparser -name "*.java" | \
xargs javac -cp javaparser.jar -g -d htmlparser/bin && \
jar cfm translator.jar manifest.txt -C htmlparser/bin .
javaparser:: \
; mkdir -p javaparser/bin && find javaparser/src -name "*.java" | \
xargs javac -encoding ISO-8859-1 -g -d javaparser/bin && \
jar cf javaparser.jar -C javaparser/bin .
translate:: translator \
; mkdir -p ../javasrc ; \
java -jar translator.jar \
htmlparser/src/nu/validator/htmlparser/impl \
.. ../nsHtml5AtomList.h
translate-javasrc:: translator \
; mkdir -p ../javasrc ; \
java -jar translator.jar \
../javasrc \
.. ../nsHtml5AtomList.h
named-characters:: translator \
; java -cp translator.jar \
nu.validator.htmlparser.generator.GenerateNamedCharactersCpp \
named-character-references.html ../
clean-javaparser:: \
; rm -rf javaparser/bin javaparser.jar
clean-htmlparser:: \
; rm -rf htmlparser/bin translator.jar
clean-javasrc:: \
; rm -rf ../javasrc
clean:: clean-javaparser clean-htmlparser clean-javasrc

View File

@ -0,0 +1,41 @@
If this is your first time building the HTML5 parser, you need to execute the
following commands (from this directory) to accomplish the translation:
make translate # perform the Java-to-C++ translation from the remote
# sources
make named_characters # Generate tables for named character tokenization
If you make changes to the translator or the javaparser, you can rebuild by
retyping 'make' in this directory. If you make changes to the HTML5 Java
implementation, you can retranslate the Java sources from the htmlparser
repository by retyping 'make translate' in this directory.
The makefile supports the following targets:
javaparser:
Builds the javaparser library retrieved earlier by sync_javaparser.
translator:
Runs the javaparser target and then builds the Java to C++ translator from
sources.
libs:
The default target. Alias for translator
translate:
Runs the translator target and then translates the HTML parser sources and
copys the parser impl java sources to ../javasrc.
translate-javasrc:
Runs the translator target and then translates the HTML parser sources
stored in ../javasrc. (Depercated)
named-characters:
Generates data tables for named character tokenization.
clean_-avaparser:
Removes the build products of the javaparser target.
clean-htmlparser:
Removes the build products of the translator target.
clean-javasrc:
Removes the javasrc snapshot code in ../javasrc
clean:
Runs clean-javaparser, clean-htmlparser, and clean-javasrc.
Ben Newman (23 September 2009)
Henri Sivonen (11 August 2016)
Matt A. Tobin (16 January 2020)

View File

@ -0,0 +1,3 @@
#!/bin/sh
APPDIR=`dirname $0`;
java -XstartOnFirstThread -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:/Developer/gwt-mac-1.5.1/gwt-user.jar:/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" com.google.gwt.dev.GWTCompiler -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser;

View File

@ -0,0 +1,3 @@
#!/bin/sh
APPDIR=`dirname $0`;
java -XstartOnFirstThread -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:/Developer/gwt-mac-1.5.1/gwt-user.jar:/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" com.google.gwt.dev.GWTCompiler -style DETAILED -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser;

View File

@ -0,0 +1,24 @@
<?xml version="1.0" encoding="UTF-8"?>
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
<listEntry value="/htmlparser"/>
</listAttribute>
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
<listEntry value="4"/>
</listAttribute>
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
<listAttribute key="org.eclipse.jdt.launching.CLASSPATH">
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry containerPath=&quot;org.eclipse.jdt.launching.JRE_CONTAINER&quot; javaProject=&quot;htmlparser&quot; path=&quot;1&quot; type=&quot;4&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/src&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/gwt-src&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/super&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry id=&quot;org.eclipse.jdt.launching.classpathentry.defaultClasspath&quot;&gt;&#10;&lt;memento exportedEntriesOnly=&quot;false&quot; project=&quot;htmlparser&quot;/&gt;&#10;&lt;/runtimeClasspathEntry&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry externalArchive=&quot;/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#10;&lt;runtimeClasspathEntry externalArchive=&quot;/Developer/gwt-mac-1.5.1/gwt-user.jar&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
</listAttribute>
<booleanAttribute key="org.eclipse.jdt.launching.DEFAULT_CLASSPATH" value="false"/>
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="com.google.gwt.dev.GWTCompiler"/>
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-style DETAILED -out /Users/hsivonen/Projects/whattf/htmlparser/www nu.validator.htmlparser.HtmlParser"/>
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="htmlparser"/>
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-XstartOnFirstThread -Xmx256M"/>
</launchConfiguration>

View File

@ -0,0 +1,22 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
<listEntry value="/htmlparser"/>
</listAttribute>
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
<listEntry value="4"/>
</listAttribute>
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
<listAttribute key="org.eclipse.jdt.launching.CLASSPATH">
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;&#10;&lt;runtimeClasspathEntry containerPath=&quot;org.eclipse.jdt.launching.JRE_CONTAINER&quot; javaProject=&quot;htmlparser&quot; path=&quot;1&quot; type=&quot;4&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/src&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/gwt-src&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/super&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;&#10;&lt;runtimeClasspathEntry id=&quot;org.eclipse.jdt.launching.classpathentry.defaultClasspath&quot;&gt;&#10;&lt;memento exportedEntriesOnly=&quot;false&quot; project=&quot;htmlparser&quot;/&gt;&#10;&lt;/runtimeClasspathEntry&gt;&#10;"/>
</listAttribute>
<booleanAttribute key="org.eclipse.jdt.launching.DEFAULT_CLASSPATH" value="false"/>
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="com.google.gwt.dev.GWTCompiler"/>
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-out /home/hsivonen/Projects/whattf/htmlparser/www nu.validator.htmlparser.HtmlParser"/>
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="htmlparser"/>
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-Xmx256M"/>
</launchConfiguration>

View File

@ -0,0 +1,3 @@
#!/bin/sh
APPDIR=`dirname $0`;
java -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:$APPDIR/bin:/home/hsivonen/gwt-linux-1.5.1/gwt-user.jar:/home/hsivonen/gwt-linux-1.5.1/gwt-dev-linux.jar" com.google.gwt.dev.GWTShell -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser/HtmlParser.html;

View File

@ -0,0 +1,3 @@
#!/bin/sh
APPDIR=`dirname $0`;
java -XstartOnFirstThread -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:$APPDIR/bin:/Developer/gwt-mac-1.5.1/gwt-user.jar:/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" com.google.gwt.dev.GWTShell -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser/HtmlParser.html;

View File

@ -0,0 +1,23 @@
<?xml version="1.0" encoding="UTF-8"?>
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
<listEntry value="/htmlparser"/>
</listAttribute>
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
<listEntry value="4"/>
</listAttribute>
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
<listAttribute key="org.eclipse.jdt.launching.CLASSPATH">
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;runtimeClasspathEntry containerPath=&quot;org.eclipse.jdt.launching.JRE_CONTAINER&quot; javaProject=&quot;htmlparser&quot; path=&quot;1&quot; type=&quot;4&quot;/&gt;&#13;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/src&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#13;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/gwt-src&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#13;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;runtimeClasspathEntry internalArchive=&quot;/htmlparser/super&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#13;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;runtimeClasspathEntry id=&quot;org.eclipse.jdt.launching.classpathentry.defaultClasspath&quot;&gt;&#13;&#10;&lt;memento project=&quot;htmlparser&quot;/&gt;&#13;&#10;&lt;/runtimeClasspathEntry&gt;&#13;&#10;"/>
<listEntry value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;runtimeClasspathEntry externalArchive=&quot;/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar&quot; path=&quot;3&quot; type=&quot;2&quot;/&gt;&#13;&#10;"/>
</listAttribute>
<booleanAttribute key="org.eclipse.jdt.launching.DEFAULT_CLASSPATH" value="false"/>
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="com.google.gwt.dev.GWTShell"/>
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-out www nu.validator.htmlparser.HtmlParser/HtmlParser.html"/>
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="htmlparser"/>
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-XstartOnFirstThread -Xmx256M"/>
</launchConfiguration>

View File

@ -0,0 +1,96 @@
This is for the HTML parser as a whole except the rewindable input stream,
the named character classes and the Live DOM Viewer.
For the copyright notices for individual files, please see individual files.
/*
* Copyright (c) 2005, 2006, 2007 Henri Sivonen
* Copyright (c) 2007-2012 Mozilla Foundation
* Portions of comments Copyright 2004-2007 Apple Computer, Inc., Mozilla
* Foundation, and Opera Software ASA.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
The following license is for the WHATWG spec from which the named character
data was extracted.
/*
* Copyright 2004-2010 Apple Computer, Inc., Mozilla Foundation, and Opera
* Software ASA.
*
* You are granted a license to use, reproduce and create derivative works of
* this document.
*/
The following license is for the rewindable input stream.
/*
* Copyright (c) 2001-2003 Thai Open Source Software Center Ltd
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials provided
* with the distribution.
* * Neither the name of the Thai Open Source Software Center Ltd nor
* the names of its contributors may be used to endorse or promote
* products derived from this software without specific prior
* written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
The following license applies to the Live DOM Viewer:
Copyright (c) 2000, 2006, 2008 Ian Hickson and various contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@ -0,0 +1,5 @@
An HTML5 parser.
Please see http://about.validator.nu/htmlparser/
-- Henri Sivonen (hsivonen@iki.fi).

View File

@ -0,0 +1,15 @@
tokenization.txt represents the state of the spec implemented in Tokenizer.java.
To get a diffable version corresponding to the current spec:
lynx -display_charset=utf-8 -dump -nolist http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html > current.txt
tree-construction.txt represents the state of the spec implemented in TreeBuilder.java.
To get a diffable version corresponding to the current spec:
lynx -display_charset=utf-8 -dump -nolist http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html > current.txt
The text of the files in this directory comes from the WHATWG HTML 5 spec
which carries the following notice:
© Copyright 2004-2010 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA.
You are granted a license to use, reproduce and create derivative works of this document.

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,745 @@
#!/usr/bin/python
# Copyright (c) 2013-2015 Mozilla Foundation
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
import json
class Label:
def __init__(self, label, preferred):
self.label = label
self.preferred = preferred
def __cmp__(self, other):
return cmp(self.label, other.label)
# If a multi-byte encoding is on this list, it is assumed to have a
# non-generated decoder implementation class. Otherwise, the JDK default
# decoder is used as a placeholder.
MULTI_BYTE_DECODER_IMPLEMENTED = [
u"x-user-defined",
u"replacement",
u"big5",
]
MULTI_BYTE_ENCODER_IMPLEMENTED = [
u"big5",
]
preferred = []
labels = []
data = json.load(open("../encoding/encodings.json", "r"))
indexes = json.load(open("../encoding/indexes.json", "r"))
single_byte = []
multi_byte = []
def to_camel_name(name):
if name == u"iso-8859-8-i":
return u"Iso8I"
if name.startswith(u"iso-8859-"):
return name.replace(u"iso-8859-", u"Iso")
return name.title().replace(u"X-", u"").replace(u"-", u"").replace(u"_", u"")
def to_constant_name(name):
return name.replace(u"-", u"_").upper()
# Encoding.java
for group in data:
if group["heading"] == "Legacy single-byte encodings":
single_byte = group["encodings"]
else:
multi_byte.extend(group["encodings"])
for encoding in group["encodings"]:
preferred.append(encoding["name"])
for label in encoding["labels"]:
labels.append(Label(label, encoding["name"]))
preferred.sort()
labels.sort()
label_file = open("src/nu/validator/encoding/Encoding.java", "w")
label_file.write("""/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.IllegalCharsetNameException;
import java.nio.charset.UnsupportedCharsetException;
import java.nio.charset.spi.CharsetProvider;
import java.util.Arrays;
import java.util.Collections;
import java.util.SortedMap;
import java.util.TreeMap;
/**
* Represents an <a href="https://encoding.spec.whatwg.org/#encoding">encoding</a>
* as defined in the <a href="https://encoding.spec.whatwg.org/">Encoding
* Standard</a>, provides access to each encoding defined in the Encoding
* Standard via a static constant and provides the
* "<a href="https://encoding.spec.whatwg.org/#concept-encoding-get">get an
* encoding</a>" algorithm defined in the Encoding Standard.
*
* <p>This class inherits from {@link Charset} to allow the Encoding
* Standard-compliant encodings to be used in contexts that support
* <code>Charset</code> instances. However, by design, the Encoding
* Standard-compliant encodings are not supplied via a {@link CharsetProvider}
* and, therefore, are not available via and do not interfere with the static
* methods provided by <code>Charset</code>. (This class provides methods of
* the same name to hide each static method of <code>Charset</code> to help
* avoid accidental calls to the static methods of the superclass when working
* with Encoding Standard-compliant encodings.)
*
* <p>When an application needs to use a particular encoding, such as utf-8
* or windows-1252, the corresponding constant, i.e.
* {@link #UTF_8 Encoding.UTF_8} and {@link #WINDOWS_1252 Encoding.WINDOWS_1252}
* respectively, should be used. However, when the application receives an
* encoding label from external input, the method {@link #forName(String)
* forName()} should be used to obtain the object representing the encoding
* identified by the label. In contexts where labels that map to the
* <a href="https://encoding.spec.whatwg.org/#replacement">replacement
* encoding</a> should be treated as unknown, the method {@link
* #forNameNoReplacement(String) forNameNoReplacement()} should be used instead.
*
*
* @author hsivonen
*/
public abstract class Encoding extends Charset {
private static final String[] LABELS = {
""")
for label in labels:
label_file.write(" \"%s\",\n" % label.label)
label_file.write(""" };
private static final Encoding[] ENCODINGS_FOR_LABELS = {
""")
for label in labels:
label_file.write(" %s.INSTANCE,\n" % to_camel_name(label.preferred))
label_file.write(""" };
private static final Encoding[] ENCODINGS = {
""")
for label in preferred:
label_file.write(" %s.INSTANCE,\n" % to_camel_name(label))
label_file.write(""" };
""")
for label in preferred:
label_file.write(""" /**
* The %s encoding.
*/
public static final Encoding %s = %s.INSTANCE;
""" % (label, to_constant_name(label), to_camel_name(label)))
label_file.write("""
private static SortedMap<String, Charset> encodings = null;
protected Encoding(String canonicalName, String[] aliases) {
super(canonicalName, aliases);
}
private enum State {
HEAD, LABEL, TAIL
};
public static Encoding forName(String label) {
if (label == null) {
throw new IllegalArgumentException("Label must not be null.");
}
if (label.length() == 0) {
throw new IllegalCharsetNameException(label);
}
// First try the fast path
int index = Arrays.binarySearch(LABELS, label);
if (index >= 0) {
return ENCODINGS_FOR_LABELS[index];
}
// Else, slow path
StringBuilder sb = new StringBuilder();
State state = State.HEAD;
for (int i = 0; i < label.length(); i++) {
char c = label.charAt(i);
if ((c == ' ') || (c == '\\n') || (c == '\\r') || (c == '\\t')
|| (c == '\\u000C')) {
if (state == State.LABEL) {
state = State.TAIL;
}
continue;
}
if ((c >= 'a' && c <= 'z') || (c >= '0' && c <= '9')) {
switch (state) {
case HEAD:
state = State.LABEL;
// Fall through
case LABEL:
sb.append(c);
continue;
case TAIL:
throw new IllegalCharsetNameException(label);
}
}
if (c >= 'A' && c <= 'Z') {
c += 0x20;
switch (state) {
case HEAD:
state = State.LABEL;
// Fall through
case LABEL:
sb.append(c);
continue;
case TAIL:
throw new IllegalCharsetNameException(label);
}
}
if ((c == '-') || (c == '+') || (c == '.') || (c == ':')
|| (c == '_')) {
switch (state) {
case LABEL:
sb.append(c);
continue;
case HEAD:
case TAIL:
throw new IllegalCharsetNameException(label);
}
}
throw new IllegalCharsetNameException(label);
}
index = Arrays.binarySearch(LABELS, sb.toString());
if (index >= 0) {
return ENCODINGS_FOR_LABELS[index];
}
throw new UnsupportedCharsetException(label);
}
public static Encoding forNameNoReplacement(String label) {
Encoding encoding = Encoding.forName(label);
if (encoding == Encoding.REPLACEMENT) {
throw new UnsupportedCharsetException(label);
}
return encoding;
}
public static boolean isSupported(String label) {
try {
Encoding.forName(label);
} catch (UnsupportedCharsetException e) {
return false;
}
return true;
}
public static boolean isSupportedNoReplacement(String label) {
try {
Encoding.forNameNoReplacement(label);
} catch (UnsupportedCharsetException e) {
return false;
}
return true;
}
public static SortedMap<String, Charset> availableCharsets() {
if (encodings == null) {
TreeMap<String, Charset> map = new TreeMap<String, Charset>();
for (Encoding encoding : ENCODINGS) {
map.put(encoding.name(), encoding);
}
encodings = Collections.unmodifiableSortedMap(map);
}
return encodings;
}
public static Encoding defaultCharset() {
return WINDOWS_1252;
}
@Override public boolean canEncode() {
return false;
}
@Override public boolean contains(Charset cs) {
return false;
}
@Override public CharsetEncoder newEncoder() {
throw new UnsupportedOperationException("Encoder not implemented.");
}
}
""")
label_file.close()
# Single-byte encodings
for encoding in single_byte:
name = encoding["name"]
labels = encoding["labels"]
labels.sort()
class_name = to_camel_name(name)
mapping_name = name
if mapping_name == u"iso-8859-8-i":
mapping_name = u"iso-8859-8"
mapping = indexes[mapping_name]
class_file = open("src/nu/validator/encoding/%s.java" % class_name, "w")
class_file.write('''/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class ''')
class_file.write(class_name)
class_file.write(''' extends Encoding {
private static final char[] TABLE = {''')
fallible = False
comma = False
for code_point in mapping:
# XXX should we have error reporting?
if not code_point:
code_point = 0xFFFD
fallible = True
if comma:
class_file.write(",")
class_file.write("\n '\u%04x'" % code_point);
comma = True
class_file.write('''
};
private static final String[] LABELS = {''')
comma = False
for label in labels:
if comma:
class_file.write(",")
class_file.write("\n \"%s\"" % label);
comma = True
class_file.write('''
};
private static final String NAME = "''')
class_file.write(name)
class_file.write('''";
static final Encoding INSTANCE = new ''')
class_file.write(class_name)
class_file.write('''();
private ''')
class_file.write(class_name)
class_file.write('''() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new ''')
class_file.write("Fallible" if fallible else "Infallible")
class_file.write('''SingleByteDecoder(this, TABLE);
}
}
''')
class_file.close()
# Multi-byte encodings
for encoding in multi_byte:
name = encoding["name"]
labels = encoding["labels"]
labels.sort()
class_name = to_camel_name(name)
class_file = open("src/nu/validator/encoding/%s.java" % class_name, "w")
class_file.write('''/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class ''')
class_file.write(class_name)
class_file.write(''' extends Encoding {
private static final String[] LABELS = {''')
comma = False
for label in labels:
if comma:
class_file.write(",")
class_file.write("\n \"%s\"" % label);
comma = True
class_file.write('''
};
private static final String NAME = "''')
class_file.write(name)
class_file.write('''";
static final ''')
class_file.write(class_name)
class_file.write(''' INSTANCE = new ''')
class_file.write(class_name)
class_file.write('''();
private ''')
class_file.write(class_name)
class_file.write('''() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
''')
if name == "gbk":
class_file.write('''return Charset.forName("gb18030").newDecoder();''')
elif name in MULTI_BYTE_DECODER_IMPLEMENTED:
class_file.write("return new %sDecoder(this);" % class_name)
else:
class_file.write('''return Charset.forName(NAME).newDecoder();''')
class_file.write('''
}
@Override public CharsetEncoder newEncoder() {
''')
if name in MULTI_BYTE_ENCODER_IMPLEMENTED:
class_file.write("return new %sEncoder(this);" % class_name)
else:
class_file.write('''return Charset.forName(NAME).newEncoder();''')
class_file.write('''
}
}
''')
class_file.close()
# Big5
def null_to_zero(code_point):
if not code_point:
code_point = 0
return code_point
index = []
for code_point in indexes["big5"]:
index.append(null_to_zero(code_point))
# There are four major gaps consisting of more than 4 consecutive invalid pointers
gaps = []
consecutive = 0
consecutive_start = 0
offset = 0
for code_point in index:
if code_point == 0:
if consecutive == 0:
consecutive_start = offset
consecutive +=1
else:
if consecutive > 4:
gaps.append((consecutive_start, consecutive_start + consecutive))
consecutive = 0
offset += 1
def invert_ranges(ranges, cap):
inverted = []
invert_start = 0
for (start, end) in ranges:
if start != 0:
inverted.append((invert_start, start))
invert_start = end
inverted.append((invert_start, cap))
return inverted
cap = len(index)
ranges = invert_ranges(gaps, cap)
# Now compute a compressed lookup table for astralness
gaps = []
consecutive = 0
consecutive_start = 0
offset = 0
for code_point in index:
if code_point <= 0xFFFF:
if consecutive == 0:
consecutive_start = offset
consecutive +=1
else:
if consecutive > 40:
gaps.append((consecutive_start, consecutive_start + consecutive))
consecutive = 0
offset += 1
astral_ranges = invert_ranges(gaps, cap)
class_file = open("src/nu/validator/encoding/Big5Data.java", "w")
class_file.write('''/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
final class Big5Data {
private static final String ASTRALNESS = "''')
bits = []
for (low, high) in astral_ranges:
for i in xrange(low, high):
bits.append(1 if index[i] > 0xFFFF else 0)
# pad length to multiple of 16
for j in xrange(16 - (len(bits) % 16)):
bits.append(0)
i = 0
while i < len(bits):
accu = 0
for j in xrange(16):
accu |= bits[i + j] << j
if accu == 0x22:
class_file.write('\\"')
else:
class_file.write('\\u%04X' % accu)
i += 16
class_file.write('''";
''')
j = 0
for (low, high) in ranges:
class_file.write(''' private static final String TABLE%d = "''' % j)
for i in xrange(low, high):
class_file.write('\\u%04X' % (index[i] & 0xFFFF))
class_file.write('''";
''')
j += 1
class_file.write(''' private static boolean readBit(int i) {
return (ASTRALNESS.charAt(i >> 4) & (1 << (i & 0xF))) != 0;
}
static char lowBits(int pointer) {
''')
j = 0
for (low, high) in ranges:
class_file.write(''' if (pointer < %d) {
return '\\u0000';
}
if (pointer < %d) {
return TABLE%d.charAt(pointer - %d);
}
''' % (low, high, j, low))
j += 1
class_file.write(''' return '\\u0000';
}
static boolean isAstral(int pointer) {
''')
base = 0
for (low, high) in astral_ranges:
if high - low == 1:
class_file.write(''' if (pointer < %d) {
return false;
}
if (pointer == %d) {
return true;
}
''' % (low, low))
else:
class_file.write(''' if (pointer < %d) {
return false;
}
if (pointer < %d) {
return readBit(%d + (pointer - %d));
}
''' % (low, high, base, low))
base += (high - low)
class_file.write(''' return false;
}
public static int findPointer(char lowBits, boolean isAstral) {
if (!isAstral) {
switch (lowBits) {
''')
hkscs_bound = (0xA1 - 0x81) * 157
prefer_last = [
0x2550,
0x255E,
0x2561,
0x256A,
0x5341,
0x5345,
]
for code_point in prefer_last:
# Python lists don't have .rindex() :-(
for i in xrange(len(index) - 1, -1, -1):
candidate = index[i]
if candidate == code_point:
class_file.write(''' case 0x%04X:
return %d;
''' % (code_point, i))
break
class_file.write(''' default:
break;
}
}''')
j = 0
for (low, high) in ranges:
if high > hkscs_bound:
start = 0
if low <= hkscs_bound and hkscs_bound < high:
# This is the first range we don't ignore and the
# range that contains the first non-HKSCS pointer.
# Avoid searching HKSCS.
start = hkscs_bound - low
class_file.write('''
for (int i = %d; i < TABLE%d.length(); i++) {
if (TABLE%d.charAt(i) == lowBits) {
int pointer = i + %d;
if (isAstral == isAstral(pointer)) {
return pointer;
}
}
}''' % (start, j, j, low))
j += 1
class_file.write('''
return 0;
}
}
''')
class_file.close()

View File

@ -0,0 +1,12 @@
<module>
<inherits name="com.google.gwt.core.Core"/>
<inherits name="com.google.gwt.user.User"/>
<super-source path="translatable"/>
<source path="annotation"/>
<source path="common"/>
<source path="impl"/>
<source path="gwt"/>
<set-property name="user.agent" value="gecko1_8"/>
<entry-point class="nu.validator.htmlparser.gwt.HtmlParserModule"/>
<add-linker name="sso"/>
</module>

View File

@ -0,0 +1,477 @@
/*
* Copyright (c) 2007 Henri Sivonen
* Copyright (c) 2008-2009 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.gwt;
import java.util.LinkedList;
import nu.validator.htmlparser.common.DocumentMode;
import nu.validator.htmlparser.impl.CoalescingTreeBuilder;
import nu.validator.htmlparser.impl.HtmlAttributes;
import org.xml.sax.SAXException;
import com.google.gwt.core.client.JavaScriptException;
import com.google.gwt.core.client.JavaScriptObject;
class BrowserTreeBuilder extends CoalescingTreeBuilder<JavaScriptObject> {
private JavaScriptObject document;
private JavaScriptObject script;
private JavaScriptObject placeholder;
private boolean readyToRun;
private final LinkedList<ScriptHolder> scriptStack = new LinkedList<ScriptHolder>();
private class ScriptHolder {
private final JavaScriptObject script;
private final JavaScriptObject placeholder;
/**
* @param script
* @param placeholder
*/
public ScriptHolder(JavaScriptObject script,
JavaScriptObject placeholder) {
this.script = script;
this.placeholder = placeholder;
}
/**
* Returns the script.
*
* @return the script
*/
public JavaScriptObject getScript() {
return script;
}
/**
* Returns the placeholder.
*
* @return the placeholder
*/
public JavaScriptObject getPlaceholder() {
return placeholder;
}
}
protected BrowserTreeBuilder(JavaScriptObject document) {
super();
this.document = document;
installExplorerCreateElementNS(document);
}
private static native boolean installExplorerCreateElementNS(
JavaScriptObject doc) /*-{
if (!doc.createElementNS) {
doc.createElementNS = function (uri, local) {
if ("http://www.w3.org/1999/xhtml" == uri) {
return doc.createElement(local);
} else if ("http://www.w3.org/1998/Math/MathML" == uri) {
if (!doc.mathplayerinitialized) {
var obj = document.createElement("object");
obj.setAttribute("id", "mathplayer");
obj.setAttribute("classid", "clsid:32F66A20-7614-11D4-BD11-00104BD3F987");
document.getElementsByTagName("head")[0].appendChild(obj);
document.namespaces.add("m", "http://www.w3.org/1998/Math/MathML", "#mathplayer");
doc.mathplayerinitialized = true;
}
return doc.createElement("m:" + local);
} else if ("http://www.w3.org/2000/svg" == uri) {
if (!doc.renesisinitialized) {
var obj = document.createElement("object");
obj.setAttribute("id", "renesis");
obj.setAttribute("classid", "clsid:AC159093-1683-4BA2-9DCF-0C350141D7F2");
document.getElementsByTagName("head")[0].appendChild(obj);
document.namespaces.add("s", "http://www.w3.org/2000/svg", "#renesis");
doc.renesisinitialized = true;
}
return doc.createElement("s:" + local);
} else {
// throw
}
}
}
}-*/;
private static native boolean hasAttributeNS(JavaScriptObject element,
String uri, String localName) /*-{
return element.hasAttributeNS(uri, localName);
}-*/;
private static native void setAttributeNS(JavaScriptObject element,
String uri, String localName, String value) /*-{
element.setAttributeNS(uri, localName, value);
}-*/;
@Override protected void addAttributesToElement(JavaScriptObject element,
HtmlAttributes attributes) throws SAXException {
try {
for (int i = 0; i < attributes.getLength(); i++) {
String localName = attributes.getLocalNameNoBoundsCheck(i);
String uri = attributes.getURINoBoundsCheck(i);
if (!hasAttributeNS(element, uri, localName)) {
setAttributeNS(element, uri, localName,
attributes.getValueNoBoundsCheck(i));
}
}
} catch (JavaScriptException e) {
fatal(e);
}
}
private static native void appendChild(JavaScriptObject parent,
JavaScriptObject child) /*-{
parent.appendChild(child);
}-*/;
private static native JavaScriptObject createTextNode(JavaScriptObject doc,
String text) /*-{
return doc.createTextNode(text);
}-*/;
private static native JavaScriptObject getLastChild(JavaScriptObject node) /*-{
return node.lastChild;
}-*/;
private static native void extendTextNode(JavaScriptObject node, String text) /*-{
node.data += text;
}-*/;
@Override protected void appendCharacters(JavaScriptObject parent,
String text) throws SAXException {
try {
if (parent == placeholder) {
appendChild(script, createTextNode(document, text));
}
JavaScriptObject lastChild = getLastChild(parent);
if (lastChild != null && getNodeType(lastChild) == 3) {
extendTextNode(lastChild, text);
return;
}
appendChild(parent, createTextNode(document, text));
} catch (JavaScriptException e) {
fatal(e);
}
}
private static native boolean hasChildNodes(JavaScriptObject element) /*-{
return element.hasChildNodes();
}-*/;
private static native JavaScriptObject getFirstChild(
JavaScriptObject element) /*-{
return element.firstChild;
}-*/;
@Override protected void appendChildrenToNewParent(
JavaScriptObject oldParent, JavaScriptObject newParent)
throws SAXException {
try {
while (hasChildNodes(oldParent)) {
appendChild(newParent, getFirstChild(oldParent));
}
} catch (JavaScriptException e) {
fatal(e);
}
}
private static native JavaScriptObject createComment(JavaScriptObject doc,
String text) /*-{
return doc.createComment(text);
}-*/;
@Override protected void appendComment(JavaScriptObject parent,
String comment) throws SAXException {
try {
if (parent == placeholder) {
appendChild(script, createComment(document, comment));
}
appendChild(parent, createComment(document, comment));
} catch (JavaScriptException e) {
fatal(e);
}
}
@Override protected void appendCommentToDocument(String comment)
throws SAXException {
try {
appendChild(document, createComment(document, comment));
} catch (JavaScriptException e) {
fatal(e);
}
}
private static native JavaScriptObject createElementNS(
JavaScriptObject doc, String ns, String local) /*-{
return doc.createElementNS(ns, local);
}-*/;
@Override protected JavaScriptObject createElement(String ns, String name,
HtmlAttributes attributes) throws SAXException {
try {
JavaScriptObject rv = createElementNS(document, ns, name);
for (int i = 0; i < attributes.getLength(); i++) {
setAttributeNS(rv, attributes.getURINoBoundsCheck(i),
attributes.getLocalNameNoBoundsCheck(i),
attributes.getValueNoBoundsCheck(i));
}
if ("script" == name) {
if (placeholder != null) {
scriptStack.addLast(new ScriptHolder(script, placeholder));
}
script = rv;
placeholder = createElementNS(document,
"http://n.validator.nu/placeholder/", "script");
rv = placeholder;
for (int i = 0; i < attributes.getLength(); i++) {
setAttributeNS(rv, attributes.getURINoBoundsCheck(i),
attributes.getLocalNameNoBoundsCheck(i),
attributes.getValueNoBoundsCheck(i));
}
}
return rv;
} catch (JavaScriptException e) {
fatal(e);
throw new RuntimeException("Unreachable");
}
}
@Override protected JavaScriptObject createHtmlElementSetAsRoot(
HtmlAttributes attributes) throws SAXException {
try {
JavaScriptObject rv = createElementNS(document,
"http://www.w3.org/1999/xhtml", "html");
for (int i = 0; i < attributes.getLength(); i++) {
setAttributeNS(rv, attributes.getURINoBoundsCheck(i),
attributes.getLocalNameNoBoundsCheck(i),
attributes.getValueNoBoundsCheck(i));
}
appendChild(document, rv);
return rv;
} catch (JavaScriptException e) {
fatal(e);
throw new RuntimeException("Unreachable");
}
}
private static native JavaScriptObject getParentNode(
JavaScriptObject element) /*-{
return element.parentNode;
}-*/;
@Override protected void appendElement(JavaScriptObject child,
JavaScriptObject newParent) throws SAXException {
try {
if (newParent == placeholder) {
appendChild(script, cloneNodeDeep(child));
}
appendChild(newParent, child);
} catch (JavaScriptException e) {
fatal(e);
}
}
@Override protected boolean hasChildren(JavaScriptObject element)
throws SAXException {
try {
return hasChildNodes(element);
} catch (JavaScriptException e) {
fatal(e);
throw new RuntimeException("Unreachable");
}
}
private static native void insertBeforeNative(JavaScriptObject parent,
JavaScriptObject child, JavaScriptObject sibling) /*-{
parent.insertBefore(child, sibling);
}-*/;
private static native int getNodeType(JavaScriptObject node) /*-{
return node.nodeType;
}-*/;
private static native JavaScriptObject cloneNodeDeep(JavaScriptObject node) /*-{
return node.cloneNode(true);
}-*/;
/**
* Returns the document.
*
* @return the document
*/
JavaScriptObject getDocument() {
JavaScriptObject rv = document;
document = null;
return rv;
}
private static native JavaScriptObject createDocumentFragment(
JavaScriptObject doc) /*-{
return doc.createDocumentFragment();
}-*/;
JavaScriptObject getDocumentFragment() {
JavaScriptObject rv = createDocumentFragment(document);
JavaScriptObject rootElt = getFirstChild(document);
while (hasChildNodes(rootElt)) {
appendChild(rv, getFirstChild(rootElt));
}
document = null;
return rv;
}
/**
* @see nu.validator.htmlparser.impl.TreeBuilder#createJavaScriptObject(String,
* java.lang.String, org.xml.sax.Attributes, java.lang.Object)
*/
@Override protected JavaScriptObject createElement(String ns, String name,
HtmlAttributes attributes, JavaScriptObject form)
throws SAXException {
try {
JavaScriptObject rv = createElement(ns, name, attributes);
// rv.setUserData("nu.validator.form-pointer", form, null);
return rv;
} catch (JavaScriptException e) {
fatal(e);
return null;
}
}
/**
* @see nu.validator.htmlparser.impl.TreeBuilder#start()
*/
@Override protected void start(boolean fragment) throws SAXException {
script = null;
placeholder = null;
readyToRun = false;
}
protected void documentMode(DocumentMode mode, String publicIdentifier,
String systemIdentifier, boolean html4SpecificAdditionalErrorChecks)
throws SAXException {
// document.setUserData("nu.validator.document-mode", mode, null);
}
/**
* @see nu.validator.htmlparser.impl.TreeBuilder#elementPopped(java.lang.String,
* java.lang.String, java.lang.Object)
*/
@Override protected void elementPopped(String ns, String name,
JavaScriptObject node) throws SAXException {
if (node == placeholder) {
readyToRun = true;
requestSuspension();
}
}
private static native void replace(JavaScriptObject oldNode,
JavaScriptObject newNode) /*-{
oldNode.parentNode.replaceChild(newNode, oldNode);
}-*/;
private static native JavaScriptObject getPreviousSibling(JavaScriptObject node) /*-{
return node.previousSibling;
}-*/;
void maybeRunScript() {
if (readyToRun) {
readyToRun = false;
replace(placeholder, script);
if (scriptStack.isEmpty()) {
script = null;
placeholder = null;
} else {
ScriptHolder scriptHolder = scriptStack.removeLast();
script = scriptHolder.getScript();
placeholder = scriptHolder.getPlaceholder();
}
}
}
@Override protected void insertFosterParentedCharacters(String text,
JavaScriptObject table, JavaScriptObject stackParent)
throws SAXException {
try {
JavaScriptObject parent = getParentNode(table);
if (parent != null) { // always an element if not null
JavaScriptObject previousSibling = getPreviousSibling(table);
if (previousSibling != null
&& getNodeType(previousSibling) == 3) {
extendTextNode(previousSibling, text);
return;
}
insertBeforeNative(parent, createTextNode(document, text), table);
return;
}
JavaScriptObject lastChild = getLastChild(stackParent);
if (lastChild != null && getNodeType(lastChild) == 3) {
extendTextNode(lastChild, text);
return;
}
appendChild(stackParent, createTextNode(document, text));
} catch (JavaScriptException e) {
fatal(e);
}
}
@Override protected void insertFosterParentedChild(JavaScriptObject child,
JavaScriptObject table, JavaScriptObject stackParent)
throws SAXException {
JavaScriptObject parent = getParentNode(table);
try {
if (parent != null && getNodeType(parent) == 1) {
insertBeforeNative(parent, child, table);
} else {
appendChild(stackParent, child);
}
} catch (JavaScriptException e) {
fatal(e);
}
}
private static native void removeChild(JavaScriptObject parent,
JavaScriptObject child) /*-{
parent.removeChild(child);
}-*/;
@Override protected void detachFromParent(JavaScriptObject element)
throws SAXException {
try {
JavaScriptObject parent = getParentNode(element);
if (parent != null) {
removeChild(parent, element);
}
} catch (JavaScriptException e) {
fatal(e);
}
}
}

View File

@ -0,0 +1,265 @@
/*
* Copyright (c) 2007 Henri Sivonen
* Copyright (c) 2007-2008 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.gwt;
import java.util.LinkedList;
import nu.validator.htmlparser.common.XmlViolationPolicy;
import nu.validator.htmlparser.impl.ErrorReportingTokenizer;
import nu.validator.htmlparser.impl.Tokenizer;
import nu.validator.htmlparser.impl.UTF16Buffer;
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import com.google.gwt.core.client.JavaScriptObject;
import com.google.gwt.user.client.Timer;
/**
* This class implements an HTML5 parser that exposes data through the DOM
* interface.
*
* <p>By default, when using the constructor without arguments, the
* this parser treats XML 1.0-incompatible infosets as fatal errors.
* This corresponds to
* <code>FATAL</code> as the general XML violation policy. To make the parser
* support non-conforming HTML fully per the HTML 5 spec while on the other
* hand potentially violating the DOM API contract, set the general XML
* violation policy to <code>ALLOW</code>. This does not work with a standard
* DOM implementation. Handling all input without fatal errors and without
* violating the DOM API contract is possible by setting
* the general XML violation policy to <code>ALTER_INFOSET</code>. <em>This
* makes the parser non-conforming</em> but is probably the most useful
* setting for most applications.
*
* <p>The doctype is not represented in the tree.
*
* <p>The document mode is represented as user data <code>DocumentMode</code>
* object with the key <code>nu.validator.document-mode</code> on the document
* node.
*
* <p>The form pointer is also stored as user data with the key
* <code>nu.validator.form-pointer</code>.
*
* @version $Id: HtmlDocumentBuilder.java 255 2008-05-29 08:57:38Z hsivonen $
* @author hsivonen
*/
public class HtmlParser {
private static final int CHUNK_SIZE = 512;
private final Tokenizer tokenizer;
private final BrowserTreeBuilder domTreeBuilder;
private final StringBuilder documentWriteBuffer = new StringBuilder();
private ErrorHandler errorHandler;
private UTF16Buffer stream;
private int streamLength;
private boolean lastWasCR;
private boolean ending;
private ParseEndListener parseEndListener;
private final LinkedList<UTF16Buffer> bufferStack = new LinkedList<UTF16Buffer>();
/**
* Instantiates the parser
*
* @param implementation
* the DOM implementation
* @param xmlPolicy the policy
*/
public HtmlParser(JavaScriptObject document) {
this.domTreeBuilder = new BrowserTreeBuilder(document);
this.tokenizer = new ErrorReportingTokenizer(domTreeBuilder);
this.domTreeBuilder.setNamePolicy(XmlViolationPolicy.ALTER_INFOSET);
this.tokenizer.setCommentPolicy(XmlViolationPolicy.ALTER_INFOSET);
this.tokenizer.setContentNonXmlCharPolicy(XmlViolationPolicy.ALTER_INFOSET);
this.tokenizer.setContentSpacePolicy(XmlViolationPolicy.ALTER_INFOSET);
this.tokenizer.setNamePolicy(XmlViolationPolicy.ALTER_INFOSET);
this.tokenizer.setXmlnsPolicy(XmlViolationPolicy.ALTER_INFOSET);
}
/**
* Parses a document from a SAX <code>InputSource</code>.
* @param is the source
* @return the doc
* @see javax.xml.parsers.DocumentBuilder#parse(org.xml.sax.InputSource)
*/
public void parse(String source, ParseEndListener callback) throws SAXException {
parseEndListener = callback;
domTreeBuilder.setFragmentContext(null);
tokenize(source, null);
}
/**
* @param is
* @throws SAXException
* @throws IOException
* @throws MalformedURLException
*/
private void tokenize(String source, String context) throws SAXException {
lastWasCR = false;
ending = false;
documentWriteBuffer.setLength(0);
streamLength = source.length();
stream = new UTF16Buffer(source.toCharArray(), 0,
(streamLength < CHUNK_SIZE ? streamLength : CHUNK_SIZE));
bufferStack.clear();
push(stream);
domTreeBuilder.setFragmentContext(context == null ? null : context.intern());
tokenizer.start();
pump();
}
private void pump() throws SAXException {
if (ending) {
tokenizer.end();
domTreeBuilder.getDocument(); // drops the internal reference
parseEndListener.parseComplete();
// Don't schedule timeout
return;
}
int docWriteLen = documentWriteBuffer.length();
if (docWriteLen > 0) {
char[] newBuf = new char[docWriteLen];
documentWriteBuffer.getChars(0, docWriteLen, newBuf, 0);
push(new UTF16Buffer(newBuf, 0, docWriteLen));
documentWriteBuffer.setLength(0);
}
for (;;) {
UTF16Buffer buffer = peek();
if (!buffer.hasMore()) {
if (buffer == stream) {
if (buffer.getEnd() == streamLength) {
// Stop parsing
tokenizer.eof();
ending = true;
break;
} else {
int newEnd = buffer.getStart() + CHUNK_SIZE;
buffer.setEnd(newEnd < streamLength ? newEnd
: streamLength);
continue;
}
} else {
pop();
continue;
}
}
// now we have a non-empty buffer
buffer.adjust(lastWasCR);
lastWasCR = false;
if (buffer.hasMore()) {
lastWasCR = tokenizer.tokenizeBuffer(buffer);
domTreeBuilder.maybeRunScript();
break;
} else {
continue;
}
}
// schedule
Timer timer = new Timer() {
@Override public void run() {
try {
pump();
} catch (SAXException e) {
ending = true;
if (errorHandler != null) {
try {
errorHandler.fatalError(new SAXParseException(
e.getMessage(), null, null, -1, -1, e));
} catch (SAXException e1) {
}
}
}
}
};
timer.schedule(1);
}
private void push(UTF16Buffer buffer) {
bufferStack.addLast(buffer);
}
private UTF16Buffer peek() {
return bufferStack.getLast();
}
private void pop() {
bufferStack.removeLast();
}
public void documentWrite(String text) throws SAXException {
UTF16Buffer buffer = new UTF16Buffer(text.toCharArray(), 0, text.length());
while (buffer.hasMore()) {
buffer.adjust(lastWasCR);
lastWasCR = false;
if (buffer.hasMore()) {
lastWasCR = tokenizer.tokenizeBuffer(buffer);
domTreeBuilder.maybeRunScript();
}
}
}
/**
* @see javax.xml.parsers.DocumentBuilder#setErrorHandler(org.xml.sax.ErrorHandler)
*/
public void setErrorHandler(ErrorHandler errorHandler) {
this.errorHandler = errorHandler;
domTreeBuilder.setErrorHandler(errorHandler);
tokenizer.setErrorHandler(errorHandler);
}
/**
* Sets whether comment nodes appear in the tree.
* @param ignoreComments <code>true</code> to ignore comments
* @see nu.validator.htmlparser.impl.TreeBuilder#setIgnoringComments(boolean)
*/
public void setIgnoringComments(boolean ignoreComments) {
domTreeBuilder.setIgnoringComments(ignoreComments);
}
/**
* Sets whether the parser considers scripting to be enabled for noscript treatment.
* @param scriptingEnabled <code>true</code> to enable
* @see nu.validator.htmlparser.impl.TreeBuilder#setScriptingEnabled(boolean)
*/
public void setScriptingEnabled(boolean scriptingEnabled) {
domTreeBuilder.setScriptingEnabled(scriptingEnabled);
}
}

View File

@ -0,0 +1,87 @@
/*
* Copyright (c) 2008 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.gwt;
import org.xml.sax.SAXException;
import com.google.gwt.core.client.EntryPoint;
import com.google.gwt.core.client.JavaScriptObject;
public class HtmlParserModule implements EntryPoint {
private static native void zapChildren(JavaScriptObject node) /*-{
while (node.hasChildNodes()) {
node.removeChild(node.lastChild);
}
}-*/;
private static native void installDocWrite(JavaScriptObject doc, HtmlParser parser) /*-{
doc.write = function() {
if (arguments.length == 0) {
return;
}
var text = arguments[0];
for (var i = 1; i < arguments.length; i++) {
text += arguments[i];
}
parser.@nu.validator.htmlparser.gwt.HtmlParser::documentWrite(Ljava/lang/String;)(text);
}
doc.writeln = function() {
if (arguments.length == 0) {
parser.@nu.validator.htmlparser.gwt.HtmlParser::documentWrite(Ljava/lang/String;)("\n");
return;
}
var text = arguments[0];
for (var i = 1; i < arguments.length; i++) {
text += arguments[i];
}
text += "\n";
parser.@nu.validator.htmlparser.gwt.HtmlParser::documentWrite(Ljava/lang/String;)(text);
}
}-*/;
@SuppressWarnings("unused")
private static void parseHtmlDocument(String source, JavaScriptObject document, JavaScriptObject readyCallback, JavaScriptObject errorHandler) throws SAXException {
if (readyCallback == null) {
readyCallback = JavaScriptObject.createFunction();
}
zapChildren(document);
HtmlParser parser = new HtmlParser(document);
parser.setScriptingEnabled(true);
// XXX error handler
installDocWrite(document, parser);
parser.parse(source, new ParseEndListener(readyCallback));
}
private static native void exportEntryPoints() /*-{
$wnd.parseHtmlDocument = @nu.validator.htmlparser.gwt.HtmlParserModule::parseHtmlDocument(Ljava/lang/String;Lcom/google/gwt/core/client/JavaScriptObject;Lcom/google/gwt/core/client/JavaScriptObject;Lcom/google/gwt/core/client/JavaScriptObject;);
}-*/;
public void onModuleLoad() {
exportEntryPoints();
}
}

View File

@ -0,0 +1,46 @@
/*
* Copyright (c) 2008 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.gwt;
import com.google.gwt.core.client.JavaScriptObject;
public class ParseEndListener {
private final JavaScriptObject callback;
/**
* @param callback
*/
public ParseEndListener(JavaScriptObject callback) {
this.callback = callback;
}
public void parseComplete() {
call(callback);
}
private static native void call(JavaScriptObject callback) /*-{
callback();
}-*/;
}

View File

@ -0,0 +1,225 @@
<!DOCTYPE HTML>
<html>
<head>
<title>Live DOM Viewer</title>
<script type="text/javascript" language="javascript" src="nu.validator.htmlparser.HtmlParser.nocache.js"></script>
<style>
h1 { margin: 0; }
h2 { font-size: small; margin: 1em 0 0; }
p, ul, pre { margin: 0; }
p { border: inset thin; }
textarea { width: 100%; -width: 99%; height: 8em; border: 0; }
iframe { width: 100%; height: 12em; border: 0; }
/* iframe.large { height: 24em; } */
pre { border: inset thin; padding: 0.5em; color: gray; }
pre samp { color: black; }
#dom { border: inset thin; padding: 0.5em 0.5em 0.5em 1em; color: black; min-height: 5em; font-family: monospace; background: white; }
#dom ul { padding: 0 0 0 1em; margin: 0; }
#dom li { padding: 0; margin: 0; list-style: none; position: relative; }
#dom li li { list-style: disc; }
#dom .t1 code { color: purple; font-weight: bold; }
#dom .t2 { font-style: normal; font-family: monospace; }
#dom .t2 .name { color: black; font-weight: bold; }
#dom .t2 .value { color: blue; font-weight: normal; }
#dom .t3 code, #dom .t4 code, #dom .t5 code { color: gray; }
#dom .t7 code, #dom .t8 code { color: green; }
#dom span { font-style: italic; font-family: serif; }
#dom .t10 code { color: teal; }
#dom .misparented, #dom .misparented code { color: red; font-weight: bold; }
#dom.hidden, .hidden { visibility: hidden; margin: 0.5em 0; padding: 0; height: 0; min-height: 0; }
pre#log { color: black; font: small monospace; }
script + p { border: none; font-size: smaller; margin: 0.8em 0.3em; }
</style>
<style title="Tree View">
#dom li li { list-style: none; }
#dom li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
#dom li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
</style>
<script>
if (navigator.userAgent.match('Gecko/(\\d+)') && RegExp.$1 == '20060217' && RegExp.$1 != '00000000') {
var style = document.getElementsByTagName('style')[1];
style.parentNode.removeChild(style);
}
</script>
</head>
<body onload="init()">
<h1>Live DOM Viewer</h1>
<h2>Markup to test (<a href="data:," id="permalink" rel="bookmark">permalink</a>, <a href="javascript:up()">upload</a>, <a href="javascript:down()">download</a>, <a href="#" onclick="toggleVisibility(this); return false">hide</a>): <span id="updown-status"></span></h2>
<p><textarea oninput="updateInput(event)" onkeydown="updateInput(event)">&lt;!DOCTYPE html>
...</textarea></p>
<h2><a href="data:," id="domview">DOM view</a> (<a href="#" onclick="toggleVisibility(this); return false;">hide</a>, <a href="#" onclick="updateDOM()">refresh</a>):</h2>
<ul id="dom"></ul>
<h2><a href="data:," id="link">Rendered view</a>: (<a href="#" onclick="toggleVisibility(this); return false;">hide</a><!--, <a href="#" onclick="grow(this)">grow</a>-->):</h2>
<p><iframe src="blank.html"></iframe></p> <!-- data:, -->
<h2>innerHTML view: (<a href="#" onclick="toggleVisibility(this); return false;">show</a>, <a href="#" onclick="updateDOM()">refresh</a>):</h2>
<pre class="hidden">&lt;!DOCTYPE HTML>&lt;html><samp></samp>&lt;/html></pre>
<h2>Log: (<a href="#" onclick="toggleVisibility(this); return false;">hide</a>):</h2>
<pre id="log">Script not loaded.</pre>
<script>
var iframe = document.getElementsByTagName('iframe')[0];
var textarea = document.getElementsByTagName('textarea')[0];
var pre = document.getElementsByTagName('samp')[0];
var dom = document.getElementsByTagName('ul')[0];
var log = document.getElementById('log');
var updownStatus = document.getElementById('updown-status');
var delayedUpdater = 0;
var lastString = '';
var logBuffer = '';
var logBuffering = false;
function updateInput(event) {
if (delayedUpdater) {
clearTimeout(delayedUpdater);
delayedUpdater = 0;
}
delayedUpdater = setTimeout(update, 100);
}
function afterParse() {
lastString = textarea.value;
setTimeout(updateDOM, 100);
updown('');
}
function update() {
if (lastString != textarea.value) {
logBuffering = true;
document.getElementById('link').href = 'data:text/html;charset=utf-8,' + encodeURIComponent(textarea.value);
iframe.contentWindow.onerror = function (a, b, c) {
record('error: ' + a + ' on line ' + c);
}
iframe.contentWindow.w = function (s) {
record('log: ' + s);
}
window.parseHtmlDocument(textarea.value, iframe.contentWindow.document, afterParse, null);
}
}
function updateDOM() {
while (pre.firstChild) pre.removeChild(pre.firstChild);
pre.appendChild(document.createTextNode(iframe.contentWindow.document.documentElement.innerHTML));
printDOM(dom, iframe.contentWindow.document);
document.getElementById('domview').href = 'data:text/plain;charset=utf-8,<ul class="domTree">' + encodeURIComponent(dom.innerHTML + '</ul>');
document.getElementById('permalink').href = '?' + encodeURIComponent(textarea.value);
record('rendering mode: ' + iframe.contentWindow.document.compatMode);
if (iframe.contentWindow.document.title)
record('document.title: ' + iframe.contentWindow.document.title);
else
record('document has no title');
while (log.firstChild != log.lastChild)
log.removeChild(log.lastChild);
log.firstChild.data = logBuffer;
logBuffering = false;
logBuffer = '';
}
function printDOM(ul, node) {
while (ul.firstChild) ul.removeChild(ul.firstChild);
for (var i = 0; i < node.childNodes.length; i += 1) {
var li = document.createElement('li');
li.className = 't' + node.childNodes[i].nodeType;
if (node.childNodes[i].nodeType == 10) {
li.appendChild(document.createTextNode('DOCTYPE: '));
}
var code = document.createElement('code');
code.appendChild(document.createTextNode(node.childNodes[i].nodeName));
li.appendChild(code);
if (node.childNodes[i].nodeValue) {
var span = document.createElement('span');
span.appendChild(document.createTextNode(node.childNodes[i].nodeValue));
li.appendChild(document.createTextNode(': '));
li.appendChild(span);
}
if (node.childNodes[i].attributes)
for (var j = 0; j < node.childNodes[i].attributes.length; j += 1) {
if (node.childNodes[i].attributes[j].specified) {
var attName = document.createElement('code');
attName.appendChild(document.createTextNode(node.childNodes[i].attributes[j].nodeName));
attName.className = 'attribute name';
var attValue = document.createElement('code');
attValue.appendChild(document.createTextNode(node.childNodes[i].attributes[j].nodeValue));
attValue.className = 'attribute value';
var att = document.createElement('span');
att.className = 't2';
att.appendChild(attName);
att.appendChild(document.createTextNode('="'));
att.appendChild(attValue);
att.appendChild(document.createTextNode('"'));
li.appendChild(document.createTextNode(' '));
li.appendChild(att);
}
}
if (node.childNodes[i].parentNode == node) {
if (node.childNodes[i].childNodes.length) {
var ul2 = document.createElement('ul');
li.appendChild(ul2);
printDOM(ul2, node.childNodes[i]);
}
} else {
li.className += ' misparented';
}
ul.appendChild(li);
}
}
function toggleVisibility(link) {
var n = link.parentNode.nextSibling;
if (n.nodeType == 3 /* text node */) n = n.nextSibling; // we should always do this but in IE, text nodes vanish
n.className = (n.className == "hidden") ? '' : 'hidden';
link.firstChild.data = n.className == "hidden" ? "show" : "hide";
}
/*
function grow(link) {
var n = link.parentNode.nextSibling;
if (n.nodeType == 3 /-* text node *-/) n = n.nextSibling; // we should always do this but in IE, text nodes vanish
n.className = (n.className == "large") ? '' : 'large';
link.firstChild.data = n.className == "grow" ? "shrink" : "grow";
}
*/
function down() {
updown('downloading...');
var request = window.XMLHttpRequest ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP");
request.onreadystatechange = function () {
updown('downloading... ' + request.readyState + '/4');
if (request.readyState == 4) {
textarea.value = request.responseText;
update();
updown('downloaded');
}
};
request.open('GET', 'clipboard.cgi', true);
request.send(null);
}
function up() {
updown('uploading...');
var request = window.XMLHttpRequest ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP");
request.onreadystatechange = function () {
updown('uploading... ' + request.readyState + '/4');
if (request.readyState == 4) {
updown('uploaded');
}
};
request.open('POST', 'clipboard.cgi', true);
request.setRequestHeader('Content-Type', 'text/plain');
request.send(textarea.value);
}
function init() {
var uri = location.search;
if (uri)
textarea.value = decodeURIComponent(uri.substring(1, uri.length));
update();
}
function record(s) {
if (logBuffering)
logBuffer += s + '\r\n';
else
log.appendChild(document.createTextNode(s + '\r\n'));
}
function updown(s) {
while (updownStatus.firstChild) updownStatus.removeChild(updownStatus.firstChild);
updownStatus.appendChild(document.createTextNode(s));
}
</script>
<p>This script puts a function <code>w(<var>s</var>)</code> into the
global scope of the test page, where <var>s</vaR> is a string to
output to the log. Also, five files are accessible in the current
directory for test purposes: <code>image</code> (a GIF image),
<code>flash</code> (a Flash file), <code>script</code> (a JS file),
<code>style</code> (a CSS file), and <code>document</code> (an HTML
file).</p>
</body>
</html>

View File

@ -0,0 +1,25 @@
From:
http://software.hixie.ch/utilities/js/live-dom-viewer/LICENSE
regarding the upstream of HtmlParser.html:
The MIT License
Copyright (c) 2000, 2006, 2008 Ian Hickson and various contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@ -0,0 +1,2 @@
<!DOCTYPE html>
<title></title>

View File

@ -0,0 +1,25 @@
These scripts export the Java-to-C++ translator and the java source files that
implement the HTML5 parser. The exported translator may be used (with no
external dependencies) to translate the exported java source files into Gecko-
compatible C++.
Hacking the translator itself still requires a working copy of the Java HTML5
parser repository, but hacking the parser (modifying the Java source files and
performing the translation) should now be possible using only files committed
to the mozilla source tree.
Run any of these scripts without arguments to receive usage instructions.
make-translator-jar.sh: compiles the Java-to-C++ translator into a .jar file
export-java-srcs.sh: exports minimal java source files implementing the
HTML5 parser
export-translator.sh: exports the compiled translator and javaparser.jar
export-all.sh: runs the previous two scripts
util.sh: provides various shell utility functions to the
scripts listed above (does nothing if run directly)
All path arguments may be either absolute or relative. This includes the path
to the script itself ($0), so the directory from which you run these scripts
doesn't matter.
Ben Newman (7 July 2009)

View File

@ -0,0 +1,24 @@
#!/usr/bin/env sh
SCRIPT_DIR=`dirname $0`
source $SCRIPT_DIR/util.sh
SCRIPT_DIR=`abs $SCRIPT_DIR`
if [ $# -eq 1 ]
then
MOZ_PARSER_PATH=`abs $1`
else
echo
echo "Usage: sh `basename $0` /path/to/mozilla-central/parser/html"
echo "Note that relative paths will work just fine."
echo
exit 1
fi
$SCRIPT_DIR/export-translator.sh $MOZ_PARSER_PATH
$SCRIPT_DIR/export-java-srcs.sh $MOZ_PARSER_PATH
echo
echo "Now go to $MOZ_PARSER_PATH and run"
echo " java -jar javalib/translator.jar javasrc . nsHtml5AtomList.h"
echo

View File

@ -0,0 +1,25 @@
#!/usr/bin/env sh
SCRIPT_DIR=`dirname $0`
source $SCRIPT_DIR/util.sh
SCRIPT_DIR=`abs $SCRIPT_DIR`
SRCDIR=`abs $SCRIPT_DIR/../src/nu/validator/htmlparser/impl`
if [ $# -eq 1 ]
then
MOZ_PARSER_PATH=`abs $1`
else
echo
echo "Usage: sh `basename $0` /path/to/mozilla-central/parser/html"
echo "Note that relative paths will work just fine."
echo
exit 1
fi
SRCTARGET=$MOZ_PARSER_PATH/javasrc
rm -rf $SRCTARGET
mkdir $SRCTARGET
# Avoid copying the .svn directory:
cp -rv $SRCDIR/*.java $SRCTARGET

View File

@ -0,0 +1,24 @@
#!/usr/bin/env sh
SCRIPT_DIR=`dirname $0`
source $SCRIPT_DIR/util.sh
SCRIPT_DIR=`abs $SCRIPT_DIR`
LIBDIR=`abs $SCRIPT_DIR/../translator-lib`
if [ $# -eq 1 ]
then
MOZ_PARSER_PATH=`abs $1`
else
echo
echo "Usage: sh `basename $0` /path/to/mozilla-central/parser/html"
echo "Note that relative paths will work just fine."
echo "Be sure that you have run `dirname $0`/make-translator-jar.sh before running this script."
echo
exit 1
fi
LIBTARGET=$MOZ_PARSER_PATH/javalib
rm -rf $LIBTARGET
cp -rv $LIBDIR $LIBTARGET

View File

@ -0,0 +1,63 @@
#!/usr/bin/env sh
SCRIPT_DIR=`dirname $0`
source $SCRIPT_DIR/util.sh
SCRIPT_DIR=`abs $SCRIPT_DIR`
SRCDIR=`abs $SCRIPT_DIR/../translator-src`
BINDIR=`abs $SCRIPT_DIR/../translator-bin`
LIBDIR=`abs $SCRIPT_DIR/../translator-lib`
if [ $# -eq 1 ]
then
JAVAPARSER_JAR_PATH=`abs $1`
else
echo
echo "Usage: sh `basename $0` /path/to/javaparser-1.0.7.jar"
echo "Note that relative paths will work just fine."
echo "Obtain javaparser-1.0.7.jar from http://code.google.com/p/javaparser"
echo
exit 1
fi
set_up() {
rm -rf $BINDIR; mkdir $BINDIR
rm -rf $LIBDIR; mkdir $LIBDIR
cp $JAVAPARSER_JAR_PATH $LIBDIR/javaparser.jar
}
write_manifest() {
rm -f $LIBDIR/manifest
echo "Main-Class: nu.validator.htmlparser.cpptranslate.Main" > $LIBDIR/manifest
echo "Class-Path: javaparser.jar" >> $LIBDIR/manifest
}
compile_translator() {
find $SRCDIR -name "*.java" | \
xargs javac -cp $LIBDIR/javaparser.jar -g -d $BINDIR
}
generate_jar() {
jar cvfm $LIBDIR/translator.jar $LIBDIR/manifest -C $BINDIR .
}
clean_up() {
rm -f $LIBDIR/manifest
}
success_message() {
echo
echo "Successfully generated directory \"$LIBDIR\" with contents:"
echo
ls -al $LIBDIR
echo
echo "Now run `dirname $0`/export-all.sh with no arguments and follow the usage instructions."
echo
}
set_up && \
compile_translator && \
write_manifest && \
generate_jar && \
clean_up && \
success_message

View File

@ -0,0 +1,23 @@
#!/usr/bin/env sh
abs() {
local rel
local p
if [ $# -ne 1 ]
then
rel=.
else
rel=$1
fi
if [ -d $rel ]
then
pushd $rel > /dev/null
p=`pwd`
popd > /dev/null
else
pushd `dirname $rel` > /dev/null
p=`pwd`/`basename $rel`
popd > /dev/null
fi
echo $p
}

View File

@ -0,0 +1,240 @@
<!--
* Copyright (c) 2007-2012 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>nu.validator.htmlparser</groupId>
<artifactId>htmlparser</artifactId>
<packaging>bundle</packaging>
<version>1.4</version>
<name>htmlparser</name>
<url>http://about.validator.nu/htmlparser/</url>
<description>The Validator.nu HTML Parser is an implementation of the HTML5 parsing algorithm in Java for applications. The parser is designed to work as a drop-in replacement for the XML parser in applications that already support XHTML 1.x content with an XML parser and use SAX, DOM or XOM to interface with the parser.</description>
<!--
Usage notes for this POM:
To build without signing, run:
mvn clean source:jar javadoc:jar repository:bundle-create
(enter 0 <return> when prompted)
To build and sign, run:
mvn clean source:jar javadoc:jar package gpg:sign repository:bundle-create
(enter 0 <return> when prompted)
This POM file is used for creating the bundle for distribution via the
Maven Central Repository. It is not used as part of the normal development
process of the parser and the maintainer of the parser (Henri Sivonen)
isn't experienced in POM tweaking. If you need this POM to do something
that it currently does not do or do something better, you need to write
the changes you need yourself and contribute a patch via
http://bugzilla.validator.nu/
-->
<developers>
<developer>
<id>hsivonen</id>
<name>Henri Sivonen</name>
<email>hsivonen@iki.fi</email>
<url>http://hsivonen.iki.fi/</url>
</developer>
</developers>
<licenses>
<license>
<name>The MIT License</name>
<url>http://www.opensource.org/licenses/mit-license.php</url>
<distribution>repo</distribution>
</license>
<license>
<name>The (New) BSD License</name>
<url>http://www.opensource.org/licenses/bsd-license.php</url>
<distribution>repo</distribution>
</license>
</licenses>
<scm>
<connection>scm:hg:http://hg.mozilla.org/projects/htmlparser/</connection>
<url>http://hg.mozilla.org/projects/htmlparser/</url>
</scm>
<build>
<sourceDirectory>${project.build.directory}/src</sourceDirectory>
<testSourceDirectory>${basedir}/test-src</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.7</version>
<dependencies>
<dependency>
<groupId>com.sun</groupId>
<artifactId>tools</artifactId>
<version>1.5.0</version>
<scope>system</scope>
<systemPath>${java.home}/../lib/tools.jar</systemPath>
</dependency>
</dependencies>
<executions>
<execution>
<id>intitialize-sources</id>
<phase>initialize</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target>
<delete dir="${project.build.sourceDirectory}"/>
<mkdir dir="${project.build.sourceDirectory}"/>
<copy todir="${project.build.sourceDirectory}">
<fileset dir="${basedir}/src"/>
</copy>
</target>
</configuration>
</execution>
<execution>
<id>tokenizer-hotspot-workaround</id>
<phase>process-sources</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target>
<property name="translator.sources" value="${basedir}/translator-src"/>
<property name="translator.classes" value="${project.build.directory}/translator-classes"/>
<mkdir dir="${translator.classes}"/>
<javac srcdir="${translator.sources}" includes="nu/validator/htmlparser/generator/ApplyHotSpotWorkaround.java" destdir="${translator.classes}" includeantruntime="false"/>
<java classname="nu.validator.htmlparser.generator.ApplyHotSpotWorkaround">
<classpath>
<pathelement location="${translator.classes}"/>
</classpath>
<arg value="${project.build.sourceDirectory}/nu/validator/htmlparser/impl/Tokenizer.java"/>
<arg value="${project.build.sourceDirectory}/nu/validator/htmlparser/impl/HotSpotWorkaround.txt"/>
</java>
</target>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
<version>2.3.7</version>
<extensions>true</extensions>
<configuration>
<archive>
<addMavenDescriptor>false</addMavenDescriptor>
</archive>
<instructions>
<Bundle-Name>${project.name}</Bundle-Name>
<Bundle-SymbolicName>nu.validator.htmlparser</Bundle-SymbolicName>
<Bundle-Version>${project.version}</Bundle-Version>
<Bundle-RequiredExecutionEnvironment>J2SE-1.5</Bundle-RequiredExecutionEnvironment>
<_removeheaders>Built-By,Bnd-LastModified</_removeheaders>
</instructions>
</configuration>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>rpm-maven-plugin</artifactId>
<configuration>
<release>1</release>
<copyright>The MIT License</copyright>
<group>Development/Java</group>
<workarea>/var/tmp/${project.build.finalName}</workarea>
<defineStatements>
<defineStatement>_javadir ${rpm.java.dir}</defineStatement>
<defineStatement>_javadocdir ${rpm.javadoc.dir}</defineStatement>
</defineStatements>
<mappings>
<mapping>
<directory>${rpm.java.dir}</directory>
<filemode>644</filemode>
<username>root</username>
<groupname>root</groupname>
<sources>
<source>
<location>${project.build.directory}/${project.build.finalName}.jar</location>
</source>
</sources>
</mapping>
<mapping>
<directory>${rpm.javadoc.dir}/${project.build.finalName}</directory>
<filemode>644</filemode>
<username>root</username>
<groupname>root</groupname>
<sources>
<source>
<location>${project.build.directory}/apidocs</location>
</source>
</sources>
</mapping>
</mappings>
<install>%__ln_s ${project.build.finalName}.jar %{buildroot}%{_javadir}/${project.name}.jar</install>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>com.ibm.icu</groupId>
<artifactId>icu4j</artifactId>
<version>4.0.1</version>
<scope>compile</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>xom</groupId>
<artifactId>xom</artifactId>
<version>1.1</version>
<scope>compile</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>net.sourceforge.jchardet</groupId>
<artifactId>jchardet</artifactId>
<version>1.0</version>
<scope>compile</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.sdicons.jsontools</groupId>
<artifactId>jsontools-core</artifactId>
<version>1.4</version>
<scope>test</scope>
</dependency>
</dependencies>
<properties>
<rpm.java.dir>/usr/share/java</rpm.java.dir>
<rpm.javadoc.dir>/usr/share/javadoc</rpm.javadoc.dir>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
</project>

View File

@ -0,0 +1,36 @@
import java.util.HashSet;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
public class DomUtils {
private static HashSet<Document> pinned_list = new HashSet<Document>();
public static synchronized void pin(Document d) {
pinned_list.add(d);
}
public static synchronized void unpin(Document d) {
pinned_list.remove(d);
}
// return all the text content contained by a single element
public static void getElementContent(Element e, StringBuffer b) {
for (Node n = e.getFirstChild(); n!=null; n=n.getNextSibling()) {
if (n.getNodeType() == n.TEXT_NODE) {
b.append(n.getNodeValue());
} else if (n.getNodeType() == n.ELEMENT_NODE) {
getElementContent((Element) e, b);
}
}
}
// replace all child nodes of a given element with a single text element
public static void setElementContent(Element e, String s) {
while (e.hasChildNodes()) {
e.removeChild(e.getFirstChild());
}
e.appendChild(e.getOwnerDocument().createTextNode(s));
}
}

View File

@ -0,0 +1,65 @@
Disclaimer:
This code is experimental.
When some people say experimental, they mean "it may not do what it is
intended to do; in fact, it might even wipe out your hard drive". I mean
that too. But I mean something more than that.
In this case, experimental means that I don't even know what it is intended
to do. I just have a vague vision, and I am trying out various things in
the hopes that one of them will work out.
Vision:
My vague vision is that I would like to see HTML 5 be a success. For me to
consider it to be a success, it needs to be a standard, be interoperable,
and be ubiquitous.
I believe that the Validator.nu parser can be used to bootstrap that
process. It is written in Java. Has been compiled into JavaScript. Has
been translated into C++ based on the Mozilla libraries with the intent of
being included in Firefox. It very closely tracks to the standard.
For the moment, the effort is on extending that to another language (Ruby)
on a single environment (i.e., Linux). Once that is complete, intent is to
evaluate the results, decide what needs to be changed, and what needs to be
done to support other languages and environments.
The bar I'm setting for myself isn't just another SWIG generated low level
interface to a DOM, but rather a best of breed interface; which for Ruby
seems to be the one pioneered by Hpricot and adopted by Nokogiri. Success
will mean passing all of the tests from one of those two parsers as well as
all of the HTML5 tests.
Build instructions:
You'll need icu4j and chardet jars. If you checked out and ran dldeps you
are already all set:
svn co http://svn.versiondude.net/whattf/build/trunk/ build
python build/build.py checkout dldeps
Fedora 11:
yum install ruby-devel rubygem-rake java-1.5.0-gcj-devel gcc-c++
Ubuntu 9.04:
apt-get install ruby ruby1.8-dev rake gcj g++
Also at this time, you need to install a jdk (e.g. sun-java6-jdk), simply
because the javac that comes with gcj doesn't support -sourcepath, and
I haven't spent the time to find a replacement.
Finally, make sure that libjaxp1.3-java is *not* installed.
http://gcc.gnu.org/ml/java/2009-06/msg00055.html
If this is done, you should be all set.
cd htmlparser/ruby-gcj
rake test
If things are successful, the last lines of the output will list the
font attributes and values found in the test/google.html file.

View File

@ -0,0 +1,77 @@
deps = ENV['deps'] || '../../dependencies'
icu4j = "#{deps}/icu4j-4_0.jar"
chardet = "#{deps}/mozilla/intl/chardet/java/dist/lib/chardet.jar"
libgcj = Dir['/usr/share/java/libgcj*.jar'].grep(/gcj[-\d.]*jar$/).sort.last
task :default => %w(headers libs Makefile validator.so)
# headers
hdb = 'nu/validator/htmlparser/dom/HtmlDocumentBuilder'
task :headers => %W(headers/DomUtils.h headers/#{hdb}.h)
file 'headers/DomUtils.h' => 'DomUtils.java' do |t|
mkdir_p %w(classes headers), :verbose => false
sh "javac -d classes #{t.prerequisites.first}"
sh "gcjh -force -o #{t.name} -cp #{libgcj}:classes DomUtils"
end
file "headers/#{hdb}.h" => "../src/#{hdb}.java" do |t|
mkdir_p %w(classes headers), :verbose => false
sh "javac -cp #{icu4j}:#{chardet} -d classes -sourcepath ../src " +
t.prerequisites.first
sh "gcjh -force -cp classes -o #{t.name} -cp #{libgcj}:classes " +
hdb.gsub('/','.')
end
# libs
task :libs => %w(htmlparser chardet icu).map {|name| "lib/libnu-#{name}.so"}
htmlparser = Dir['../src/**/*.java'].reject {|name| name.include? '/xom/'}
file 'lib/libnu-htmlparser.so' => htmlparser + ['DomUtils.java'] do |t|
mkdir_p 'lib', :verbose => false
sh "gcj -shared --classpath=#{icu4j}:#{chardet} -fPIC " +
"-o #{t.name} #{t.prerequisites.join(' ')}"
end
file 'lib/libnu-chardet.so' => chardet do |t|
mkdir_p 'lib', :verbose => false
sh "gcj -shared -fPIC -o #{t.name} #{t.prerequisites.join(' ')}"
end
file 'lib/libnu-icu.so' => icu4j do |t|
mkdir_p 'lib', :verbose => false
sh "gcj -shared -fPIC -o #{t.name} #{t.prerequisites.join(' ')}"
end
# module
file 'Makefile' do
sh "ruby extconf.rb --with-gcj=#{libgcj}"
end
file 'validator.so' => %w(Makefile validator.cpp headers/DomUtils.h) do
system 'make'
end
file 'nu/validator.so' do
mkdir_p 'nu', :verbose => false
system 'ln -s -t nu ../validator.so'
end
# tasks
task :test => [:default, 'nu/validator.so'] do
ENV['LD_LIBRARY_PATH']='lib'
sh 'ruby test/fonts.rb test/google.html'
end
task :clean do
rm_rf %W(classes lib nu mkmf.log headers/DomUtils.h headers/#{hdb}.h) +
Dir['*.o'] + Dir['*.so']
end
task :clobber => :clean do
rm_rf %w(headers Makefile)
end

View File

@ -0,0 +1,45 @@
require 'mkmf'
# system dependencies
gcj = with_config('gcj', '/usr/share/java/libgcj.jar')
# headers for JAXP
CONFIG['CC'] = 'g++'
with_cppflags('-xc++') do
unless find_header('org/w3c/dom/Document.h', 'headers')
`jar tf #{gcj}`.split.each do |file|
next unless file =~ /\.class$/
next unless file =~ /^(javax|org)\/(w3c|xml)/
next if file.include? '$'
dest = 'headers/' + file.sub(/\.class$/,'.h')
name = file.sub(/\.class$/,'').gsub('/','.')
next if File.exist? dest
cmd = "gcjh -cp #{gcj} -o #{dest} #{name}"
puts cmd
break unless system cmd
system "ruby -pi -e '$_.sub!(/namespace namespace$/," +
"\"namespace namespace$\")' #{dest}"
system "ruby -pi -e '$_.sub!(/::namespace::/," +
"\"::namespace$::\")' #{dest}"
end
exit unless find_header('org/w3c/dom/Document.h', 'headers')
end
find_header 'nu/validator/htmlparser/dom/HtmlDocumentBuilder.h', 'headers'
end
# Java libraries
Config::CONFIG['CC'] = 'g++ -shared'
dir_config('nu-htmlparser', nil, 'lib')
have_library 'nu-htmlparser'
have_library 'nu-icu'
have_library 'nu-chardet'
# Ruby library
create_makefile 'nu/validator'

View File

@ -0,0 +1,5 @@
require 'nu/validator'
ARGV.each do |arg|
puts Nu::Validator::parse(open(arg)).root.name
end

View File

@ -0,0 +1,11 @@
require 'nu/validator'
require 'open-uri'
ARGV.each do |arg|
doc = Nu::Validator::parse(open(arg))
doc.xpath("//*[local-name()='font']").each do |font|
font.attributes.each do |name, attr|
puts "#{name} => #{attr.value}"
end
end
end

View File

@ -0,0 +1,10 @@
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><title>Google</title><script>window.google={kEI:"vLhASujeGpTU9QT2iOnWAQ",kEXPI:"17259",kCSIE:"17259",kHL:"en"};
window.google.sn="webhp";window.google.timers={load:{t:{start:(new Date).getTime()}}};try{window.google.pt=window.gtbExternal&&window.gtbExternal.pageT()||window.external&&window.external.pageT}catch(b){}
window.google.jsrt_kill=1;
var _gjwl=location;function _gjuc(){var e=_gjwl.href.indexOf("#");if(e>=0){var a=_gjwl.href.substring(e);if(a.indexOf("&q=")>0||a.indexOf("#q=")>=0){a=a.substring(1);if(a.indexOf("#")==-1){for(var c=0;c<a.length;){var d=c;if(a.charAt(d)=="&")++d;var b=a.indexOf("&",d);if(b==-1)b=a.length;var f=a.substring(d,b);if(f.indexOf("fp=")==0){a=a.substring(0,c)+a.substring(b,a.length);b=c}else if(f=="cad=h")return 0;c=b}_gjwl.href="/search?"+a+"&cad=h";return 1}}}return 0}function _gjp(){!(window._gjwl.hash&&
window._gjuc())&&setTimeout(_gjp,500)};
window._gjp && _gjp();</script><style>td{line-height:.8em;}.gac_c{line-height:normal;}form{margin-bottom:20px;}body,td,a,p,.h{font-family:arial,sans-serif}.h{color:#36c;font-size:20px}.q{color:#00c}.ts td{padding:0}.ts{border-collapse:collapse}#gbar{height:22px;padding-left:0px}.gbh,.gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh{height:0;position:absolute;top:24px;width:100%}#guser{padding-bottom:7px !important;text-align:right}#gbar,#guser{font-size:13px;padding-top:1px !important}@media all{.gb1,.gb3{height:22px;margin-right:.5em;vertical-align:top}#gbar{float:left}}a.gb1,a.gb3{color:#00c !important}.gb3{text-decoration:none}</style><script>google.y={};google.x=function(e,g){google.y[e.id]=[e,g];return false};</script></head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onload="document.f.q.focus();if(document.images)new Image().src='/images/nav_logo4.png'" topmargin=3 marginheight=3><textarea id=csi style=display:none></textarea><iframe name=wgjf style="display:none"></iframe><div id=gbar><nobr><b class=gb1>Web</b> <a href="http://images.google.com/imghp?hl=en&tab=wi" class=gb1>Images</a> <a href="http://video.google.com/?hl=en&tab=wv" class=gb1>Video</a> <a href="http://maps.google.com/maps?hl=en&tab=wl" class=gb1>Maps</a> <a href="http://news.google.com/nwshp?hl=en&tab=wn" class=gb1>News</a> <a href="http://www.google.com/prdhp?hl=en&tab=wf" class=gb1>Shopping</a> <a href="http://mail.google.com/mail/?hl=en&tab=wm" class=gb1>Gmail</a> <a href="http://www.google.com/intl/en/options/" class=gb3><u>more</u> &raquo;</a></nobr></div><div id=guser width=100%><nobr><a href="/url?sa=p&pref=ig&pval=3&q=http://www.google.com/ig%3Fhl%3Den%26source%3Diglk&usg=AFQjCNFA18XPfgb7dKnXfKz7x7g1GDH1tg">iGoogle</a> | <a href="https://www.google.com/accounts/Login?hl=en&continue=http://www.google.com/">Sign in</a></nobr></div><div class=gbh style=left:0></div><div class=gbh style=right:0></div><center><br clear=all id=lgpd><img alt="Google" height=110 src="/intl/en_ALL/images/logo.gif" width=276 id=logo onload="window.lol&&lol()"><br><br><form action="/search" name=f><table cellpadding=0 cellspacing=0><tr valign=top><td width=25%>&nbsp;</td><td align=center nowrap><input name=hl type=hidden value=en><input type=hidden name=ie value="ISO-8859-1"><input autocomplete="off" maxlength=2048 name=q size=55 title="Google Search" value=""><br><input name=btnG type=submit value="Google Search"><input name=btnI type=submit value="I'm Feeling Lucky"></td><td nowrap width=25% align=left><font size=-2>&nbsp;&nbsp;<a href=/advanced_search?hl=en>Advanced Search</a><br>&nbsp;&nbsp;<a href=/preferences?hl=en>Preferences</a><br>&nbsp;&nbsp;<a href=/language_tools?hl=en>Language Tools</a></font></td></tr></table></form><br><font size=-1><a href="/aclk?sa=L&ai=CqVchLbNASrv7IZa68gS13KTwAc3__IMB29PoogzB2ZzZExABIMFUUK_O0JX______wFgyQaqBAlP0BcDOBRYhqw&num=1&sig=AGiWqty21CD7ixNXZILwCnH7c_3n9v2-tg&q=http://www.allforgood.org#source=hpp">Find an opportunity to volunteer</a> in your community today.</font><br><br><br><font size=-1><a href="/intl/en/ads/">Advertising&nbsp;Programs</a> - <a href="/services/">Business Solutions</a> - <a href="/intl/en/about.html">About Google</a></font><p><font size=-2>&copy;2009 - <a href="/intl/en/privacy.html">Privacy</a></font></p></center><div id=xjsd></div><div id=xjsi><script>if(google.y)google.y.first=[];if(google.y)google.y.first=[];google.dstr=[];google.rein=[];window.setTimeout(function(){var a=document.createElement("script");a.src="/extern_js/f/CgJlbhICdXMgACswCjggQAgsKzAOOAUsKzAYOAQsKzAlOMmIASwrMCY4BCwrMCc4ACw/1t0T7hspHT4.js";(document.getElementById("xjsd")||document.body).appendChild(a)},0);
;google.y.first.push(function(){google.ac.i(document.f,document.f.q,'','')});google.xjs&&google.j&&google.j.xi&&google.j.xi()</script></div><script>(function(){
function a(){google.timers.load.t.ol=(new Date).getTime();google.report&&google.report(google.timers.load,{ei:google.kEI,e:google.kCSIE})}if(window.addEventListener)window.addEventListener("load",a,false);else if(window.attachEvent)window.attachEvent("onload",a);google.timers.load.t.prt=(new Date).getTime();
})();
</script>

View File

@ -0,0 +1,2 @@
<?xml version='1.0' encoding='iso-8859-7'?>
<root/>

View File

@ -0,0 +1,210 @@
#include <gcj/cni.h>
#include <java/io/ByteArrayInputStream.h>
#include <java/lang/System.h>
#include <java/lang/Throwable.h>
#include <java/util/ArrayList.h>
#include <javax/xml/xpath/XPath.h>
#include <javax/xml/xpath/XPathFactory.h>
#include <javax/xml/xpath/XPathExpression.h>
#include <javax/xml/xpath/XPathConstants.h>
#include <javax/xml/parsers/DocumentBuilderFactory.h>
#include <javax/xml/parsers/DocumentBuilder.h>
#include <org/w3c/dom/Attr.h>
#include <org/w3c/dom/Document.h>
#include <org/w3c/dom/Element.h>
#include <org/w3c/dom/NodeList.h>
#include <org/w3c/dom/NamedNodeMap.h>
#include <org/xml/sax/InputSource.h>
#include "nu/validator/htmlparser/dom/HtmlDocumentBuilder.h"
#include "DomUtils.h"
#include "ruby.h"
using namespace java::io;
using namespace java::lang;
using namespace java::util;
using namespace javax::xml::parsers;
using namespace javax::xml::xpath;
using namespace nu::validator::htmlparser::dom;
using namespace org::w3c::dom;
using namespace org::xml::sax;
static VALUE jaxp_Document;
static VALUE jaxp_Attr;
static VALUE jaxp_Element;
static ID ID_read;
static ID ID_doc;
static ID ID_element;
// convert a Java string into a Ruby string
static VALUE j2r(String *string) {
if (string == NULL) return Qnil;
jint len = JvGetStringUTFLength(string);
char buf[len];
JvGetStringUTFRegion(string, 0, len, buf);
return rb_str_new(buf, len);
}
// convert a Ruby string into a Java string
static String *r2j(VALUE string) {
return JvNewStringUTF(RSTRING(string)->ptr);
}
// release the Java Document associated with this Ruby Document
static void vnu_document_free(Document *doc) {
DomUtils::unpin(doc);
}
// Nu::Validator::parse( string|file )
static VALUE vnu_parse(VALUE self, VALUE input) {
HtmlDocumentBuilder *parser = new HtmlDocumentBuilder();
// read file-like objects into memory. TODO: buffer such objects
if (rb_respond_to(input, ID_read))
input = rb_funcall(input, ID_read, 0);
// convert input in to a ByteArrayInputStream
jbyteArray bytes = JvNewByteArray(RSTRING(input)->len);
memcpy(elements(bytes), RSTRING(input)->ptr, RSTRING(input)->len);
InputSource *source = new InputSource(new ByteArrayInputStream(bytes));
// parse, pin, and wrap
Document *doc = parser->parse(source);
DomUtils::pin(doc);
return Data_Wrap_Struct(jaxp_Document, NULL, vnu_document_free, doc);
}
// Jaxp::parse( string|file )
static VALUE jaxp_parse(VALUE self, VALUE input) {
DocumentBuilderFactory *factory = DocumentBuilderFactory::newInstance();
DocumentBuilder *parser = factory->newDocumentBuilder();
// read file-like objects into memory. TODO: buffer such objects
if (rb_respond_to(input, ID_read))
input = rb_funcall(input, ID_read, 0);
try {
jbyteArray bytes = JvNewByteArray(RSTRING(input)->len);
memcpy(elements(bytes), RSTRING(input)->ptr, RSTRING(input)->len);
Document *doc = parser->parse(new ByteArrayInputStream(bytes));
DomUtils::pin(doc);
return Data_Wrap_Struct(jaxp_Document, NULL, vnu_document_free, doc);
} catch (java::lang::Throwable *ex) {
ex->printStackTrace();
return Qnil;
}
}
// Nu::Validator::Document#encoding
static VALUE jaxp_document_encoding(VALUE rdoc) {
Document *jdoc;
Data_Get_Struct(rdoc, Document, jdoc);
return j2r(jdoc->getXmlEncoding());
}
// Nu::Validator::Document#root
static VALUE jaxp_document_root(VALUE rdoc) {
Document *jdoc;
Data_Get_Struct(rdoc, Document, jdoc);
Element *jelement = jdoc->getDocumentElement();
if (jelement==NULL) return Qnil;
VALUE relement = Data_Wrap_Struct(jaxp_Element, NULL, NULL, jelement);
rb_ivar_set(relement, ID_doc, rdoc);
return relement;
}
// Nu::Validator::Document#xpath
static VALUE jaxp_document_xpath(VALUE rdoc, VALUE path) {
Document *jdoc;
Data_Get_Struct(rdoc, Document, jdoc);
Element *jelement = jdoc->getDocumentElement();
if (jelement==NULL) return Qnil;
XPath *xpath = XPathFactory::newInstance()->newXPath();
XPathExpression *expr = xpath->compile(r2j(path));
NodeList *list = (NodeList*) expr->evaluate(jdoc, XPathConstants::NODESET);
VALUE result = rb_ary_new();
for (int i=0; i<list->getLength(); i++) {
VALUE relement = Data_Wrap_Struct(jaxp_Element, NULL, NULL, list->item(i));
rb_ivar_set(relement, ID_doc, rdoc);
rb_ary_push(result, relement);
}
return result;
}
// Nu::Validator::Element#name
static VALUE jaxp_element_name(VALUE relement) {
Element *jelement;
Data_Get_Struct(relement, Element, jelement);
return j2r(jelement->getNodeName());
}
// Nu::Validator::Element#attributes
static VALUE jaxp_element_attributes(VALUE relement) {
Element *jelement;
Data_Get_Struct(relement, Element, jelement);
VALUE result = rb_hash_new();
NamedNodeMap *map = jelement->getAttributes();
for (int i=0; i<map->getLength(); i++) {
Attr *jattr = (Attr *) map->item(i);
VALUE rattr = Data_Wrap_Struct(jaxp_Attr, NULL, NULL, jattr);
rb_ivar_set(rattr, ID_element, relement);
rb_hash_aset(result, j2r(jattr->getName()), rattr);
}
return result;
}
// Nu::Validator::Attribute#value
static VALUE jaxp_attribute_value(VALUE rattribute) {
Attr *jattribute;
Data_Get_Struct(rattribute, Attr, jattribute);
return j2r(jattribute->getValue());
}
typedef VALUE (ruby_method)(...);
// Nu::Validator module initialization
extern "C" void Init_validator() {
JvCreateJavaVM(NULL);
JvAttachCurrentThread(NULL, NULL);
JvInitClass(&DomUtils::class$);
JvInitClass(&XPathFactory::class$);
JvInitClass(&XPathConstants::class$);
VALUE jaxp = rb_define_module("Jaxp");
rb_define_singleton_method(jaxp, "parse", (ruby_method*)&jaxp_parse, 1);
VALUE nu = rb_define_module("Nu");
VALUE validator = rb_define_module_under(nu, "Validator");
rb_define_singleton_method(validator, "parse", (ruby_method*)&vnu_parse, 1);
jaxp_Document = rb_define_class_under(jaxp, "Document", rb_cObject);
rb_define_method(jaxp_Document, "encoding",
(ruby_method*)&jaxp_document_encoding, 0);
rb_define_method(jaxp_Document, "root",
(ruby_method*)&jaxp_document_root, 0);
rb_define_method(jaxp_Document, "xpath",
(ruby_method*)&jaxp_document_xpath, 1);
jaxp_Element = rb_define_class_under(jaxp, "Element", rb_cObject);
rb_define_method(jaxp_Element, "name",
(ruby_method*)&jaxp_element_name, 0);
rb_define_method(jaxp_Element, "attributes",
(ruby_method*)&jaxp_element_attributes, 0);
jaxp_Attr = rb_define_class_under(jaxp, "Attr", rb_cObject);
rb_define_method(jaxp_Attr, "value",
(ruby_method*)&jaxp_attribute_value, 0);
ID_read = rb_intern("read");
ID_doc = rb_intern("@doc");
ID_element = rb_intern("@element");
}

View File

@ -0,0 +1,59 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Big5 extends Encoding {
private static final String[] LABELS = {
"big5",
"big5-hkscs",
"cn-big5",
"csbig5",
"x-x-big5"
};
private static final String NAME = "big5";
static final Big5 INSTANCE = new Big5();
private Big5() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new Big5Decoder(this);
}
@Override public CharsetEncoder newEncoder() {
return new Big5Encoder(this);
}
}

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,184 @@
/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CoderResult;
public class Big5Decoder extends Decoder {
private int big5Lead = 0;
private char pendingTrail = '\u0000';
protected Big5Decoder(Charset cs) {
super(cs, 0.5f, 1.0f);
}
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
assert !(this.report && (big5Lead != 0)):
"When reporting, this method should never return with big5Lead set.";
if (pendingTrail != '\u0000') {
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
out.put(pendingTrail);
pendingTrail = '\u0000';
}
for (;;) {
if (!in.hasRemaining()) {
return CoderResult.UNDERFLOW;
}
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
int b = ((int) in.get() & 0xFF);
if (big5Lead == 0) {
if (b <= 0x7F) {
out.put((char) b);
continue;
}
if (b >= 0x81 && b <= 0xFE) {
if (this.report && !in.hasRemaining()) {
// The Java API is badly documented. Need to do this
// crazy thing and hope the caller knows about the
// undocumented aspects of the API!
in.position(in.position() - 1);
return CoderResult.UNDERFLOW;
}
big5Lead = b;
continue;
}
if (this.report) {
in.position(in.position() - 1);
return CoderResult.malformedForLength(1);
}
out.put('\uFFFD');
continue;
}
int lead = big5Lead;
big5Lead = 0;
int offset = (b < 0x7F) ? 0x40 : 0x62;
if ((b >= 0x40 && b <= 0x7E) || (b >= 0xA1 && b <= 0xFE)) {
int pointer = (lead - 0x81) * 157 + (b - offset);
char outTrail;
switch (pointer) {
case 1133:
out.put('\u00CA');
outTrail = '\u0304';
break;
case 1135:
out.put('\u00CA');
outTrail = '\u030C';
break;
case 1164:
out.put('\u00EA');
outTrail = '\u0304';
break;
case 1166:
out.put('\u00EA');
outTrail = '\u030C';
break;
default:
char lowBits = Big5Data.lowBits(pointer);
if (lowBits == '\u0000') {
// The following |if| block fixes
// https://github.com/whatwg/encoding/issues/5
if (b <= 0x7F) {
// prepend byte to stream
// Always legal, since we've always just read a byte
// if we come here.
in.position(in.position() - 1);
}
if (this.report) {
// This can go past the start of the buffer
// if the caller does not conform to the
// undocumented aspects of the API.
in.position(in.position() - 1);
return CoderResult.malformedForLength(b <= 0x7F ? 1 : 2);
}
out.put('\uFFFD');
continue;
}
if (Big5Data.isAstral(pointer)) {
int codePoint = lowBits | 0x20000;
out.put((char) (0xD7C0 + (codePoint >> 10)));
outTrail = (char) (0xDC00 + (codePoint & 0x3FF));
break;
}
out.put(lowBits);
continue;
}
if (!out.hasRemaining()) {
pendingTrail = outTrail;
return CoderResult.OVERFLOW;
}
out.put(outTrail);
continue;
}
// pointer is null
if (b <= 0x7F) {
// prepend byte to stream
// Always legal, since we've always just read a byte
// if we come here.
in.position(in.position() - 1);
}
if (this.report) {
// if position() == 0, the caller is not using the
// undocumented part of the API right and the line
// below will throw!
in.position(in.position() - 1);
return CoderResult.malformedForLength(b <= 0x7F ? 1 : 2);
}
out.put('\uFFFD');
continue;
}
}
@Override protected CoderResult implFlush(CharBuffer out) {
if (pendingTrail != '\u0000') {
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
out.put(pendingTrail);
pendingTrail = '\u0000';
}
if (big5Lead != 0) {
assert !this.report: "How come big5Lead got to be non-zero when decodeLoop() returned in the reporting mode?";
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
out.put('\uFFFD');
big5Lead = 0;
}
return CoderResult.UNDERFLOW;
}
@Override protected void implReset() {
big5Lead = 0;
pendingTrail = '\u0000';
}
}

View File

@ -0,0 +1,185 @@
/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CoderResult;
public class Big5Encoder extends Encoder {
private char utf16Lead = '\u0000';
private byte pendingTrail = 0;
protected Big5Encoder(Charset cs) {
super(cs, 1.5f, 2.0f);
}
@Override protected CoderResult encodeLoop(CharBuffer in, ByteBuffer out) {
assert !((this.reportMalformed || this.reportUnmappable) && (utf16Lead != '\u0000')):
"When reporting, this method should never return with utf16Lead set.";
if (pendingTrail != 0) {
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
out.put(pendingTrail);
pendingTrail = 0;
}
for (;;) {
if (!in.hasRemaining()) {
return CoderResult.UNDERFLOW;
}
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
boolean isAstral; // true means Plane 2, false means BMP
char lowBits; // The low 16 bits of the code point
char codeUnit = in.get();
int highBits = (codeUnit & 0xFC00);
if (highBits == 0xD800) {
// high surrogate
if (utf16Lead != '\u0000') {
// High surrogate follows another high surrogate. The
// *previous* code unit is in error.
if (this.reportMalformed) {
// The caller had better adhere to the API contract.
// Otherwise, this may throw.
in.position(in.position() - 2);
utf16Lead = '\u0000';
return CoderResult.malformedForLength(1);
}
out.put((byte) '?');
}
utf16Lead = codeUnit;
continue;
}
if (highBits == 0xDC00) {
// low surrogate
if (utf16Lead == '\u0000') {
// Got low surrogate without a previous high surrogate
if (this.reportMalformed) {
in.position(in.position() - 1);
return CoderResult.malformedForLength(1);
}
out.put((byte) '?');
continue;
}
int codePoint = (utf16Lead << 10) + codeUnit - 56613888;
utf16Lead = '\u0000';
// Plane 2 is the only astral plane that has potentially
// Big5-encodable characters.
if ((0xFF0000 & codePoint) != 0x20000) {
if (this.reportUnmappable) {
in.position(in.position() - 2);
return CoderResult.unmappableForLength(2);
}
out.put((byte) '?');
continue;
}
isAstral = true;
lowBits = (char)(codePoint & 0xFFFF);
} else {
// not a surrogate
if (utf16Lead != '\u0000') {
// Non-surrogate follows a high surrogate. The *previous*
// code unit is in error.
utf16Lead = '\u0000';
if (this.reportMalformed) {
// The caller had better adhere to the API contract.
// Otherwise, this may throw.
in.position(in.position() - 2);
return CoderResult.malformedForLength(1);
}
out.put((byte) '?');
// Let's unconsume this code unit and reloop in order to
// re-check if the output buffer still has space.
in.position(in.position() - 1);
continue;
}
isAstral = false;
lowBits = codeUnit;
}
// isAstral now tells us if we have a Plane 2 or a BMP character.
// lowBits tells us the low 16 bits.
// After all the above setup to deal with UTF-16, we are now
// finally ready to follow the spec.
if (!isAstral && lowBits <= 0x7F) {
out.put((byte)lowBits);
continue;
}
int pointer = Big5Data.findPointer(lowBits, isAstral);
if (pointer == 0) {
if (this.reportUnmappable) {
if (isAstral) {
in.position(in.position() - 2);
return CoderResult.unmappableForLength(2);
}
in.position(in.position() - 1);
return CoderResult.unmappableForLength(1);
}
out.put((byte)'?');
continue;
}
int lead = pointer / 157 + 0x81;
int trail = pointer % 157;
if (trail < 0x3F) {
trail += 0x40;
} else {
trail += 0x62;
}
out.put((byte)lead);
if (!out.hasRemaining()) {
pendingTrail = (byte)trail;
return CoderResult.OVERFLOW;
}
out.put((byte)trail);
continue;
}
}
@Override protected CoderResult implFlush(ByteBuffer out) {
if (pendingTrail != 0) {
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
out.put(pendingTrail);
pendingTrail = 0;
}
if (utf16Lead != '\u0000') {
assert !this.reportMalformed: "How come utf16Lead got to be non-zero when decodeLoop() returned in the reporting mode?";
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
out.put((byte)'?');
utf16Lead = '\u0000';
}
return CoderResult.UNDERFLOW;
}
@Override protected void implReset() {
utf16Lead = '\u0000';
pendingTrail = 0;
}
}

View File

@ -0,0 +1,80 @@
/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CodingErrorAction;
public abstract class Decoder extends CharsetDecoder {
protected boolean report = true;
protected Decoder(Charset cs, float averageCharsPerByte, float maxCharsPerByte) {
super(cs, averageCharsPerByte, maxCharsPerByte);
}
@Override protected final void implOnMalformedInput(CodingErrorAction newAction) {
if (newAction == null) {
throw new IllegalArgumentException("The argument must not be null.");
}
if (newAction == CodingErrorAction.IGNORE) {
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
}
if (newAction == CodingErrorAction.REPLACE) {
this.report = false;
return;
}
if (newAction == CodingErrorAction.REPORT) {
this.report = true;
return;
}
assert false: "Unreachable.";
throw new IllegalArgumentException("Unknown CodingErrorAction.");
}
@Override protected final void implOnUnmappableCharacter(
CodingErrorAction newAction) {
if (newAction == null) {
throw new IllegalArgumentException("The argument must not be null.");
}
if (newAction == CodingErrorAction.IGNORE) {
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
}
if (newAction == CodingErrorAction.REPLACE) {
return; // We don't actually care, since there are no unmappables.
}
if (newAction == CodingErrorAction.REPORT) {
return; // We don't actually care, since there are no unmappables.
}
assert false: "Unreachable.";
throw new IllegalArgumentException("Unknown CodingErrorAction.");
}
@Override protected final void implReplaceWith(String newReplacement) {
if (!"\uFFFD".equals(newReplacement)) {
throw new IllegalArgumentException("Only U+FFFD is allowed as the replacement.");
}
}
// TODO: Check if the JDK decoders reset the reporting state on reset()
}

View File

@ -0,0 +1,95 @@
/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.CodingErrorAction;
public abstract class Encoder extends CharsetEncoder {
boolean reportMalformed = true;
boolean reportUnmappable = true;
protected Encoder(Charset cs, float averageBytesPerChar,
float maxBytesPerChar) {
super(cs, averageBytesPerChar, maxBytesPerChar);
}
@Override protected final void implOnMalformedInput(CodingErrorAction newAction) {
if (newAction == null) {
throw new IllegalArgumentException("The argument must not be null.");
}
if (newAction == CodingErrorAction.IGNORE) {
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
}
if (newAction == CodingErrorAction.REPLACE) {
this.reportMalformed = false;
return;
}
if (newAction == CodingErrorAction.REPORT) {
this.reportUnmappable = true;
return;
}
assert false: "Unreachable.";
throw new IllegalArgumentException("Unknown CodingErrorAction.");
}
@Override protected final void implOnUnmappableCharacter(
CodingErrorAction newAction) {
if (newAction == null) {
throw new IllegalArgumentException("The argument must not be null.");
}
if (newAction == CodingErrorAction.IGNORE) {
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
}
if (newAction == CodingErrorAction.REPLACE) {
this.reportUnmappable = false;
return;
}
if (newAction == CodingErrorAction.REPORT) {
this.reportMalformed = true;
return;
}
assert false: "Unreachable.";
throw new IllegalArgumentException("Unknown CodingErrorAction.");
}
@Override public boolean isLegalReplacement(byte[] repl) {
if (repl == null) {
return false;
}
if (repl.length != 1) {
return false;
}
if (repl[0] != '?') {
return false;
}
return true;
}
@Override protected final void implReplaceWith(byte[] newReplacement) {
}
}

View File

@ -0,0 +1,886 @@
/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.IllegalCharsetNameException;
import java.nio.charset.UnsupportedCharsetException;
import java.nio.charset.spi.CharsetProvider;
import java.util.Arrays;
import java.util.Collections;
import java.util.SortedMap;
import java.util.TreeMap;
/**
* Represents an <a href="https://encoding.spec.whatwg.org/#encoding">encoding</a>
* as defined in the <a href="https://encoding.spec.whatwg.org/">Encoding
* Standard</a>, provides access to each encoding defined in the Encoding
* Standard via a static constant and provides the
* "<a href="https://encoding.spec.whatwg.org/#concept-encoding-get">get an
* encoding</a>" algorithm defined in the Encoding Standard.
*
* <p>This class inherits from {@link Charset} to allow the Encoding
* Standard-compliant encodings to be used in contexts that support
* <code>Charset</code> instances. However, by design, the Encoding
* Standard-compliant encodings are not supplied via a {@link CharsetProvider}
* and, therefore, are not available via and do not interfere with the static
* methods provided by <code>Charset</code>. (This class provides methods of
* the same name to hide each static method of <code>Charset</code> to help
* avoid accidental calls to the static methods of the superclass when working
* with Encoding Standard-compliant encodings.)
*
* <p>When an application needs to use a particular encoding, such as utf-8
* or windows-1252, the corresponding constant, i.e.
* {@link #UTF_8 Encoding.UTF_8} and {@link #WINDOWS_1252 Encoding.WINDOWS_1252}
* respectively, should be used. However, when the application receives an
* encoding label from external input, the method {@link #forName(String)
* forName()} should be used to obtain the object representing the encoding
* identified by the label. In contexts where labels that map to the
* <a href="https://encoding.spec.whatwg.org/#replacement">replacement
* encoding</a> should be treated as unknown, the method {@link
* #forNameNoReplacement(String) forNameNoReplacement()} should be used instead.
*
*
* @author hsivonen
*/
public abstract class Encoding extends Charset {
private static final String[] LABELS = {
"866",
"ansi_x3.4-1968",
"arabic",
"ascii",
"asmo-708",
"big5",
"big5-hkscs",
"chinese",
"cn-big5",
"cp1250",
"cp1251",
"cp1252",
"cp1253",
"cp1254",
"cp1255",
"cp1256",
"cp1257",
"cp1258",
"cp819",
"cp866",
"csbig5",
"cseuckr",
"cseucpkdfmtjapanese",
"csgb2312",
"csibm866",
"csiso2022jp",
"csiso2022kr",
"csiso58gb231280",
"csiso88596e",
"csiso88596i",
"csiso88598e",
"csiso88598i",
"csisolatin1",
"csisolatin2",
"csisolatin3",
"csisolatin4",
"csisolatin5",
"csisolatin6",
"csisolatin9",
"csisolatinarabic",
"csisolatincyrillic",
"csisolatingreek",
"csisolatinhebrew",
"cskoi8r",
"csksc56011987",
"csmacintosh",
"csshiftjis",
"cyrillic",
"dos-874",
"ecma-114",
"ecma-118",
"elot_928",
"euc-jp",
"euc-kr",
"gb18030",
"gb2312",
"gb_2312",
"gb_2312-80",
"gbk",
"greek",
"greek8",
"hebrew",
"hz-gb-2312",
"ibm819",
"ibm866",
"iso-2022-cn",
"iso-2022-cn-ext",
"iso-2022-jp",
"iso-2022-kr",
"iso-8859-1",
"iso-8859-10",
"iso-8859-11",
"iso-8859-13",
"iso-8859-14",
"iso-8859-15",
"iso-8859-16",
"iso-8859-2",
"iso-8859-3",
"iso-8859-4",
"iso-8859-5",
"iso-8859-6",
"iso-8859-6-e",
"iso-8859-6-i",
"iso-8859-7",
"iso-8859-8",
"iso-8859-8-e",
"iso-8859-8-i",
"iso-8859-9",
"iso-ir-100",
"iso-ir-101",
"iso-ir-109",
"iso-ir-110",
"iso-ir-126",
"iso-ir-127",
"iso-ir-138",
"iso-ir-144",
"iso-ir-148",
"iso-ir-149",
"iso-ir-157",
"iso-ir-58",
"iso8859-1",
"iso8859-10",
"iso8859-11",
"iso8859-13",
"iso8859-14",
"iso8859-15",
"iso8859-2",
"iso8859-3",
"iso8859-4",
"iso8859-5",
"iso8859-6",
"iso8859-7",
"iso8859-8",
"iso8859-9",
"iso88591",
"iso885910",
"iso885911",
"iso885913",
"iso885914",
"iso885915",
"iso88592",
"iso88593",
"iso88594",
"iso88595",
"iso88596",
"iso88597",
"iso88598",
"iso88599",
"iso_8859-1",
"iso_8859-15",
"iso_8859-1:1987",
"iso_8859-2",
"iso_8859-2:1987",
"iso_8859-3",
"iso_8859-3:1988",
"iso_8859-4",
"iso_8859-4:1988",
"iso_8859-5",
"iso_8859-5:1988",
"iso_8859-6",
"iso_8859-6:1987",
"iso_8859-7",
"iso_8859-7:1987",
"iso_8859-8",
"iso_8859-8:1988",
"iso_8859-9",
"iso_8859-9:1989",
"koi",
"koi8",
"koi8-r",
"koi8-ru",
"koi8-u",
"koi8_r",
"korean",
"ks_c_5601-1987",
"ks_c_5601-1989",
"ksc5601",
"ksc_5601",
"l1",
"l2",
"l3",
"l4",
"l5",
"l6",
"l9",
"latin1",
"latin2",
"latin3",
"latin4",
"latin5",
"latin6",
"logical",
"mac",
"macintosh",
"ms932",
"ms_kanji",
"shift-jis",
"shift_jis",
"sjis",
"sun_eu_greek",
"tis-620",
"unicode-1-1-utf-8",
"us-ascii",
"utf-16",
"utf-16be",
"utf-16le",
"utf-8",
"utf8",
"visual",
"windows-1250",
"windows-1251",
"windows-1252",
"windows-1253",
"windows-1254",
"windows-1255",
"windows-1256",
"windows-1257",
"windows-1258",
"windows-31j",
"windows-874",
"windows-949",
"x-cp1250",
"x-cp1251",
"x-cp1252",
"x-cp1253",
"x-cp1254",
"x-cp1255",
"x-cp1256",
"x-cp1257",
"x-cp1258",
"x-euc-jp",
"x-gbk",
"x-mac-cyrillic",
"x-mac-roman",
"x-mac-ukrainian",
"x-sjis",
"x-user-defined",
"x-x-big5",
};
private static final Encoding[] ENCODINGS_FOR_LABELS = {
Ibm866.INSTANCE,
Windows1252.INSTANCE,
Iso6.INSTANCE,
Windows1252.INSTANCE,
Iso6.INSTANCE,
Big5.INSTANCE,
Big5.INSTANCE,
Gbk.INSTANCE,
Big5.INSTANCE,
Windows1250.INSTANCE,
Windows1251.INSTANCE,
Windows1252.INSTANCE,
Windows1253.INSTANCE,
Windows1254.INSTANCE,
Windows1255.INSTANCE,
Windows1256.INSTANCE,
Windows1257.INSTANCE,
Windows1258.INSTANCE,
Windows1252.INSTANCE,
Ibm866.INSTANCE,
Big5.INSTANCE,
EucKr.INSTANCE,
EucJp.INSTANCE,
Gbk.INSTANCE,
Ibm866.INSTANCE,
Iso2022Jp.INSTANCE,
Replacement.INSTANCE,
Gbk.INSTANCE,
Iso6.INSTANCE,
Iso6.INSTANCE,
Iso8.INSTANCE,
Iso8I.INSTANCE,
Windows1252.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Windows1254.INSTANCE,
Iso10.INSTANCE,
Iso15.INSTANCE,
Iso6.INSTANCE,
Iso5.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Koi8R.INSTANCE,
EucKr.INSTANCE,
Macintosh.INSTANCE,
ShiftJis.INSTANCE,
Iso5.INSTANCE,
Windows874.INSTANCE,
Iso6.INSTANCE,
Iso7.INSTANCE,
Iso7.INSTANCE,
EucJp.INSTANCE,
EucKr.INSTANCE,
Gb18030.INSTANCE,
Gbk.INSTANCE,
Gbk.INSTANCE,
Gbk.INSTANCE,
Gbk.INSTANCE,
Iso7.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Replacement.INSTANCE,
Windows1252.INSTANCE,
Ibm866.INSTANCE,
Replacement.INSTANCE,
Replacement.INSTANCE,
Iso2022Jp.INSTANCE,
Replacement.INSTANCE,
Windows1252.INSTANCE,
Iso10.INSTANCE,
Windows874.INSTANCE,
Iso13.INSTANCE,
Iso14.INSTANCE,
Iso15.INSTANCE,
Iso16.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Iso5.INSTANCE,
Iso6.INSTANCE,
Iso6.INSTANCE,
Iso6.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Iso8.INSTANCE,
Iso8I.INSTANCE,
Windows1254.INSTANCE,
Windows1252.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Iso7.INSTANCE,
Iso6.INSTANCE,
Iso8.INSTANCE,
Iso5.INSTANCE,
Windows1254.INSTANCE,
EucKr.INSTANCE,
Iso10.INSTANCE,
Gbk.INSTANCE,
Windows1252.INSTANCE,
Iso10.INSTANCE,
Windows874.INSTANCE,
Iso13.INSTANCE,
Iso14.INSTANCE,
Iso15.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Iso5.INSTANCE,
Iso6.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Windows1254.INSTANCE,
Windows1252.INSTANCE,
Iso10.INSTANCE,
Windows874.INSTANCE,
Iso13.INSTANCE,
Iso14.INSTANCE,
Iso15.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Iso5.INSTANCE,
Iso6.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Windows1254.INSTANCE,
Windows1252.INSTANCE,
Iso15.INSTANCE,
Windows1252.INSTANCE,
Iso2.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Iso4.INSTANCE,
Iso5.INSTANCE,
Iso5.INSTANCE,
Iso6.INSTANCE,
Iso6.INSTANCE,
Iso7.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Iso8.INSTANCE,
Windows1254.INSTANCE,
Windows1254.INSTANCE,
Koi8R.INSTANCE,
Koi8R.INSTANCE,
Koi8R.INSTANCE,
Koi8U.INSTANCE,
Koi8U.INSTANCE,
Koi8R.INSTANCE,
EucKr.INSTANCE,
EucKr.INSTANCE,
EucKr.INSTANCE,
EucKr.INSTANCE,
EucKr.INSTANCE,
Windows1252.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Windows1254.INSTANCE,
Iso10.INSTANCE,
Iso15.INSTANCE,
Windows1252.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Windows1254.INSTANCE,
Iso10.INSTANCE,
Iso8I.INSTANCE,
Macintosh.INSTANCE,
Macintosh.INSTANCE,
ShiftJis.INSTANCE,
ShiftJis.INSTANCE,
ShiftJis.INSTANCE,
ShiftJis.INSTANCE,
ShiftJis.INSTANCE,
Iso7.INSTANCE,
Windows874.INSTANCE,
Utf8.INSTANCE,
Windows1252.INSTANCE,
Utf16Le.INSTANCE,
Utf16Be.INSTANCE,
Utf16Le.INSTANCE,
Utf8.INSTANCE,
Utf8.INSTANCE,
Iso8.INSTANCE,
Windows1250.INSTANCE,
Windows1251.INSTANCE,
Windows1252.INSTANCE,
Windows1253.INSTANCE,
Windows1254.INSTANCE,
Windows1255.INSTANCE,
Windows1256.INSTANCE,
Windows1257.INSTANCE,
Windows1258.INSTANCE,
ShiftJis.INSTANCE,
Windows874.INSTANCE,
EucKr.INSTANCE,
Windows1250.INSTANCE,
Windows1251.INSTANCE,
Windows1252.INSTANCE,
Windows1253.INSTANCE,
Windows1254.INSTANCE,
Windows1255.INSTANCE,
Windows1256.INSTANCE,
Windows1257.INSTANCE,
Windows1258.INSTANCE,
EucJp.INSTANCE,
Gbk.INSTANCE,
MacCyrillic.INSTANCE,
Macintosh.INSTANCE,
MacCyrillic.INSTANCE,
ShiftJis.INSTANCE,
UserDefined.INSTANCE,
Big5.INSTANCE,
};
private static final Encoding[] ENCODINGS = {
Big5.INSTANCE,
EucJp.INSTANCE,
EucKr.INSTANCE,
Gb18030.INSTANCE,
Gbk.INSTANCE,
Ibm866.INSTANCE,
Iso2022Jp.INSTANCE,
Iso10.INSTANCE,
Iso13.INSTANCE,
Iso14.INSTANCE,
Iso15.INSTANCE,
Iso16.INSTANCE,
Iso2.INSTANCE,
Iso3.INSTANCE,
Iso4.INSTANCE,
Iso5.INSTANCE,
Iso6.INSTANCE,
Iso7.INSTANCE,
Iso8.INSTANCE,
Iso8I.INSTANCE,
Koi8R.INSTANCE,
Koi8U.INSTANCE,
Macintosh.INSTANCE,
Replacement.INSTANCE,
ShiftJis.INSTANCE,
Utf16Be.INSTANCE,
Utf16Le.INSTANCE,
Utf8.INSTANCE,
Windows1250.INSTANCE,
Windows1251.INSTANCE,
Windows1252.INSTANCE,
Windows1253.INSTANCE,
Windows1254.INSTANCE,
Windows1255.INSTANCE,
Windows1256.INSTANCE,
Windows1257.INSTANCE,
Windows1258.INSTANCE,
Windows874.INSTANCE,
MacCyrillic.INSTANCE,
UserDefined.INSTANCE,
};
/**
* The big5 encoding.
*/
public static final Encoding BIG5 = Big5.INSTANCE;
/**
* The euc-jp encoding.
*/
public static final Encoding EUC_JP = EucJp.INSTANCE;
/**
* The euc-kr encoding.
*/
public static final Encoding EUC_KR = EucKr.INSTANCE;
/**
* The gb18030 encoding.
*/
public static final Encoding GB18030 = Gb18030.INSTANCE;
/**
* The gbk encoding.
*/
public static final Encoding GBK = Gbk.INSTANCE;
/**
* The ibm866 encoding.
*/
public static final Encoding IBM866 = Ibm866.INSTANCE;
/**
* The iso-2022-jp encoding.
*/
public static final Encoding ISO_2022_JP = Iso2022Jp.INSTANCE;
/**
* The iso-8859-10 encoding.
*/
public static final Encoding ISO_8859_10 = Iso10.INSTANCE;
/**
* The iso-8859-13 encoding.
*/
public static final Encoding ISO_8859_13 = Iso13.INSTANCE;
/**
* The iso-8859-14 encoding.
*/
public static final Encoding ISO_8859_14 = Iso14.INSTANCE;
/**
* The iso-8859-15 encoding.
*/
public static final Encoding ISO_8859_15 = Iso15.INSTANCE;
/**
* The iso-8859-16 encoding.
*/
public static final Encoding ISO_8859_16 = Iso16.INSTANCE;
/**
* The iso-8859-2 encoding.
*/
public static final Encoding ISO_8859_2 = Iso2.INSTANCE;
/**
* The iso-8859-3 encoding.
*/
public static final Encoding ISO_8859_3 = Iso3.INSTANCE;
/**
* The iso-8859-4 encoding.
*/
public static final Encoding ISO_8859_4 = Iso4.INSTANCE;
/**
* The iso-8859-5 encoding.
*/
public static final Encoding ISO_8859_5 = Iso5.INSTANCE;
/**
* The iso-8859-6 encoding.
*/
public static final Encoding ISO_8859_6 = Iso6.INSTANCE;
/**
* The iso-8859-7 encoding.
*/
public static final Encoding ISO_8859_7 = Iso7.INSTANCE;
/**
* The iso-8859-8 encoding.
*/
public static final Encoding ISO_8859_8 = Iso8.INSTANCE;
/**
* The iso-8859-8-i encoding.
*/
public static final Encoding ISO_8859_8_I = Iso8I.INSTANCE;
/**
* The koi8-r encoding.
*/
public static final Encoding KOI8_R = Koi8R.INSTANCE;
/**
* The koi8-u encoding.
*/
public static final Encoding KOI8_U = Koi8U.INSTANCE;
/**
* The macintosh encoding.
*/
public static final Encoding MACINTOSH = Macintosh.INSTANCE;
/**
* The replacement encoding.
*/
public static final Encoding REPLACEMENT = Replacement.INSTANCE;
/**
* The shift_jis encoding.
*/
public static final Encoding SHIFT_JIS = ShiftJis.INSTANCE;
/**
* The utf-16be encoding.
*/
public static final Encoding UTF_16BE = Utf16Be.INSTANCE;
/**
* The utf-16le encoding.
*/
public static final Encoding UTF_16LE = Utf16Le.INSTANCE;
/**
* The utf-8 encoding.
*/
public static final Encoding UTF_8 = Utf8.INSTANCE;
/**
* The windows-1250 encoding.
*/
public static final Encoding WINDOWS_1250 = Windows1250.INSTANCE;
/**
* The windows-1251 encoding.
*/
public static final Encoding WINDOWS_1251 = Windows1251.INSTANCE;
/**
* The windows-1252 encoding.
*/
public static final Encoding WINDOWS_1252 = Windows1252.INSTANCE;
/**
* The windows-1253 encoding.
*/
public static final Encoding WINDOWS_1253 = Windows1253.INSTANCE;
/**
* The windows-1254 encoding.
*/
public static final Encoding WINDOWS_1254 = Windows1254.INSTANCE;
/**
* The windows-1255 encoding.
*/
public static final Encoding WINDOWS_1255 = Windows1255.INSTANCE;
/**
* The windows-1256 encoding.
*/
public static final Encoding WINDOWS_1256 = Windows1256.INSTANCE;
/**
* The windows-1257 encoding.
*/
public static final Encoding WINDOWS_1257 = Windows1257.INSTANCE;
/**
* The windows-1258 encoding.
*/
public static final Encoding WINDOWS_1258 = Windows1258.INSTANCE;
/**
* The windows-874 encoding.
*/
public static final Encoding WINDOWS_874 = Windows874.INSTANCE;
/**
* The x-mac-cyrillic encoding.
*/
public static final Encoding X_MAC_CYRILLIC = MacCyrillic.INSTANCE;
/**
* The x-user-defined encoding.
*/
public static final Encoding X_USER_DEFINED = UserDefined.INSTANCE;
private static SortedMap<String, Charset> encodings = null;
protected Encoding(String canonicalName, String[] aliases) {
super(canonicalName, aliases);
}
private enum State {
HEAD, LABEL, TAIL
};
public static Encoding forName(String label) {
if (label == null) {
throw new IllegalArgumentException("Label must not be null.");
}
if (label.length() == 0) {
throw new IllegalCharsetNameException(label);
}
// First try the fast path
int index = Arrays.binarySearch(LABELS, label);
if (index >= 0) {
return ENCODINGS_FOR_LABELS[index];
}
// Else, slow path
StringBuilder sb = new StringBuilder();
State state = State.HEAD;
for (int i = 0; i < label.length(); i++) {
char c = label.charAt(i);
if ((c == ' ') || (c == '\n') || (c == '\r') || (c == '\t')
|| (c == '\u000C')) {
if (state == State.LABEL) {
state = State.TAIL;
}
continue;
}
if ((c >= 'a' && c <= 'z') || (c >= '0' && c <= '9')) {
switch (state) {
case HEAD:
state = State.LABEL;
// Fall through
case LABEL:
sb.append(c);
continue;
case TAIL:
throw new IllegalCharsetNameException(label);
}
}
if (c >= 'A' && c <= 'Z') {
c += 0x20;
switch (state) {
case HEAD:
state = State.LABEL;
// Fall through
case LABEL:
sb.append(c);
continue;
case TAIL:
throw new IllegalCharsetNameException(label);
}
}
if ((c == '-') || (c == '+') || (c == '.') || (c == ':')
|| (c == '_')) {
switch (state) {
case LABEL:
sb.append(c);
continue;
case HEAD:
case TAIL:
throw new IllegalCharsetNameException(label);
}
}
throw new IllegalCharsetNameException(label);
}
index = Arrays.binarySearch(LABELS, sb.toString());
if (index >= 0) {
return ENCODINGS_FOR_LABELS[index];
}
throw new UnsupportedCharsetException(label);
}
public static Encoding forNameNoReplacement(String label) {
Encoding encoding = Encoding.forName(label);
if (encoding == Encoding.REPLACEMENT) {
throw new UnsupportedCharsetException(label);
}
return encoding;
}
public static boolean isSupported(String label) {
try {
Encoding.forName(label);
} catch (UnsupportedCharsetException e) {
return false;
}
return true;
}
public static boolean isSupportedNoReplacement(String label) {
try {
Encoding.forNameNoReplacement(label);
} catch (UnsupportedCharsetException e) {
return false;
}
return true;
}
public static SortedMap<String, Charset> availableCharsets() {
if (encodings == null) {
TreeMap<String, Charset> map = new TreeMap<String, Charset>();
for (Encoding encoding : ENCODINGS) {
map.put(encoding.name(), encoding);
}
encodings = Collections.unmodifiableSortedMap(map);
}
return encodings;
}
public static Encoding defaultCharset() {
return WINDOWS_1252;
}
@Override public boolean canEncode() {
return false;
}
@Override public boolean contains(Charset cs) {
return false;
}
@Override public CharsetEncoder newEncoder() {
throw new UnsupportedOperationException("Encoder not implemented.");
}
}

View File

@ -0,0 +1,57 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class EucJp extends Encoding {
private static final String[] LABELS = {
"cseucpkdfmtjapanese",
"euc-jp",
"x-euc-jp"
};
private static final String NAME = "euc-jp";
static final EucJp INSTANCE = new EucJp();
private EucJp() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,64 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class EucKr extends Encoding {
private static final String[] LABELS = {
"cseuckr",
"csksc56011987",
"euc-kr",
"iso-ir-149",
"korean",
"ks_c_5601-1987",
"ks_c_5601-1989",
"ksc5601",
"ksc_5601",
"windows-949"
};
private static final String NAME = "euc-kr";
static final EucKr INSTANCE = new EucKr();
private EucKr() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,61 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CoderResult;
public final class FallibleSingleByteDecoder extends InfallibleSingleByteDecoder {
public FallibleSingleByteDecoder(Encoding cs, char[] upperHalf) {
super(cs, upperHalf);
}
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
if (!this.report) {
return super.decodeLoop(in, out);
} else {
for (;;) {
if (!in.hasRemaining()) {
return CoderResult.UNDERFLOW;
}
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
int b = (int) in.get();
if (b >= 0) {
out.put((char) b);
} else {
char mapped = this.upperHalf[b + 128];
if (mapped == '\uFFFD') {
in.position(in.position() - 1);
return CoderResult.malformedForLength(1);
}
out.put(mapped);
}
}
}
}
}

View File

@ -0,0 +1,55 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Gb18030 extends Encoding {
private static final String[] LABELS = {
"gb18030"
};
private static final String NAME = "gb18030";
static final Gb18030 INSTANCE = new Gb18030();
private Gb18030() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,63 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Gbk extends Encoding {
private static final String[] LABELS = {
"chinese",
"csgb2312",
"csiso58gb231280",
"gb2312",
"gb_2312",
"gb_2312-80",
"gbk",
"iso-ir-58",
"x-gbk"
};
private static final String NAME = "gbk";
static final Gbk INSTANCE = new Gbk();
private Gbk() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName("gb18030").newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,184 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Ibm866 extends Encoding {
private static final char[] TABLE = {
'\u0410',
'\u0411',
'\u0412',
'\u0413',
'\u0414',
'\u0415',
'\u0416',
'\u0417',
'\u0418',
'\u0419',
'\u041a',
'\u041b',
'\u041c',
'\u041d',
'\u041e',
'\u041f',
'\u0420',
'\u0421',
'\u0422',
'\u0423',
'\u0424',
'\u0425',
'\u0426',
'\u0427',
'\u0428',
'\u0429',
'\u042a',
'\u042b',
'\u042c',
'\u042d',
'\u042e',
'\u042f',
'\u0430',
'\u0431',
'\u0432',
'\u0433',
'\u0434',
'\u0435',
'\u0436',
'\u0437',
'\u0438',
'\u0439',
'\u043a',
'\u043b',
'\u043c',
'\u043d',
'\u043e',
'\u043f',
'\u2591',
'\u2592',
'\u2593',
'\u2502',
'\u2524',
'\u2561',
'\u2562',
'\u2556',
'\u2555',
'\u2563',
'\u2551',
'\u2557',
'\u255d',
'\u255c',
'\u255b',
'\u2510',
'\u2514',
'\u2534',
'\u252c',
'\u251c',
'\u2500',
'\u253c',
'\u255e',
'\u255f',
'\u255a',
'\u2554',
'\u2569',
'\u2566',
'\u2560',
'\u2550',
'\u256c',
'\u2567',
'\u2568',
'\u2564',
'\u2565',
'\u2559',
'\u2558',
'\u2552',
'\u2553',
'\u256b',
'\u256a',
'\u2518',
'\u250c',
'\u2588',
'\u2584',
'\u258c',
'\u2590',
'\u2580',
'\u0440',
'\u0441',
'\u0442',
'\u0443',
'\u0444',
'\u0445',
'\u0446',
'\u0447',
'\u0448',
'\u0449',
'\u044a',
'\u044b',
'\u044c',
'\u044d',
'\u044e',
'\u044f',
'\u0401',
'\u0451',
'\u0404',
'\u0454',
'\u0407',
'\u0457',
'\u040e',
'\u045e',
'\u00b0',
'\u2219',
'\u00b7',
'\u221a',
'\u2116',
'\u00a4',
'\u25a0',
'\u00a0'
};
private static final String[] LABELS = {
"866",
"cp866",
"csibm866",
"ibm866"
};
private static final String NAME = "ibm866";
static final Encoding INSTANCE = new Ibm866();
private Ibm866() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,57 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CoderResult;
public class InfallibleSingleByteDecoder extends Decoder {
protected final char[] upperHalf;
protected InfallibleSingleByteDecoder(Encoding cs, char[] upperHalf) {
super(cs, 1.0f, 1.0f);
this.upperHalf = upperHalf;
}
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
// TODO figure out if it's worthwhile to optimize the case where both
// buffers are array-backed.
for (;;) {
if (!in.hasRemaining()) {
return CoderResult.UNDERFLOW;
}
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
int b = (int) in.get();
if (b >= 0) {
out.put((char) b);
} else {
out.put(this.upperHalf[b + 128]);
}
}
}
}

View File

@ -0,0 +1,187 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso10 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0104',
'\u0112',
'\u0122',
'\u012a',
'\u0128',
'\u0136',
'\u00a7',
'\u013b',
'\u0110',
'\u0160',
'\u0166',
'\u017d',
'\u00ad',
'\u016a',
'\u014a',
'\u00b0',
'\u0105',
'\u0113',
'\u0123',
'\u012b',
'\u0129',
'\u0137',
'\u00b7',
'\u013c',
'\u0111',
'\u0161',
'\u0167',
'\u017e',
'\u2015',
'\u016b',
'\u014b',
'\u0100',
'\u00c1',
'\u00c2',
'\u00c3',
'\u00c4',
'\u00c5',
'\u00c6',
'\u012e',
'\u010c',
'\u00c9',
'\u0118',
'\u00cb',
'\u0116',
'\u00cd',
'\u00ce',
'\u00cf',
'\u00d0',
'\u0145',
'\u014c',
'\u00d3',
'\u00d4',
'\u00d5',
'\u00d6',
'\u0168',
'\u00d8',
'\u0172',
'\u00da',
'\u00db',
'\u00dc',
'\u00dd',
'\u00de',
'\u00df',
'\u0101',
'\u00e1',
'\u00e2',
'\u00e3',
'\u00e4',
'\u00e5',
'\u00e6',
'\u012f',
'\u010d',
'\u00e9',
'\u0119',
'\u00eb',
'\u0117',
'\u00ed',
'\u00ee',
'\u00ef',
'\u00f0',
'\u0146',
'\u014d',
'\u00f3',
'\u00f4',
'\u00f5',
'\u00f6',
'\u0169',
'\u00f8',
'\u0173',
'\u00fa',
'\u00fb',
'\u00fc',
'\u00fd',
'\u00fe',
'\u0138'
};
private static final String[] LABELS = {
"csisolatin6",
"iso-8859-10",
"iso-ir-157",
"iso8859-10",
"iso885910",
"l6",
"latin6"
};
private static final String NAME = "iso-8859-10";
static final Encoding INSTANCE = new Iso10();
private Iso10() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso13 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u201d',
'\u00a2',
'\u00a3',
'\u00a4',
'\u201e',
'\u00a6',
'\u00a7',
'\u00d8',
'\u00a9',
'\u0156',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00c6',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u201c',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00f8',
'\u00b9',
'\u0157',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u00e6',
'\u0104',
'\u012e',
'\u0100',
'\u0106',
'\u00c4',
'\u00c5',
'\u0118',
'\u0112',
'\u010c',
'\u00c9',
'\u0179',
'\u0116',
'\u0122',
'\u0136',
'\u012a',
'\u013b',
'\u0160',
'\u0143',
'\u0145',
'\u00d3',
'\u014c',
'\u00d5',
'\u00d6',
'\u00d7',
'\u0172',
'\u0141',
'\u015a',
'\u016a',
'\u00dc',
'\u017b',
'\u017d',
'\u00df',
'\u0105',
'\u012f',
'\u0101',
'\u0107',
'\u00e4',
'\u00e5',
'\u0119',
'\u0113',
'\u010d',
'\u00e9',
'\u017a',
'\u0117',
'\u0123',
'\u0137',
'\u012b',
'\u013c',
'\u0161',
'\u0144',
'\u0146',
'\u00f3',
'\u014d',
'\u00f5',
'\u00f6',
'\u00f7',
'\u0173',
'\u0142',
'\u015b',
'\u016b',
'\u00fc',
'\u017c',
'\u017e',
'\u2019'
};
private static final String[] LABELS = {
"iso-8859-13",
"iso8859-13",
"iso885913"
};
private static final String NAME = "iso-8859-13";
static final Encoding INSTANCE = new Iso13();
private Iso13() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso14 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u1e02',
'\u1e03',
'\u00a3',
'\u010a',
'\u010b',
'\u1e0a',
'\u00a7',
'\u1e80',
'\u00a9',
'\u1e82',
'\u1e0b',
'\u1ef2',
'\u00ad',
'\u00ae',
'\u0178',
'\u1e1e',
'\u1e1f',
'\u0120',
'\u0121',
'\u1e40',
'\u1e41',
'\u00b6',
'\u1e56',
'\u1e81',
'\u1e57',
'\u1e83',
'\u1e60',
'\u1ef3',
'\u1e84',
'\u1e85',
'\u1e61',
'\u00c0',
'\u00c1',
'\u00c2',
'\u00c3',
'\u00c4',
'\u00c5',
'\u00c6',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u00cc',
'\u00cd',
'\u00ce',
'\u00cf',
'\u0174',
'\u00d1',
'\u00d2',
'\u00d3',
'\u00d4',
'\u00d5',
'\u00d6',
'\u1e6a',
'\u00d8',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u00dd',
'\u0176',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\u00e3',
'\u00e4',
'\u00e5',
'\u00e6',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u00ec',
'\u00ed',
'\u00ee',
'\u00ef',
'\u0175',
'\u00f1',
'\u00f2',
'\u00f3',
'\u00f4',
'\u00f5',
'\u00f6',
'\u1e6b',
'\u00f8',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u00fd',
'\u0177',
'\u00ff'
};
private static final String[] LABELS = {
"iso-8859-14",
"iso8859-14",
"iso885914"
};
private static final String NAME = "iso-8859-14";
static final Encoding INSTANCE = new Iso14();
private Iso14() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,186 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso15 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u00a1',
'\u00a2',
'\u00a3',
'\u20ac',
'\u00a5',
'\u0160',
'\u00a7',
'\u0161',
'\u00a9',
'\u00aa',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u017d',
'\u00b5',
'\u00b6',
'\u00b7',
'\u017e',
'\u00b9',
'\u00ba',
'\u00bb',
'\u0152',
'\u0153',
'\u0178',
'\u00bf',
'\u00c0',
'\u00c1',
'\u00c2',
'\u00c3',
'\u00c4',
'\u00c5',
'\u00c6',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u00cc',
'\u00cd',
'\u00ce',
'\u00cf',
'\u00d0',
'\u00d1',
'\u00d2',
'\u00d3',
'\u00d4',
'\u00d5',
'\u00d6',
'\u00d7',
'\u00d8',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u00dd',
'\u00de',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\u00e3',
'\u00e4',
'\u00e5',
'\u00e6',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u00ec',
'\u00ed',
'\u00ee',
'\u00ef',
'\u00f0',
'\u00f1',
'\u00f2',
'\u00f3',
'\u00f4',
'\u00f5',
'\u00f6',
'\u00f7',
'\u00f8',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u00fd',
'\u00fe',
'\u00ff'
};
private static final String[] LABELS = {
"csisolatin9",
"iso-8859-15",
"iso8859-15",
"iso885915",
"iso_8859-15",
"l9"
};
private static final String NAME = "iso-8859-15";
static final Encoding INSTANCE = new Iso15();
private Iso15() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,181 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso16 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0104',
'\u0105',
'\u0141',
'\u20ac',
'\u201e',
'\u0160',
'\u00a7',
'\u0161',
'\u00a9',
'\u0218',
'\u00ab',
'\u0179',
'\u00ad',
'\u017a',
'\u017b',
'\u00b0',
'\u00b1',
'\u010c',
'\u0142',
'\u017d',
'\u201d',
'\u00b6',
'\u00b7',
'\u017e',
'\u010d',
'\u0219',
'\u00bb',
'\u0152',
'\u0153',
'\u0178',
'\u017c',
'\u00c0',
'\u00c1',
'\u00c2',
'\u0102',
'\u00c4',
'\u0106',
'\u00c6',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u00cc',
'\u00cd',
'\u00ce',
'\u00cf',
'\u0110',
'\u0143',
'\u00d2',
'\u00d3',
'\u00d4',
'\u0150',
'\u00d6',
'\u015a',
'\u0170',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u0118',
'\u021a',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\u0103',
'\u00e4',
'\u0107',
'\u00e6',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u00ec',
'\u00ed',
'\u00ee',
'\u00ef',
'\u0111',
'\u0144',
'\u00f2',
'\u00f3',
'\u00f4',
'\u0151',
'\u00f6',
'\u015b',
'\u0171',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u0119',
'\u021b',
'\u00ff'
};
private static final String[] LABELS = {
"iso-8859-16"
};
private static final String NAME = "iso-8859-16";
static final Encoding INSTANCE = new Iso16();
private Iso16() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,189 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso2 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0104',
'\u02d8',
'\u0141',
'\u00a4',
'\u013d',
'\u015a',
'\u00a7',
'\u00a8',
'\u0160',
'\u015e',
'\u0164',
'\u0179',
'\u00ad',
'\u017d',
'\u017b',
'\u00b0',
'\u0105',
'\u02db',
'\u0142',
'\u00b4',
'\u013e',
'\u015b',
'\u02c7',
'\u00b8',
'\u0161',
'\u015f',
'\u0165',
'\u017a',
'\u02dd',
'\u017e',
'\u017c',
'\u0154',
'\u00c1',
'\u00c2',
'\u0102',
'\u00c4',
'\u0139',
'\u0106',
'\u00c7',
'\u010c',
'\u00c9',
'\u0118',
'\u00cb',
'\u011a',
'\u00cd',
'\u00ce',
'\u010e',
'\u0110',
'\u0143',
'\u0147',
'\u00d3',
'\u00d4',
'\u0150',
'\u00d6',
'\u00d7',
'\u0158',
'\u016e',
'\u00da',
'\u0170',
'\u00dc',
'\u00dd',
'\u0162',
'\u00df',
'\u0155',
'\u00e1',
'\u00e2',
'\u0103',
'\u00e4',
'\u013a',
'\u0107',
'\u00e7',
'\u010d',
'\u00e9',
'\u0119',
'\u00eb',
'\u011b',
'\u00ed',
'\u00ee',
'\u010f',
'\u0111',
'\u0144',
'\u0148',
'\u00f3',
'\u00f4',
'\u0151',
'\u00f6',
'\u00f7',
'\u0159',
'\u016f',
'\u00fa',
'\u0171',
'\u00fc',
'\u00fd',
'\u0163',
'\u02d9'
};
private static final String[] LABELS = {
"csisolatin2",
"iso-8859-2",
"iso-ir-101",
"iso8859-2",
"iso88592",
"iso_8859-2",
"iso_8859-2:1987",
"l2",
"latin2"
};
private static final String NAME = "iso-8859-2";
static final Encoding INSTANCE = new Iso2();
private Iso2() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,56 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Iso2022Jp extends Encoding {
private static final String[] LABELS = {
"csiso2022jp",
"iso-2022-jp"
};
private static final String NAME = "iso-2022-jp";
static final Iso2022Jp INSTANCE = new Iso2022Jp();
private Iso2022Jp() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,189 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso3 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0126',
'\u02d8',
'\u00a3',
'\u00a4',
'\ufffd',
'\u0124',
'\u00a7',
'\u00a8',
'\u0130',
'\u015e',
'\u011e',
'\u0134',
'\u00ad',
'\ufffd',
'\u017b',
'\u00b0',
'\u0127',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u0125',
'\u00b7',
'\u00b8',
'\u0131',
'\u015f',
'\u011f',
'\u0135',
'\u00bd',
'\ufffd',
'\u017c',
'\u00c0',
'\u00c1',
'\u00c2',
'\ufffd',
'\u00c4',
'\u010a',
'\u0108',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u00cc',
'\u00cd',
'\u00ce',
'\u00cf',
'\ufffd',
'\u00d1',
'\u00d2',
'\u00d3',
'\u00d4',
'\u0120',
'\u00d6',
'\u00d7',
'\u011c',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u016c',
'\u015c',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\ufffd',
'\u00e4',
'\u010b',
'\u0109',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u00ec',
'\u00ed',
'\u00ee',
'\u00ef',
'\ufffd',
'\u00f1',
'\u00f2',
'\u00f3',
'\u00f4',
'\u0121',
'\u00f6',
'\u00f7',
'\u011d',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u016d',
'\u015d',
'\u02d9'
};
private static final String[] LABELS = {
"csisolatin3",
"iso-8859-3",
"iso-ir-109",
"iso8859-3",
"iso88593",
"iso_8859-3",
"iso_8859-3:1988",
"l3",
"latin3"
};
private static final String NAME = "iso-8859-3";
static final Encoding INSTANCE = new Iso3();
private Iso3() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,189 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso4 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0104',
'\u0138',
'\u0156',
'\u00a4',
'\u0128',
'\u013b',
'\u00a7',
'\u00a8',
'\u0160',
'\u0112',
'\u0122',
'\u0166',
'\u00ad',
'\u017d',
'\u00af',
'\u00b0',
'\u0105',
'\u02db',
'\u0157',
'\u00b4',
'\u0129',
'\u013c',
'\u02c7',
'\u00b8',
'\u0161',
'\u0113',
'\u0123',
'\u0167',
'\u014a',
'\u017e',
'\u014b',
'\u0100',
'\u00c1',
'\u00c2',
'\u00c3',
'\u00c4',
'\u00c5',
'\u00c6',
'\u012e',
'\u010c',
'\u00c9',
'\u0118',
'\u00cb',
'\u0116',
'\u00cd',
'\u00ce',
'\u012a',
'\u0110',
'\u0145',
'\u014c',
'\u0136',
'\u00d4',
'\u00d5',
'\u00d6',
'\u00d7',
'\u00d8',
'\u0172',
'\u00da',
'\u00db',
'\u00dc',
'\u0168',
'\u016a',
'\u00df',
'\u0101',
'\u00e1',
'\u00e2',
'\u00e3',
'\u00e4',
'\u00e5',
'\u00e6',
'\u012f',
'\u010d',
'\u00e9',
'\u0119',
'\u00eb',
'\u0117',
'\u00ed',
'\u00ee',
'\u012b',
'\u0111',
'\u0146',
'\u014d',
'\u0137',
'\u00f4',
'\u00f5',
'\u00f6',
'\u00f7',
'\u00f8',
'\u0173',
'\u00fa',
'\u00fb',
'\u00fc',
'\u0169',
'\u016b',
'\u02d9'
};
private static final String[] LABELS = {
"csisolatin4",
"iso-8859-4",
"iso-ir-110",
"iso8859-4",
"iso88594",
"iso_8859-4",
"iso_8859-4:1988",
"l4",
"latin4"
};
private static final String NAME = "iso-8859-4";
static final Encoding INSTANCE = new Iso4();
private Iso4() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,188 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso5 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0401',
'\u0402',
'\u0403',
'\u0404',
'\u0405',
'\u0406',
'\u0407',
'\u0408',
'\u0409',
'\u040a',
'\u040b',
'\u040c',
'\u00ad',
'\u040e',
'\u040f',
'\u0410',
'\u0411',
'\u0412',
'\u0413',
'\u0414',
'\u0415',
'\u0416',
'\u0417',
'\u0418',
'\u0419',
'\u041a',
'\u041b',
'\u041c',
'\u041d',
'\u041e',
'\u041f',
'\u0420',
'\u0421',
'\u0422',
'\u0423',
'\u0424',
'\u0425',
'\u0426',
'\u0427',
'\u0428',
'\u0429',
'\u042a',
'\u042b',
'\u042c',
'\u042d',
'\u042e',
'\u042f',
'\u0430',
'\u0431',
'\u0432',
'\u0433',
'\u0434',
'\u0435',
'\u0436',
'\u0437',
'\u0438',
'\u0439',
'\u043a',
'\u043b',
'\u043c',
'\u043d',
'\u043e',
'\u043f',
'\u0440',
'\u0441',
'\u0442',
'\u0443',
'\u0444',
'\u0445',
'\u0446',
'\u0447',
'\u0448',
'\u0449',
'\u044a',
'\u044b',
'\u044c',
'\u044d',
'\u044e',
'\u044f',
'\u2116',
'\u0451',
'\u0452',
'\u0453',
'\u0454',
'\u0455',
'\u0456',
'\u0457',
'\u0458',
'\u0459',
'\u045a',
'\u045b',
'\u045c',
'\u00a7',
'\u045e',
'\u045f'
};
private static final String[] LABELS = {
"csisolatincyrillic",
"cyrillic",
"iso-8859-5",
"iso-ir-144",
"iso8859-5",
"iso88595",
"iso_8859-5",
"iso_8859-5:1988"
};
private static final String NAME = "iso-8859-5";
static final Encoding INSTANCE = new Iso5();
private Iso5() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,194 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso6 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\ufffd',
'\ufffd',
'\ufffd',
'\u00a4',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u060c',
'\u00ad',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u061b',
'\ufffd',
'\ufffd',
'\ufffd',
'\u061f',
'\ufffd',
'\u0621',
'\u0622',
'\u0623',
'\u0624',
'\u0625',
'\u0626',
'\u0627',
'\u0628',
'\u0629',
'\u062a',
'\u062b',
'\u062c',
'\u062d',
'\u062e',
'\u062f',
'\u0630',
'\u0631',
'\u0632',
'\u0633',
'\u0634',
'\u0635',
'\u0636',
'\u0637',
'\u0638',
'\u0639',
'\u063a',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u0640',
'\u0641',
'\u0642',
'\u0643',
'\u0644',
'\u0645',
'\u0646',
'\u0647',
'\u0648',
'\u0649',
'\u064a',
'\u064b',
'\u064c',
'\u064d',
'\u064e',
'\u064f',
'\u0650',
'\u0651',
'\u0652',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd'
};
private static final String[] LABELS = {
"arabic",
"asmo-708",
"csiso88596e",
"csiso88596i",
"csisolatinarabic",
"ecma-114",
"iso-8859-6",
"iso-8859-6-e",
"iso-8859-6-i",
"iso-ir-127",
"iso8859-6",
"iso88596",
"iso_8859-6",
"iso_8859-6:1987"
};
private static final String NAME = "iso-8859-6";
static final Encoding INSTANCE = new Iso6();
private Iso6() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,192 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso7 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u2018',
'\u2019',
'\u00a3',
'\u20ac',
'\u20af',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u037a',
'\u00ab',
'\u00ac',
'\u00ad',
'\ufffd',
'\u2015',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u0384',
'\u0385',
'\u0386',
'\u00b7',
'\u0388',
'\u0389',
'\u038a',
'\u00bb',
'\u038c',
'\u00bd',
'\u038e',
'\u038f',
'\u0390',
'\u0391',
'\u0392',
'\u0393',
'\u0394',
'\u0395',
'\u0396',
'\u0397',
'\u0398',
'\u0399',
'\u039a',
'\u039b',
'\u039c',
'\u039d',
'\u039e',
'\u039f',
'\u03a0',
'\u03a1',
'\ufffd',
'\u03a3',
'\u03a4',
'\u03a5',
'\u03a6',
'\u03a7',
'\u03a8',
'\u03a9',
'\u03aa',
'\u03ab',
'\u03ac',
'\u03ad',
'\u03ae',
'\u03af',
'\u03b0',
'\u03b1',
'\u03b2',
'\u03b3',
'\u03b4',
'\u03b5',
'\u03b6',
'\u03b7',
'\u03b8',
'\u03b9',
'\u03ba',
'\u03bb',
'\u03bc',
'\u03bd',
'\u03be',
'\u03bf',
'\u03c0',
'\u03c1',
'\u03c2',
'\u03c3',
'\u03c4',
'\u03c5',
'\u03c6',
'\u03c7',
'\u03c8',
'\u03c9',
'\u03ca',
'\u03cb',
'\u03cc',
'\u03cd',
'\u03ce',
'\ufffd'
};
private static final String[] LABELS = {
"csisolatingreek",
"ecma-118",
"elot_928",
"greek",
"greek8",
"iso-8859-7",
"iso-ir-126",
"iso8859-7",
"iso88597",
"iso_8859-7",
"iso_8859-7:1987",
"sun_eu_greek"
};
private static final String NAME = "iso-8859-7";
static final Encoding INSTANCE = new Iso7();
private Iso7() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,191 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso8 extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\ufffd',
'\u00a2',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u00d7',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u00f7',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u2017',
'\u05d0',
'\u05d1',
'\u05d2',
'\u05d3',
'\u05d4',
'\u05d5',
'\u05d6',
'\u05d7',
'\u05d8',
'\u05d9',
'\u05da',
'\u05db',
'\u05dc',
'\u05dd',
'\u05de',
'\u05df',
'\u05e0',
'\u05e1',
'\u05e2',
'\u05e3',
'\u05e4',
'\u05e5',
'\u05e6',
'\u05e7',
'\u05e8',
'\u05e9',
'\u05ea',
'\ufffd',
'\ufffd',
'\u200e',
'\u200f',
'\ufffd'
};
private static final String[] LABELS = {
"csiso88598e",
"csisolatinhebrew",
"hebrew",
"iso-8859-8",
"iso-8859-8-e",
"iso-ir-138",
"iso8859-8",
"iso88598",
"iso_8859-8",
"iso_8859-8:1988",
"visual"
};
private static final String NAME = "iso-8859-8";
static final Encoding INSTANCE = new Iso8();
private Iso8() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Iso8I extends Encoding {
private static final char[] TABLE = {
'\u0080',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u0085',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u0091',
'\u0092',
'\u0093',
'\u0094',
'\u0095',
'\u0096',
'\u0097',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\ufffd',
'\u00a2',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u00d7',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u00f7',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u2017',
'\u05d0',
'\u05d1',
'\u05d2',
'\u05d3',
'\u05d4',
'\u05d5',
'\u05d6',
'\u05d7',
'\u05d8',
'\u05d9',
'\u05da',
'\u05db',
'\u05dc',
'\u05dd',
'\u05de',
'\u05df',
'\u05e0',
'\u05e1',
'\u05e2',
'\u05e3',
'\u05e4',
'\u05e5',
'\u05e6',
'\u05e7',
'\u05e8',
'\u05e9',
'\u05ea',
'\ufffd',
'\ufffd',
'\u200e',
'\u200f',
'\ufffd'
};
private static final String[] LABELS = {
"csiso88598i",
"iso-8859-8-i",
"logical"
};
private static final String NAME = "iso-8859-8-i";
static final Encoding INSTANCE = new Iso8I();
private Iso8I() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,185 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Koi8R extends Encoding {
private static final char[] TABLE = {
'\u2500',
'\u2502',
'\u250c',
'\u2510',
'\u2514',
'\u2518',
'\u251c',
'\u2524',
'\u252c',
'\u2534',
'\u253c',
'\u2580',
'\u2584',
'\u2588',
'\u258c',
'\u2590',
'\u2591',
'\u2592',
'\u2593',
'\u2320',
'\u25a0',
'\u2219',
'\u221a',
'\u2248',
'\u2264',
'\u2265',
'\u00a0',
'\u2321',
'\u00b0',
'\u00b2',
'\u00b7',
'\u00f7',
'\u2550',
'\u2551',
'\u2552',
'\u0451',
'\u2553',
'\u2554',
'\u2555',
'\u2556',
'\u2557',
'\u2558',
'\u2559',
'\u255a',
'\u255b',
'\u255c',
'\u255d',
'\u255e',
'\u255f',
'\u2560',
'\u2561',
'\u0401',
'\u2562',
'\u2563',
'\u2564',
'\u2565',
'\u2566',
'\u2567',
'\u2568',
'\u2569',
'\u256a',
'\u256b',
'\u256c',
'\u00a9',
'\u044e',
'\u0430',
'\u0431',
'\u0446',
'\u0434',
'\u0435',
'\u0444',
'\u0433',
'\u0445',
'\u0438',
'\u0439',
'\u043a',
'\u043b',
'\u043c',
'\u043d',
'\u043e',
'\u043f',
'\u044f',
'\u0440',
'\u0441',
'\u0442',
'\u0443',
'\u0436',
'\u0432',
'\u044c',
'\u044b',
'\u0437',
'\u0448',
'\u044d',
'\u0449',
'\u0447',
'\u044a',
'\u042e',
'\u0410',
'\u0411',
'\u0426',
'\u0414',
'\u0415',
'\u0424',
'\u0413',
'\u0425',
'\u0418',
'\u0419',
'\u041a',
'\u041b',
'\u041c',
'\u041d',
'\u041e',
'\u041f',
'\u042f',
'\u0420',
'\u0421',
'\u0422',
'\u0423',
'\u0416',
'\u0412',
'\u042c',
'\u042b',
'\u0417',
'\u0428',
'\u042d',
'\u0429',
'\u0427',
'\u042a'
};
private static final String[] LABELS = {
"cskoi8r",
"koi",
"koi8",
"koi8-r",
"koi8_r"
};
private static final String NAME = "koi8-r";
static final Encoding INSTANCE = new Koi8R();
private Koi8R() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,182 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Koi8U extends Encoding {
private static final char[] TABLE = {
'\u2500',
'\u2502',
'\u250c',
'\u2510',
'\u2514',
'\u2518',
'\u251c',
'\u2524',
'\u252c',
'\u2534',
'\u253c',
'\u2580',
'\u2584',
'\u2588',
'\u258c',
'\u2590',
'\u2591',
'\u2592',
'\u2593',
'\u2320',
'\u25a0',
'\u2219',
'\u221a',
'\u2248',
'\u2264',
'\u2265',
'\u00a0',
'\u2321',
'\u00b0',
'\u00b2',
'\u00b7',
'\u00f7',
'\u2550',
'\u2551',
'\u2552',
'\u0451',
'\u0454',
'\u2554',
'\u0456',
'\u0457',
'\u2557',
'\u2558',
'\u2559',
'\u255a',
'\u255b',
'\u0491',
'\u045e',
'\u255e',
'\u255f',
'\u2560',
'\u2561',
'\u0401',
'\u0404',
'\u2563',
'\u0406',
'\u0407',
'\u2566',
'\u2567',
'\u2568',
'\u2569',
'\u256a',
'\u0490',
'\u040e',
'\u00a9',
'\u044e',
'\u0430',
'\u0431',
'\u0446',
'\u0434',
'\u0435',
'\u0444',
'\u0433',
'\u0445',
'\u0438',
'\u0439',
'\u043a',
'\u043b',
'\u043c',
'\u043d',
'\u043e',
'\u043f',
'\u044f',
'\u0440',
'\u0441',
'\u0442',
'\u0443',
'\u0436',
'\u0432',
'\u044c',
'\u044b',
'\u0437',
'\u0448',
'\u044d',
'\u0449',
'\u0447',
'\u044a',
'\u042e',
'\u0410',
'\u0411',
'\u0426',
'\u0414',
'\u0415',
'\u0424',
'\u0413',
'\u0425',
'\u0418',
'\u0419',
'\u041a',
'\u041b',
'\u041c',
'\u041d',
'\u041e',
'\u041f',
'\u042f',
'\u0420',
'\u0421',
'\u0422',
'\u0423',
'\u0416',
'\u0412',
'\u042c',
'\u042b',
'\u0417',
'\u0428',
'\u042d',
'\u0429',
'\u0427',
'\u042a'
};
private static final String[] LABELS = {
"koi8-ru",
"koi8-u"
};
private static final String NAME = "koi8-u";
static final Encoding INSTANCE = new Koi8U();
private Koi8U() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,182 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class MacCyrillic extends Encoding {
private static final char[] TABLE = {
'\u0410',
'\u0411',
'\u0412',
'\u0413',
'\u0414',
'\u0415',
'\u0416',
'\u0417',
'\u0418',
'\u0419',
'\u041a',
'\u041b',
'\u041c',
'\u041d',
'\u041e',
'\u041f',
'\u0420',
'\u0421',
'\u0422',
'\u0423',
'\u0424',
'\u0425',
'\u0426',
'\u0427',
'\u0428',
'\u0429',
'\u042a',
'\u042b',
'\u042c',
'\u042d',
'\u042e',
'\u042f',
'\u2020',
'\u00b0',
'\u0490',
'\u00a3',
'\u00a7',
'\u2022',
'\u00b6',
'\u0406',
'\u00ae',
'\u00a9',
'\u2122',
'\u0402',
'\u0452',
'\u2260',
'\u0403',
'\u0453',
'\u221e',
'\u00b1',
'\u2264',
'\u2265',
'\u0456',
'\u00b5',
'\u0491',
'\u0408',
'\u0404',
'\u0454',
'\u0407',
'\u0457',
'\u0409',
'\u0459',
'\u040a',
'\u045a',
'\u0458',
'\u0405',
'\u00ac',
'\u221a',
'\u0192',
'\u2248',
'\u2206',
'\u00ab',
'\u00bb',
'\u2026',
'\u00a0',
'\u040b',
'\u045b',
'\u040c',
'\u045c',
'\u0455',
'\u2013',
'\u2014',
'\u201c',
'\u201d',
'\u2018',
'\u2019',
'\u00f7',
'\u201e',
'\u040e',
'\u045e',
'\u040f',
'\u045f',
'\u2116',
'\u0401',
'\u0451',
'\u044f',
'\u0430',
'\u0431',
'\u0432',
'\u0433',
'\u0434',
'\u0435',
'\u0436',
'\u0437',
'\u0438',
'\u0439',
'\u043a',
'\u043b',
'\u043c',
'\u043d',
'\u043e',
'\u043f',
'\u0440',
'\u0441',
'\u0442',
'\u0443',
'\u0444',
'\u0445',
'\u0446',
'\u0447',
'\u0448',
'\u0449',
'\u044a',
'\u044b',
'\u044c',
'\u044d',
'\u044e',
'\u20ac'
};
private static final String[] LABELS = {
"x-mac-cyrillic",
"x-mac-ukrainian"
};
private static final String NAME = "x-mac-cyrillic";
static final Encoding INSTANCE = new MacCyrillic();
private MacCyrillic() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,184 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Macintosh extends Encoding {
private static final char[] TABLE = {
'\u00c4',
'\u00c5',
'\u00c7',
'\u00c9',
'\u00d1',
'\u00d6',
'\u00dc',
'\u00e1',
'\u00e0',
'\u00e2',
'\u00e4',
'\u00e3',
'\u00e5',
'\u00e7',
'\u00e9',
'\u00e8',
'\u00ea',
'\u00eb',
'\u00ed',
'\u00ec',
'\u00ee',
'\u00ef',
'\u00f1',
'\u00f3',
'\u00f2',
'\u00f4',
'\u00f6',
'\u00f5',
'\u00fa',
'\u00f9',
'\u00fb',
'\u00fc',
'\u2020',
'\u00b0',
'\u00a2',
'\u00a3',
'\u00a7',
'\u2022',
'\u00b6',
'\u00df',
'\u00ae',
'\u00a9',
'\u2122',
'\u00b4',
'\u00a8',
'\u2260',
'\u00c6',
'\u00d8',
'\u221e',
'\u00b1',
'\u2264',
'\u2265',
'\u00a5',
'\u00b5',
'\u2202',
'\u2211',
'\u220f',
'\u03c0',
'\u222b',
'\u00aa',
'\u00ba',
'\u03a9',
'\u00e6',
'\u00f8',
'\u00bf',
'\u00a1',
'\u00ac',
'\u221a',
'\u0192',
'\u2248',
'\u2206',
'\u00ab',
'\u00bb',
'\u2026',
'\u00a0',
'\u00c0',
'\u00c3',
'\u00d5',
'\u0152',
'\u0153',
'\u2013',
'\u2014',
'\u201c',
'\u201d',
'\u2018',
'\u2019',
'\u00f7',
'\u25ca',
'\u00ff',
'\u0178',
'\u2044',
'\u20ac',
'\u2039',
'\u203a',
'\ufb01',
'\ufb02',
'\u2021',
'\u00b7',
'\u201a',
'\u201e',
'\u2030',
'\u00c2',
'\u00ca',
'\u00c1',
'\u00cb',
'\u00c8',
'\u00cd',
'\u00ce',
'\u00cf',
'\u00cc',
'\u00d3',
'\u00d4',
'\uf8ff',
'\u00d2',
'\u00da',
'\u00db',
'\u00d9',
'\u0131',
'\u02c6',
'\u02dc',
'\u00af',
'\u02d8',
'\u02d9',
'\u02da',
'\u00b8',
'\u02dd',
'\u02db',
'\u02c7'
};
private static final String[] LABELS = {
"csmacintosh",
"mac",
"macintosh",
"x-mac-roman"
};
private static final String NAME = "macintosh";
static final Encoding INSTANCE = new Macintosh();
private Macintosh() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,59 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Replacement extends Encoding {
private static final String[] LABELS = {
"csiso2022kr",
"hz-gb-2312",
"iso-2022-cn",
"iso-2022-cn-ext",
"iso-2022-kr"
};
private static final String NAME = "replacement";
static final Replacement INSTANCE = new Replacement();
private Replacement() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new ReplacementDecoder(this);
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,75 @@
/*
* Copyright (c) 2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CoderResult;
class ReplacementDecoder extends Decoder {
private boolean haveEmitted = false;
ReplacementDecoder(Charset cs) {
super(cs, 1.0f, 1.0f);
}
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
for (;;) {
if (!in.hasRemaining()) {
return CoderResult.UNDERFLOW;
}
if (haveEmitted) {
in.position(in.limit());
return CoderResult.UNDERFLOW;
}
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
in.position(in.limit());
haveEmitted = true;
if (this.report) {
return CoderResult.malformedForLength(1);
}
out.put('\uFFFD');
}
}
/**
* @see java.nio.charset.CharsetDecoder#implFlush(java.nio.CharBuffer)
*/
@Override protected CoderResult implFlush(CharBuffer out) {
// TODO Auto-generated method stub
return super.implFlush(out);
}
/**
* @see java.nio.charset.CharsetDecoder#implReset()
*/
@Override protected void implReset() {
// TODO Auto-generated method stub
super.implReset();
}
}

View File

@ -0,0 +1,62 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class ShiftJis extends Encoding {
private static final String[] LABELS = {
"csshiftjis",
"ms932",
"ms_kanji",
"shift-jis",
"shift_jis",
"sjis",
"windows-31j",
"x-sjis"
};
private static final String NAME = "shift_jis";
static final ShiftJis INSTANCE = new ShiftJis();
private ShiftJis() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,55 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class UserDefined extends Encoding {
private static final String[] LABELS = {
"x-user-defined"
};
private static final String NAME = "x-user-defined";
static final UserDefined INSTANCE = new UserDefined();
private UserDefined() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new UserDefinedDecoder(this);
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,56 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.encoding;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CoderResult;
class UserDefinedDecoder extends Decoder {
UserDefinedDecoder(Charset cs) {
super(cs, 1.0f, 1.0f);
}
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
// TODO figure out if it's worthwhile to optimize the case where both
// buffers are array-backed.
for (;;) {
if (!in.hasRemaining()) {
return CoderResult.UNDERFLOW;
}
if (!out.hasRemaining()) {
return CoderResult.OVERFLOW;
}
int b = (int)in.get();
if (b >= 0) {
out.put((char)b);
} else {
out.put((char)(b + 128 + 0xF780));
}
}
}
}

View File

@ -0,0 +1,55 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Utf16Be extends Encoding {
private static final String[] LABELS = {
"utf-16be"
};
private static final String NAME = "utf-16be";
static final Utf16Be INSTANCE = new Utf16Be();
private Utf16Be() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,56 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Utf16Le extends Encoding {
private static final String[] LABELS = {
"utf-16",
"utf-16le"
};
private static final String NAME = "utf-16le";
static final Utf16Le INSTANCE = new Utf16Le();
private Utf16Le() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,57 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
class Utf8 extends Encoding {
private static final String[] LABELS = {
"unicode-1-1-utf-8",
"utf-8",
"utf8"
};
private static final String NAME = "utf-8";
static final Utf8 INSTANCE = new Utf8();
private Utf8() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return Charset.forName(NAME).newDecoder();
}
@Override public CharsetEncoder newEncoder() {
return Charset.forName(NAME).newEncoder();
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1250 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0083',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u0088',
'\u2030',
'\u0160',
'\u2039',
'\u015a',
'\u0164',
'\u017d',
'\u0179',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u0098',
'\u2122',
'\u0161',
'\u203a',
'\u015b',
'\u0165',
'\u017e',
'\u017a',
'\u00a0',
'\u02c7',
'\u02d8',
'\u0141',
'\u00a4',
'\u0104',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u015e',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u017b',
'\u00b0',
'\u00b1',
'\u02db',
'\u0142',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u0105',
'\u015f',
'\u00bb',
'\u013d',
'\u02dd',
'\u013e',
'\u017c',
'\u0154',
'\u00c1',
'\u00c2',
'\u0102',
'\u00c4',
'\u0139',
'\u0106',
'\u00c7',
'\u010c',
'\u00c9',
'\u0118',
'\u00cb',
'\u011a',
'\u00cd',
'\u00ce',
'\u010e',
'\u0110',
'\u0143',
'\u0147',
'\u00d3',
'\u00d4',
'\u0150',
'\u00d6',
'\u00d7',
'\u0158',
'\u016e',
'\u00da',
'\u0170',
'\u00dc',
'\u00dd',
'\u0162',
'\u00df',
'\u0155',
'\u00e1',
'\u00e2',
'\u0103',
'\u00e4',
'\u013a',
'\u0107',
'\u00e7',
'\u010d',
'\u00e9',
'\u0119',
'\u00eb',
'\u011b',
'\u00ed',
'\u00ee',
'\u010f',
'\u0111',
'\u0144',
'\u0148',
'\u00f3',
'\u00f4',
'\u0151',
'\u00f6',
'\u00f7',
'\u0159',
'\u016f',
'\u00fa',
'\u0171',
'\u00fc',
'\u00fd',
'\u0163',
'\u02d9'
};
private static final String[] LABELS = {
"cp1250",
"windows-1250",
"x-cp1250"
};
private static final String NAME = "windows-1250";
static final Encoding INSTANCE = new Windows1250();
private Windows1250() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1251 extends Encoding {
private static final char[] TABLE = {
'\u0402',
'\u0403',
'\u201a',
'\u0453',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u20ac',
'\u2030',
'\u0409',
'\u2039',
'\u040a',
'\u040c',
'\u040b',
'\u040f',
'\u0452',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u0098',
'\u2122',
'\u0459',
'\u203a',
'\u045a',
'\u045c',
'\u045b',
'\u045f',
'\u00a0',
'\u040e',
'\u045e',
'\u0408',
'\u00a4',
'\u0490',
'\u00a6',
'\u00a7',
'\u0401',
'\u00a9',
'\u0404',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u0407',
'\u00b0',
'\u00b1',
'\u0406',
'\u0456',
'\u0491',
'\u00b5',
'\u00b6',
'\u00b7',
'\u0451',
'\u2116',
'\u0454',
'\u00bb',
'\u0458',
'\u0405',
'\u0455',
'\u0457',
'\u0410',
'\u0411',
'\u0412',
'\u0413',
'\u0414',
'\u0415',
'\u0416',
'\u0417',
'\u0418',
'\u0419',
'\u041a',
'\u041b',
'\u041c',
'\u041d',
'\u041e',
'\u041f',
'\u0420',
'\u0421',
'\u0422',
'\u0423',
'\u0424',
'\u0425',
'\u0426',
'\u0427',
'\u0428',
'\u0429',
'\u042a',
'\u042b',
'\u042c',
'\u042d',
'\u042e',
'\u042f',
'\u0430',
'\u0431',
'\u0432',
'\u0433',
'\u0434',
'\u0435',
'\u0436',
'\u0437',
'\u0438',
'\u0439',
'\u043a',
'\u043b',
'\u043c',
'\u043d',
'\u043e',
'\u043f',
'\u0440',
'\u0441',
'\u0442',
'\u0443',
'\u0444',
'\u0445',
'\u0446',
'\u0447',
'\u0448',
'\u0449',
'\u044a',
'\u044b',
'\u044c',
'\u044d',
'\u044e',
'\u044f'
};
private static final String[] LABELS = {
"cp1251",
"windows-1251",
"x-cp1251"
};
private static final String NAME = "windows-1251";
static final Encoding INSTANCE = new Windows1251();
private Windows1251() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,197 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1252 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0192',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u02c6',
'\u2030',
'\u0160',
'\u2039',
'\u0152',
'\u008d',
'\u017d',
'\u008f',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u02dc',
'\u2122',
'\u0161',
'\u203a',
'\u0153',
'\u009d',
'\u017e',
'\u0178',
'\u00a0',
'\u00a1',
'\u00a2',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u00aa',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u00ba',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u00bf',
'\u00c0',
'\u00c1',
'\u00c2',
'\u00c3',
'\u00c4',
'\u00c5',
'\u00c6',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u00cc',
'\u00cd',
'\u00ce',
'\u00cf',
'\u00d0',
'\u00d1',
'\u00d2',
'\u00d3',
'\u00d4',
'\u00d5',
'\u00d6',
'\u00d7',
'\u00d8',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u00dd',
'\u00de',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\u00e3',
'\u00e4',
'\u00e5',
'\u00e6',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u00ec',
'\u00ed',
'\u00ee',
'\u00ef',
'\u00f0',
'\u00f1',
'\u00f2',
'\u00f3',
'\u00f4',
'\u00f5',
'\u00f6',
'\u00f7',
'\u00f8',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u00fd',
'\u00fe',
'\u00ff'
};
private static final String[] LABELS = {
"ansi_x3.4-1968",
"ascii",
"cp1252",
"cp819",
"csisolatin1",
"ibm819",
"iso-8859-1",
"iso-ir-100",
"iso8859-1",
"iso88591",
"iso_8859-1",
"iso_8859-1:1987",
"l1",
"latin1",
"us-ascii",
"windows-1252",
"x-cp1252"
};
private static final String NAME = "windows-1252";
static final Encoding INSTANCE = new Windows1252();
private Windows1252() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1253 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0192',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u0088',
'\u2030',
'\u008a',
'\u2039',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u0098',
'\u2122',
'\u009a',
'\u203a',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0385',
'\u0386',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\ufffd',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u2015',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u0384',
'\u00b5',
'\u00b6',
'\u00b7',
'\u0388',
'\u0389',
'\u038a',
'\u00bb',
'\u038c',
'\u00bd',
'\u038e',
'\u038f',
'\u0390',
'\u0391',
'\u0392',
'\u0393',
'\u0394',
'\u0395',
'\u0396',
'\u0397',
'\u0398',
'\u0399',
'\u039a',
'\u039b',
'\u039c',
'\u039d',
'\u039e',
'\u039f',
'\u03a0',
'\u03a1',
'\ufffd',
'\u03a3',
'\u03a4',
'\u03a5',
'\u03a6',
'\u03a7',
'\u03a8',
'\u03a9',
'\u03aa',
'\u03ab',
'\u03ac',
'\u03ad',
'\u03ae',
'\u03af',
'\u03b0',
'\u03b1',
'\u03b2',
'\u03b3',
'\u03b4',
'\u03b5',
'\u03b6',
'\u03b7',
'\u03b8',
'\u03b9',
'\u03ba',
'\u03bb',
'\u03bc',
'\u03bd',
'\u03be',
'\u03bf',
'\u03c0',
'\u03c1',
'\u03c2',
'\u03c3',
'\u03c4',
'\u03c5',
'\u03c6',
'\u03c7',
'\u03c8',
'\u03c9',
'\u03ca',
'\u03cb',
'\u03cc',
'\u03cd',
'\u03ce',
'\ufffd'
};
private static final String[] LABELS = {
"cp1253",
"windows-1253",
"x-cp1253"
};
private static final String NAME = "windows-1253";
static final Encoding INSTANCE = new Windows1253();
private Windows1253() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,192 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1254 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0192',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u02c6',
'\u2030',
'\u0160',
'\u2039',
'\u0152',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u02dc',
'\u2122',
'\u0161',
'\u203a',
'\u0153',
'\u009d',
'\u009e',
'\u0178',
'\u00a0',
'\u00a1',
'\u00a2',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u00aa',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u00ba',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u00bf',
'\u00c0',
'\u00c1',
'\u00c2',
'\u00c3',
'\u00c4',
'\u00c5',
'\u00c6',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u00cc',
'\u00cd',
'\u00ce',
'\u00cf',
'\u011e',
'\u00d1',
'\u00d2',
'\u00d3',
'\u00d4',
'\u00d5',
'\u00d6',
'\u00d7',
'\u00d8',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u0130',
'\u015e',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\u00e3',
'\u00e4',
'\u00e5',
'\u00e6',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u00ec',
'\u00ed',
'\u00ee',
'\u00ef',
'\u011f',
'\u00f1',
'\u00f2',
'\u00f3',
'\u00f4',
'\u00f5',
'\u00f6',
'\u00f7',
'\u00f8',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u0131',
'\u015f',
'\u00ff'
};
private static final String[] LABELS = {
"cp1254",
"csisolatin5",
"iso-8859-9",
"iso-ir-148",
"iso8859-9",
"iso88599",
"iso_8859-9",
"iso_8859-9:1989",
"l5",
"latin5",
"windows-1254",
"x-cp1254"
};
private static final String NAME = "windows-1254";
static final Encoding INSTANCE = new Windows1254();
private Windows1254() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1255 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0192',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u02c6',
'\u2030',
'\u008a',
'\u2039',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u02dc',
'\u2122',
'\u009a',
'\u203a',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u00a1',
'\u00a2',
'\u00a3',
'\u20aa',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u00d7',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u00f7',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u00bf',
'\u05b0',
'\u05b1',
'\u05b2',
'\u05b3',
'\u05b4',
'\u05b5',
'\u05b6',
'\u05b7',
'\u05b8',
'\u05b9',
'\ufffd',
'\u05bb',
'\u05bc',
'\u05bd',
'\u05be',
'\u05bf',
'\u05c0',
'\u05c1',
'\u05c2',
'\u05c3',
'\u05f0',
'\u05f1',
'\u05f2',
'\u05f3',
'\u05f4',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u05d0',
'\u05d1',
'\u05d2',
'\u05d3',
'\u05d4',
'\u05d5',
'\u05d6',
'\u05d7',
'\u05d8',
'\u05d9',
'\u05da',
'\u05db',
'\u05dc',
'\u05dd',
'\u05de',
'\u05df',
'\u05e0',
'\u05e1',
'\u05e2',
'\u05e3',
'\u05e4',
'\u05e5',
'\u05e6',
'\u05e7',
'\u05e8',
'\u05e9',
'\u05ea',
'\ufffd',
'\ufffd',
'\u200e',
'\u200f',
'\ufffd'
};
private static final String[] LABELS = {
"cp1255",
"windows-1255",
"x-cp1255"
};
private static final String NAME = "windows-1255";
static final Encoding INSTANCE = new Windows1255();
private Windows1255() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1256 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u067e',
'\u201a',
'\u0192',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u02c6',
'\u2030',
'\u0679',
'\u2039',
'\u0152',
'\u0686',
'\u0698',
'\u0688',
'\u06af',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u06a9',
'\u2122',
'\u0691',
'\u203a',
'\u0153',
'\u200c',
'\u200d',
'\u06ba',
'\u00a0',
'\u060c',
'\u00a2',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u06be',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u061b',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u061f',
'\u06c1',
'\u0621',
'\u0622',
'\u0623',
'\u0624',
'\u0625',
'\u0626',
'\u0627',
'\u0628',
'\u0629',
'\u062a',
'\u062b',
'\u062c',
'\u062d',
'\u062e',
'\u062f',
'\u0630',
'\u0631',
'\u0632',
'\u0633',
'\u0634',
'\u0635',
'\u0636',
'\u00d7',
'\u0637',
'\u0638',
'\u0639',
'\u063a',
'\u0640',
'\u0641',
'\u0642',
'\u0643',
'\u00e0',
'\u0644',
'\u00e2',
'\u0645',
'\u0646',
'\u0647',
'\u0648',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u0649',
'\u064a',
'\u00ee',
'\u00ef',
'\u064b',
'\u064c',
'\u064d',
'\u064e',
'\u00f4',
'\u064f',
'\u0650',
'\u00f7',
'\u0651',
'\u00f9',
'\u0652',
'\u00fb',
'\u00fc',
'\u200e',
'\u200f',
'\u06d2'
};
private static final String[] LABELS = {
"cp1256",
"windows-1256",
"x-cp1256"
};
private static final String NAME = "windows-1256";
static final Encoding INSTANCE = new Windows1256();
private Windows1256() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1257 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0083',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u0088',
'\u2030',
'\u008a',
'\u2039',
'\u008c',
'\u00a8',
'\u02c7',
'\u00b8',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u0098',
'\u2122',
'\u009a',
'\u203a',
'\u009c',
'\u00af',
'\u02db',
'\u009f',
'\u00a0',
'\ufffd',
'\u00a2',
'\u00a3',
'\u00a4',
'\ufffd',
'\u00a6',
'\u00a7',
'\u00d8',
'\u00a9',
'\u0156',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00c6',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00f8',
'\u00b9',
'\u0157',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u00e6',
'\u0104',
'\u012e',
'\u0100',
'\u0106',
'\u00c4',
'\u00c5',
'\u0118',
'\u0112',
'\u010c',
'\u00c9',
'\u0179',
'\u0116',
'\u0122',
'\u0136',
'\u012a',
'\u013b',
'\u0160',
'\u0143',
'\u0145',
'\u00d3',
'\u014c',
'\u00d5',
'\u00d6',
'\u00d7',
'\u0172',
'\u0141',
'\u015a',
'\u016a',
'\u00dc',
'\u017b',
'\u017d',
'\u00df',
'\u0105',
'\u012f',
'\u0101',
'\u0107',
'\u00e4',
'\u00e5',
'\u0119',
'\u0113',
'\u010d',
'\u00e9',
'\u017a',
'\u0117',
'\u0123',
'\u0137',
'\u012b',
'\u013c',
'\u0161',
'\u0144',
'\u0146',
'\u00f3',
'\u014d',
'\u00f5',
'\u00f6',
'\u00f7',
'\u0173',
'\u0142',
'\u015b',
'\u016b',
'\u00fc',
'\u017c',
'\u017e',
'\u02d9'
};
private static final String[] LABELS = {
"cp1257",
"windows-1257",
"x-cp1257"
};
private static final String NAME = "windows-1257";
static final Encoding INSTANCE = new Windows1257();
private Windows1257() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,183 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows1258 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u201a',
'\u0192',
'\u201e',
'\u2026',
'\u2020',
'\u2021',
'\u02c6',
'\u2030',
'\u008a',
'\u2039',
'\u0152',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u02dc',
'\u2122',
'\u009a',
'\u203a',
'\u0153',
'\u009d',
'\u009e',
'\u0178',
'\u00a0',
'\u00a1',
'\u00a2',
'\u00a3',
'\u00a4',
'\u00a5',
'\u00a6',
'\u00a7',
'\u00a8',
'\u00a9',
'\u00aa',
'\u00ab',
'\u00ac',
'\u00ad',
'\u00ae',
'\u00af',
'\u00b0',
'\u00b1',
'\u00b2',
'\u00b3',
'\u00b4',
'\u00b5',
'\u00b6',
'\u00b7',
'\u00b8',
'\u00b9',
'\u00ba',
'\u00bb',
'\u00bc',
'\u00bd',
'\u00be',
'\u00bf',
'\u00c0',
'\u00c1',
'\u00c2',
'\u0102',
'\u00c4',
'\u00c5',
'\u00c6',
'\u00c7',
'\u00c8',
'\u00c9',
'\u00ca',
'\u00cb',
'\u0300',
'\u00cd',
'\u00ce',
'\u00cf',
'\u0110',
'\u00d1',
'\u0309',
'\u00d3',
'\u00d4',
'\u01a0',
'\u00d6',
'\u00d7',
'\u00d8',
'\u00d9',
'\u00da',
'\u00db',
'\u00dc',
'\u01af',
'\u0303',
'\u00df',
'\u00e0',
'\u00e1',
'\u00e2',
'\u0103',
'\u00e4',
'\u00e5',
'\u00e6',
'\u00e7',
'\u00e8',
'\u00e9',
'\u00ea',
'\u00eb',
'\u0301',
'\u00ed',
'\u00ee',
'\u00ef',
'\u0111',
'\u00f1',
'\u0323',
'\u00f3',
'\u00f4',
'\u01a1',
'\u00f6',
'\u00f7',
'\u00f8',
'\u00f9',
'\u00fa',
'\u00fb',
'\u00fc',
'\u01b0',
'\u20ab',
'\u00ff'
};
private static final String[] LABELS = {
"cp1258",
"windows-1258",
"x-cp1258"
};
private static final String NAME = "windows-1258";
static final Encoding INSTANCE = new Windows1258();
private Windows1258() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new InfallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,186 @@
/*
* Copyright (c) 2013-2015 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/*
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
* Instead, please regenerate using generate-encoding-data.py
*/
package nu.validator.encoding;
import java.nio.charset.CharsetDecoder;
class Windows874 extends Encoding {
private static final char[] TABLE = {
'\u20ac',
'\u0081',
'\u0082',
'\u0083',
'\u0084',
'\u2026',
'\u0086',
'\u0087',
'\u0088',
'\u0089',
'\u008a',
'\u008b',
'\u008c',
'\u008d',
'\u008e',
'\u008f',
'\u0090',
'\u2018',
'\u2019',
'\u201c',
'\u201d',
'\u2022',
'\u2013',
'\u2014',
'\u0098',
'\u0099',
'\u009a',
'\u009b',
'\u009c',
'\u009d',
'\u009e',
'\u009f',
'\u00a0',
'\u0e01',
'\u0e02',
'\u0e03',
'\u0e04',
'\u0e05',
'\u0e06',
'\u0e07',
'\u0e08',
'\u0e09',
'\u0e0a',
'\u0e0b',
'\u0e0c',
'\u0e0d',
'\u0e0e',
'\u0e0f',
'\u0e10',
'\u0e11',
'\u0e12',
'\u0e13',
'\u0e14',
'\u0e15',
'\u0e16',
'\u0e17',
'\u0e18',
'\u0e19',
'\u0e1a',
'\u0e1b',
'\u0e1c',
'\u0e1d',
'\u0e1e',
'\u0e1f',
'\u0e20',
'\u0e21',
'\u0e22',
'\u0e23',
'\u0e24',
'\u0e25',
'\u0e26',
'\u0e27',
'\u0e28',
'\u0e29',
'\u0e2a',
'\u0e2b',
'\u0e2c',
'\u0e2d',
'\u0e2e',
'\u0e2f',
'\u0e30',
'\u0e31',
'\u0e32',
'\u0e33',
'\u0e34',
'\u0e35',
'\u0e36',
'\u0e37',
'\u0e38',
'\u0e39',
'\u0e3a',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd',
'\u0e3f',
'\u0e40',
'\u0e41',
'\u0e42',
'\u0e43',
'\u0e44',
'\u0e45',
'\u0e46',
'\u0e47',
'\u0e48',
'\u0e49',
'\u0e4a',
'\u0e4b',
'\u0e4c',
'\u0e4d',
'\u0e4e',
'\u0e4f',
'\u0e50',
'\u0e51',
'\u0e52',
'\u0e53',
'\u0e54',
'\u0e55',
'\u0e56',
'\u0e57',
'\u0e58',
'\u0e59',
'\u0e5a',
'\u0e5b',
'\ufffd',
'\ufffd',
'\ufffd',
'\ufffd'
};
private static final String[] LABELS = {
"dos-874",
"iso-8859-11",
"iso8859-11",
"iso885911",
"tis-620",
"windows-874"
};
private static final String NAME = "windows-874";
static final Encoding INSTANCE = new Windows874();
private Windows874() {
super(NAME, LABELS);
}
@Override public CharsetDecoder newDecoder() {
return new FallibleSingleByteDecoder(this, TABLE);
}
}

View File

@ -0,0 +1,27 @@
/*
* Copyright (c) 2010 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
public @interface Auto {
}

View File

@ -0,0 +1,27 @@
/*
* Copyright (c) 2010 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
public @interface CharacterName {
}

View File

@ -0,0 +1,34 @@
/*
* Copyright (c) 2010 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
/**
* Marker for translating into the C++ const keyword on the declaration in
* question.
*
* @version $Id$
* @author hsivonen
*/
public @interface Const {
}

View File

@ -0,0 +1,34 @@
/*
* Copyright (c) 2008 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
/**
* The type for attribute IDness. (In Java, an interned string
* <code>"CDATA"</code> or <code>"ID"</code>.)
*
* @version $Id$
* @author hsivonen
*/
public @interface IdType {
}

View File

@ -0,0 +1,33 @@
/*
* Copyright (c) 2009-2010 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
/**
* Translates into the C++ inline keyword.
*
* @version $Id$
* @author hsivonen
*/
public @interface Inline {
}

View File

@ -0,0 +1,34 @@
/*
* Copyright (c) 2009-2010 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
/**
* Marks a string type as being the literal string type (typically const char*)
* in C++.
*
* @version $Id$
* @author hsivonen
*/
public @interface Literal {
}

View File

@ -0,0 +1,34 @@
/*
* Copyright (c) 2008 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
/**
* The local name of an element or attribute. Must be comparable with
* <code>==</code> (interned <code>String</code> in Java).
*
* @version $Id$
* @author hsivonen
*/
public @interface Local {
}

View File

@ -0,0 +1,34 @@
/*
* Copyright (c) 2008 Mozilla Foundation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
package nu.validator.htmlparser.annotation;
/**
* The array type marked with this annotation won't have its
* <code>.length</code> read.
*
* @version $Id$
* @author hsivonen
*/
public @interface NoLength {
}

Some files were not shown because too many files have changed in this diff Show More