Reinstate the java->c++ source, generator code.
parent
30e6400e30
commit
1b648eee31
|
@ -49,9 +49,11 @@ js/src/autom4te.cache
|
|||
js/src/tests/results-*.html
|
||||
js/src/tests/results-*.txt
|
||||
|
||||
# Java HTML5 parser classes
|
||||
parser/html/java/htmlparser/
|
||||
parser/html/java/javaparser/
|
||||
# Java HTML5 parser codegen artifacts
|
||||
parser/html/java/htmlparser/bin/
|
||||
parser/html/java/javaparser/bin/
|
||||
parser/html/java/*.jar
|
||||
parser/html/javasrc/
|
||||
|
||||
# Ignore the files and directory that Eclipse IDE creates
|
||||
.project
|
||||
|
|
|
@ -0,0 +1,63 @@
|
|||
# Updating HTML5 parser code
|
||||
|
||||
Our html5 parser is based on the java html5 parser from [Validator.nu](http://about.validator.nu/htmlparser/) by Henri Sivonen. It has been adopted by Mozilla and further updated, and has been imported as a whole into the UXP tree to have an independent and maintainable source of it that doesn't rely on external sources.
|
||||
|
||||
## Stages
|
||||
Updating the parser code consists of 3 stages:
|
||||
- Make updates to the html parser source in java
|
||||
- Let the java parser regenerate part of its own code after the change
|
||||
- Translate the java source to C++
|
||||
|
||||
This process was best explained in the [following Bugzilla comment](https://bugzilla.mozilla.org/show_bug.cgi?id=1378079#c6), which explain how to add a new attribute name ("is") to html5, inserted in this document for convenience:
|
||||
|
||||
>> Is
|
||||
>> there any documentation on how to add a new nsHtml5AttributeName?
|
||||
>
|
||||
> I don't recall. I should get around to writing it.
|
||||
>
|
||||
>> Looks like
|
||||
>> I need to clone hg.mozilla.org/projects/htmlparser/ and generate a hash with
|
||||
>> it?
|
||||
>
|
||||
> Yes. Here's how:
|
||||
>
|
||||
> `cd parser/html/java/`
|
||||
> `make sync`
|
||||
>
|
||||
> Now you have a clone of [https://hg.mozilla.org/projects/htmlparser/](https://hg.mozilla.org/projects/htmlparser/) in > parser/html/java/htmlparser/
|
||||
>
|
||||
> `cd htmlparser/src/`
|
||||
> `$EDITOR nu/validator/htmlparser/impl/AttributeName.java`
|
||||
>
|
||||
> Search for the word "uncomment" and uncomment stuff according to the two comments that talk about uncommenting
|
||||
> Duplicate the declaration a normal attribute (nothings special in SVG mode, etc.). Let's use "alt", since it's the first one.
|
||||
> In the duplicate, replace ALT with IS and "alt" with "is".
|
||||
> Search for "ALT,", duplicate that line and change the duplicate to say "IS,"
|
||||
> Save.
|
||||
>
|
||||
> `javac nu/validator/htmlparser/impl/AttributeName.java`
|
||||
> `java nu.validator.htmlparser.impl.AttributeName`
|
||||
>
|
||||
> Copy and paste the output into nu/validator/htmlparser/impl/AttributeName.java replacing the text below the comment "START GENERATED CODE" and above the very last "}".
|
||||
> Recomment the bits that you uncommented earlier.
|
||||
> Save.
|
||||
>
|
||||
> `cd ../..` - Back to parser/html/java/
|
||||
> `make translate`
|
||||
|
||||
## Organizing commits
|
||||
|
||||
**The html5 parser code is fragile due to its generation and translation before being used as C++ in our tree. Do not touch or commit anything without a code peer nearby with knowledge of the parser and the commit process (at this moment that means Gaming4JC (@g4jc)), and communicate the changes thoroughly.**
|
||||
|
||||
To organize this properly in our repo, commits should be split up when making these kinds of changes:
|
||||
1. Commit your code edits to the html parser
|
||||
2. Regenerate java into a translation-ready source
|
||||
3. Commit
|
||||
4. Translate and regenerate C++ code
|
||||
5. Check a build to make sure the changes have the intended result
|
||||
6. Commit
|
||||
|
||||
This is needed because the source edit will sometimes be in parts that are self-generated and may otherwise be lost in generation noise, and because we want to keep a strict separation between commits resulting from developer work and those resulting from running scripts/automated processes.
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,44 @@
|
|||
# This Source Code Form is subject to the terms of the Mozilla Public
|
||||
# License, v. 2.0. If a copy of the MPL was not distributed with this
|
||||
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
||||
|
||||
libs:: translator
|
||||
|
||||
translator:: javaparser \
|
||||
; mkdir -p htmlparser/bin && \
|
||||
find htmlparser/translator-src/nu/validator/htmlparser -name "*.java" | \
|
||||
xargs javac -cp javaparser.jar -g -d htmlparser/bin && \
|
||||
jar cfm translator.jar manifest.txt -C htmlparser/bin .
|
||||
|
||||
javaparser:: \
|
||||
; mkdir -p javaparser/bin && find javaparser/src -name "*.java" | \
|
||||
xargs javac -encoding ISO-8859-1 -g -d javaparser/bin && \
|
||||
jar cf javaparser.jar -C javaparser/bin .
|
||||
|
||||
translate:: translator \
|
||||
; mkdir -p ../javasrc ; \
|
||||
java -jar translator.jar \
|
||||
htmlparser/src/nu/validator/htmlparser/impl \
|
||||
.. ../nsHtml5AtomList.h
|
||||
|
||||
translate-javasrc:: translator \
|
||||
; mkdir -p ../javasrc ; \
|
||||
java -jar translator.jar \
|
||||
../javasrc \
|
||||
.. ../nsHtml5AtomList.h
|
||||
|
||||
named-characters:: translator \
|
||||
; java -cp translator.jar \
|
||||
nu.validator.htmlparser.generator.GenerateNamedCharactersCpp \
|
||||
named-character-references.html ../
|
||||
|
||||
clean-javaparser:: \
|
||||
; rm -rf javaparser/bin javaparser.jar
|
||||
|
||||
clean-htmlparser:: \
|
||||
; rm -rf htmlparser/bin translator.jar
|
||||
|
||||
clean-javasrc:: \
|
||||
; rm -rf ../javasrc
|
||||
|
||||
clean:: clean-javaparser clean-htmlparser clean-javasrc
|
|
@ -0,0 +1,41 @@
|
|||
If this is your first time building the HTML5 parser, you need to execute the
|
||||
following commands (from this directory) to accomplish the translation:
|
||||
|
||||
make translate # perform the Java-to-C++ translation from the remote
|
||||
# sources
|
||||
make named_characters # Generate tables for named character tokenization
|
||||
|
||||
If you make changes to the translator or the javaparser, you can rebuild by
|
||||
retyping 'make' in this directory. If you make changes to the HTML5 Java
|
||||
implementation, you can retranslate the Java sources from the htmlparser
|
||||
repository by retyping 'make translate' in this directory.
|
||||
|
||||
The makefile supports the following targets:
|
||||
|
||||
javaparser:
|
||||
Builds the javaparser library retrieved earlier by sync_javaparser.
|
||||
translator:
|
||||
Runs the javaparser target and then builds the Java to C++ translator from
|
||||
sources.
|
||||
libs:
|
||||
The default target. Alias for translator
|
||||
translate:
|
||||
Runs the translator target and then translates the HTML parser sources and
|
||||
copys the parser impl java sources to ../javasrc.
|
||||
translate-javasrc:
|
||||
Runs the translator target and then translates the HTML parser sources
|
||||
stored in ../javasrc. (Depercated)
|
||||
named-characters:
|
||||
Generates data tables for named character tokenization.
|
||||
clean_-avaparser:
|
||||
Removes the build products of the javaparser target.
|
||||
clean-htmlparser:
|
||||
Removes the build products of the translator target.
|
||||
clean-javasrc:
|
||||
Removes the javasrc snapshot code in ../javasrc
|
||||
clean:
|
||||
Runs clean-javaparser, clean-htmlparser, and clean-javasrc.
|
||||
|
||||
Ben Newman (23 September 2009)
|
||||
Henri Sivonen (11 August 2016)
|
||||
Matt A. Tobin (16 January 2020)
|
|
@ -0,0 +1,3 @@
|
|||
#!/bin/sh
|
||||
APPDIR=`dirname $0`;
|
||||
java -XstartOnFirstThread -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:/Developer/gwt-mac-1.5.1/gwt-user.jar:/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" com.google.gwt.dev.GWTCompiler -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser;
|
|
@ -0,0 +1,3 @@
|
|||
#!/bin/sh
|
||||
APPDIR=`dirname $0`;
|
||||
java -XstartOnFirstThread -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:/Developer/gwt-mac-1.5.1/gwt-user.jar:/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" com.google.gwt.dev.GWTCompiler -style DETAILED -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser;
|
|
@ -0,0 +1,24 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
|
||||
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
|
||||
<listEntry value="/htmlparser"/>
|
||||
</listAttribute>
|
||||
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
|
||||
<listEntry value="4"/>
|
||||
</listAttribute>
|
||||
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
|
||||
<listAttribute key="org.eclipse.jdt.launching.CLASSPATH">
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry containerPath="org.eclipse.jdt.launching.JRE_CONTAINER" javaProject="htmlparser" path="1" type="4"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry internalArchive="/htmlparser/src" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry internalArchive="/htmlparser/gwt-src" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry internalArchive="/htmlparser/super" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry id="org.eclipse.jdt.launching.classpathentry.defaultClasspath"> <memento exportedEntriesOnly="false" project="htmlparser"/> </runtimeClasspathEntry> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry externalArchive="/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry externalArchive="/Developer/gwt-mac-1.5.1/gwt-user.jar" path="3" type="2"/> "/>
|
||||
</listAttribute>
|
||||
<booleanAttribute key="org.eclipse.jdt.launching.DEFAULT_CLASSPATH" value="false"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="com.google.gwt.dev.GWTCompiler"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-style DETAILED -out /Users/hsivonen/Projects/whattf/htmlparser/www nu.validator.htmlparser.HtmlParser"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="htmlparser"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-XstartOnFirstThread -Xmx256M"/>
|
||||
</launchConfiguration>
|
|
@ -0,0 +1,22 @@
|
|||
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
|
||||
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
|
||||
<listEntry value="/htmlparser"/>
|
||||
</listAttribute>
|
||||
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
|
||||
<listEntry value="4"/>
|
||||
</listAttribute>
|
||||
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
|
||||
<listAttribute key="org.eclipse.jdt.launching.CLASSPATH">
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8" standalone="no"?> <runtimeClasspathEntry containerPath="org.eclipse.jdt.launching.JRE_CONTAINER" javaProject="htmlparser" path="1" type="4"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8" standalone="no"?> <runtimeClasspathEntry internalArchive="/htmlparser/src" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8" standalone="no"?> <runtimeClasspathEntry internalArchive="/htmlparser/gwt-src" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8" standalone="no"?> <runtimeClasspathEntry internalArchive="/htmlparser/super" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8" standalone="no"?> <runtimeClasspathEntry id="org.eclipse.jdt.launching.classpathentry.defaultClasspath"> <memento exportedEntriesOnly="false" project="htmlparser"/> </runtimeClasspathEntry> "/>
|
||||
</listAttribute>
|
||||
<booleanAttribute key="org.eclipse.jdt.launching.DEFAULT_CLASSPATH" value="false"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="com.google.gwt.dev.GWTCompiler"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-out /home/hsivonen/Projects/whattf/htmlparser/www nu.validator.htmlparser.HtmlParser"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="htmlparser"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-Xmx256M"/>
|
||||
</launchConfiguration>
|
|
@ -0,0 +1,3 @@
|
|||
#!/bin/sh
|
||||
APPDIR=`dirname $0`;
|
||||
java -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:$APPDIR/bin:/home/hsivonen/gwt-linux-1.5.1/gwt-user.jar:/home/hsivonen/gwt-linux-1.5.1/gwt-dev-linux.jar" com.google.gwt.dev.GWTShell -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser/HtmlParser.html;
|
|
@ -0,0 +1,3 @@
|
|||
#!/bin/sh
|
||||
APPDIR=`dirname $0`;
|
||||
java -XstartOnFirstThread -Xmx256M -cp "$APPDIR/src:$APPDIR/gwt-src:$APPDIR/super:$APPDIR/bin:/Developer/gwt-mac-1.5.1/gwt-user.jar:/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" com.google.gwt.dev.GWTShell -out "$APPDIR/www" "$@" nu.validator.htmlparser.HtmlParser/HtmlParser.html;
|
|
@ -0,0 +1,23 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
|
||||
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
|
||||
<listEntry value="/htmlparser"/>
|
||||
</listAttribute>
|
||||
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
|
||||
<listEntry value="4"/>
|
||||
</listAttribute>
|
||||
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
|
||||
<listAttribute key="org.eclipse.jdt.launching.CLASSPATH">
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry containerPath="org.eclipse.jdt.launching.JRE_CONTAINER" javaProject="htmlparser" path="1" type="4"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry internalArchive="/htmlparser/src" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry internalArchive="/htmlparser/gwt-src" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry internalArchive="/htmlparser/super" path="3" type="2"/> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry id="org.eclipse.jdt.launching.classpathentry.defaultClasspath"> <memento project="htmlparser"/> </runtimeClasspathEntry> "/>
|
||||
<listEntry value="<?xml version="1.0" encoding="UTF-8"?> <runtimeClasspathEntry externalArchive="/Developer/gwt-mac-1.5.1/gwt-dev-mac.jar" path="3" type="2"/> "/>
|
||||
</listAttribute>
|
||||
<booleanAttribute key="org.eclipse.jdt.launching.DEFAULT_CLASSPATH" value="false"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="com.google.gwt.dev.GWTShell"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-out www nu.validator.htmlparser.HtmlParser/HtmlParser.html"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="htmlparser"/>
|
||||
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-XstartOnFirstThread -Xmx256M"/>
|
||||
</launchConfiguration>
|
|
@ -0,0 +1,96 @@
|
|||
This is for the HTML parser as a whole except the rewindable input stream,
|
||||
the named character classes and the Live DOM Viewer.
|
||||
For the copyright notices for individual files, please see individual files.
|
||||
|
||||
/*
|
||||
* Copyright (c) 2005, 2006, 2007 Henri Sivonen
|
||||
* Copyright (c) 2007-2012 Mozilla Foundation
|
||||
* Portions of comments Copyright 2004-2007 Apple Computer, Inc., Mozilla
|
||||
* Foundation, and Opera Software ASA.
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
The following license is for the WHATWG spec from which the named character
|
||||
data was extracted.
|
||||
|
||||
/*
|
||||
* Copyright 2004-2010 Apple Computer, Inc., Mozilla Foundation, and Opera
|
||||
* Software ASA.
|
||||
*
|
||||
* You are granted a license to use, reproduce and create derivative works of
|
||||
* this document.
|
||||
*/
|
||||
|
||||
The following license is for the rewindable input stream.
|
||||
|
||||
/*
|
||||
* Copyright (c) 2001-2003 Thai Open Source Software Center Ltd
|
||||
* All rights reserved.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions
|
||||
* are met:
|
||||
*
|
||||
* * Redistributions of source code must retain the above copyright
|
||||
* notice, this list of conditions and the following disclaimer.
|
||||
* * Redistributions in binary form must reproduce the above
|
||||
* copyright notice, this list of conditions and the following
|
||||
* disclaimer in the documentation and/or other materials provided
|
||||
* with the distribution.
|
||||
* * Neither the name of the Thai Open Source Software Center Ltd nor
|
||||
* the names of its contributors may be used to endorse or promote
|
||||
* products derived from this software without specific prior
|
||||
* written permission.
|
||||
*
|
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
|
||||
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
|
||||
* REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
|
||||
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
|
||||
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
||||
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
|
||||
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
||||
* POSSIBILITY OF SUCH DAMAGE.
|
||||
*/
|
||||
|
||||
The following license applies to the Live DOM Viewer:
|
||||
|
||||
Copyright (c) 2000, 2006, 2008 Ian Hickson and various contributors
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
|
@ -0,0 +1,5 @@
|
|||
An HTML5 parser.
|
||||
|
||||
Please see http://about.validator.nu/htmlparser/
|
||||
|
||||
-- Henri Sivonen (hsivonen@iki.fi).
|
|
@ -0,0 +1,15 @@
|
|||
tokenization.txt represents the state of the spec implemented in Tokenizer.java.
|
||||
|
||||
To get a diffable version corresponding to the current spec:
|
||||
lynx -display_charset=utf-8 -dump -nolist http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html > current.txt
|
||||
|
||||
tree-construction.txt represents the state of the spec implemented in TreeBuilder.java.
|
||||
|
||||
To get a diffable version corresponding to the current spec:
|
||||
lynx -display_charset=utf-8 -dump -nolist http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html > current.txt
|
||||
|
||||
|
||||
The text of the files in this directory comes from the WHATWG HTML 5 spec
|
||||
which carries the following notice:
|
||||
© Copyright 2004-2010 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA.
|
||||
You are granted a license to use, reproduce and create derivative works of this document.
|
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,745 @@
|
|||
#!/usr/bin/python
|
||||
|
||||
# Copyright (c) 2013-2015 Mozilla Foundation
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a
|
||||
# copy of this software and associated documentation files (the "Software"),
|
||||
# to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
# and/or sell copies of the Software, and to permit persons to whom the
|
||||
# Software is furnished to do so, subject to the following conditions:
|
||||
#
|
||||
# The above copyright notice and this permission notice shall be included in
|
||||
# all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
# DEALINGS IN THE SOFTWARE.
|
||||
|
||||
import json
|
||||
|
||||
class Label:
|
||||
def __init__(self, label, preferred):
|
||||
self.label = label
|
||||
self.preferred = preferred
|
||||
def __cmp__(self, other):
|
||||
return cmp(self.label, other.label)
|
||||
|
||||
# If a multi-byte encoding is on this list, it is assumed to have a
|
||||
# non-generated decoder implementation class. Otherwise, the JDK default
|
||||
# decoder is used as a placeholder.
|
||||
MULTI_BYTE_DECODER_IMPLEMENTED = [
|
||||
u"x-user-defined",
|
||||
u"replacement",
|
||||
u"big5",
|
||||
]
|
||||
|
||||
MULTI_BYTE_ENCODER_IMPLEMENTED = [
|
||||
u"big5",
|
||||
]
|
||||
|
||||
preferred = []
|
||||
|
||||
labels = []
|
||||
|
||||
data = json.load(open("../encoding/encodings.json", "r"))
|
||||
|
||||
indexes = json.load(open("../encoding/indexes.json", "r"))
|
||||
|
||||
single_byte = []
|
||||
|
||||
multi_byte = []
|
||||
|
||||
def to_camel_name(name):
|
||||
if name == u"iso-8859-8-i":
|
||||
return u"Iso8I"
|
||||
if name.startswith(u"iso-8859-"):
|
||||
return name.replace(u"iso-8859-", u"Iso")
|
||||
return name.title().replace(u"X-", u"").replace(u"-", u"").replace(u"_", u"")
|
||||
|
||||
def to_constant_name(name):
|
||||
return name.replace(u"-", u"_").upper()
|
||||
|
||||
# Encoding.java
|
||||
|
||||
for group in data:
|
||||
if group["heading"] == "Legacy single-byte encodings":
|
||||
single_byte = group["encodings"]
|
||||
else:
|
||||
multi_byte.extend(group["encodings"])
|
||||
for encoding in group["encodings"]:
|
||||
preferred.append(encoding["name"])
|
||||
for label in encoding["labels"]:
|
||||
labels.append(Label(label, encoding["name"]))
|
||||
|
||||
preferred.sort()
|
||||
labels.sort()
|
||||
|
||||
label_file = open("src/nu/validator/encoding/Encoding.java", "w")
|
||||
|
||||
label_file.write("""/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
import java.nio.charset.IllegalCharsetNameException;
|
||||
import java.nio.charset.UnsupportedCharsetException;
|
||||
import java.nio.charset.spi.CharsetProvider;
|
||||
import java.util.Arrays;
|
||||
import java.util.Collections;
|
||||
import java.util.SortedMap;
|
||||
import java.util.TreeMap;
|
||||
|
||||
/**
|
||||
* Represents an <a href="https://encoding.spec.whatwg.org/#encoding">encoding</a>
|
||||
* as defined in the <a href="https://encoding.spec.whatwg.org/">Encoding
|
||||
* Standard</a>, provides access to each encoding defined in the Encoding
|
||||
* Standard via a static constant and provides the
|
||||
* "<a href="https://encoding.spec.whatwg.org/#concept-encoding-get">get an
|
||||
* encoding</a>" algorithm defined in the Encoding Standard.
|
||||
*
|
||||
* <p>This class inherits from {@link Charset} to allow the Encoding
|
||||
* Standard-compliant encodings to be used in contexts that support
|
||||
* <code>Charset</code> instances. However, by design, the Encoding
|
||||
* Standard-compliant encodings are not supplied via a {@link CharsetProvider}
|
||||
* and, therefore, are not available via and do not interfere with the static
|
||||
* methods provided by <code>Charset</code>. (This class provides methods of
|
||||
* the same name to hide each static method of <code>Charset</code> to help
|
||||
* avoid accidental calls to the static methods of the superclass when working
|
||||
* with Encoding Standard-compliant encodings.)
|
||||
*
|
||||
* <p>When an application needs to use a particular encoding, such as utf-8
|
||||
* or windows-1252, the corresponding constant, i.e.
|
||||
* {@link #UTF_8 Encoding.UTF_8} and {@link #WINDOWS_1252 Encoding.WINDOWS_1252}
|
||||
* respectively, should be used. However, when the application receives an
|
||||
* encoding label from external input, the method {@link #forName(String)
|
||||
* forName()} should be used to obtain the object representing the encoding
|
||||
* identified by the label. In contexts where labels that map to the
|
||||
* <a href="https://encoding.spec.whatwg.org/#replacement">replacement
|
||||
* encoding</a> should be treated as unknown, the method {@link
|
||||
* #forNameNoReplacement(String) forNameNoReplacement()} should be used instead.
|
||||
*
|
||||
*
|
||||
* @author hsivonen
|
||||
*/
|
||||
public abstract class Encoding extends Charset {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
""")
|
||||
|
||||
for label in labels:
|
||||
label_file.write(" \"%s\",\n" % label.label)
|
||||
|
||||
label_file.write(""" };
|
||||
|
||||
private static final Encoding[] ENCODINGS_FOR_LABELS = {
|
||||
""")
|
||||
|
||||
for label in labels:
|
||||
label_file.write(" %s.INSTANCE,\n" % to_camel_name(label.preferred))
|
||||
|
||||
label_file.write(""" };
|
||||
|
||||
private static final Encoding[] ENCODINGS = {
|
||||
""")
|
||||
|
||||
for label in preferred:
|
||||
label_file.write(" %s.INSTANCE,\n" % to_camel_name(label))
|
||||
|
||||
label_file.write(""" };
|
||||
|
||||
""")
|
||||
|
||||
for label in preferred:
|
||||
label_file.write(""" /**
|
||||
* The %s encoding.
|
||||
*/
|
||||
public static final Encoding %s = %s.INSTANCE;
|
||||
|
||||
""" % (label, to_constant_name(label), to_camel_name(label)))
|
||||
|
||||
label_file.write("""
|
||||
private static SortedMap<String, Charset> encodings = null;
|
||||
|
||||
protected Encoding(String canonicalName, String[] aliases) {
|
||||
super(canonicalName, aliases);
|
||||
}
|
||||
|
||||
private enum State {
|
||||
HEAD, LABEL, TAIL
|
||||
};
|
||||
|
||||
public static Encoding forName(String label) {
|
||||
if (label == null) {
|
||||
throw new IllegalArgumentException("Label must not be null.");
|
||||
}
|
||||
if (label.length() == 0) {
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
// First try the fast path
|
||||
int index = Arrays.binarySearch(LABELS, label);
|
||||
if (index >= 0) {
|
||||
return ENCODINGS_FOR_LABELS[index];
|
||||
}
|
||||
// Else, slow path
|
||||
StringBuilder sb = new StringBuilder();
|
||||
State state = State.HEAD;
|
||||
for (int i = 0; i < label.length(); i++) {
|
||||
char c = label.charAt(i);
|
||||
if ((c == ' ') || (c == '\\n') || (c == '\\r') || (c == '\\t')
|
||||
|| (c == '\\u000C')) {
|
||||
if (state == State.LABEL) {
|
||||
state = State.TAIL;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if ((c >= 'a' && c <= 'z') || (c >= '0' && c <= '9')) {
|
||||
switch (state) {
|
||||
case HEAD:
|
||||
state = State.LABEL;
|
||||
// Fall through
|
||||
case LABEL:
|
||||
sb.append(c);
|
||||
continue;
|
||||
case TAIL:
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
}
|
||||
if (c >= 'A' && c <= 'Z') {
|
||||
c += 0x20;
|
||||
switch (state) {
|
||||
case HEAD:
|
||||
state = State.LABEL;
|
||||
// Fall through
|
||||
case LABEL:
|
||||
sb.append(c);
|
||||
continue;
|
||||
case TAIL:
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
}
|
||||
if ((c == '-') || (c == '+') || (c == '.') || (c == ':')
|
||||
|| (c == '_')) {
|
||||
switch (state) {
|
||||
case LABEL:
|
||||
sb.append(c);
|
||||
continue;
|
||||
case HEAD:
|
||||
case TAIL:
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
}
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
index = Arrays.binarySearch(LABELS, sb.toString());
|
||||
if (index >= 0) {
|
||||
return ENCODINGS_FOR_LABELS[index];
|
||||
}
|
||||
throw new UnsupportedCharsetException(label);
|
||||
}
|
||||
|
||||
public static Encoding forNameNoReplacement(String label) {
|
||||
Encoding encoding = Encoding.forName(label);
|
||||
if (encoding == Encoding.REPLACEMENT) {
|
||||
throw new UnsupportedCharsetException(label);
|
||||
}
|
||||
return encoding;
|
||||
}
|
||||
|
||||
public static boolean isSupported(String label) {
|
||||
try {
|
||||
Encoding.forName(label);
|
||||
} catch (UnsupportedCharsetException e) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
public static boolean isSupportedNoReplacement(String label) {
|
||||
try {
|
||||
Encoding.forNameNoReplacement(label);
|
||||
} catch (UnsupportedCharsetException e) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
public static SortedMap<String, Charset> availableCharsets() {
|
||||
if (encodings == null) {
|
||||
TreeMap<String, Charset> map = new TreeMap<String, Charset>();
|
||||
for (Encoding encoding : ENCODINGS) {
|
||||
map.put(encoding.name(), encoding);
|
||||
}
|
||||
encodings = Collections.unmodifiableSortedMap(map);
|
||||
}
|
||||
return encodings;
|
||||
}
|
||||
|
||||
public static Encoding defaultCharset() {
|
||||
return WINDOWS_1252;
|
||||
}
|
||||
|
||||
@Override public boolean canEncode() {
|
||||
return false;
|
||||
}
|
||||
|
||||
@Override public boolean contains(Charset cs) {
|
||||
return false;
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
throw new UnsupportedOperationException("Encoder not implemented.");
|
||||
}
|
||||
}
|
||||
""")
|
||||
|
||||
label_file.close()
|
||||
|
||||
# Single-byte encodings
|
||||
|
||||
for encoding in single_byte:
|
||||
name = encoding["name"]
|
||||
labels = encoding["labels"]
|
||||
labels.sort()
|
||||
class_name = to_camel_name(name)
|
||||
mapping_name = name
|
||||
if mapping_name == u"iso-8859-8-i":
|
||||
mapping_name = u"iso-8859-8"
|
||||
mapping = indexes[mapping_name]
|
||||
class_file = open("src/nu/validator/encoding/%s.java" % class_name, "w")
|
||||
class_file.write('''/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write(''' extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {''')
|
||||
fallible = False
|
||||
comma = False
|
||||
for code_point in mapping:
|
||||
# XXX should we have error reporting?
|
||||
if not code_point:
|
||||
code_point = 0xFFFD
|
||||
fallible = True
|
||||
if comma:
|
||||
class_file.write(",")
|
||||
class_file.write("\n '\u%04x'" % code_point);
|
||||
comma = True
|
||||
class_file.write('''
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {''')
|
||||
|
||||
comma = False
|
||||
for label in labels:
|
||||
if comma:
|
||||
class_file.write(",")
|
||||
class_file.write("\n \"%s\"" % label);
|
||||
comma = True
|
||||
class_file.write('''
|
||||
};
|
||||
|
||||
private static final String NAME = "''')
|
||||
class_file.write(name)
|
||||
class_file.write('''";
|
||||
|
||||
static final Encoding INSTANCE = new ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write('''();
|
||||
|
||||
private ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write('''() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new ''')
|
||||
class_file.write("Fallible" if fallible else "Infallible")
|
||||
class_file.write('''SingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
||||
''')
|
||||
class_file.close()
|
||||
|
||||
# Multi-byte encodings
|
||||
|
||||
for encoding in multi_byte:
|
||||
name = encoding["name"]
|
||||
labels = encoding["labels"]
|
||||
labels.sort()
|
||||
class_name = to_camel_name(name)
|
||||
class_file = open("src/nu/validator/encoding/%s.java" % class_name, "w")
|
||||
class_file.write('''/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write(''' extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {''')
|
||||
|
||||
comma = False
|
||||
for label in labels:
|
||||
if comma:
|
||||
class_file.write(",")
|
||||
class_file.write("\n \"%s\"" % label);
|
||||
comma = True
|
||||
class_file.write('''
|
||||
};
|
||||
|
||||
private static final String NAME = "''')
|
||||
class_file.write(name)
|
||||
class_file.write('''";
|
||||
|
||||
static final ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write(''' INSTANCE = new ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write('''();
|
||||
|
||||
private ''')
|
||||
class_file.write(class_name)
|
||||
class_file.write('''() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
''')
|
||||
if name == "gbk":
|
||||
class_file.write('''return Charset.forName("gb18030").newDecoder();''')
|
||||
elif name in MULTI_BYTE_DECODER_IMPLEMENTED:
|
||||
class_file.write("return new %sDecoder(this);" % class_name)
|
||||
else:
|
||||
class_file.write('''return Charset.forName(NAME).newDecoder();''')
|
||||
class_file.write('''
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
''')
|
||||
if name in MULTI_BYTE_ENCODER_IMPLEMENTED:
|
||||
class_file.write("return new %sEncoder(this);" % class_name)
|
||||
else:
|
||||
class_file.write('''return Charset.forName(NAME).newEncoder();''')
|
||||
class_file.write('''
|
||||
}
|
||||
}
|
||||
''')
|
||||
class_file.close()
|
||||
|
||||
# Big5
|
||||
|
||||
def null_to_zero(code_point):
|
||||
if not code_point:
|
||||
code_point = 0
|
||||
return code_point
|
||||
|
||||
index = []
|
||||
|
||||
for code_point in indexes["big5"]:
|
||||
index.append(null_to_zero(code_point))
|
||||
|
||||
# There are four major gaps consisting of more than 4 consecutive invalid pointers
|
||||
gaps = []
|
||||
consecutive = 0
|
||||
consecutive_start = 0
|
||||
offset = 0
|
||||
for code_point in index:
|
||||
if code_point == 0:
|
||||
if consecutive == 0:
|
||||
consecutive_start = offset
|
||||
consecutive +=1
|
||||
else:
|
||||
if consecutive > 4:
|
||||
gaps.append((consecutive_start, consecutive_start + consecutive))
|
||||
consecutive = 0
|
||||
offset += 1
|
||||
|
||||
def invert_ranges(ranges, cap):
|
||||
inverted = []
|
||||
invert_start = 0
|
||||
for (start, end) in ranges:
|
||||
if start != 0:
|
||||
inverted.append((invert_start, start))
|
||||
invert_start = end
|
||||
inverted.append((invert_start, cap))
|
||||
return inverted
|
||||
|
||||
cap = len(index)
|
||||
ranges = invert_ranges(gaps, cap)
|
||||
|
||||
# Now compute a compressed lookup table for astralness
|
||||
|
||||
gaps = []
|
||||
consecutive = 0
|
||||
consecutive_start = 0
|
||||
offset = 0
|
||||
for code_point in index:
|
||||
if code_point <= 0xFFFF:
|
||||
if consecutive == 0:
|
||||
consecutive_start = offset
|
||||
consecutive +=1
|
||||
else:
|
||||
if consecutive > 40:
|
||||
gaps.append((consecutive_start, consecutive_start + consecutive))
|
||||
consecutive = 0
|
||||
offset += 1
|
||||
|
||||
astral_ranges = invert_ranges(gaps, cap)
|
||||
|
||||
class_file = open("src/nu/validator/encoding/Big5Data.java", "w")
|
||||
class_file.write('''/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
final class Big5Data {
|
||||
|
||||
private static final String ASTRALNESS = "''')
|
||||
|
||||
bits = []
|
||||
for (low, high) in astral_ranges:
|
||||
for i in xrange(low, high):
|
||||
bits.append(1 if index[i] > 0xFFFF else 0)
|
||||
# pad length to multiple of 16
|
||||
for j in xrange(16 - (len(bits) % 16)):
|
||||
bits.append(0)
|
||||
|
||||
i = 0
|
||||
while i < len(bits):
|
||||
accu = 0
|
||||
for j in xrange(16):
|
||||
accu |= bits[i + j] << j
|
||||
if accu == 0x22:
|
||||
class_file.write('\\"')
|
||||
else:
|
||||
class_file.write('\\u%04X' % accu)
|
||||
i += 16
|
||||
|
||||
class_file.write('''";
|
||||
|
||||
''')
|
||||
|
||||
j = 0
|
||||
for (low, high) in ranges:
|
||||
class_file.write(''' private static final String TABLE%d = "''' % j)
|
||||
for i in xrange(low, high):
|
||||
class_file.write('\\u%04X' % (index[i] & 0xFFFF))
|
||||
class_file.write('''";
|
||||
|
||||
''')
|
||||
j += 1
|
||||
|
||||
class_file.write(''' private static boolean readBit(int i) {
|
||||
return (ASTRALNESS.charAt(i >> 4) & (1 << (i & 0xF))) != 0;
|
||||
}
|
||||
|
||||
static char lowBits(int pointer) {
|
||||
''')
|
||||
|
||||
j = 0
|
||||
for (low, high) in ranges:
|
||||
class_file.write(''' if (pointer < %d) {
|
||||
return '\\u0000';
|
||||
}
|
||||
if (pointer < %d) {
|
||||
return TABLE%d.charAt(pointer - %d);
|
||||
}
|
||||
''' % (low, high, j, low))
|
||||
j += 1
|
||||
|
||||
class_file.write(''' return '\\u0000';
|
||||
}
|
||||
|
||||
static boolean isAstral(int pointer) {
|
||||
''')
|
||||
|
||||
base = 0
|
||||
for (low, high) in astral_ranges:
|
||||
if high - low == 1:
|
||||
class_file.write(''' if (pointer < %d) {
|
||||
return false;
|
||||
}
|
||||
if (pointer == %d) {
|
||||
return true;
|
||||
}
|
||||
''' % (low, low))
|
||||
else:
|
||||
class_file.write(''' if (pointer < %d) {
|
||||
return false;
|
||||
}
|
||||
if (pointer < %d) {
|
||||
return readBit(%d + (pointer - %d));
|
||||
}
|
||||
''' % (low, high, base, low))
|
||||
base += (high - low)
|
||||
|
||||
class_file.write(''' return false;
|
||||
}
|
||||
|
||||
public static int findPointer(char lowBits, boolean isAstral) {
|
||||
if (!isAstral) {
|
||||
switch (lowBits) {
|
||||
''')
|
||||
|
||||
hkscs_bound = (0xA1 - 0x81) * 157
|
||||
|
||||
prefer_last = [
|
||||
0x2550,
|
||||
0x255E,
|
||||
0x2561,
|
||||
0x256A,
|
||||
0x5341,
|
||||
0x5345,
|
||||
]
|
||||
|
||||
for code_point in prefer_last:
|
||||
# Python lists don't have .rindex() :-(
|
||||
for i in xrange(len(index) - 1, -1, -1):
|
||||
candidate = index[i]
|
||||
if candidate == code_point:
|
||||
class_file.write(''' case 0x%04X:
|
||||
return %d;
|
||||
''' % (code_point, i))
|
||||
break
|
||||
|
||||
class_file.write(''' default:
|
||||
break;
|
||||
}
|
||||
}''')
|
||||
|
||||
j = 0
|
||||
for (low, high) in ranges:
|
||||
if high > hkscs_bound:
|
||||
start = 0
|
||||
if low <= hkscs_bound and hkscs_bound < high:
|
||||
# This is the first range we don't ignore and the
|
||||
# range that contains the first non-HKSCS pointer.
|
||||
# Avoid searching HKSCS.
|
||||
start = hkscs_bound - low
|
||||
class_file.write('''
|
||||
for (int i = %d; i < TABLE%d.length(); i++) {
|
||||
if (TABLE%d.charAt(i) == lowBits) {
|
||||
int pointer = i + %d;
|
||||
if (isAstral == isAstral(pointer)) {
|
||||
return pointer;
|
||||
}
|
||||
}
|
||||
}''' % (start, j, j, low))
|
||||
j += 1
|
||||
|
||||
class_file.write('''
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
''')
|
||||
class_file.close()
|
|
@ -0,0 +1,12 @@
|
|||
<module>
|
||||
<inherits name="com.google.gwt.core.Core"/>
|
||||
<inherits name="com.google.gwt.user.User"/>
|
||||
<super-source path="translatable"/>
|
||||
<source path="annotation"/>
|
||||
<source path="common"/>
|
||||
<source path="impl"/>
|
||||
<source path="gwt"/>
|
||||
<set-property name="user.agent" value="gecko1_8"/>
|
||||
<entry-point class="nu.validator.htmlparser.gwt.HtmlParserModule"/>
|
||||
<add-linker name="sso"/>
|
||||
</module>
|
|
@ -0,0 +1,477 @@
|
|||
/*
|
||||
* Copyright (c) 2007 Henri Sivonen
|
||||
* Copyright (c) 2008-2009 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.gwt;
|
||||
|
||||
import java.util.LinkedList;
|
||||
|
||||
import nu.validator.htmlparser.common.DocumentMode;
|
||||
import nu.validator.htmlparser.impl.CoalescingTreeBuilder;
|
||||
import nu.validator.htmlparser.impl.HtmlAttributes;
|
||||
|
||||
import org.xml.sax.SAXException;
|
||||
|
||||
import com.google.gwt.core.client.JavaScriptException;
|
||||
import com.google.gwt.core.client.JavaScriptObject;
|
||||
|
||||
class BrowserTreeBuilder extends CoalescingTreeBuilder<JavaScriptObject> {
|
||||
|
||||
private JavaScriptObject document;
|
||||
|
||||
private JavaScriptObject script;
|
||||
|
||||
private JavaScriptObject placeholder;
|
||||
|
||||
private boolean readyToRun;
|
||||
|
||||
private final LinkedList<ScriptHolder> scriptStack = new LinkedList<ScriptHolder>();
|
||||
|
||||
private class ScriptHolder {
|
||||
private final JavaScriptObject script;
|
||||
|
||||
private final JavaScriptObject placeholder;
|
||||
|
||||
/**
|
||||
* @param script
|
||||
* @param placeholder
|
||||
*/
|
||||
public ScriptHolder(JavaScriptObject script,
|
||||
JavaScriptObject placeholder) {
|
||||
this.script = script;
|
||||
this.placeholder = placeholder;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the script.
|
||||
*
|
||||
* @return the script
|
||||
*/
|
||||
public JavaScriptObject getScript() {
|
||||
return script;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the placeholder.
|
||||
*
|
||||
* @return the placeholder
|
||||
*/
|
||||
public JavaScriptObject getPlaceholder() {
|
||||
return placeholder;
|
||||
}
|
||||
}
|
||||
|
||||
protected BrowserTreeBuilder(JavaScriptObject document) {
|
||||
super();
|
||||
this.document = document;
|
||||
installExplorerCreateElementNS(document);
|
||||
}
|
||||
|
||||
private static native boolean installExplorerCreateElementNS(
|
||||
JavaScriptObject doc) /*-{
|
||||
if (!doc.createElementNS) {
|
||||
doc.createElementNS = function (uri, local) {
|
||||
if ("http://www.w3.org/1999/xhtml" == uri) {
|
||||
return doc.createElement(local);
|
||||
} else if ("http://www.w3.org/1998/Math/MathML" == uri) {
|
||||
if (!doc.mathplayerinitialized) {
|
||||
var obj = document.createElement("object");
|
||||
obj.setAttribute("id", "mathplayer");
|
||||
obj.setAttribute("classid", "clsid:32F66A20-7614-11D4-BD11-00104BD3F987");
|
||||
document.getElementsByTagName("head")[0].appendChild(obj);
|
||||
document.namespaces.add("m", "http://www.w3.org/1998/Math/MathML", "#mathplayer");
|
||||
doc.mathplayerinitialized = true;
|
||||
}
|
||||
return doc.createElement("m:" + local);
|
||||
} else if ("http://www.w3.org/2000/svg" == uri) {
|
||||
if (!doc.renesisinitialized) {
|
||||
var obj = document.createElement("object");
|
||||
obj.setAttribute("id", "renesis");
|
||||
obj.setAttribute("classid", "clsid:AC159093-1683-4BA2-9DCF-0C350141D7F2");
|
||||
document.getElementsByTagName("head")[0].appendChild(obj);
|
||||
document.namespaces.add("s", "http://www.w3.org/2000/svg", "#renesis");
|
||||
doc.renesisinitialized = true;
|
||||
}
|
||||
return doc.createElement("s:" + local);
|
||||
} else {
|
||||
// throw
|
||||
}
|
||||
}
|
||||
}
|
||||
}-*/;
|
||||
|
||||
private static native boolean hasAttributeNS(JavaScriptObject element,
|
||||
String uri, String localName) /*-{
|
||||
return element.hasAttributeNS(uri, localName);
|
||||
}-*/;
|
||||
|
||||
private static native void setAttributeNS(JavaScriptObject element,
|
||||
String uri, String localName, String value) /*-{
|
||||
element.setAttributeNS(uri, localName, value);
|
||||
}-*/;
|
||||
|
||||
@Override protected void addAttributesToElement(JavaScriptObject element,
|
||||
HtmlAttributes attributes) throws SAXException {
|
||||
try {
|
||||
for (int i = 0; i < attributes.getLength(); i++) {
|
||||
String localName = attributes.getLocalNameNoBoundsCheck(i);
|
||||
String uri = attributes.getURINoBoundsCheck(i);
|
||||
if (!hasAttributeNS(element, uri, localName)) {
|
||||
setAttributeNS(element, uri, localName,
|
||||
attributes.getValueNoBoundsCheck(i));
|
||||
}
|
||||
}
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
private static native void appendChild(JavaScriptObject parent,
|
||||
JavaScriptObject child) /*-{
|
||||
parent.appendChild(child);
|
||||
}-*/;
|
||||
|
||||
private static native JavaScriptObject createTextNode(JavaScriptObject doc,
|
||||
String text) /*-{
|
||||
return doc.createTextNode(text);
|
||||
}-*/;
|
||||
|
||||
private static native JavaScriptObject getLastChild(JavaScriptObject node) /*-{
|
||||
return node.lastChild;
|
||||
}-*/;
|
||||
|
||||
private static native void extendTextNode(JavaScriptObject node, String text) /*-{
|
||||
node.data += text;
|
||||
}-*/;
|
||||
|
||||
@Override protected void appendCharacters(JavaScriptObject parent,
|
||||
String text) throws SAXException {
|
||||
try {
|
||||
if (parent == placeholder) {
|
||||
appendChild(script, createTextNode(document, text));
|
||||
|
||||
}
|
||||
JavaScriptObject lastChild = getLastChild(parent);
|
||||
if (lastChild != null && getNodeType(lastChild) == 3) {
|
||||
extendTextNode(lastChild, text);
|
||||
return;
|
||||
}
|
||||
appendChild(parent, createTextNode(document, text));
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
private static native boolean hasChildNodes(JavaScriptObject element) /*-{
|
||||
return element.hasChildNodes();
|
||||
}-*/;
|
||||
|
||||
private static native JavaScriptObject getFirstChild(
|
||||
JavaScriptObject element) /*-{
|
||||
return element.firstChild;
|
||||
}-*/;
|
||||
|
||||
@Override protected void appendChildrenToNewParent(
|
||||
JavaScriptObject oldParent, JavaScriptObject newParent)
|
||||
throws SAXException {
|
||||
try {
|
||||
while (hasChildNodes(oldParent)) {
|
||||
appendChild(newParent, getFirstChild(oldParent));
|
||||
}
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
private static native JavaScriptObject createComment(JavaScriptObject doc,
|
||||
String text) /*-{
|
||||
return doc.createComment(text);
|
||||
}-*/;
|
||||
|
||||
@Override protected void appendComment(JavaScriptObject parent,
|
||||
String comment) throws SAXException {
|
||||
try {
|
||||
if (parent == placeholder) {
|
||||
appendChild(script, createComment(document, comment));
|
||||
}
|
||||
appendChild(parent, createComment(document, comment));
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected void appendCommentToDocument(String comment)
|
||||
throws SAXException {
|
||||
try {
|
||||
appendChild(document, createComment(document, comment));
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
private static native JavaScriptObject createElementNS(
|
||||
JavaScriptObject doc, String ns, String local) /*-{
|
||||
return doc.createElementNS(ns, local);
|
||||
}-*/;
|
||||
|
||||
@Override protected JavaScriptObject createElement(String ns, String name,
|
||||
HtmlAttributes attributes) throws SAXException {
|
||||
try {
|
||||
JavaScriptObject rv = createElementNS(document, ns, name);
|
||||
for (int i = 0; i < attributes.getLength(); i++) {
|
||||
setAttributeNS(rv, attributes.getURINoBoundsCheck(i),
|
||||
attributes.getLocalNameNoBoundsCheck(i),
|
||||
attributes.getValueNoBoundsCheck(i));
|
||||
}
|
||||
|
||||
if ("script" == name) {
|
||||
if (placeholder != null) {
|
||||
scriptStack.addLast(new ScriptHolder(script, placeholder));
|
||||
}
|
||||
script = rv;
|
||||
placeholder = createElementNS(document,
|
||||
"http://n.validator.nu/placeholder/", "script");
|
||||
rv = placeholder;
|
||||
for (int i = 0; i < attributes.getLength(); i++) {
|
||||
setAttributeNS(rv, attributes.getURINoBoundsCheck(i),
|
||||
attributes.getLocalNameNoBoundsCheck(i),
|
||||
attributes.getValueNoBoundsCheck(i));
|
||||
}
|
||||
}
|
||||
|
||||
return rv;
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
throw new RuntimeException("Unreachable");
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected JavaScriptObject createHtmlElementSetAsRoot(
|
||||
HtmlAttributes attributes) throws SAXException {
|
||||
try {
|
||||
JavaScriptObject rv = createElementNS(document,
|
||||
"http://www.w3.org/1999/xhtml", "html");
|
||||
for (int i = 0; i < attributes.getLength(); i++) {
|
||||
setAttributeNS(rv, attributes.getURINoBoundsCheck(i),
|
||||
attributes.getLocalNameNoBoundsCheck(i),
|
||||
attributes.getValueNoBoundsCheck(i));
|
||||
}
|
||||
appendChild(document, rv);
|
||||
return rv;
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
throw new RuntimeException("Unreachable");
|
||||
}
|
||||
}
|
||||
|
||||
private static native JavaScriptObject getParentNode(
|
||||
JavaScriptObject element) /*-{
|
||||
return element.parentNode;
|
||||
}-*/;
|
||||
|
||||
@Override protected void appendElement(JavaScriptObject child,
|
||||
JavaScriptObject newParent) throws SAXException {
|
||||
try {
|
||||
if (newParent == placeholder) {
|
||||
appendChild(script, cloneNodeDeep(child));
|
||||
}
|
||||
appendChild(newParent, child);
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected boolean hasChildren(JavaScriptObject element)
|
||||
throws SAXException {
|
||||
try {
|
||||
return hasChildNodes(element);
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
throw new RuntimeException("Unreachable");
|
||||
}
|
||||
}
|
||||
|
||||
private static native void insertBeforeNative(JavaScriptObject parent,
|
||||
JavaScriptObject child, JavaScriptObject sibling) /*-{
|
||||
parent.insertBefore(child, sibling);
|
||||
}-*/;
|
||||
|
||||
private static native int getNodeType(JavaScriptObject node) /*-{
|
||||
return node.nodeType;
|
||||
}-*/;
|
||||
|
||||
private static native JavaScriptObject cloneNodeDeep(JavaScriptObject node) /*-{
|
||||
return node.cloneNode(true);
|
||||
}-*/;
|
||||
|
||||
/**
|
||||
* Returns the document.
|
||||
*
|
||||
* @return the document
|
||||
*/
|
||||
JavaScriptObject getDocument() {
|
||||
JavaScriptObject rv = document;
|
||||
document = null;
|
||||
return rv;
|
||||
}
|
||||
|
||||
private static native JavaScriptObject createDocumentFragment(
|
||||
JavaScriptObject doc) /*-{
|
||||
return doc.createDocumentFragment();
|
||||
}-*/;
|
||||
|
||||
JavaScriptObject getDocumentFragment() {
|
||||
JavaScriptObject rv = createDocumentFragment(document);
|
||||
JavaScriptObject rootElt = getFirstChild(document);
|
||||
while (hasChildNodes(rootElt)) {
|
||||
appendChild(rv, getFirstChild(rootElt));
|
||||
}
|
||||
document = null;
|
||||
return rv;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see nu.validator.htmlparser.impl.TreeBuilder#createJavaScriptObject(String,
|
||||
* java.lang.String, org.xml.sax.Attributes, java.lang.Object)
|
||||
*/
|
||||
@Override protected JavaScriptObject createElement(String ns, String name,
|
||||
HtmlAttributes attributes, JavaScriptObject form)
|
||||
throws SAXException {
|
||||
try {
|
||||
JavaScriptObject rv = createElement(ns, name, attributes);
|
||||
// rv.setUserData("nu.validator.form-pointer", form, null);
|
||||
return rv;
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* @see nu.validator.htmlparser.impl.TreeBuilder#start()
|
||||
*/
|
||||
@Override protected void start(boolean fragment) throws SAXException {
|
||||
script = null;
|
||||
placeholder = null;
|
||||
readyToRun = false;
|
||||
}
|
||||
|
||||
protected void documentMode(DocumentMode mode, String publicIdentifier,
|
||||
String systemIdentifier, boolean html4SpecificAdditionalErrorChecks)
|
||||
throws SAXException {
|
||||
// document.setUserData("nu.validator.document-mode", mode, null);
|
||||
}
|
||||
|
||||
/**
|
||||
* @see nu.validator.htmlparser.impl.TreeBuilder#elementPopped(java.lang.String,
|
||||
* java.lang.String, java.lang.Object)
|
||||
*/
|
||||
@Override protected void elementPopped(String ns, String name,
|
||||
JavaScriptObject node) throws SAXException {
|
||||
if (node == placeholder) {
|
||||
readyToRun = true;
|
||||
requestSuspension();
|
||||
}
|
||||
}
|
||||
|
||||
private static native void replace(JavaScriptObject oldNode,
|
||||
JavaScriptObject newNode) /*-{
|
||||
oldNode.parentNode.replaceChild(newNode, oldNode);
|
||||
}-*/;
|
||||
|
||||
private static native JavaScriptObject getPreviousSibling(JavaScriptObject node) /*-{
|
||||
return node.previousSibling;
|
||||
}-*/;
|
||||
|
||||
void maybeRunScript() {
|
||||
if (readyToRun) {
|
||||
readyToRun = false;
|
||||
replace(placeholder, script);
|
||||
if (scriptStack.isEmpty()) {
|
||||
script = null;
|
||||
placeholder = null;
|
||||
} else {
|
||||
ScriptHolder scriptHolder = scriptStack.removeLast();
|
||||
script = scriptHolder.getScript();
|
||||
placeholder = scriptHolder.getPlaceholder();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected void insertFosterParentedCharacters(String text,
|
||||
JavaScriptObject table, JavaScriptObject stackParent)
|
||||
throws SAXException {
|
||||
try {
|
||||
JavaScriptObject parent = getParentNode(table);
|
||||
if (parent != null) { // always an element if not null
|
||||
JavaScriptObject previousSibling = getPreviousSibling(table);
|
||||
if (previousSibling != null
|
||||
&& getNodeType(previousSibling) == 3) {
|
||||
extendTextNode(previousSibling, text);
|
||||
return;
|
||||
}
|
||||
insertBeforeNative(parent, createTextNode(document, text), table);
|
||||
return;
|
||||
}
|
||||
JavaScriptObject lastChild = getLastChild(stackParent);
|
||||
if (lastChild != null && getNodeType(lastChild) == 3) {
|
||||
extendTextNode(lastChild, text);
|
||||
return;
|
||||
}
|
||||
appendChild(stackParent, createTextNode(document, text));
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected void insertFosterParentedChild(JavaScriptObject child,
|
||||
JavaScriptObject table, JavaScriptObject stackParent)
|
||||
throws SAXException {
|
||||
JavaScriptObject parent = getParentNode(table);
|
||||
try {
|
||||
if (parent != null && getNodeType(parent) == 1) {
|
||||
insertBeforeNative(parent, child, table);
|
||||
} else {
|
||||
appendChild(stackParent, child);
|
||||
}
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
|
||||
private static native void removeChild(JavaScriptObject parent,
|
||||
JavaScriptObject child) /*-{
|
||||
parent.removeChild(child);
|
||||
}-*/;
|
||||
|
||||
@Override protected void detachFromParent(JavaScriptObject element)
|
||||
throws SAXException {
|
||||
try {
|
||||
JavaScriptObject parent = getParentNode(element);
|
||||
if (parent != null) {
|
||||
removeChild(parent, element);
|
||||
}
|
||||
} catch (JavaScriptException e) {
|
||||
fatal(e);
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,265 @@
|
|||
/*
|
||||
* Copyright (c) 2007 Henri Sivonen
|
||||
* Copyright (c) 2007-2008 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.gwt;
|
||||
|
||||
import java.util.LinkedList;
|
||||
|
||||
import nu.validator.htmlparser.common.XmlViolationPolicy;
|
||||
import nu.validator.htmlparser.impl.ErrorReportingTokenizer;
|
||||
import nu.validator.htmlparser.impl.Tokenizer;
|
||||
import nu.validator.htmlparser.impl.UTF16Buffer;
|
||||
|
||||
import org.xml.sax.ErrorHandler;
|
||||
import org.xml.sax.SAXException;
|
||||
import org.xml.sax.SAXParseException;
|
||||
|
||||
import com.google.gwt.core.client.JavaScriptObject;
|
||||
import com.google.gwt.user.client.Timer;
|
||||
|
||||
/**
|
||||
* This class implements an HTML5 parser that exposes data through the DOM
|
||||
* interface.
|
||||
*
|
||||
* <p>By default, when using the constructor without arguments, the
|
||||
* this parser treats XML 1.0-incompatible infosets as fatal errors.
|
||||
* This corresponds to
|
||||
* <code>FATAL</code> as the general XML violation policy. To make the parser
|
||||
* support non-conforming HTML fully per the HTML 5 spec while on the other
|
||||
* hand potentially violating the DOM API contract, set the general XML
|
||||
* violation policy to <code>ALLOW</code>. This does not work with a standard
|
||||
* DOM implementation. Handling all input without fatal errors and without
|
||||
* violating the DOM API contract is possible by setting
|
||||
* the general XML violation policy to <code>ALTER_INFOSET</code>. <em>This
|
||||
* makes the parser non-conforming</em> but is probably the most useful
|
||||
* setting for most applications.
|
||||
*
|
||||
* <p>The doctype is not represented in the tree.
|
||||
*
|
||||
* <p>The document mode is represented as user data <code>DocumentMode</code>
|
||||
* object with the key <code>nu.validator.document-mode</code> on the document
|
||||
* node.
|
||||
*
|
||||
* <p>The form pointer is also stored as user data with the key
|
||||
* <code>nu.validator.form-pointer</code>.
|
||||
*
|
||||
* @version $Id: HtmlDocumentBuilder.java 255 2008-05-29 08:57:38Z hsivonen $
|
||||
* @author hsivonen
|
||||
*/
|
||||
public class HtmlParser {
|
||||
|
||||
private static final int CHUNK_SIZE = 512;
|
||||
|
||||
private final Tokenizer tokenizer;
|
||||
|
||||
private final BrowserTreeBuilder domTreeBuilder;
|
||||
|
||||
private final StringBuilder documentWriteBuffer = new StringBuilder();
|
||||
|
||||
private ErrorHandler errorHandler;
|
||||
|
||||
private UTF16Buffer stream;
|
||||
|
||||
private int streamLength;
|
||||
|
||||
private boolean lastWasCR;
|
||||
|
||||
private boolean ending;
|
||||
|
||||
private ParseEndListener parseEndListener;
|
||||
|
||||
private final LinkedList<UTF16Buffer> bufferStack = new LinkedList<UTF16Buffer>();
|
||||
|
||||
/**
|
||||
* Instantiates the parser
|
||||
*
|
||||
* @param implementation
|
||||
* the DOM implementation
|
||||
* @param xmlPolicy the policy
|
||||
*/
|
||||
public HtmlParser(JavaScriptObject document) {
|
||||
this.domTreeBuilder = new BrowserTreeBuilder(document);
|
||||
this.tokenizer = new ErrorReportingTokenizer(domTreeBuilder);
|
||||
this.domTreeBuilder.setNamePolicy(XmlViolationPolicy.ALTER_INFOSET);
|
||||
this.tokenizer.setCommentPolicy(XmlViolationPolicy.ALTER_INFOSET);
|
||||
this.tokenizer.setContentNonXmlCharPolicy(XmlViolationPolicy.ALTER_INFOSET);
|
||||
this.tokenizer.setContentSpacePolicy(XmlViolationPolicy.ALTER_INFOSET);
|
||||
this.tokenizer.setNamePolicy(XmlViolationPolicy.ALTER_INFOSET);
|
||||
this.tokenizer.setXmlnsPolicy(XmlViolationPolicy.ALTER_INFOSET);
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses a document from a SAX <code>InputSource</code>.
|
||||
* @param is the source
|
||||
* @return the doc
|
||||
* @see javax.xml.parsers.DocumentBuilder#parse(org.xml.sax.InputSource)
|
||||
*/
|
||||
public void parse(String source, ParseEndListener callback) throws SAXException {
|
||||
parseEndListener = callback;
|
||||
domTreeBuilder.setFragmentContext(null);
|
||||
tokenize(source, null);
|
||||
}
|
||||
|
||||
/**
|
||||
* @param is
|
||||
* @throws SAXException
|
||||
* @throws IOException
|
||||
* @throws MalformedURLException
|
||||
*/
|
||||
private void tokenize(String source, String context) throws SAXException {
|
||||
lastWasCR = false;
|
||||
ending = false;
|
||||
documentWriteBuffer.setLength(0);
|
||||
streamLength = source.length();
|
||||
stream = new UTF16Buffer(source.toCharArray(), 0,
|
||||
(streamLength < CHUNK_SIZE ? streamLength : CHUNK_SIZE));
|
||||
bufferStack.clear();
|
||||
push(stream);
|
||||
domTreeBuilder.setFragmentContext(context == null ? null : context.intern());
|
||||
tokenizer.start();
|
||||
pump();
|
||||
}
|
||||
|
||||
private void pump() throws SAXException {
|
||||
if (ending) {
|
||||
tokenizer.end();
|
||||
domTreeBuilder.getDocument(); // drops the internal reference
|
||||
parseEndListener.parseComplete();
|
||||
// Don't schedule timeout
|
||||
return;
|
||||
}
|
||||
|
||||
int docWriteLen = documentWriteBuffer.length();
|
||||
if (docWriteLen > 0) {
|
||||
char[] newBuf = new char[docWriteLen];
|
||||
documentWriteBuffer.getChars(0, docWriteLen, newBuf, 0);
|
||||
push(new UTF16Buffer(newBuf, 0, docWriteLen));
|
||||
documentWriteBuffer.setLength(0);
|
||||
}
|
||||
|
||||
for (;;) {
|
||||
UTF16Buffer buffer = peek();
|
||||
if (!buffer.hasMore()) {
|
||||
if (buffer == stream) {
|
||||
if (buffer.getEnd() == streamLength) {
|
||||
// Stop parsing
|
||||
tokenizer.eof();
|
||||
ending = true;
|
||||
break;
|
||||
} else {
|
||||
int newEnd = buffer.getStart() + CHUNK_SIZE;
|
||||
buffer.setEnd(newEnd < streamLength ? newEnd
|
||||
: streamLength);
|
||||
continue;
|
||||
}
|
||||
} else {
|
||||
pop();
|
||||
continue;
|
||||
}
|
||||
}
|
||||
// now we have a non-empty buffer
|
||||
buffer.adjust(lastWasCR);
|
||||
lastWasCR = false;
|
||||
if (buffer.hasMore()) {
|
||||
lastWasCR = tokenizer.tokenizeBuffer(buffer);
|
||||
domTreeBuilder.maybeRunScript();
|
||||
break;
|
||||
} else {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
// schedule
|
||||
Timer timer = new Timer() {
|
||||
|
||||
@Override public void run() {
|
||||
try {
|
||||
pump();
|
||||
} catch (SAXException e) {
|
||||
ending = true;
|
||||
if (errorHandler != null) {
|
||||
try {
|
||||
errorHandler.fatalError(new SAXParseException(
|
||||
e.getMessage(), null, null, -1, -1, e));
|
||||
} catch (SAXException e1) {
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
};
|
||||
timer.schedule(1);
|
||||
}
|
||||
|
||||
private void push(UTF16Buffer buffer) {
|
||||
bufferStack.addLast(buffer);
|
||||
}
|
||||
|
||||
private UTF16Buffer peek() {
|
||||
return bufferStack.getLast();
|
||||
}
|
||||
|
||||
private void pop() {
|
||||
bufferStack.removeLast();
|
||||
}
|
||||
|
||||
public void documentWrite(String text) throws SAXException {
|
||||
UTF16Buffer buffer = new UTF16Buffer(text.toCharArray(), 0, text.length());
|
||||
while (buffer.hasMore()) {
|
||||
buffer.adjust(lastWasCR);
|
||||
lastWasCR = false;
|
||||
if (buffer.hasMore()) {
|
||||
lastWasCR = tokenizer.tokenizeBuffer(buffer);
|
||||
domTreeBuilder.maybeRunScript();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* @see javax.xml.parsers.DocumentBuilder#setErrorHandler(org.xml.sax.ErrorHandler)
|
||||
*/
|
||||
public void setErrorHandler(ErrorHandler errorHandler) {
|
||||
this.errorHandler = errorHandler;
|
||||
domTreeBuilder.setErrorHandler(errorHandler);
|
||||
tokenizer.setErrorHandler(errorHandler);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets whether comment nodes appear in the tree.
|
||||
* @param ignoreComments <code>true</code> to ignore comments
|
||||
* @see nu.validator.htmlparser.impl.TreeBuilder#setIgnoringComments(boolean)
|
||||
*/
|
||||
public void setIgnoringComments(boolean ignoreComments) {
|
||||
domTreeBuilder.setIgnoringComments(ignoreComments);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets whether the parser considers scripting to be enabled for noscript treatment.
|
||||
* @param scriptingEnabled <code>true</code> to enable
|
||||
* @see nu.validator.htmlparser.impl.TreeBuilder#setScriptingEnabled(boolean)
|
||||
*/
|
||||
public void setScriptingEnabled(boolean scriptingEnabled) {
|
||||
domTreeBuilder.setScriptingEnabled(scriptingEnabled);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,87 @@
|
|||
/*
|
||||
* Copyright (c) 2008 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.gwt;
|
||||
|
||||
import org.xml.sax.SAXException;
|
||||
|
||||
import com.google.gwt.core.client.EntryPoint;
|
||||
import com.google.gwt.core.client.JavaScriptObject;
|
||||
|
||||
public class HtmlParserModule implements EntryPoint {
|
||||
|
||||
private static native void zapChildren(JavaScriptObject node) /*-{
|
||||
while (node.hasChildNodes()) {
|
||||
node.removeChild(node.lastChild);
|
||||
}
|
||||
}-*/;
|
||||
|
||||
private static native void installDocWrite(JavaScriptObject doc, HtmlParser parser) /*-{
|
||||
doc.write = function() {
|
||||
if (arguments.length == 0) {
|
||||
return;
|
||||
}
|
||||
var text = arguments[0];
|
||||
for (var i = 1; i < arguments.length; i++) {
|
||||
text += arguments[i];
|
||||
}
|
||||
parser.@nu.validator.htmlparser.gwt.HtmlParser::documentWrite(Ljava/lang/String;)(text);
|
||||
}
|
||||
doc.writeln = function() {
|
||||
if (arguments.length == 0) {
|
||||
parser.@nu.validator.htmlparser.gwt.HtmlParser::documentWrite(Ljava/lang/String;)("\n");
|
||||
return;
|
||||
}
|
||||
var text = arguments[0];
|
||||
for (var i = 1; i < arguments.length; i++) {
|
||||
text += arguments[i];
|
||||
}
|
||||
text += "\n";
|
||||
parser.@nu.validator.htmlparser.gwt.HtmlParser::documentWrite(Ljava/lang/String;)(text);
|
||||
}
|
||||
}-*/;
|
||||
|
||||
@SuppressWarnings("unused")
|
||||
private static void parseHtmlDocument(String source, JavaScriptObject document, JavaScriptObject readyCallback, JavaScriptObject errorHandler) throws SAXException {
|
||||
if (readyCallback == null) {
|
||||
readyCallback = JavaScriptObject.createFunction();
|
||||
}
|
||||
zapChildren(document);
|
||||
HtmlParser parser = new HtmlParser(document);
|
||||
parser.setScriptingEnabled(true);
|
||||
// XXX error handler
|
||||
|
||||
installDocWrite(document, parser);
|
||||
|
||||
parser.parse(source, new ParseEndListener(readyCallback));
|
||||
}
|
||||
|
||||
private static native void exportEntryPoints() /*-{
|
||||
$wnd.parseHtmlDocument = @nu.validator.htmlparser.gwt.HtmlParserModule::parseHtmlDocument(Ljava/lang/String;Lcom/google/gwt/core/client/JavaScriptObject;Lcom/google/gwt/core/client/JavaScriptObject;Lcom/google/gwt/core/client/JavaScriptObject;);
|
||||
}-*/;
|
||||
|
||||
|
||||
public void onModuleLoad() {
|
||||
exportEntryPoints();
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,46 @@
|
|||
/*
|
||||
* Copyright (c) 2008 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.gwt;
|
||||
|
||||
import com.google.gwt.core.client.JavaScriptObject;
|
||||
|
||||
public class ParseEndListener {
|
||||
|
||||
private final JavaScriptObject callback;
|
||||
|
||||
/**
|
||||
* @param callback
|
||||
*/
|
||||
public ParseEndListener(JavaScriptObject callback) {
|
||||
this.callback = callback;
|
||||
}
|
||||
|
||||
public void parseComplete() {
|
||||
call(callback);
|
||||
}
|
||||
|
||||
private static native void call(JavaScriptObject callback) /*-{
|
||||
callback();
|
||||
}-*/;
|
||||
|
||||
}
|
|
@ -0,0 +1,225 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html>
|
||||
<head>
|
||||
<title>Live DOM Viewer</title>
|
||||
<script type="text/javascript" language="javascript" src="nu.validator.htmlparser.HtmlParser.nocache.js"></script>
|
||||
<style>
|
||||
h1 { margin: 0; }
|
||||
h2 { font-size: small; margin: 1em 0 0; }
|
||||
p, ul, pre { margin: 0; }
|
||||
p { border: inset thin; }
|
||||
textarea { width: 100%; -width: 99%; height: 8em; border: 0; }
|
||||
iframe { width: 100%; height: 12em; border: 0; }
|
||||
/* iframe.large { height: 24em; } */
|
||||
pre { border: inset thin; padding: 0.5em; color: gray; }
|
||||
pre samp { color: black; }
|
||||
#dom { border: inset thin; padding: 0.5em 0.5em 0.5em 1em; color: black; min-height: 5em; font-family: monospace; background: white; }
|
||||
#dom ul { padding: 0 0 0 1em; margin: 0; }
|
||||
#dom li { padding: 0; margin: 0; list-style: none; position: relative; }
|
||||
#dom li li { list-style: disc; }
|
||||
#dom .t1 code { color: purple; font-weight: bold; }
|
||||
#dom .t2 { font-style: normal; font-family: monospace; }
|
||||
#dom .t2 .name { color: black; font-weight: bold; }
|
||||
#dom .t2 .value { color: blue; font-weight: normal; }
|
||||
#dom .t3 code, #dom .t4 code, #dom .t5 code { color: gray; }
|
||||
#dom .t7 code, #dom .t8 code { color: green; }
|
||||
#dom span { font-style: italic; font-family: serif; }
|
||||
#dom .t10 code { color: teal; }
|
||||
#dom .misparented, #dom .misparented code { color: red; font-weight: bold; }
|
||||
#dom.hidden, .hidden { visibility: hidden; margin: 0.5em 0; padding: 0; height: 0; min-height: 0; }
|
||||
pre#log { color: black; font: small monospace; }
|
||||
script + p { border: none; font-size: smaller; margin: 0.8em 0.3em; }
|
||||
</style>
|
||||
<style title="Tree View">
|
||||
#dom li li { list-style: none; }
|
||||
#dom li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
|
||||
#dom li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
|
||||
</style>
|
||||
<script>
|
||||
if (navigator.userAgent.match('Gecko/(\\d+)') && RegExp.$1 == '20060217' && RegExp.$1 != '00000000') {
|
||||
var style = document.getElementsByTagName('style')[1];
|
||||
style.parentNode.removeChild(style);
|
||||
}
|
||||
</script>
|
||||
</head>
|
||||
<body onload="init()">
|
||||
<h1>Live DOM Viewer</h1>
|
||||
<h2>Markup to test (<a href="data:," id="permalink" rel="bookmark">permalink</a>, <a href="javascript:up()">upload</a>, <a href="javascript:down()">download</a>, <a href="#" onclick="toggleVisibility(this); return false">hide</a>): <span id="updown-status"></span></h2>
|
||||
<p><textarea oninput="updateInput(event)" onkeydown="updateInput(event)"><!DOCTYPE html>
|
||||
...</textarea></p>
|
||||
<h2><a href="data:," id="domview">DOM view</a> (<a href="#" onclick="toggleVisibility(this); return false;">hide</a>, <a href="#" onclick="updateDOM()">refresh</a>):</h2>
|
||||
<ul id="dom"></ul>
|
||||
<h2><a href="data:," id="link">Rendered view</a>: (<a href="#" onclick="toggleVisibility(this); return false;">hide</a><!--, <a href="#" onclick="grow(this)">grow</a>-->):</h2>
|
||||
<p><iframe src="blank.html"></iframe></p> <!-- data:, -->
|
||||
<h2>innerHTML view: (<a href="#" onclick="toggleVisibility(this); return false;">show</a>, <a href="#" onclick="updateDOM()">refresh</a>):</h2>
|
||||
<pre class="hidden"><!DOCTYPE HTML><html><samp></samp></html></pre>
|
||||
<h2>Log: (<a href="#" onclick="toggleVisibility(this); return false;">hide</a>):</h2>
|
||||
<pre id="log">Script not loaded.</pre>
|
||||
<script>
|
||||
var iframe = document.getElementsByTagName('iframe')[0];
|
||||
var textarea = document.getElementsByTagName('textarea')[0];
|
||||
var pre = document.getElementsByTagName('samp')[0];
|
||||
var dom = document.getElementsByTagName('ul')[0];
|
||||
var log = document.getElementById('log');
|
||||
var updownStatus = document.getElementById('updown-status');
|
||||
var delayedUpdater = 0;
|
||||
var lastString = '';
|
||||
var logBuffer = '';
|
||||
var logBuffering = false;
|
||||
function updateInput(event) {
|
||||
if (delayedUpdater) {
|
||||
clearTimeout(delayedUpdater);
|
||||
delayedUpdater = 0;
|
||||
}
|
||||
delayedUpdater = setTimeout(update, 100);
|
||||
}
|
||||
function afterParse() {
|
||||
lastString = textarea.value;
|
||||
setTimeout(updateDOM, 100);
|
||||
updown('');
|
||||
}
|
||||
function update() {
|
||||
if (lastString != textarea.value) {
|
||||
logBuffering = true;
|
||||
document.getElementById('link').href = 'data:text/html;charset=utf-8,' + encodeURIComponent(textarea.value);
|
||||
iframe.contentWindow.onerror = function (a, b, c) {
|
||||
record('error: ' + a + ' on line ' + c);
|
||||
}
|
||||
iframe.contentWindow.w = function (s) {
|
||||
record('log: ' + s);
|
||||
}
|
||||
window.parseHtmlDocument(textarea.value, iframe.contentWindow.document, afterParse, null);
|
||||
}
|
||||
}
|
||||
function updateDOM() {
|
||||
while (pre.firstChild) pre.removeChild(pre.firstChild);
|
||||
pre.appendChild(document.createTextNode(iframe.contentWindow.document.documentElement.innerHTML));
|
||||
printDOM(dom, iframe.contentWindow.document);
|
||||
document.getElementById('domview').href = 'data:text/plain;charset=utf-8,<ul class="domTree">' + encodeURIComponent(dom.innerHTML + '</ul>');
|
||||
document.getElementById('permalink').href = '?' + encodeURIComponent(textarea.value);
|
||||
record('rendering mode: ' + iframe.contentWindow.document.compatMode);
|
||||
if (iframe.contentWindow.document.title)
|
||||
record('document.title: ' + iframe.contentWindow.document.title);
|
||||
else
|
||||
record('document has no title');
|
||||
while (log.firstChild != log.lastChild)
|
||||
log.removeChild(log.lastChild);
|
||||
log.firstChild.data = logBuffer;
|
||||
logBuffering = false;
|
||||
logBuffer = '';
|
||||
}
|
||||
function printDOM(ul, node) {
|
||||
while (ul.firstChild) ul.removeChild(ul.firstChild);
|
||||
for (var i = 0; i < node.childNodes.length; i += 1) {
|
||||
var li = document.createElement('li');
|
||||
li.className = 't' + node.childNodes[i].nodeType;
|
||||
if (node.childNodes[i].nodeType == 10) {
|
||||
li.appendChild(document.createTextNode('DOCTYPE: '));
|
||||
}
|
||||
var code = document.createElement('code');
|
||||
code.appendChild(document.createTextNode(node.childNodes[i].nodeName));
|
||||
li.appendChild(code);
|
||||
if (node.childNodes[i].nodeValue) {
|
||||
var span = document.createElement('span');
|
||||
span.appendChild(document.createTextNode(node.childNodes[i].nodeValue));
|
||||
li.appendChild(document.createTextNode(': '));
|
||||
li.appendChild(span);
|
||||
}
|
||||
if (node.childNodes[i].attributes)
|
||||
for (var j = 0; j < node.childNodes[i].attributes.length; j += 1) {
|
||||
if (node.childNodes[i].attributes[j].specified) {
|
||||
var attName = document.createElement('code');
|
||||
attName.appendChild(document.createTextNode(node.childNodes[i].attributes[j].nodeName));
|
||||
attName.className = 'attribute name';
|
||||
var attValue = document.createElement('code');
|
||||
attValue.appendChild(document.createTextNode(node.childNodes[i].attributes[j].nodeValue));
|
||||
attValue.className = 'attribute value';
|
||||
var att = document.createElement('span');
|
||||
att.className = 't2';
|
||||
att.appendChild(attName);
|
||||
att.appendChild(document.createTextNode('="'));
|
||||
att.appendChild(attValue);
|
||||
att.appendChild(document.createTextNode('"'));
|
||||
li.appendChild(document.createTextNode(' '));
|
||||
li.appendChild(att);
|
||||
}
|
||||
}
|
||||
if (node.childNodes[i].parentNode == node) {
|
||||
if (node.childNodes[i].childNodes.length) {
|
||||
var ul2 = document.createElement('ul');
|
||||
li.appendChild(ul2);
|
||||
printDOM(ul2, node.childNodes[i]);
|
||||
}
|
||||
} else {
|
||||
li.className += ' misparented';
|
||||
}
|
||||
ul.appendChild(li);
|
||||
}
|
||||
}
|
||||
function toggleVisibility(link) {
|
||||
var n = link.parentNode.nextSibling;
|
||||
if (n.nodeType == 3 /* text node */) n = n.nextSibling; // we should always do this but in IE, text nodes vanish
|
||||
n.className = (n.className == "hidden") ? '' : 'hidden';
|
||||
link.firstChild.data = n.className == "hidden" ? "show" : "hide";
|
||||
}
|
||||
/*
|
||||
function grow(link) {
|
||||
var n = link.parentNode.nextSibling;
|
||||
if (n.nodeType == 3 /-* text node *-/) n = n.nextSibling; // we should always do this but in IE, text nodes vanish
|
||||
n.className = (n.className == "large") ? '' : 'large';
|
||||
link.firstChild.data = n.className == "grow" ? "shrink" : "grow";
|
||||
}
|
||||
*/
|
||||
function down() {
|
||||
updown('downloading...');
|
||||
var request = window.XMLHttpRequest ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP");
|
||||
request.onreadystatechange = function () {
|
||||
updown('downloading... ' + request.readyState + '/4');
|
||||
if (request.readyState == 4) {
|
||||
textarea.value = request.responseText;
|
||||
update();
|
||||
updown('downloaded');
|
||||
}
|
||||
};
|
||||
request.open('GET', 'clipboard.cgi', true);
|
||||
request.send(null);
|
||||
}
|
||||
function up() {
|
||||
updown('uploading...');
|
||||
var request = window.XMLHttpRequest ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP");
|
||||
request.onreadystatechange = function () {
|
||||
updown('uploading... ' + request.readyState + '/4');
|
||||
if (request.readyState == 4) {
|
||||
updown('uploaded');
|
||||
}
|
||||
};
|
||||
request.open('POST', 'clipboard.cgi', true);
|
||||
request.setRequestHeader('Content-Type', 'text/plain');
|
||||
request.send(textarea.value);
|
||||
}
|
||||
function init() {
|
||||
var uri = location.search;
|
||||
if (uri)
|
||||
textarea.value = decodeURIComponent(uri.substring(1, uri.length));
|
||||
update();
|
||||
}
|
||||
function record(s) {
|
||||
if (logBuffering)
|
||||
logBuffer += s + '\r\n';
|
||||
else
|
||||
log.appendChild(document.createTextNode(s + '\r\n'));
|
||||
}
|
||||
function updown(s) {
|
||||
while (updownStatus.firstChild) updownStatus.removeChild(updownStatus.firstChild);
|
||||
updownStatus.appendChild(document.createTextNode(s));
|
||||
}
|
||||
</script>
|
||||
<p>This script puts a function <code>w(<var>s</var>)</code> into the
|
||||
global scope of the test page, where <var>s</vaR> is a string to
|
||||
output to the log. Also, five files are accessible in the current
|
||||
directory for test purposes: <code>image</code> (a GIF image),
|
||||
<code>flash</code> (a Flash file), <code>script</code> (a JS file),
|
||||
<code>style</code> (a CSS file), and <code>document</code> (an HTML
|
||||
file).</p>
|
||||
</body>
|
||||
</html>
|
|
@ -0,0 +1,25 @@
|
|||
From:
|
||||
http://software.hixie.ch/utilities/js/live-dom-viewer/LICENSE
|
||||
regarding the upstream of HtmlParser.html:
|
||||
|
||||
The MIT License
|
||||
|
||||
Copyright (c) 2000, 2006, 2008 Ian Hickson and various contributors
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
|
@ -0,0 +1,2 @@
|
|||
<!DOCTYPE html>
|
||||
<title></title>
|
|
@ -0,0 +1,25 @@
|
|||
These scripts export the Java-to-C++ translator and the java source files that
|
||||
implement the HTML5 parser. The exported translator may be used (with no
|
||||
external dependencies) to translate the exported java source files into Gecko-
|
||||
compatible C++.
|
||||
|
||||
Hacking the translator itself still requires a working copy of the Java HTML5
|
||||
parser repository, but hacking the parser (modifying the Java source files and
|
||||
performing the translation) should now be possible using only files committed
|
||||
to the mozilla source tree.
|
||||
|
||||
Run any of these scripts without arguments to receive usage instructions.
|
||||
|
||||
make-translator-jar.sh: compiles the Java-to-C++ translator into a .jar file
|
||||
export-java-srcs.sh: exports minimal java source files implementing the
|
||||
HTML5 parser
|
||||
export-translator.sh: exports the compiled translator and javaparser.jar
|
||||
export-all.sh: runs the previous two scripts
|
||||
util.sh: provides various shell utility functions to the
|
||||
scripts listed above (does nothing if run directly)
|
||||
|
||||
All path arguments may be either absolute or relative. This includes the path
|
||||
to the script itself ($0), so the directory from which you run these scripts
|
||||
doesn't matter.
|
||||
|
||||
Ben Newman (7 July 2009)
|
|
@ -0,0 +1,24 @@
|
|||
#!/usr/bin/env sh
|
||||
|
||||
SCRIPT_DIR=`dirname $0`
|
||||
source $SCRIPT_DIR/util.sh
|
||||
SCRIPT_DIR=`abs $SCRIPT_DIR`
|
||||
|
||||
if [ $# -eq 1 ]
|
||||
then
|
||||
MOZ_PARSER_PATH=`abs $1`
|
||||
else
|
||||
echo
|
||||
echo "Usage: sh `basename $0` /path/to/mozilla-central/parser/html"
|
||||
echo "Note that relative paths will work just fine."
|
||||
echo
|
||||
exit 1
|
||||
fi
|
||||
|
||||
$SCRIPT_DIR/export-translator.sh $MOZ_PARSER_PATH
|
||||
$SCRIPT_DIR/export-java-srcs.sh $MOZ_PARSER_PATH
|
||||
|
||||
echo
|
||||
echo "Now go to $MOZ_PARSER_PATH and run"
|
||||
echo " java -jar javalib/translator.jar javasrc . nsHtml5AtomList.h"
|
||||
echo
|
|
@ -0,0 +1,25 @@
|
|||
#!/usr/bin/env sh
|
||||
|
||||
SCRIPT_DIR=`dirname $0`
|
||||
source $SCRIPT_DIR/util.sh
|
||||
SCRIPT_DIR=`abs $SCRIPT_DIR`
|
||||
|
||||
SRCDIR=`abs $SCRIPT_DIR/../src/nu/validator/htmlparser/impl`
|
||||
|
||||
if [ $# -eq 1 ]
|
||||
then
|
||||
MOZ_PARSER_PATH=`abs $1`
|
||||
else
|
||||
echo
|
||||
echo "Usage: sh `basename $0` /path/to/mozilla-central/parser/html"
|
||||
echo "Note that relative paths will work just fine."
|
||||
echo
|
||||
exit 1
|
||||
fi
|
||||
|
||||
SRCTARGET=$MOZ_PARSER_PATH/javasrc
|
||||
|
||||
rm -rf $SRCTARGET
|
||||
mkdir $SRCTARGET
|
||||
# Avoid copying the .svn directory:
|
||||
cp -rv $SRCDIR/*.java $SRCTARGET
|
|
@ -0,0 +1,24 @@
|
|||
#!/usr/bin/env sh
|
||||
|
||||
SCRIPT_DIR=`dirname $0`
|
||||
source $SCRIPT_DIR/util.sh
|
||||
SCRIPT_DIR=`abs $SCRIPT_DIR`
|
||||
|
||||
LIBDIR=`abs $SCRIPT_DIR/../translator-lib`
|
||||
|
||||
if [ $# -eq 1 ]
|
||||
then
|
||||
MOZ_PARSER_PATH=`abs $1`
|
||||
else
|
||||
echo
|
||||
echo "Usage: sh `basename $0` /path/to/mozilla-central/parser/html"
|
||||
echo "Note that relative paths will work just fine."
|
||||
echo "Be sure that you have run `dirname $0`/make-translator-jar.sh before running this script."
|
||||
echo
|
||||
exit 1
|
||||
fi
|
||||
|
||||
LIBTARGET=$MOZ_PARSER_PATH/javalib
|
||||
|
||||
rm -rf $LIBTARGET
|
||||
cp -rv $LIBDIR $LIBTARGET
|
|
@ -0,0 +1,63 @@
|
|||
#!/usr/bin/env sh
|
||||
|
||||
SCRIPT_DIR=`dirname $0`
|
||||
source $SCRIPT_DIR/util.sh
|
||||
SCRIPT_DIR=`abs $SCRIPT_DIR`
|
||||
|
||||
SRCDIR=`abs $SCRIPT_DIR/../translator-src`
|
||||
BINDIR=`abs $SCRIPT_DIR/../translator-bin`
|
||||
LIBDIR=`abs $SCRIPT_DIR/../translator-lib`
|
||||
|
||||
if [ $# -eq 1 ]
|
||||
then
|
||||
JAVAPARSER_JAR_PATH=`abs $1`
|
||||
else
|
||||
echo
|
||||
echo "Usage: sh `basename $0` /path/to/javaparser-1.0.7.jar"
|
||||
echo "Note that relative paths will work just fine."
|
||||
echo "Obtain javaparser-1.0.7.jar from http://code.google.com/p/javaparser"
|
||||
echo
|
||||
exit 1
|
||||
fi
|
||||
|
||||
set_up() {
|
||||
rm -rf $BINDIR; mkdir $BINDIR
|
||||
rm -rf $LIBDIR; mkdir $LIBDIR
|
||||
cp $JAVAPARSER_JAR_PATH $LIBDIR/javaparser.jar
|
||||
}
|
||||
|
||||
write_manifest() {
|
||||
rm -f $LIBDIR/manifest
|
||||
echo "Main-Class: nu.validator.htmlparser.cpptranslate.Main" > $LIBDIR/manifest
|
||||
echo "Class-Path: javaparser.jar" >> $LIBDIR/manifest
|
||||
}
|
||||
|
||||
compile_translator() {
|
||||
find $SRCDIR -name "*.java" | \
|
||||
xargs javac -cp $LIBDIR/javaparser.jar -g -d $BINDIR
|
||||
}
|
||||
|
||||
generate_jar() {
|
||||
jar cvfm $LIBDIR/translator.jar $LIBDIR/manifest -C $BINDIR .
|
||||
}
|
||||
|
||||
clean_up() {
|
||||
rm -f $LIBDIR/manifest
|
||||
}
|
||||
|
||||
success_message() {
|
||||
echo
|
||||
echo "Successfully generated directory \"$LIBDIR\" with contents:"
|
||||
echo
|
||||
ls -al $LIBDIR
|
||||
echo
|
||||
echo "Now run `dirname $0`/export-all.sh with no arguments and follow the usage instructions."
|
||||
echo
|
||||
}
|
||||
|
||||
set_up && \
|
||||
compile_translator && \
|
||||
write_manifest && \
|
||||
generate_jar && \
|
||||
clean_up && \
|
||||
success_message
|
|
@ -0,0 +1,23 @@
|
|||
#!/usr/bin/env sh
|
||||
|
||||
abs() {
|
||||
local rel
|
||||
local p
|
||||
if [ $# -ne 1 ]
|
||||
then
|
||||
rel=.
|
||||
else
|
||||
rel=$1
|
||||
fi
|
||||
if [ -d $rel ]
|
||||
then
|
||||
pushd $rel > /dev/null
|
||||
p=`pwd`
|
||||
popd > /dev/null
|
||||
else
|
||||
pushd `dirname $rel` > /dev/null
|
||||
p=`pwd`/`basename $rel`
|
||||
popd > /dev/null
|
||||
fi
|
||||
echo $p
|
||||
}
|
|
@ -0,0 +1,240 @@
|
|||
<!--
|
||||
* Copyright (c) 2007-2012 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
-->
|
||||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
||||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
|
||||
<modelVersion>4.0.0</modelVersion>
|
||||
<groupId>nu.validator.htmlparser</groupId>
|
||||
<artifactId>htmlparser</artifactId>
|
||||
<packaging>bundle</packaging>
|
||||
<version>1.4</version>
|
||||
<name>htmlparser</name>
|
||||
<url>http://about.validator.nu/htmlparser/</url>
|
||||
<description>The Validator.nu HTML Parser is an implementation of the HTML5 parsing algorithm in Java for applications. The parser is designed to work as a drop-in replacement for the XML parser in applications that already support XHTML 1.x content with an XML parser and use SAX, DOM or XOM to interface with the parser.</description>
|
||||
<!--
|
||||
Usage notes for this POM:
|
||||
|
||||
To build without signing, run:
|
||||
mvn clean source:jar javadoc:jar repository:bundle-create
|
||||
(enter 0 <return> when prompted)
|
||||
|
||||
To build and sign, run:
|
||||
mvn clean source:jar javadoc:jar package gpg:sign repository:bundle-create
|
||||
(enter 0 <return> when prompted)
|
||||
|
||||
This POM file is used for creating the bundle for distribution via the
|
||||
Maven Central Repository. It is not used as part of the normal development
|
||||
process of the parser and the maintainer of the parser (Henri Sivonen)
|
||||
isn't experienced in POM tweaking. If you need this POM to do something
|
||||
that it currently does not do or do something better, you need to write
|
||||
the changes you need yourself and contribute a patch via
|
||||
http://bugzilla.validator.nu/
|
||||
-->
|
||||
<developers>
|
||||
<developer>
|
||||
<id>hsivonen</id>
|
||||
<name>Henri Sivonen</name>
|
||||
<email>hsivonen@iki.fi</email>
|
||||
<url>http://hsivonen.iki.fi/</url>
|
||||
</developer>
|
||||
</developers>
|
||||
<licenses>
|
||||
<license>
|
||||
<name>The MIT License</name>
|
||||
<url>http://www.opensource.org/licenses/mit-license.php</url>
|
||||
<distribution>repo</distribution>
|
||||
</license>
|
||||
<license>
|
||||
<name>The (New) BSD License</name>
|
||||
<url>http://www.opensource.org/licenses/bsd-license.php</url>
|
||||
<distribution>repo</distribution>
|
||||
</license>
|
||||
</licenses>
|
||||
<scm>
|
||||
<connection>scm:hg:http://hg.mozilla.org/projects/htmlparser/</connection>
|
||||
<url>http://hg.mozilla.org/projects/htmlparser/</url>
|
||||
</scm>
|
||||
<build>
|
||||
<sourceDirectory>${project.build.directory}/src</sourceDirectory>
|
||||
<testSourceDirectory>${basedir}/test-src</testSourceDirectory>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.apache.maven.plugins</groupId>
|
||||
<artifactId>maven-compiler-plugin</artifactId>
|
||||
<configuration>
|
||||
<source>1.5</source>
|
||||
<target>1.5</target>
|
||||
</configuration>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<artifactId>maven-antrun-plugin</artifactId>
|
||||
<version>1.7</version>
|
||||
<dependencies>
|
||||
<dependency>
|
||||
<groupId>com.sun</groupId>
|
||||
<artifactId>tools</artifactId>
|
||||
<version>1.5.0</version>
|
||||
<scope>system</scope>
|
||||
<systemPath>${java.home}/../lib/tools.jar</systemPath>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
<executions>
|
||||
<execution>
|
||||
<id>intitialize-sources</id>
|
||||
<phase>initialize</phase>
|
||||
<goals>
|
||||
<goal>run</goal>
|
||||
</goals>
|
||||
<configuration>
|
||||
<target>
|
||||
<delete dir="${project.build.sourceDirectory}"/>
|
||||
<mkdir dir="${project.build.sourceDirectory}"/>
|
||||
<copy todir="${project.build.sourceDirectory}">
|
||||
<fileset dir="${basedir}/src"/>
|
||||
</copy>
|
||||
</target>
|
||||
</configuration>
|
||||
</execution>
|
||||
<execution>
|
||||
<id>tokenizer-hotspot-workaround</id>
|
||||
<phase>process-sources</phase>
|
||||
<goals>
|
||||
<goal>run</goal>
|
||||
</goals>
|
||||
<configuration>
|
||||
<target>
|
||||
<property name="translator.sources" value="${basedir}/translator-src"/>
|
||||
<property name="translator.classes" value="${project.build.directory}/translator-classes"/>
|
||||
<mkdir dir="${translator.classes}"/>
|
||||
<javac srcdir="${translator.sources}" includes="nu/validator/htmlparser/generator/ApplyHotSpotWorkaround.java" destdir="${translator.classes}" includeantruntime="false"/>
|
||||
<java classname="nu.validator.htmlparser.generator.ApplyHotSpotWorkaround">
|
||||
<classpath>
|
||||
<pathelement location="${translator.classes}"/>
|
||||
</classpath>
|
||||
<arg value="${project.build.sourceDirectory}/nu/validator/htmlparser/impl/Tokenizer.java"/>
|
||||
<arg value="${project.build.sourceDirectory}/nu/validator/htmlparser/impl/HotSpotWorkaround.txt"/>
|
||||
</java>
|
||||
</target>
|
||||
</configuration>
|
||||
</execution>
|
||||
</executions>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.apache.maven.plugins</groupId>
|
||||
<artifactId>maven-surefire-plugin</artifactId>
|
||||
<configuration>
|
||||
<skip>true</skip>
|
||||
</configuration>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.apache.felix</groupId>
|
||||
<artifactId>maven-bundle-plugin</artifactId>
|
||||
<version>2.3.7</version>
|
||||
<extensions>true</extensions>
|
||||
<configuration>
|
||||
<archive>
|
||||
<addMavenDescriptor>false</addMavenDescriptor>
|
||||
</archive>
|
||||
<instructions>
|
||||
<Bundle-Name>${project.name}</Bundle-Name>
|
||||
<Bundle-SymbolicName>nu.validator.htmlparser</Bundle-SymbolicName>
|
||||
<Bundle-Version>${project.version}</Bundle-Version>
|
||||
<Bundle-RequiredExecutionEnvironment>J2SE-1.5</Bundle-RequiredExecutionEnvironment>
|
||||
<_removeheaders>Built-By,Bnd-LastModified</_removeheaders>
|
||||
</instructions>
|
||||
</configuration>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.codehaus.mojo</groupId>
|
||||
<artifactId>rpm-maven-plugin</artifactId>
|
||||
<configuration>
|
||||
<release>1</release>
|
||||
<copyright>The MIT License</copyright>
|
||||
<group>Development/Java</group>
|
||||
<workarea>/var/tmp/${project.build.finalName}</workarea>
|
||||
<defineStatements>
|
||||
<defineStatement>_javadir ${rpm.java.dir}</defineStatement>
|
||||
<defineStatement>_javadocdir ${rpm.javadoc.dir}</defineStatement>
|
||||
</defineStatements>
|
||||
<mappings>
|
||||
<mapping>
|
||||
<directory>${rpm.java.dir}</directory>
|
||||
<filemode>644</filemode>
|
||||
<username>root</username>
|
||||
<groupname>root</groupname>
|
||||
<sources>
|
||||
<source>
|
||||
<location>${project.build.directory}/${project.build.finalName}.jar</location>
|
||||
</source>
|
||||
</sources>
|
||||
</mapping>
|
||||
<mapping>
|
||||
<directory>${rpm.javadoc.dir}/${project.build.finalName}</directory>
|
||||
<filemode>644</filemode>
|
||||
<username>root</username>
|
||||
<groupname>root</groupname>
|
||||
<sources>
|
||||
<source>
|
||||
<location>${project.build.directory}/apidocs</location>
|
||||
</source>
|
||||
</sources>
|
||||
</mapping>
|
||||
</mappings>
|
||||
<install>%__ln_s ${project.build.finalName}.jar %{buildroot}%{_javadir}/${project.name}.jar</install>
|
||||
</configuration>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
<dependencies>
|
||||
<dependency>
|
||||
<groupId>com.ibm.icu</groupId>
|
||||
<artifactId>icu4j</artifactId>
|
||||
<version>4.0.1</version>
|
||||
<scope>compile</scope>
|
||||
<optional>true</optional>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>xom</groupId>
|
||||
<artifactId>xom</artifactId>
|
||||
<version>1.1</version>
|
||||
<scope>compile</scope>
|
||||
<optional>true</optional>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>net.sourceforge.jchardet</groupId>
|
||||
<artifactId>jchardet</artifactId>
|
||||
<version>1.0</version>
|
||||
<scope>compile</scope>
|
||||
<optional>true</optional>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>com.sdicons.jsontools</groupId>
|
||||
<artifactId>jsontools-core</artifactId>
|
||||
<version>1.4</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
<properties>
|
||||
<rpm.java.dir>/usr/share/java</rpm.java.dir>
|
||||
<rpm.javadoc.dir>/usr/share/javadoc</rpm.javadoc.dir>
|
||||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
|
||||
</properties>
|
||||
</project>
|
|
@ -0,0 +1,36 @@
|
|||
import java.util.HashSet;
|
||||
import org.w3c.dom.Document;
|
||||
import org.w3c.dom.Node;
|
||||
import org.w3c.dom.Element;
|
||||
|
||||
public class DomUtils {
|
||||
|
||||
private static HashSet<Document> pinned_list = new HashSet<Document>();
|
||||
|
||||
public static synchronized void pin(Document d) {
|
||||
pinned_list.add(d);
|
||||
}
|
||||
|
||||
public static synchronized void unpin(Document d) {
|
||||
pinned_list.remove(d);
|
||||
}
|
||||
|
||||
// return all the text content contained by a single element
|
||||
public static void getElementContent(Element e, StringBuffer b) {
|
||||
for (Node n = e.getFirstChild(); n!=null; n=n.getNextSibling()) {
|
||||
if (n.getNodeType() == n.TEXT_NODE) {
|
||||
b.append(n.getNodeValue());
|
||||
} else if (n.getNodeType() == n.ELEMENT_NODE) {
|
||||
getElementContent((Element) e, b);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// replace all child nodes of a given element with a single text element
|
||||
public static void setElementContent(Element e, String s) {
|
||||
while (e.hasChildNodes()) {
|
||||
e.removeChild(e.getFirstChild());
|
||||
}
|
||||
e.appendChild(e.getOwnerDocument().createTextNode(s));
|
||||
}
|
||||
}
|
|
@ -0,0 +1,65 @@
|
|||
Disclaimer:
|
||||
|
||||
This code is experimental.
|
||||
|
||||
When some people say experimental, they mean "it may not do what it is
|
||||
intended to do; in fact, it might even wipe out your hard drive". I mean
|
||||
that too. But I mean something more than that.
|
||||
|
||||
In this case, experimental means that I don't even know what it is intended
|
||||
to do. I just have a vague vision, and I am trying out various things in
|
||||
the hopes that one of them will work out.
|
||||
|
||||
Vision:
|
||||
|
||||
My vague vision is that I would like to see HTML 5 be a success. For me to
|
||||
consider it to be a success, it needs to be a standard, be interoperable,
|
||||
and be ubiquitous.
|
||||
|
||||
I believe that the Validator.nu parser can be used to bootstrap that
|
||||
process. It is written in Java. Has been compiled into JavaScript. Has
|
||||
been translated into C++ based on the Mozilla libraries with the intent of
|
||||
being included in Firefox. It very closely tracks to the standard.
|
||||
|
||||
For the moment, the effort is on extending that to another language (Ruby)
|
||||
on a single environment (i.e., Linux). Once that is complete, intent is to
|
||||
evaluate the results, decide what needs to be changed, and what needs to be
|
||||
done to support other languages and environments.
|
||||
|
||||
The bar I'm setting for myself isn't just another SWIG generated low level
|
||||
interface to a DOM, but rather a best of breed interface; which for Ruby
|
||||
seems to be the one pioneered by Hpricot and adopted by Nokogiri. Success
|
||||
will mean passing all of the tests from one of those two parsers as well as
|
||||
all of the HTML5 tests.
|
||||
|
||||
Build instructions:
|
||||
|
||||
You'll need icu4j and chardet jars. If you checked out and ran dldeps you
|
||||
are already all set:
|
||||
|
||||
svn co http://svn.versiondude.net/whattf/build/trunk/ build
|
||||
python build/build.py checkout dldeps
|
||||
|
||||
Fedora 11:
|
||||
|
||||
yum install ruby-devel rubygem-rake java-1.5.0-gcj-devel gcc-c++
|
||||
|
||||
Ubuntu 9.04:
|
||||
|
||||
apt-get install ruby ruby1.8-dev rake gcj g++
|
||||
|
||||
Also at this time, you need to install a jdk (e.g. sun-java6-jdk), simply
|
||||
because the javac that comes with gcj doesn't support -sourcepath, and
|
||||
I haven't spent the time to find a replacement.
|
||||
|
||||
Finally, make sure that libjaxp1.3-java is *not* installed.
|
||||
|
||||
http://gcc.gnu.org/ml/java/2009-06/msg00055.html
|
||||
|
||||
If this is done, you should be all set.
|
||||
|
||||
cd htmlparser/ruby-gcj
|
||||
rake test
|
||||
|
||||
If things are successful, the last lines of the output will list the
|
||||
font attributes and values found in the test/google.html file.
|
|
@ -0,0 +1,77 @@
|
|||
deps = ENV['deps'] || '../../dependencies'
|
||||
icu4j = "#{deps}/icu4j-4_0.jar"
|
||||
chardet = "#{deps}/mozilla/intl/chardet/java/dist/lib/chardet.jar"
|
||||
libgcj = Dir['/usr/share/java/libgcj*.jar'].grep(/gcj[-\d.]*jar$/).sort.last
|
||||
|
||||
task :default => %w(headers libs Makefile validator.so)
|
||||
|
||||
# headers
|
||||
|
||||
hdb = 'nu/validator/htmlparser/dom/HtmlDocumentBuilder'
|
||||
task :headers => %W(headers/DomUtils.h headers/#{hdb}.h)
|
||||
|
||||
file 'headers/DomUtils.h' => 'DomUtils.java' do |t|
|
||||
mkdir_p %w(classes headers), :verbose => false
|
||||
sh "javac -d classes #{t.prerequisites.first}"
|
||||
sh "gcjh -force -o #{t.name} -cp #{libgcj}:classes DomUtils"
|
||||
end
|
||||
|
||||
file "headers/#{hdb}.h" => "../src/#{hdb}.java" do |t|
|
||||
mkdir_p %w(classes headers), :verbose => false
|
||||
sh "javac -cp #{icu4j}:#{chardet} -d classes -sourcepath ../src " +
|
||||
t.prerequisites.first
|
||||
sh "gcjh -force -cp classes -o #{t.name} -cp #{libgcj}:classes " +
|
||||
hdb.gsub('/','.')
|
||||
end
|
||||
|
||||
# libs
|
||||
|
||||
task :libs => %w(htmlparser chardet icu).map {|name| "lib/libnu-#{name}.so"}
|
||||
|
||||
htmlparser = Dir['../src/**/*.java'].reject {|name| name.include? '/xom/'}
|
||||
file 'lib/libnu-htmlparser.so' => htmlparser + ['DomUtils.java'] do |t|
|
||||
mkdir_p 'lib', :verbose => false
|
||||
sh "gcj -shared --classpath=#{icu4j}:#{chardet} -fPIC " +
|
||||
"-o #{t.name} #{t.prerequisites.join(' ')}"
|
||||
end
|
||||
|
||||
file 'lib/libnu-chardet.so' => chardet do |t|
|
||||
mkdir_p 'lib', :verbose => false
|
||||
sh "gcj -shared -fPIC -o #{t.name} #{t.prerequisites.join(' ')}"
|
||||
end
|
||||
|
||||
file 'lib/libnu-icu.so' => icu4j do |t|
|
||||
mkdir_p 'lib', :verbose => false
|
||||
sh "gcj -shared -fPIC -o #{t.name} #{t.prerequisites.join(' ')}"
|
||||
end
|
||||
|
||||
# module
|
||||
|
||||
file 'Makefile' do
|
||||
sh "ruby extconf.rb --with-gcj=#{libgcj}"
|
||||
end
|
||||
|
||||
file 'validator.so' => %w(Makefile validator.cpp headers/DomUtils.h) do
|
||||
system 'make'
|
||||
end
|
||||
|
||||
file 'nu/validator.so' do
|
||||
mkdir_p 'nu', :verbose => false
|
||||
system 'ln -s -t nu ../validator.so'
|
||||
end
|
||||
|
||||
# tasks
|
||||
|
||||
task :test => [:default, 'nu/validator.so'] do
|
||||
ENV['LD_LIBRARY_PATH']='lib'
|
||||
sh 'ruby test/fonts.rb test/google.html'
|
||||
end
|
||||
|
||||
task :clean do
|
||||
rm_rf %W(classes lib nu mkmf.log headers/DomUtils.h headers/#{hdb}.h) +
|
||||
Dir['*.o'] + Dir['*.so']
|
||||
end
|
||||
|
||||
task :clobber => :clean do
|
||||
rm_rf %w(headers Makefile)
|
||||
end
|
|
@ -0,0 +1,45 @@
|
|||
require 'mkmf'
|
||||
|
||||
# system dependencies
|
||||
gcj = with_config('gcj', '/usr/share/java/libgcj.jar')
|
||||
|
||||
# headers for JAXP
|
||||
CONFIG['CC'] = 'g++'
|
||||
with_cppflags('-xc++') do
|
||||
|
||||
unless find_header('org/w3c/dom/Document.h', 'headers')
|
||||
|
||||
`jar tf #{gcj}`.split.each do |file|
|
||||
next unless file =~ /\.class$/
|
||||
next unless file =~ /^(javax|org)\/(w3c|xml)/
|
||||
next if file.include? '$'
|
||||
|
||||
dest = 'headers/' + file.sub(/\.class$/,'.h')
|
||||
name = file.sub(/\.class$/,'').gsub('/','.')
|
||||
|
||||
next if File.exist? dest
|
||||
|
||||
cmd = "gcjh -cp #{gcj} -o #{dest} #{name}"
|
||||
puts cmd
|
||||
break unless system cmd
|
||||
system "ruby -pi -e '$_.sub!(/namespace namespace$/," +
|
||||
"\"namespace namespace$\")' #{dest}"
|
||||
system "ruby -pi -e '$_.sub!(/::namespace::/," +
|
||||
"\"::namespace$::\")' #{dest}"
|
||||
end
|
||||
|
||||
exit unless find_header('org/w3c/dom/Document.h', 'headers')
|
||||
end
|
||||
|
||||
find_header 'nu/validator/htmlparser/dom/HtmlDocumentBuilder.h', 'headers'
|
||||
end
|
||||
|
||||
# Java libraries
|
||||
Config::CONFIG['CC'] = 'g++ -shared'
|
||||
dir_config('nu-htmlparser', nil, 'lib')
|
||||
have_library 'nu-htmlparser'
|
||||
have_library 'nu-icu'
|
||||
have_library 'nu-chardet'
|
||||
|
||||
# Ruby library
|
||||
create_makefile 'nu/validator'
|
|
@ -0,0 +1,5 @@
|
|||
require 'nu/validator'
|
||||
|
||||
ARGV.each do |arg|
|
||||
puts Nu::Validator::parse(open(arg)).root.name
|
||||
end
|
|
@ -0,0 +1,11 @@
|
|||
require 'nu/validator'
|
||||
require 'open-uri'
|
||||
|
||||
ARGV.each do |arg|
|
||||
doc = Nu::Validator::parse(open(arg))
|
||||
doc.xpath("//*[local-name()='font']").each do |font|
|
||||
font.attributes.each do |name, attr|
|
||||
puts "#{name} => #{attr.value}"
|
||||
end
|
||||
end
|
||||
end
|
|
@ -0,0 +1,10 @@
|
|||
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><title>Google</title><script>window.google={kEI:"vLhASujeGpTU9QT2iOnWAQ",kEXPI:"17259",kCSIE:"17259",kHL:"en"};
|
||||
window.google.sn="webhp";window.google.timers={load:{t:{start:(new Date).getTime()}}};try{window.google.pt=window.gtbExternal&&window.gtbExternal.pageT()||window.external&&window.external.pageT}catch(b){}
|
||||
window.google.jsrt_kill=1;
|
||||
var _gjwl=location;function _gjuc(){var e=_gjwl.href.indexOf("#");if(e>=0){var a=_gjwl.href.substring(e);if(a.indexOf("&q=")>0||a.indexOf("#q=")>=0){a=a.substring(1);if(a.indexOf("#")==-1){for(var c=0;c<a.length;){var d=c;if(a.charAt(d)=="&")++d;var b=a.indexOf("&",d);if(b==-1)b=a.length;var f=a.substring(d,b);if(f.indexOf("fp=")==0){a=a.substring(0,c)+a.substring(b,a.length);b=c}else if(f=="cad=h")return 0;c=b}_gjwl.href="/search?"+a+"&cad=h";return 1}}}return 0}function _gjp(){!(window._gjwl.hash&&
|
||||
window._gjuc())&&setTimeout(_gjp,500)};
|
||||
window._gjp && _gjp();</script><style>td{line-height:.8em;}.gac_c{line-height:normal;}form{margin-bottom:20px;}body,td,a,p,.h{font-family:arial,sans-serif}.h{color:#36c;font-size:20px}.q{color:#00c}.ts td{padding:0}.ts{border-collapse:collapse}#gbar{height:22px;padding-left:0px}.gbh,.gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh{height:0;position:absolute;top:24px;width:100%}#guser{padding-bottom:7px !important;text-align:right}#gbar,#guser{font-size:13px;padding-top:1px !important}@media all{.gb1,.gb3{height:22px;margin-right:.5em;vertical-align:top}#gbar{float:left}}a.gb1,a.gb3{color:#00c !important}.gb3{text-decoration:none}</style><script>google.y={};google.x=function(e,g){google.y[e.id]=[e,g];return false};</script></head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onload="document.f.q.focus();if(document.images)new Image().src='/images/nav_logo4.png'" topmargin=3 marginheight=3><textarea id=csi style=display:none></textarea><iframe name=wgjf style="display:none"></iframe><div id=gbar><nobr><b class=gb1>Web</b> <a href="http://images.google.com/imghp?hl=en&tab=wi" class=gb1>Images</a> <a href="http://video.google.com/?hl=en&tab=wv" class=gb1>Video</a> <a href="http://maps.google.com/maps?hl=en&tab=wl" class=gb1>Maps</a> <a href="http://news.google.com/nwshp?hl=en&tab=wn" class=gb1>News</a> <a href="http://www.google.com/prdhp?hl=en&tab=wf" class=gb1>Shopping</a> <a href="http://mail.google.com/mail/?hl=en&tab=wm" class=gb1>Gmail</a> <a href="http://www.google.com/intl/en/options/" class=gb3><u>more</u> »</a></nobr></div><div id=guser width=100%><nobr><a href="/url?sa=p&pref=ig&pval=3&q=http://www.google.com/ig%3Fhl%3Den%26source%3Diglk&usg=AFQjCNFA18XPfgb7dKnXfKz7x7g1GDH1tg">iGoogle</a> | <a href="https://www.google.com/accounts/Login?hl=en&continue=http://www.google.com/">Sign in</a></nobr></div><div class=gbh style=left:0></div><div class=gbh style=right:0></div><center><br clear=all id=lgpd><img alt="Google" height=110 src="/intl/en_ALL/images/logo.gif" width=276 id=logo onload="window.lol&&lol()"><br><br><form action="/search" name=f><table cellpadding=0 cellspacing=0><tr valign=top><td width=25%> </td><td align=center nowrap><input name=hl type=hidden value=en><input type=hidden name=ie value="ISO-8859-1"><input autocomplete="off" maxlength=2048 name=q size=55 title="Google Search" value=""><br><input name=btnG type=submit value="Google Search"><input name=btnI type=submit value="I'm Feeling Lucky"></td><td nowrap width=25% align=left><font size=-2> <a href=/advanced_search?hl=en>Advanced Search</a><br> <a href=/preferences?hl=en>Preferences</a><br> <a href=/language_tools?hl=en>Language Tools</a></font></td></tr></table></form><br><font size=-1><a href="/aclk?sa=L&ai=CqVchLbNASrv7IZa68gS13KTwAc3__IMB29PoogzB2ZzZExABIMFUUK_O0JX______wFgyQaqBAlP0BcDOBRYhqw&num=1&sig=AGiWqty21CD7ixNXZILwCnH7c_3n9v2-tg&q=http://www.allforgood.org#source=hpp">Find an opportunity to volunteer</a> in your community today.</font><br><br><br><font size=-1><a href="/intl/en/ads/">Advertising Programs</a> - <a href="/services/">Business Solutions</a> - <a href="/intl/en/about.html">About Google</a></font><p><font size=-2>©2009 - <a href="/intl/en/privacy.html">Privacy</a></font></p></center><div id=xjsd></div><div id=xjsi><script>if(google.y)google.y.first=[];if(google.y)google.y.first=[];google.dstr=[];google.rein=[];window.setTimeout(function(){var a=document.createElement("script");a.src="/extern_js/f/CgJlbhICdXMgACswCjggQAgsKzAOOAUsKzAYOAQsKzAlOMmIASwrMCY4BCwrMCc4ACw/1t0T7hspHT4.js";(document.getElementById("xjsd")||document.body).appendChild(a)},0);
|
||||
;google.y.first.push(function(){google.ac.i(document.f,document.f.q,'','')});google.xjs&&google.j&&google.j.xi&&google.j.xi()</script></div><script>(function(){
|
||||
function a(){google.timers.load.t.ol=(new Date).getTime();google.report&&google.report(google.timers.load,{ei:google.kEI,e:google.kCSIE})}if(window.addEventListener)window.addEventListener("load",a,false);else if(window.attachEvent)window.attachEvent("onload",a);google.timers.load.t.prt=(new Date).getTime();
|
||||
})();
|
||||
</script>
|
|
@ -0,0 +1,2 @@
|
|||
<?xml version='1.0' encoding='iso-8859-7'?>
|
||||
<root/>
|
|
@ -0,0 +1,210 @@
|
|||
#include <gcj/cni.h>
|
||||
|
||||
#include <java/io/ByteArrayInputStream.h>
|
||||
#include <java/lang/System.h>
|
||||
#include <java/lang/Throwable.h>
|
||||
#include <java/util/ArrayList.h>
|
||||
#include <javax/xml/xpath/XPath.h>
|
||||
#include <javax/xml/xpath/XPathFactory.h>
|
||||
#include <javax/xml/xpath/XPathExpression.h>
|
||||
#include <javax/xml/xpath/XPathConstants.h>
|
||||
#include <javax/xml/parsers/DocumentBuilderFactory.h>
|
||||
#include <javax/xml/parsers/DocumentBuilder.h>
|
||||
#include <org/w3c/dom/Attr.h>
|
||||
#include <org/w3c/dom/Document.h>
|
||||
#include <org/w3c/dom/Element.h>
|
||||
#include <org/w3c/dom/NodeList.h>
|
||||
#include <org/w3c/dom/NamedNodeMap.h>
|
||||
#include <org/xml/sax/InputSource.h>
|
||||
|
||||
#include "nu/validator/htmlparser/dom/HtmlDocumentBuilder.h"
|
||||
|
||||
#include "DomUtils.h"
|
||||
|
||||
#include "ruby.h"
|
||||
|
||||
using namespace java::io;
|
||||
using namespace java::lang;
|
||||
using namespace java::util;
|
||||
using namespace javax::xml::parsers;
|
||||
using namespace javax::xml::xpath;
|
||||
using namespace nu::validator::htmlparser::dom;
|
||||
using namespace org::w3c::dom;
|
||||
using namespace org::xml::sax;
|
||||
|
||||
static VALUE jaxp_Document;
|
||||
static VALUE jaxp_Attr;
|
||||
static VALUE jaxp_Element;
|
||||
static ID ID_read;
|
||||
static ID ID_doc;
|
||||
static ID ID_element;
|
||||
|
||||
// convert a Java string into a Ruby string
|
||||
static VALUE j2r(String *string) {
|
||||
if (string == NULL) return Qnil;
|
||||
jint len = JvGetStringUTFLength(string);
|
||||
char buf[len];
|
||||
JvGetStringUTFRegion(string, 0, len, buf);
|
||||
return rb_str_new(buf, len);
|
||||
}
|
||||
|
||||
// convert a Ruby string into a Java string
|
||||
static String *r2j(VALUE string) {
|
||||
return JvNewStringUTF(RSTRING(string)->ptr);
|
||||
}
|
||||
|
||||
// release the Java Document associated with this Ruby Document
|
||||
static void vnu_document_free(Document *doc) {
|
||||
DomUtils::unpin(doc);
|
||||
}
|
||||
|
||||
// Nu::Validator::parse( string|file )
|
||||
static VALUE vnu_parse(VALUE self, VALUE input) {
|
||||
HtmlDocumentBuilder *parser = new HtmlDocumentBuilder();
|
||||
|
||||
// read file-like objects into memory. TODO: buffer such objects
|
||||
if (rb_respond_to(input, ID_read))
|
||||
input = rb_funcall(input, ID_read, 0);
|
||||
|
||||
// convert input in to a ByteArrayInputStream
|
||||
jbyteArray bytes = JvNewByteArray(RSTRING(input)->len);
|
||||
memcpy(elements(bytes), RSTRING(input)->ptr, RSTRING(input)->len);
|
||||
InputSource *source = new InputSource(new ByteArrayInputStream(bytes));
|
||||
|
||||
// parse, pin, and wrap
|
||||
Document *doc = parser->parse(source);
|
||||
DomUtils::pin(doc);
|
||||
return Data_Wrap_Struct(jaxp_Document, NULL, vnu_document_free, doc);
|
||||
}
|
||||
|
||||
// Jaxp::parse( string|file )
|
||||
static VALUE jaxp_parse(VALUE self, VALUE input) {
|
||||
DocumentBuilderFactory *factory = DocumentBuilderFactory::newInstance();
|
||||
DocumentBuilder *parser = factory->newDocumentBuilder();
|
||||
|
||||
// read file-like objects into memory. TODO: buffer such objects
|
||||
if (rb_respond_to(input, ID_read))
|
||||
input = rb_funcall(input, ID_read, 0);
|
||||
|
||||
try {
|
||||
jbyteArray bytes = JvNewByteArray(RSTRING(input)->len);
|
||||
memcpy(elements(bytes), RSTRING(input)->ptr, RSTRING(input)->len);
|
||||
Document *doc = parser->parse(new ByteArrayInputStream(bytes));
|
||||
DomUtils::pin(doc);
|
||||
return Data_Wrap_Struct(jaxp_Document, NULL, vnu_document_free, doc);
|
||||
} catch (java::lang::Throwable *ex) {
|
||||
ex->printStackTrace();
|
||||
return Qnil;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// Nu::Validator::Document#encoding
|
||||
static VALUE jaxp_document_encoding(VALUE rdoc) {
|
||||
Document *jdoc;
|
||||
Data_Get_Struct(rdoc, Document, jdoc);
|
||||
return j2r(jdoc->getXmlEncoding());
|
||||
}
|
||||
|
||||
// Nu::Validator::Document#root
|
||||
static VALUE jaxp_document_root(VALUE rdoc) {
|
||||
Document *jdoc;
|
||||
Data_Get_Struct(rdoc, Document, jdoc);
|
||||
|
||||
Element *jelement = jdoc->getDocumentElement();
|
||||
if (jelement==NULL) return Qnil;
|
||||
|
||||
VALUE relement = Data_Wrap_Struct(jaxp_Element, NULL, NULL, jelement);
|
||||
rb_ivar_set(relement, ID_doc, rdoc);
|
||||
return relement;
|
||||
}
|
||||
|
||||
// Nu::Validator::Document#xpath
|
||||
static VALUE jaxp_document_xpath(VALUE rdoc, VALUE path) {
|
||||
Document *jdoc;
|
||||
Data_Get_Struct(rdoc, Document, jdoc);
|
||||
|
||||
Element *jelement = jdoc->getDocumentElement();
|
||||
if (jelement==NULL) return Qnil;
|
||||
|
||||
XPath *xpath = XPathFactory::newInstance()->newXPath();
|
||||
XPathExpression *expr = xpath->compile(r2j(path));
|
||||
NodeList *list = (NodeList*) expr->evaluate(jdoc, XPathConstants::NODESET);
|
||||
|
||||
VALUE result = rb_ary_new();
|
||||
for (int i=0; i<list->getLength(); i++) {
|
||||
VALUE relement = Data_Wrap_Struct(jaxp_Element, NULL, NULL, list->item(i));
|
||||
rb_ivar_set(relement, ID_doc, rdoc);
|
||||
rb_ary_push(result, relement);
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
// Nu::Validator::Element#name
|
||||
static VALUE jaxp_element_name(VALUE relement) {
|
||||
Element *jelement;
|
||||
Data_Get_Struct(relement, Element, jelement);
|
||||
return j2r(jelement->getNodeName());
|
||||
}
|
||||
|
||||
// Nu::Validator::Element#attributes
|
||||
static VALUE jaxp_element_attributes(VALUE relement) {
|
||||
Element *jelement;
|
||||
Data_Get_Struct(relement, Element, jelement);
|
||||
VALUE result = rb_hash_new();
|
||||
NamedNodeMap *map = jelement->getAttributes();
|
||||
for (int i=0; i<map->getLength(); i++) {
|
||||
Attr *jattr = (Attr *) map->item(i);
|
||||
VALUE rattr = Data_Wrap_Struct(jaxp_Attr, NULL, NULL, jattr);
|
||||
rb_ivar_set(rattr, ID_element, relement);
|
||||
rb_hash_aset(result, j2r(jattr->getName()), rattr);
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
// Nu::Validator::Attribute#value
|
||||
static VALUE jaxp_attribute_value(VALUE rattribute) {
|
||||
Attr *jattribute;
|
||||
Data_Get_Struct(rattribute, Attr, jattribute);
|
||||
return j2r(jattribute->getValue());
|
||||
}
|
||||
|
||||
typedef VALUE (ruby_method)(...);
|
||||
|
||||
// Nu::Validator module initialization
|
||||
extern "C" void Init_validator() {
|
||||
JvCreateJavaVM(NULL);
|
||||
JvAttachCurrentThread(NULL, NULL);
|
||||
JvInitClass(&DomUtils::class$);
|
||||
JvInitClass(&XPathFactory::class$);
|
||||
JvInitClass(&XPathConstants::class$);
|
||||
|
||||
VALUE jaxp = rb_define_module("Jaxp");
|
||||
rb_define_singleton_method(jaxp, "parse", (ruby_method*)&jaxp_parse, 1);
|
||||
|
||||
VALUE nu = rb_define_module("Nu");
|
||||
VALUE validator = rb_define_module_under(nu, "Validator");
|
||||
rb_define_singleton_method(validator, "parse", (ruby_method*)&vnu_parse, 1);
|
||||
|
||||
jaxp_Document = rb_define_class_under(jaxp, "Document", rb_cObject);
|
||||
rb_define_method(jaxp_Document, "encoding",
|
||||
(ruby_method*)&jaxp_document_encoding, 0);
|
||||
rb_define_method(jaxp_Document, "root",
|
||||
(ruby_method*)&jaxp_document_root, 0);
|
||||
rb_define_method(jaxp_Document, "xpath",
|
||||
(ruby_method*)&jaxp_document_xpath, 1);
|
||||
|
||||
jaxp_Element = rb_define_class_under(jaxp, "Element", rb_cObject);
|
||||
rb_define_method(jaxp_Element, "name",
|
||||
(ruby_method*)&jaxp_element_name, 0);
|
||||
rb_define_method(jaxp_Element, "attributes",
|
||||
(ruby_method*)&jaxp_element_attributes, 0);
|
||||
|
||||
jaxp_Attr = rb_define_class_under(jaxp, "Attr", rb_cObject);
|
||||
rb_define_method(jaxp_Attr, "value",
|
||||
(ruby_method*)&jaxp_attribute_value, 0);
|
||||
|
||||
ID_read = rb_intern("read");
|
||||
ID_doc = rb_intern("@doc");
|
||||
ID_element = rb_intern("@element");
|
||||
}
|
|
@ -0,0 +1,59 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Big5 extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"big5",
|
||||
"big5-hkscs",
|
||||
"cn-big5",
|
||||
"csbig5",
|
||||
"x-x-big5"
|
||||
};
|
||||
|
||||
private static final String NAME = "big5";
|
||||
|
||||
static final Big5 INSTANCE = new Big5();
|
||||
|
||||
private Big5() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new Big5Decoder(this);
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return new Big5Encoder(this);
|
||||
}
|
||||
}
|
File diff suppressed because one or more lines are too long
|
@ -0,0 +1,184 @@
|
|||
/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CoderResult;
|
||||
|
||||
public class Big5Decoder extends Decoder {
|
||||
|
||||
private int big5Lead = 0;
|
||||
|
||||
private char pendingTrail = '\u0000';
|
||||
|
||||
protected Big5Decoder(Charset cs) {
|
||||
super(cs, 0.5f, 1.0f);
|
||||
}
|
||||
|
||||
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
|
||||
assert !(this.report && (big5Lead != 0)):
|
||||
"When reporting, this method should never return with big5Lead set.";
|
||||
if (pendingTrail != '\u0000') {
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put(pendingTrail);
|
||||
pendingTrail = '\u0000';
|
||||
}
|
||||
for (;;) {
|
||||
if (!in.hasRemaining()) {
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
int b = ((int) in.get() & 0xFF);
|
||||
if (big5Lead == 0) {
|
||||
if (b <= 0x7F) {
|
||||
out.put((char) b);
|
||||
continue;
|
||||
}
|
||||
if (b >= 0x81 && b <= 0xFE) {
|
||||
if (this.report && !in.hasRemaining()) {
|
||||
// The Java API is badly documented. Need to do this
|
||||
// crazy thing and hope the caller knows about the
|
||||
// undocumented aspects of the API!
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
big5Lead = b;
|
||||
continue;
|
||||
}
|
||||
if (this.report) {
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.malformedForLength(1);
|
||||
}
|
||||
out.put('\uFFFD');
|
||||
continue;
|
||||
}
|
||||
int lead = big5Lead;
|
||||
big5Lead = 0;
|
||||
int offset = (b < 0x7F) ? 0x40 : 0x62;
|
||||
if ((b >= 0x40 && b <= 0x7E) || (b >= 0xA1 && b <= 0xFE)) {
|
||||
int pointer = (lead - 0x81) * 157 + (b - offset);
|
||||
char outTrail;
|
||||
switch (pointer) {
|
||||
case 1133:
|
||||
out.put('\u00CA');
|
||||
outTrail = '\u0304';
|
||||
break;
|
||||
case 1135:
|
||||
out.put('\u00CA');
|
||||
outTrail = '\u030C';
|
||||
break;
|
||||
case 1164:
|
||||
out.put('\u00EA');
|
||||
outTrail = '\u0304';
|
||||
break;
|
||||
case 1166:
|
||||
out.put('\u00EA');
|
||||
outTrail = '\u030C';
|
||||
break;
|
||||
default:
|
||||
char lowBits = Big5Data.lowBits(pointer);
|
||||
if (lowBits == '\u0000') {
|
||||
// The following |if| block fixes
|
||||
// https://github.com/whatwg/encoding/issues/5
|
||||
if (b <= 0x7F) {
|
||||
// prepend byte to stream
|
||||
// Always legal, since we've always just read a byte
|
||||
// if we come here.
|
||||
in.position(in.position() - 1);
|
||||
}
|
||||
if (this.report) {
|
||||
// This can go past the start of the buffer
|
||||
// if the caller does not conform to the
|
||||
// undocumented aspects of the API.
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.malformedForLength(b <= 0x7F ? 1 : 2);
|
||||
}
|
||||
out.put('\uFFFD');
|
||||
continue;
|
||||
}
|
||||
if (Big5Data.isAstral(pointer)) {
|
||||
int codePoint = lowBits | 0x20000;
|
||||
out.put((char) (0xD7C0 + (codePoint >> 10)));
|
||||
outTrail = (char) (0xDC00 + (codePoint & 0x3FF));
|
||||
break;
|
||||
}
|
||||
out.put(lowBits);
|
||||
continue;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
pendingTrail = outTrail;
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put(outTrail);
|
||||
continue;
|
||||
}
|
||||
// pointer is null
|
||||
if (b <= 0x7F) {
|
||||
// prepend byte to stream
|
||||
// Always legal, since we've always just read a byte
|
||||
// if we come here.
|
||||
in.position(in.position() - 1);
|
||||
}
|
||||
if (this.report) {
|
||||
// if position() == 0, the caller is not using the
|
||||
// undocumented part of the API right and the line
|
||||
// below will throw!
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.malformedForLength(b <= 0x7F ? 1 : 2);
|
||||
}
|
||||
out.put('\uFFFD');
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected CoderResult implFlush(CharBuffer out) {
|
||||
if (pendingTrail != '\u0000') {
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put(pendingTrail);
|
||||
pendingTrail = '\u0000';
|
||||
}
|
||||
if (big5Lead != 0) {
|
||||
assert !this.report: "How come big5Lead got to be non-zero when decodeLoop() returned in the reporting mode?";
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put('\uFFFD');
|
||||
big5Lead = 0;
|
||||
}
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
|
||||
@Override protected void implReset() {
|
||||
big5Lead = 0;
|
||||
pendingTrail = '\u0000';
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,185 @@
|
|||
/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CoderResult;
|
||||
|
||||
public class Big5Encoder extends Encoder {
|
||||
|
||||
private char utf16Lead = '\u0000';
|
||||
|
||||
private byte pendingTrail = 0;
|
||||
|
||||
protected Big5Encoder(Charset cs) {
|
||||
super(cs, 1.5f, 2.0f);
|
||||
}
|
||||
|
||||
@Override protected CoderResult encodeLoop(CharBuffer in, ByteBuffer out) {
|
||||
assert !((this.reportMalformed || this.reportUnmappable) && (utf16Lead != '\u0000')):
|
||||
"When reporting, this method should never return with utf16Lead set.";
|
||||
if (pendingTrail != 0) {
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put(pendingTrail);
|
||||
pendingTrail = 0;
|
||||
}
|
||||
for (;;) {
|
||||
if (!in.hasRemaining()) {
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
boolean isAstral; // true means Plane 2, false means BMP
|
||||
char lowBits; // The low 16 bits of the code point
|
||||
char codeUnit = in.get();
|
||||
int highBits = (codeUnit & 0xFC00);
|
||||
if (highBits == 0xD800) {
|
||||
// high surrogate
|
||||
if (utf16Lead != '\u0000') {
|
||||
// High surrogate follows another high surrogate. The
|
||||
// *previous* code unit is in error.
|
||||
if (this.reportMalformed) {
|
||||
// The caller had better adhere to the API contract.
|
||||
// Otherwise, this may throw.
|
||||
in.position(in.position() - 2);
|
||||
utf16Lead = '\u0000';
|
||||
return CoderResult.malformedForLength(1);
|
||||
}
|
||||
out.put((byte) '?');
|
||||
}
|
||||
utf16Lead = codeUnit;
|
||||
continue;
|
||||
}
|
||||
if (highBits == 0xDC00) {
|
||||
// low surrogate
|
||||
if (utf16Lead == '\u0000') {
|
||||
// Got low surrogate without a previous high surrogate
|
||||
if (this.reportMalformed) {
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.malformedForLength(1);
|
||||
}
|
||||
out.put((byte) '?');
|
||||
continue;
|
||||
}
|
||||
int codePoint = (utf16Lead << 10) + codeUnit - 56613888;
|
||||
utf16Lead = '\u0000';
|
||||
// Plane 2 is the only astral plane that has potentially
|
||||
// Big5-encodable characters.
|
||||
if ((0xFF0000 & codePoint) != 0x20000) {
|
||||
if (this.reportUnmappable) {
|
||||
in.position(in.position() - 2);
|
||||
return CoderResult.unmappableForLength(2);
|
||||
}
|
||||
out.put((byte) '?');
|
||||
continue;
|
||||
}
|
||||
isAstral = true;
|
||||
lowBits = (char)(codePoint & 0xFFFF);
|
||||
} else {
|
||||
// not a surrogate
|
||||
if (utf16Lead != '\u0000') {
|
||||
// Non-surrogate follows a high surrogate. The *previous*
|
||||
// code unit is in error.
|
||||
utf16Lead = '\u0000';
|
||||
if (this.reportMalformed) {
|
||||
// The caller had better adhere to the API contract.
|
||||
// Otherwise, this may throw.
|
||||
in.position(in.position() - 2);
|
||||
return CoderResult.malformedForLength(1);
|
||||
}
|
||||
out.put((byte) '?');
|
||||
// Let's unconsume this code unit and reloop in order to
|
||||
// re-check if the output buffer still has space.
|
||||
in.position(in.position() - 1);
|
||||
continue;
|
||||
}
|
||||
isAstral = false;
|
||||
lowBits = codeUnit;
|
||||
}
|
||||
// isAstral now tells us if we have a Plane 2 or a BMP character.
|
||||
// lowBits tells us the low 16 bits.
|
||||
// After all the above setup to deal with UTF-16, we are now
|
||||
// finally ready to follow the spec.
|
||||
if (!isAstral && lowBits <= 0x7F) {
|
||||
out.put((byte)lowBits);
|
||||
continue;
|
||||
}
|
||||
int pointer = Big5Data.findPointer(lowBits, isAstral);
|
||||
if (pointer == 0) {
|
||||
if (this.reportUnmappable) {
|
||||
if (isAstral) {
|
||||
in.position(in.position() - 2);
|
||||
return CoderResult.unmappableForLength(2);
|
||||
}
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.unmappableForLength(1);
|
||||
}
|
||||
out.put((byte)'?');
|
||||
continue;
|
||||
}
|
||||
int lead = pointer / 157 + 0x81;
|
||||
int trail = pointer % 157;
|
||||
if (trail < 0x3F) {
|
||||
trail += 0x40;
|
||||
} else {
|
||||
trail += 0x62;
|
||||
}
|
||||
out.put((byte)lead);
|
||||
if (!out.hasRemaining()) {
|
||||
pendingTrail = (byte)trail;
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put((byte)trail);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
@Override protected CoderResult implFlush(ByteBuffer out) {
|
||||
if (pendingTrail != 0) {
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put(pendingTrail);
|
||||
pendingTrail = 0;
|
||||
}
|
||||
if (utf16Lead != '\u0000') {
|
||||
assert !this.reportMalformed: "How come utf16Lead got to be non-zero when decodeLoop() returned in the reporting mode?";
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
out.put((byte)'?');
|
||||
utf16Lead = '\u0000';
|
||||
}
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
|
||||
@Override protected void implReset() {
|
||||
utf16Lead = '\u0000';
|
||||
pendingTrail = 0;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,80 @@
|
|||
/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CodingErrorAction;
|
||||
|
||||
public abstract class Decoder extends CharsetDecoder {
|
||||
|
||||
protected boolean report = true;
|
||||
|
||||
protected Decoder(Charset cs, float averageCharsPerByte, float maxCharsPerByte) {
|
||||
super(cs, averageCharsPerByte, maxCharsPerByte);
|
||||
}
|
||||
|
||||
@Override protected final void implOnMalformedInput(CodingErrorAction newAction) {
|
||||
if (newAction == null) {
|
||||
throw new IllegalArgumentException("The argument must not be null.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.IGNORE) {
|
||||
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPLACE) {
|
||||
this.report = false;
|
||||
return;
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPORT) {
|
||||
this.report = true;
|
||||
return;
|
||||
}
|
||||
assert false: "Unreachable.";
|
||||
throw new IllegalArgumentException("Unknown CodingErrorAction.");
|
||||
}
|
||||
|
||||
@Override protected final void implOnUnmappableCharacter(
|
||||
CodingErrorAction newAction) {
|
||||
if (newAction == null) {
|
||||
throw new IllegalArgumentException("The argument must not be null.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.IGNORE) {
|
||||
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPLACE) {
|
||||
return; // We don't actually care, since there are no unmappables.
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPORT) {
|
||||
return; // We don't actually care, since there are no unmappables.
|
||||
}
|
||||
assert false: "Unreachable.";
|
||||
throw new IllegalArgumentException("Unknown CodingErrorAction.");
|
||||
}
|
||||
|
||||
@Override protected final void implReplaceWith(String newReplacement) {
|
||||
if (!"\uFFFD".equals(newReplacement)) {
|
||||
throw new IllegalArgumentException("Only U+FFFD is allowed as the replacement.");
|
||||
}
|
||||
}
|
||||
|
||||
// TODO: Check if the JDK decoders reset the reporting state on reset()
|
||||
}
|
|
@ -0,0 +1,95 @@
|
|||
/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
import java.nio.charset.CodingErrorAction;
|
||||
|
||||
public abstract class Encoder extends CharsetEncoder {
|
||||
|
||||
boolean reportMalformed = true;
|
||||
|
||||
boolean reportUnmappable = true;
|
||||
|
||||
protected Encoder(Charset cs, float averageBytesPerChar,
|
||||
float maxBytesPerChar) {
|
||||
super(cs, averageBytesPerChar, maxBytesPerChar);
|
||||
}
|
||||
|
||||
@Override protected final void implOnMalformedInput(CodingErrorAction newAction) {
|
||||
if (newAction == null) {
|
||||
throw new IllegalArgumentException("The argument must not be null.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.IGNORE) {
|
||||
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPLACE) {
|
||||
this.reportMalformed = false;
|
||||
return;
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPORT) {
|
||||
this.reportUnmappable = true;
|
||||
return;
|
||||
}
|
||||
assert false: "Unreachable.";
|
||||
throw new IllegalArgumentException("Unknown CodingErrorAction.");
|
||||
}
|
||||
|
||||
@Override protected final void implOnUnmappableCharacter(
|
||||
CodingErrorAction newAction) {
|
||||
if (newAction == null) {
|
||||
throw new IllegalArgumentException("The argument must not be null.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.IGNORE) {
|
||||
throw new IllegalArgumentException("The Encoding Standard does not allow errors to be ignored.");
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPLACE) {
|
||||
this.reportUnmappable = false;
|
||||
return;
|
||||
}
|
||||
if (newAction == CodingErrorAction.REPORT) {
|
||||
this.reportMalformed = true;
|
||||
return;
|
||||
}
|
||||
assert false: "Unreachable.";
|
||||
throw new IllegalArgumentException("Unknown CodingErrorAction.");
|
||||
}
|
||||
|
||||
@Override public boolean isLegalReplacement(byte[] repl) {
|
||||
if (repl == null) {
|
||||
return false;
|
||||
}
|
||||
if (repl.length != 1) {
|
||||
return false;
|
||||
}
|
||||
if (repl[0] != '?') {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
@Override protected final void implReplaceWith(byte[] newReplacement) {
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,886 @@
|
|||
/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
import java.nio.charset.IllegalCharsetNameException;
|
||||
import java.nio.charset.UnsupportedCharsetException;
|
||||
import java.nio.charset.spi.CharsetProvider;
|
||||
import java.util.Arrays;
|
||||
import java.util.Collections;
|
||||
import java.util.SortedMap;
|
||||
import java.util.TreeMap;
|
||||
|
||||
/**
|
||||
* Represents an <a href="https://encoding.spec.whatwg.org/#encoding">encoding</a>
|
||||
* as defined in the <a href="https://encoding.spec.whatwg.org/">Encoding
|
||||
* Standard</a>, provides access to each encoding defined in the Encoding
|
||||
* Standard via a static constant and provides the
|
||||
* "<a href="https://encoding.spec.whatwg.org/#concept-encoding-get">get an
|
||||
* encoding</a>" algorithm defined in the Encoding Standard.
|
||||
*
|
||||
* <p>This class inherits from {@link Charset} to allow the Encoding
|
||||
* Standard-compliant encodings to be used in contexts that support
|
||||
* <code>Charset</code> instances. However, by design, the Encoding
|
||||
* Standard-compliant encodings are not supplied via a {@link CharsetProvider}
|
||||
* and, therefore, are not available via and do not interfere with the static
|
||||
* methods provided by <code>Charset</code>. (This class provides methods of
|
||||
* the same name to hide each static method of <code>Charset</code> to help
|
||||
* avoid accidental calls to the static methods of the superclass when working
|
||||
* with Encoding Standard-compliant encodings.)
|
||||
*
|
||||
* <p>When an application needs to use a particular encoding, such as utf-8
|
||||
* or windows-1252, the corresponding constant, i.e.
|
||||
* {@link #UTF_8 Encoding.UTF_8} and {@link #WINDOWS_1252 Encoding.WINDOWS_1252}
|
||||
* respectively, should be used. However, when the application receives an
|
||||
* encoding label from external input, the method {@link #forName(String)
|
||||
* forName()} should be used to obtain the object representing the encoding
|
||||
* identified by the label. In contexts where labels that map to the
|
||||
* <a href="https://encoding.spec.whatwg.org/#replacement">replacement
|
||||
* encoding</a> should be treated as unknown, the method {@link
|
||||
* #forNameNoReplacement(String) forNameNoReplacement()} should be used instead.
|
||||
*
|
||||
*
|
||||
* @author hsivonen
|
||||
*/
|
||||
public abstract class Encoding extends Charset {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"866",
|
||||
"ansi_x3.4-1968",
|
||||
"arabic",
|
||||
"ascii",
|
||||
"asmo-708",
|
||||
"big5",
|
||||
"big5-hkscs",
|
||||
"chinese",
|
||||
"cn-big5",
|
||||
"cp1250",
|
||||
"cp1251",
|
||||
"cp1252",
|
||||
"cp1253",
|
||||
"cp1254",
|
||||
"cp1255",
|
||||
"cp1256",
|
||||
"cp1257",
|
||||
"cp1258",
|
||||
"cp819",
|
||||
"cp866",
|
||||
"csbig5",
|
||||
"cseuckr",
|
||||
"cseucpkdfmtjapanese",
|
||||
"csgb2312",
|
||||
"csibm866",
|
||||
"csiso2022jp",
|
||||
"csiso2022kr",
|
||||
"csiso58gb231280",
|
||||
"csiso88596e",
|
||||
"csiso88596i",
|
||||
"csiso88598e",
|
||||
"csiso88598i",
|
||||
"csisolatin1",
|
||||
"csisolatin2",
|
||||
"csisolatin3",
|
||||
"csisolatin4",
|
||||
"csisolatin5",
|
||||
"csisolatin6",
|
||||
"csisolatin9",
|
||||
"csisolatinarabic",
|
||||
"csisolatincyrillic",
|
||||
"csisolatingreek",
|
||||
"csisolatinhebrew",
|
||||
"cskoi8r",
|
||||
"csksc56011987",
|
||||
"csmacintosh",
|
||||
"csshiftjis",
|
||||
"cyrillic",
|
||||
"dos-874",
|
||||
"ecma-114",
|
||||
"ecma-118",
|
||||
"elot_928",
|
||||
"euc-jp",
|
||||
"euc-kr",
|
||||
"gb18030",
|
||||
"gb2312",
|
||||
"gb_2312",
|
||||
"gb_2312-80",
|
||||
"gbk",
|
||||
"greek",
|
||||
"greek8",
|
||||
"hebrew",
|
||||
"hz-gb-2312",
|
||||
"ibm819",
|
||||
"ibm866",
|
||||
"iso-2022-cn",
|
||||
"iso-2022-cn-ext",
|
||||
"iso-2022-jp",
|
||||
"iso-2022-kr",
|
||||
"iso-8859-1",
|
||||
"iso-8859-10",
|
||||
"iso-8859-11",
|
||||
"iso-8859-13",
|
||||
"iso-8859-14",
|
||||
"iso-8859-15",
|
||||
"iso-8859-16",
|
||||
"iso-8859-2",
|
||||
"iso-8859-3",
|
||||
"iso-8859-4",
|
||||
"iso-8859-5",
|
||||
"iso-8859-6",
|
||||
"iso-8859-6-e",
|
||||
"iso-8859-6-i",
|
||||
"iso-8859-7",
|
||||
"iso-8859-8",
|
||||
"iso-8859-8-e",
|
||||
"iso-8859-8-i",
|
||||
"iso-8859-9",
|
||||
"iso-ir-100",
|
||||
"iso-ir-101",
|
||||
"iso-ir-109",
|
||||
"iso-ir-110",
|
||||
"iso-ir-126",
|
||||
"iso-ir-127",
|
||||
"iso-ir-138",
|
||||
"iso-ir-144",
|
||||
"iso-ir-148",
|
||||
"iso-ir-149",
|
||||
"iso-ir-157",
|
||||
"iso-ir-58",
|
||||
"iso8859-1",
|
||||
"iso8859-10",
|
||||
"iso8859-11",
|
||||
"iso8859-13",
|
||||
"iso8859-14",
|
||||
"iso8859-15",
|
||||
"iso8859-2",
|
||||
"iso8859-3",
|
||||
"iso8859-4",
|
||||
"iso8859-5",
|
||||
"iso8859-6",
|
||||
"iso8859-7",
|
||||
"iso8859-8",
|
||||
"iso8859-9",
|
||||
"iso88591",
|
||||
"iso885910",
|
||||
"iso885911",
|
||||
"iso885913",
|
||||
"iso885914",
|
||||
"iso885915",
|
||||
"iso88592",
|
||||
"iso88593",
|
||||
"iso88594",
|
||||
"iso88595",
|
||||
"iso88596",
|
||||
"iso88597",
|
||||
"iso88598",
|
||||
"iso88599",
|
||||
"iso_8859-1",
|
||||
"iso_8859-15",
|
||||
"iso_8859-1:1987",
|
||||
"iso_8859-2",
|
||||
"iso_8859-2:1987",
|
||||
"iso_8859-3",
|
||||
"iso_8859-3:1988",
|
||||
"iso_8859-4",
|
||||
"iso_8859-4:1988",
|
||||
"iso_8859-5",
|
||||
"iso_8859-5:1988",
|
||||
"iso_8859-6",
|
||||
"iso_8859-6:1987",
|
||||
"iso_8859-7",
|
||||
"iso_8859-7:1987",
|
||||
"iso_8859-8",
|
||||
"iso_8859-8:1988",
|
||||
"iso_8859-9",
|
||||
"iso_8859-9:1989",
|
||||
"koi",
|
||||
"koi8",
|
||||
"koi8-r",
|
||||
"koi8-ru",
|
||||
"koi8-u",
|
||||
"koi8_r",
|
||||
"korean",
|
||||
"ks_c_5601-1987",
|
||||
"ks_c_5601-1989",
|
||||
"ksc5601",
|
||||
"ksc_5601",
|
||||
"l1",
|
||||
"l2",
|
||||
"l3",
|
||||
"l4",
|
||||
"l5",
|
||||
"l6",
|
||||
"l9",
|
||||
"latin1",
|
||||
"latin2",
|
||||
"latin3",
|
||||
"latin4",
|
||||
"latin5",
|
||||
"latin6",
|
||||
"logical",
|
||||
"mac",
|
||||
"macintosh",
|
||||
"ms932",
|
||||
"ms_kanji",
|
||||
"shift-jis",
|
||||
"shift_jis",
|
||||
"sjis",
|
||||
"sun_eu_greek",
|
||||
"tis-620",
|
||||
"unicode-1-1-utf-8",
|
||||
"us-ascii",
|
||||
"utf-16",
|
||||
"utf-16be",
|
||||
"utf-16le",
|
||||
"utf-8",
|
||||
"utf8",
|
||||
"visual",
|
||||
"windows-1250",
|
||||
"windows-1251",
|
||||
"windows-1252",
|
||||
"windows-1253",
|
||||
"windows-1254",
|
||||
"windows-1255",
|
||||
"windows-1256",
|
||||
"windows-1257",
|
||||
"windows-1258",
|
||||
"windows-31j",
|
||||
"windows-874",
|
||||
"windows-949",
|
||||
"x-cp1250",
|
||||
"x-cp1251",
|
||||
"x-cp1252",
|
||||
"x-cp1253",
|
||||
"x-cp1254",
|
||||
"x-cp1255",
|
||||
"x-cp1256",
|
||||
"x-cp1257",
|
||||
"x-cp1258",
|
||||
"x-euc-jp",
|
||||
"x-gbk",
|
||||
"x-mac-cyrillic",
|
||||
"x-mac-roman",
|
||||
"x-mac-ukrainian",
|
||||
"x-sjis",
|
||||
"x-user-defined",
|
||||
"x-x-big5",
|
||||
};
|
||||
|
||||
private static final Encoding[] ENCODINGS_FOR_LABELS = {
|
||||
Ibm866.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Big5.INSTANCE,
|
||||
Big5.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Big5.INSTANCE,
|
||||
Windows1250.INSTANCE,
|
||||
Windows1251.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Windows1253.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1255.INSTANCE,
|
||||
Windows1256.INSTANCE,
|
||||
Windows1257.INSTANCE,
|
||||
Windows1258.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Ibm866.INSTANCE,
|
||||
Big5.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
EucJp.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Ibm866.INSTANCE,
|
||||
Iso2022Jp.INSTANCE,
|
||||
Replacement.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Iso8I.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Koi8R.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
Macintosh.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
EucJp.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
Gb18030.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Replacement.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Ibm866.INSTANCE,
|
||||
Replacement.INSTANCE,
|
||||
Replacement.INSTANCE,
|
||||
Iso2022Jp.INSTANCE,
|
||||
Replacement.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
Iso13.INSTANCE,
|
||||
Iso14.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Iso16.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Iso8I.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
Iso13.INSTANCE,
|
||||
Iso14.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
Iso13.INSTANCE,
|
||||
Iso14.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Koi8R.INSTANCE,
|
||||
Koi8R.INSTANCE,
|
||||
Koi8R.INSTANCE,
|
||||
Koi8U.INSTANCE,
|
||||
Koi8U.INSTANCE,
|
||||
Koi8R.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Iso8I.INSTANCE,
|
||||
Macintosh.INSTANCE,
|
||||
Macintosh.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
Utf8.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Utf16Le.INSTANCE,
|
||||
Utf16Be.INSTANCE,
|
||||
Utf16Le.INSTANCE,
|
||||
Utf8.INSTANCE,
|
||||
Utf8.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Windows1250.INSTANCE,
|
||||
Windows1251.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Windows1253.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1255.INSTANCE,
|
||||
Windows1256.INSTANCE,
|
||||
Windows1257.INSTANCE,
|
||||
Windows1258.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
Windows1250.INSTANCE,
|
||||
Windows1251.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Windows1253.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1255.INSTANCE,
|
||||
Windows1256.INSTANCE,
|
||||
Windows1257.INSTANCE,
|
||||
Windows1258.INSTANCE,
|
||||
EucJp.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
MacCyrillic.INSTANCE,
|
||||
Macintosh.INSTANCE,
|
||||
MacCyrillic.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
UserDefined.INSTANCE,
|
||||
Big5.INSTANCE,
|
||||
};
|
||||
|
||||
private static final Encoding[] ENCODINGS = {
|
||||
Big5.INSTANCE,
|
||||
EucJp.INSTANCE,
|
||||
EucKr.INSTANCE,
|
||||
Gb18030.INSTANCE,
|
||||
Gbk.INSTANCE,
|
||||
Ibm866.INSTANCE,
|
||||
Iso2022Jp.INSTANCE,
|
||||
Iso10.INSTANCE,
|
||||
Iso13.INSTANCE,
|
||||
Iso14.INSTANCE,
|
||||
Iso15.INSTANCE,
|
||||
Iso16.INSTANCE,
|
||||
Iso2.INSTANCE,
|
||||
Iso3.INSTANCE,
|
||||
Iso4.INSTANCE,
|
||||
Iso5.INSTANCE,
|
||||
Iso6.INSTANCE,
|
||||
Iso7.INSTANCE,
|
||||
Iso8.INSTANCE,
|
||||
Iso8I.INSTANCE,
|
||||
Koi8R.INSTANCE,
|
||||
Koi8U.INSTANCE,
|
||||
Macintosh.INSTANCE,
|
||||
Replacement.INSTANCE,
|
||||
ShiftJis.INSTANCE,
|
||||
Utf16Be.INSTANCE,
|
||||
Utf16Le.INSTANCE,
|
||||
Utf8.INSTANCE,
|
||||
Windows1250.INSTANCE,
|
||||
Windows1251.INSTANCE,
|
||||
Windows1252.INSTANCE,
|
||||
Windows1253.INSTANCE,
|
||||
Windows1254.INSTANCE,
|
||||
Windows1255.INSTANCE,
|
||||
Windows1256.INSTANCE,
|
||||
Windows1257.INSTANCE,
|
||||
Windows1258.INSTANCE,
|
||||
Windows874.INSTANCE,
|
||||
MacCyrillic.INSTANCE,
|
||||
UserDefined.INSTANCE,
|
||||
};
|
||||
|
||||
/**
|
||||
* The big5 encoding.
|
||||
*/
|
||||
public static final Encoding BIG5 = Big5.INSTANCE;
|
||||
|
||||
/**
|
||||
* The euc-jp encoding.
|
||||
*/
|
||||
public static final Encoding EUC_JP = EucJp.INSTANCE;
|
||||
|
||||
/**
|
||||
* The euc-kr encoding.
|
||||
*/
|
||||
public static final Encoding EUC_KR = EucKr.INSTANCE;
|
||||
|
||||
/**
|
||||
* The gb18030 encoding.
|
||||
*/
|
||||
public static final Encoding GB18030 = Gb18030.INSTANCE;
|
||||
|
||||
/**
|
||||
* The gbk encoding.
|
||||
*/
|
||||
public static final Encoding GBK = Gbk.INSTANCE;
|
||||
|
||||
/**
|
||||
* The ibm866 encoding.
|
||||
*/
|
||||
public static final Encoding IBM866 = Ibm866.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-2022-jp encoding.
|
||||
*/
|
||||
public static final Encoding ISO_2022_JP = Iso2022Jp.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-10 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_10 = Iso10.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-13 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_13 = Iso13.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-14 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_14 = Iso14.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-15 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_15 = Iso15.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-16 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_16 = Iso16.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-2 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_2 = Iso2.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-3 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_3 = Iso3.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-4 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_4 = Iso4.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-5 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_5 = Iso5.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-6 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_6 = Iso6.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-7 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_7 = Iso7.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-8 encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_8 = Iso8.INSTANCE;
|
||||
|
||||
/**
|
||||
* The iso-8859-8-i encoding.
|
||||
*/
|
||||
public static final Encoding ISO_8859_8_I = Iso8I.INSTANCE;
|
||||
|
||||
/**
|
||||
* The koi8-r encoding.
|
||||
*/
|
||||
public static final Encoding KOI8_R = Koi8R.INSTANCE;
|
||||
|
||||
/**
|
||||
* The koi8-u encoding.
|
||||
*/
|
||||
public static final Encoding KOI8_U = Koi8U.INSTANCE;
|
||||
|
||||
/**
|
||||
* The macintosh encoding.
|
||||
*/
|
||||
public static final Encoding MACINTOSH = Macintosh.INSTANCE;
|
||||
|
||||
/**
|
||||
* The replacement encoding.
|
||||
*/
|
||||
public static final Encoding REPLACEMENT = Replacement.INSTANCE;
|
||||
|
||||
/**
|
||||
* The shift_jis encoding.
|
||||
*/
|
||||
public static final Encoding SHIFT_JIS = ShiftJis.INSTANCE;
|
||||
|
||||
/**
|
||||
* The utf-16be encoding.
|
||||
*/
|
||||
public static final Encoding UTF_16BE = Utf16Be.INSTANCE;
|
||||
|
||||
/**
|
||||
* The utf-16le encoding.
|
||||
*/
|
||||
public static final Encoding UTF_16LE = Utf16Le.INSTANCE;
|
||||
|
||||
/**
|
||||
* The utf-8 encoding.
|
||||
*/
|
||||
public static final Encoding UTF_8 = Utf8.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1250 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1250 = Windows1250.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1251 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1251 = Windows1251.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1252 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1252 = Windows1252.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1253 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1253 = Windows1253.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1254 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1254 = Windows1254.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1255 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1255 = Windows1255.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1256 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1256 = Windows1256.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1257 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1257 = Windows1257.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-1258 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_1258 = Windows1258.INSTANCE;
|
||||
|
||||
/**
|
||||
* The windows-874 encoding.
|
||||
*/
|
||||
public static final Encoding WINDOWS_874 = Windows874.INSTANCE;
|
||||
|
||||
/**
|
||||
* The x-mac-cyrillic encoding.
|
||||
*/
|
||||
public static final Encoding X_MAC_CYRILLIC = MacCyrillic.INSTANCE;
|
||||
|
||||
/**
|
||||
* The x-user-defined encoding.
|
||||
*/
|
||||
public static final Encoding X_USER_DEFINED = UserDefined.INSTANCE;
|
||||
|
||||
|
||||
private static SortedMap<String, Charset> encodings = null;
|
||||
|
||||
protected Encoding(String canonicalName, String[] aliases) {
|
||||
super(canonicalName, aliases);
|
||||
}
|
||||
|
||||
private enum State {
|
||||
HEAD, LABEL, TAIL
|
||||
};
|
||||
|
||||
public static Encoding forName(String label) {
|
||||
if (label == null) {
|
||||
throw new IllegalArgumentException("Label must not be null.");
|
||||
}
|
||||
if (label.length() == 0) {
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
// First try the fast path
|
||||
int index = Arrays.binarySearch(LABELS, label);
|
||||
if (index >= 0) {
|
||||
return ENCODINGS_FOR_LABELS[index];
|
||||
}
|
||||
// Else, slow path
|
||||
StringBuilder sb = new StringBuilder();
|
||||
State state = State.HEAD;
|
||||
for (int i = 0; i < label.length(); i++) {
|
||||
char c = label.charAt(i);
|
||||
if ((c == ' ') || (c == '\n') || (c == '\r') || (c == '\t')
|
||||
|| (c == '\u000C')) {
|
||||
if (state == State.LABEL) {
|
||||
state = State.TAIL;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if ((c >= 'a' && c <= 'z') || (c >= '0' && c <= '9')) {
|
||||
switch (state) {
|
||||
case HEAD:
|
||||
state = State.LABEL;
|
||||
// Fall through
|
||||
case LABEL:
|
||||
sb.append(c);
|
||||
continue;
|
||||
case TAIL:
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
}
|
||||
if (c >= 'A' && c <= 'Z') {
|
||||
c += 0x20;
|
||||
switch (state) {
|
||||
case HEAD:
|
||||
state = State.LABEL;
|
||||
// Fall through
|
||||
case LABEL:
|
||||
sb.append(c);
|
||||
continue;
|
||||
case TAIL:
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
}
|
||||
if ((c == '-') || (c == '+') || (c == '.') || (c == ':')
|
||||
|| (c == '_')) {
|
||||
switch (state) {
|
||||
case LABEL:
|
||||
sb.append(c);
|
||||
continue;
|
||||
case HEAD:
|
||||
case TAIL:
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
}
|
||||
throw new IllegalCharsetNameException(label);
|
||||
}
|
||||
index = Arrays.binarySearch(LABELS, sb.toString());
|
||||
if (index >= 0) {
|
||||
return ENCODINGS_FOR_LABELS[index];
|
||||
}
|
||||
throw new UnsupportedCharsetException(label);
|
||||
}
|
||||
|
||||
public static Encoding forNameNoReplacement(String label) {
|
||||
Encoding encoding = Encoding.forName(label);
|
||||
if (encoding == Encoding.REPLACEMENT) {
|
||||
throw new UnsupportedCharsetException(label);
|
||||
}
|
||||
return encoding;
|
||||
}
|
||||
|
||||
public static boolean isSupported(String label) {
|
||||
try {
|
||||
Encoding.forName(label);
|
||||
} catch (UnsupportedCharsetException e) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
public static boolean isSupportedNoReplacement(String label) {
|
||||
try {
|
||||
Encoding.forNameNoReplacement(label);
|
||||
} catch (UnsupportedCharsetException e) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
public static SortedMap<String, Charset> availableCharsets() {
|
||||
if (encodings == null) {
|
||||
TreeMap<String, Charset> map = new TreeMap<String, Charset>();
|
||||
for (Encoding encoding : ENCODINGS) {
|
||||
map.put(encoding.name(), encoding);
|
||||
}
|
||||
encodings = Collections.unmodifiableSortedMap(map);
|
||||
}
|
||||
return encodings;
|
||||
}
|
||||
|
||||
public static Encoding defaultCharset() {
|
||||
return WINDOWS_1252;
|
||||
}
|
||||
|
||||
@Override public boolean canEncode() {
|
||||
return false;
|
||||
}
|
||||
|
||||
@Override public boolean contains(Charset cs) {
|
||||
return false;
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
throw new UnsupportedOperationException("Encoder not implemented.");
|
||||
}
|
||||
}
|
|
@ -0,0 +1,57 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class EucJp extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cseucpkdfmtjapanese",
|
||||
"euc-jp",
|
||||
"x-euc-jp"
|
||||
};
|
||||
|
||||
private static final String NAME = "euc-jp";
|
||||
|
||||
static final EucJp INSTANCE = new EucJp();
|
||||
|
||||
private EucJp() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,64 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class EucKr extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cseuckr",
|
||||
"csksc56011987",
|
||||
"euc-kr",
|
||||
"iso-ir-149",
|
||||
"korean",
|
||||
"ks_c_5601-1987",
|
||||
"ks_c_5601-1989",
|
||||
"ksc5601",
|
||||
"ksc_5601",
|
||||
"windows-949"
|
||||
};
|
||||
|
||||
private static final String NAME = "euc-kr";
|
||||
|
||||
static final EucKr INSTANCE = new EucKr();
|
||||
|
||||
private EucKr() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,61 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.CoderResult;
|
||||
|
||||
public final class FallibleSingleByteDecoder extends InfallibleSingleByteDecoder {
|
||||
|
||||
public FallibleSingleByteDecoder(Encoding cs, char[] upperHalf) {
|
||||
super(cs, upperHalf);
|
||||
}
|
||||
|
||||
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
|
||||
if (!this.report) {
|
||||
return super.decodeLoop(in, out);
|
||||
} else {
|
||||
for (;;) {
|
||||
if (!in.hasRemaining()) {
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
int b = (int) in.get();
|
||||
if (b >= 0) {
|
||||
out.put((char) b);
|
||||
} else {
|
||||
char mapped = this.upperHalf[b + 128];
|
||||
if (mapped == '\uFFFD') {
|
||||
in.position(in.position() - 1);
|
||||
return CoderResult.malformedForLength(1);
|
||||
}
|
||||
out.put(mapped);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,55 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Gb18030 extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"gb18030"
|
||||
};
|
||||
|
||||
private static final String NAME = "gb18030";
|
||||
|
||||
static final Gb18030 INSTANCE = new Gb18030();
|
||||
|
||||
private Gb18030() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,63 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Gbk extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"chinese",
|
||||
"csgb2312",
|
||||
"csiso58gb231280",
|
||||
"gb2312",
|
||||
"gb_2312",
|
||||
"gb_2312-80",
|
||||
"gbk",
|
||||
"iso-ir-58",
|
||||
"x-gbk"
|
||||
};
|
||||
|
||||
private static final String NAME = "gbk";
|
||||
|
||||
static final Gbk INSTANCE = new Gbk();
|
||||
|
||||
private Gbk() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName("gb18030").newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,184 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Ibm866 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0410',
|
||||
'\u0411',
|
||||
'\u0412',
|
||||
'\u0413',
|
||||
'\u0414',
|
||||
'\u0415',
|
||||
'\u0416',
|
||||
'\u0417',
|
||||
'\u0418',
|
||||
'\u0419',
|
||||
'\u041a',
|
||||
'\u041b',
|
||||
'\u041c',
|
||||
'\u041d',
|
||||
'\u041e',
|
||||
'\u041f',
|
||||
'\u0420',
|
||||
'\u0421',
|
||||
'\u0422',
|
||||
'\u0423',
|
||||
'\u0424',
|
||||
'\u0425',
|
||||
'\u0426',
|
||||
'\u0427',
|
||||
'\u0428',
|
||||
'\u0429',
|
||||
'\u042a',
|
||||
'\u042b',
|
||||
'\u042c',
|
||||
'\u042d',
|
||||
'\u042e',
|
||||
'\u042f',
|
||||
'\u0430',
|
||||
'\u0431',
|
||||
'\u0432',
|
||||
'\u0433',
|
||||
'\u0434',
|
||||
'\u0435',
|
||||
'\u0436',
|
||||
'\u0437',
|
||||
'\u0438',
|
||||
'\u0439',
|
||||
'\u043a',
|
||||
'\u043b',
|
||||
'\u043c',
|
||||
'\u043d',
|
||||
'\u043e',
|
||||
'\u043f',
|
||||
'\u2591',
|
||||
'\u2592',
|
||||
'\u2593',
|
||||
'\u2502',
|
||||
'\u2524',
|
||||
'\u2561',
|
||||
'\u2562',
|
||||
'\u2556',
|
||||
'\u2555',
|
||||
'\u2563',
|
||||
'\u2551',
|
||||
'\u2557',
|
||||
'\u255d',
|
||||
'\u255c',
|
||||
'\u255b',
|
||||
'\u2510',
|
||||
'\u2514',
|
||||
'\u2534',
|
||||
'\u252c',
|
||||
'\u251c',
|
||||
'\u2500',
|
||||
'\u253c',
|
||||
'\u255e',
|
||||
'\u255f',
|
||||
'\u255a',
|
||||
'\u2554',
|
||||
'\u2569',
|
||||
'\u2566',
|
||||
'\u2560',
|
||||
'\u2550',
|
||||
'\u256c',
|
||||
'\u2567',
|
||||
'\u2568',
|
||||
'\u2564',
|
||||
'\u2565',
|
||||
'\u2559',
|
||||
'\u2558',
|
||||
'\u2552',
|
||||
'\u2553',
|
||||
'\u256b',
|
||||
'\u256a',
|
||||
'\u2518',
|
||||
'\u250c',
|
||||
'\u2588',
|
||||
'\u2584',
|
||||
'\u258c',
|
||||
'\u2590',
|
||||
'\u2580',
|
||||
'\u0440',
|
||||
'\u0441',
|
||||
'\u0442',
|
||||
'\u0443',
|
||||
'\u0444',
|
||||
'\u0445',
|
||||
'\u0446',
|
||||
'\u0447',
|
||||
'\u0448',
|
||||
'\u0449',
|
||||
'\u044a',
|
||||
'\u044b',
|
||||
'\u044c',
|
||||
'\u044d',
|
||||
'\u044e',
|
||||
'\u044f',
|
||||
'\u0401',
|
||||
'\u0451',
|
||||
'\u0404',
|
||||
'\u0454',
|
||||
'\u0407',
|
||||
'\u0457',
|
||||
'\u040e',
|
||||
'\u045e',
|
||||
'\u00b0',
|
||||
'\u2219',
|
||||
'\u00b7',
|
||||
'\u221a',
|
||||
'\u2116',
|
||||
'\u00a4',
|
||||
'\u25a0',
|
||||
'\u00a0'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"866",
|
||||
"cp866",
|
||||
"csibm866",
|
||||
"ibm866"
|
||||
};
|
||||
|
||||
private static final String NAME = "ibm866";
|
||||
|
||||
static final Encoding INSTANCE = new Ibm866();
|
||||
|
||||
private Ibm866() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,57 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.CoderResult;
|
||||
|
||||
public class InfallibleSingleByteDecoder extends Decoder {
|
||||
|
||||
protected final char[] upperHalf;
|
||||
|
||||
protected InfallibleSingleByteDecoder(Encoding cs, char[] upperHalf) {
|
||||
super(cs, 1.0f, 1.0f);
|
||||
this.upperHalf = upperHalf;
|
||||
}
|
||||
|
||||
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
|
||||
// TODO figure out if it's worthwhile to optimize the case where both
|
||||
// buffers are array-backed.
|
||||
for (;;) {
|
||||
if (!in.hasRemaining()) {
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
int b = (int) in.get();
|
||||
if (b >= 0) {
|
||||
out.put((char) b);
|
||||
} else {
|
||||
out.put(this.upperHalf[b + 128]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,187 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso10 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0104',
|
||||
'\u0112',
|
||||
'\u0122',
|
||||
'\u012a',
|
||||
'\u0128',
|
||||
'\u0136',
|
||||
'\u00a7',
|
||||
'\u013b',
|
||||
'\u0110',
|
||||
'\u0160',
|
||||
'\u0166',
|
||||
'\u017d',
|
||||
'\u00ad',
|
||||
'\u016a',
|
||||
'\u014a',
|
||||
'\u00b0',
|
||||
'\u0105',
|
||||
'\u0113',
|
||||
'\u0123',
|
||||
'\u012b',
|
||||
'\u0129',
|
||||
'\u0137',
|
||||
'\u00b7',
|
||||
'\u013c',
|
||||
'\u0111',
|
||||
'\u0161',
|
||||
'\u0167',
|
||||
'\u017e',
|
||||
'\u2015',
|
||||
'\u016b',
|
||||
'\u014b',
|
||||
'\u0100',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u00c3',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u012e',
|
||||
'\u010c',
|
||||
'\u00c9',
|
||||
'\u0118',
|
||||
'\u00cb',
|
||||
'\u0116',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u00d0',
|
||||
'\u0145',
|
||||
'\u014c',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u0168',
|
||||
'\u00d8',
|
||||
'\u0172',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u00dd',
|
||||
'\u00de',
|
||||
'\u00df',
|
||||
'\u0101',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u00e3',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u012f',
|
||||
'\u010d',
|
||||
'\u00e9',
|
||||
'\u0119',
|
||||
'\u00eb',
|
||||
'\u0117',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u00f0',
|
||||
'\u0146',
|
||||
'\u014d',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u0169',
|
||||
'\u00f8',
|
||||
'\u0173',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u00fd',
|
||||
'\u00fe',
|
||||
'\u0138'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatin6",
|
||||
"iso-8859-10",
|
||||
"iso-ir-157",
|
||||
"iso8859-10",
|
||||
"iso885910",
|
||||
"l6",
|
||||
"latin6"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-10";
|
||||
|
||||
static final Encoding INSTANCE = new Iso10();
|
||||
|
||||
private Iso10() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso13 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u201d',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u201e',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00d8',
|
||||
'\u00a9',
|
||||
'\u0156',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00c6',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u201c',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00f8',
|
||||
'\u00b9',
|
||||
'\u0157',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u00e6',
|
||||
'\u0104',
|
||||
'\u012e',
|
||||
'\u0100',
|
||||
'\u0106',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u0118',
|
||||
'\u0112',
|
||||
'\u010c',
|
||||
'\u00c9',
|
||||
'\u0179',
|
||||
'\u0116',
|
||||
'\u0122',
|
||||
'\u0136',
|
||||
'\u012a',
|
||||
'\u013b',
|
||||
'\u0160',
|
||||
'\u0143',
|
||||
'\u0145',
|
||||
'\u00d3',
|
||||
'\u014c',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u0172',
|
||||
'\u0141',
|
||||
'\u015a',
|
||||
'\u016a',
|
||||
'\u00dc',
|
||||
'\u017b',
|
||||
'\u017d',
|
||||
'\u00df',
|
||||
'\u0105',
|
||||
'\u012f',
|
||||
'\u0101',
|
||||
'\u0107',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u0119',
|
||||
'\u0113',
|
||||
'\u010d',
|
||||
'\u00e9',
|
||||
'\u017a',
|
||||
'\u0117',
|
||||
'\u0123',
|
||||
'\u0137',
|
||||
'\u012b',
|
||||
'\u013c',
|
||||
'\u0161',
|
||||
'\u0144',
|
||||
'\u0146',
|
||||
'\u00f3',
|
||||
'\u014d',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u0173',
|
||||
'\u0142',
|
||||
'\u015b',
|
||||
'\u016b',
|
||||
'\u00fc',
|
||||
'\u017c',
|
||||
'\u017e',
|
||||
'\u2019'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"iso-8859-13",
|
||||
"iso8859-13",
|
||||
"iso885913"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-13";
|
||||
|
||||
static final Encoding INSTANCE = new Iso13();
|
||||
|
||||
private Iso13() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso14 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u1e02',
|
||||
'\u1e03',
|
||||
'\u00a3',
|
||||
'\u010a',
|
||||
'\u010b',
|
||||
'\u1e0a',
|
||||
'\u00a7',
|
||||
'\u1e80',
|
||||
'\u00a9',
|
||||
'\u1e82',
|
||||
'\u1e0b',
|
||||
'\u1ef2',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u0178',
|
||||
'\u1e1e',
|
||||
'\u1e1f',
|
||||
'\u0120',
|
||||
'\u0121',
|
||||
'\u1e40',
|
||||
'\u1e41',
|
||||
'\u00b6',
|
||||
'\u1e56',
|
||||
'\u1e81',
|
||||
'\u1e57',
|
||||
'\u1e83',
|
||||
'\u1e60',
|
||||
'\u1ef3',
|
||||
'\u1e84',
|
||||
'\u1e85',
|
||||
'\u1e61',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u00c3',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u00cc',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u0174',
|
||||
'\u00d1',
|
||||
'\u00d2',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u1e6a',
|
||||
'\u00d8',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u00dd',
|
||||
'\u0176',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u00e3',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ec',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u0175',
|
||||
'\u00f1',
|
||||
'\u00f2',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u1e6b',
|
||||
'\u00f8',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u00fd',
|
||||
'\u0177',
|
||||
'\u00ff'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"iso-8859-14",
|
||||
"iso8859-14",
|
||||
"iso885914"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-14";
|
||||
|
||||
static final Encoding INSTANCE = new Iso14();
|
||||
|
||||
private Iso14() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,186 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso15 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u00a1',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u20ac',
|
||||
'\u00a5',
|
||||
'\u0160',
|
||||
'\u00a7',
|
||||
'\u0161',
|
||||
'\u00a9',
|
||||
'\u00aa',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u017d',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u017e',
|
||||
'\u00b9',
|
||||
'\u00ba',
|
||||
'\u00bb',
|
||||
'\u0152',
|
||||
'\u0153',
|
||||
'\u0178',
|
||||
'\u00bf',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u00c3',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u00cc',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u00d0',
|
||||
'\u00d1',
|
||||
'\u00d2',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u00d8',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u00dd',
|
||||
'\u00de',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u00e3',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ec',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u00f0',
|
||||
'\u00f1',
|
||||
'\u00f2',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u00f8',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u00fd',
|
||||
'\u00fe',
|
||||
'\u00ff'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatin9",
|
||||
"iso-8859-15",
|
||||
"iso8859-15",
|
||||
"iso885915",
|
||||
"iso_8859-15",
|
||||
"l9"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-15";
|
||||
|
||||
static final Encoding INSTANCE = new Iso15();
|
||||
|
||||
private Iso15() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,181 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso16 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0104',
|
||||
'\u0105',
|
||||
'\u0141',
|
||||
'\u20ac',
|
||||
'\u201e',
|
||||
'\u0160',
|
||||
'\u00a7',
|
||||
'\u0161',
|
||||
'\u00a9',
|
||||
'\u0218',
|
||||
'\u00ab',
|
||||
'\u0179',
|
||||
'\u00ad',
|
||||
'\u017a',
|
||||
'\u017b',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u010c',
|
||||
'\u0142',
|
||||
'\u017d',
|
||||
'\u201d',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u017e',
|
||||
'\u010d',
|
||||
'\u0219',
|
||||
'\u00bb',
|
||||
'\u0152',
|
||||
'\u0153',
|
||||
'\u0178',
|
||||
'\u017c',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u0102',
|
||||
'\u00c4',
|
||||
'\u0106',
|
||||
'\u00c6',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u00cc',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u0110',
|
||||
'\u0143',
|
||||
'\u00d2',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u0150',
|
||||
'\u00d6',
|
||||
'\u015a',
|
||||
'\u0170',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u0118',
|
||||
'\u021a',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u0103',
|
||||
'\u00e4',
|
||||
'\u0107',
|
||||
'\u00e6',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ec',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u0111',
|
||||
'\u0144',
|
||||
'\u00f2',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u0151',
|
||||
'\u00f6',
|
||||
'\u015b',
|
||||
'\u0171',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u0119',
|
||||
'\u021b',
|
||||
'\u00ff'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"iso-8859-16"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-16";
|
||||
|
||||
static final Encoding INSTANCE = new Iso16();
|
||||
|
||||
private Iso16() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,189 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso2 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0104',
|
||||
'\u02d8',
|
||||
'\u0141',
|
||||
'\u00a4',
|
||||
'\u013d',
|
||||
'\u015a',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u0160',
|
||||
'\u015e',
|
||||
'\u0164',
|
||||
'\u0179',
|
||||
'\u00ad',
|
||||
'\u017d',
|
||||
'\u017b',
|
||||
'\u00b0',
|
||||
'\u0105',
|
||||
'\u02db',
|
||||
'\u0142',
|
||||
'\u00b4',
|
||||
'\u013e',
|
||||
'\u015b',
|
||||
'\u02c7',
|
||||
'\u00b8',
|
||||
'\u0161',
|
||||
'\u015f',
|
||||
'\u0165',
|
||||
'\u017a',
|
||||
'\u02dd',
|
||||
'\u017e',
|
||||
'\u017c',
|
||||
'\u0154',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u0102',
|
||||
'\u00c4',
|
||||
'\u0139',
|
||||
'\u0106',
|
||||
'\u00c7',
|
||||
'\u010c',
|
||||
'\u00c9',
|
||||
'\u0118',
|
||||
'\u00cb',
|
||||
'\u011a',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u010e',
|
||||
'\u0110',
|
||||
'\u0143',
|
||||
'\u0147',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u0150',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u0158',
|
||||
'\u016e',
|
||||
'\u00da',
|
||||
'\u0170',
|
||||
'\u00dc',
|
||||
'\u00dd',
|
||||
'\u0162',
|
||||
'\u00df',
|
||||
'\u0155',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u0103',
|
||||
'\u00e4',
|
||||
'\u013a',
|
||||
'\u0107',
|
||||
'\u00e7',
|
||||
'\u010d',
|
||||
'\u00e9',
|
||||
'\u0119',
|
||||
'\u00eb',
|
||||
'\u011b',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u010f',
|
||||
'\u0111',
|
||||
'\u0144',
|
||||
'\u0148',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u0151',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u0159',
|
||||
'\u016f',
|
||||
'\u00fa',
|
||||
'\u0171',
|
||||
'\u00fc',
|
||||
'\u00fd',
|
||||
'\u0163',
|
||||
'\u02d9'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatin2",
|
||||
"iso-8859-2",
|
||||
"iso-ir-101",
|
||||
"iso8859-2",
|
||||
"iso88592",
|
||||
"iso_8859-2",
|
||||
"iso_8859-2:1987",
|
||||
"l2",
|
||||
"latin2"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-2";
|
||||
|
||||
static final Encoding INSTANCE = new Iso2();
|
||||
|
||||
private Iso2() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,56 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Iso2022Jp extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csiso2022jp",
|
||||
"iso-2022-jp"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-2022-jp";
|
||||
|
||||
static final Iso2022Jp INSTANCE = new Iso2022Jp();
|
||||
|
||||
private Iso2022Jp() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,189 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso3 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0126',
|
||||
'\u02d8',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\ufffd',
|
||||
'\u0124',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u0130',
|
||||
'\u015e',
|
||||
'\u011e',
|
||||
'\u0134',
|
||||
'\u00ad',
|
||||
'\ufffd',
|
||||
'\u017b',
|
||||
'\u00b0',
|
||||
'\u0127',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u0125',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u0131',
|
||||
'\u015f',
|
||||
'\u011f',
|
||||
'\u0135',
|
||||
'\u00bd',
|
||||
'\ufffd',
|
||||
'\u017c',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\ufffd',
|
||||
'\u00c4',
|
||||
'\u010a',
|
||||
'\u0108',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u00cc',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\ufffd',
|
||||
'\u00d1',
|
||||
'\u00d2',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u0120',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u011c',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u016c',
|
||||
'\u015c',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\ufffd',
|
||||
'\u00e4',
|
||||
'\u010b',
|
||||
'\u0109',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ec',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\ufffd',
|
||||
'\u00f1',
|
||||
'\u00f2',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u0121',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u011d',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u016d',
|
||||
'\u015d',
|
||||
'\u02d9'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatin3",
|
||||
"iso-8859-3",
|
||||
"iso-ir-109",
|
||||
"iso8859-3",
|
||||
"iso88593",
|
||||
"iso_8859-3",
|
||||
"iso_8859-3:1988",
|
||||
"l3",
|
||||
"latin3"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-3";
|
||||
|
||||
static final Encoding INSTANCE = new Iso3();
|
||||
|
||||
private Iso3() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,189 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso4 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0104',
|
||||
'\u0138',
|
||||
'\u0156',
|
||||
'\u00a4',
|
||||
'\u0128',
|
||||
'\u013b',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u0160',
|
||||
'\u0112',
|
||||
'\u0122',
|
||||
'\u0166',
|
||||
'\u00ad',
|
||||
'\u017d',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u0105',
|
||||
'\u02db',
|
||||
'\u0157',
|
||||
'\u00b4',
|
||||
'\u0129',
|
||||
'\u013c',
|
||||
'\u02c7',
|
||||
'\u00b8',
|
||||
'\u0161',
|
||||
'\u0113',
|
||||
'\u0123',
|
||||
'\u0167',
|
||||
'\u014a',
|
||||
'\u017e',
|
||||
'\u014b',
|
||||
'\u0100',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u00c3',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u012e',
|
||||
'\u010c',
|
||||
'\u00c9',
|
||||
'\u0118',
|
||||
'\u00cb',
|
||||
'\u0116',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u012a',
|
||||
'\u0110',
|
||||
'\u0145',
|
||||
'\u014c',
|
||||
'\u0136',
|
||||
'\u00d4',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u00d8',
|
||||
'\u0172',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u0168',
|
||||
'\u016a',
|
||||
'\u00df',
|
||||
'\u0101',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u00e3',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u012f',
|
||||
'\u010d',
|
||||
'\u00e9',
|
||||
'\u0119',
|
||||
'\u00eb',
|
||||
'\u0117',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u012b',
|
||||
'\u0111',
|
||||
'\u0146',
|
||||
'\u014d',
|
||||
'\u0137',
|
||||
'\u00f4',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u00f8',
|
||||
'\u0173',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u0169',
|
||||
'\u016b',
|
||||
'\u02d9'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatin4",
|
||||
"iso-8859-4",
|
||||
"iso-ir-110",
|
||||
"iso8859-4",
|
||||
"iso88594",
|
||||
"iso_8859-4",
|
||||
"iso_8859-4:1988",
|
||||
"l4",
|
||||
"latin4"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-4";
|
||||
|
||||
static final Encoding INSTANCE = new Iso4();
|
||||
|
||||
private Iso4() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,188 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso5 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0401',
|
||||
'\u0402',
|
||||
'\u0403',
|
||||
'\u0404',
|
||||
'\u0405',
|
||||
'\u0406',
|
||||
'\u0407',
|
||||
'\u0408',
|
||||
'\u0409',
|
||||
'\u040a',
|
||||
'\u040b',
|
||||
'\u040c',
|
||||
'\u00ad',
|
||||
'\u040e',
|
||||
'\u040f',
|
||||
'\u0410',
|
||||
'\u0411',
|
||||
'\u0412',
|
||||
'\u0413',
|
||||
'\u0414',
|
||||
'\u0415',
|
||||
'\u0416',
|
||||
'\u0417',
|
||||
'\u0418',
|
||||
'\u0419',
|
||||
'\u041a',
|
||||
'\u041b',
|
||||
'\u041c',
|
||||
'\u041d',
|
||||
'\u041e',
|
||||
'\u041f',
|
||||
'\u0420',
|
||||
'\u0421',
|
||||
'\u0422',
|
||||
'\u0423',
|
||||
'\u0424',
|
||||
'\u0425',
|
||||
'\u0426',
|
||||
'\u0427',
|
||||
'\u0428',
|
||||
'\u0429',
|
||||
'\u042a',
|
||||
'\u042b',
|
||||
'\u042c',
|
||||
'\u042d',
|
||||
'\u042e',
|
||||
'\u042f',
|
||||
'\u0430',
|
||||
'\u0431',
|
||||
'\u0432',
|
||||
'\u0433',
|
||||
'\u0434',
|
||||
'\u0435',
|
||||
'\u0436',
|
||||
'\u0437',
|
||||
'\u0438',
|
||||
'\u0439',
|
||||
'\u043a',
|
||||
'\u043b',
|
||||
'\u043c',
|
||||
'\u043d',
|
||||
'\u043e',
|
||||
'\u043f',
|
||||
'\u0440',
|
||||
'\u0441',
|
||||
'\u0442',
|
||||
'\u0443',
|
||||
'\u0444',
|
||||
'\u0445',
|
||||
'\u0446',
|
||||
'\u0447',
|
||||
'\u0448',
|
||||
'\u0449',
|
||||
'\u044a',
|
||||
'\u044b',
|
||||
'\u044c',
|
||||
'\u044d',
|
||||
'\u044e',
|
||||
'\u044f',
|
||||
'\u2116',
|
||||
'\u0451',
|
||||
'\u0452',
|
||||
'\u0453',
|
||||
'\u0454',
|
||||
'\u0455',
|
||||
'\u0456',
|
||||
'\u0457',
|
||||
'\u0458',
|
||||
'\u0459',
|
||||
'\u045a',
|
||||
'\u045b',
|
||||
'\u045c',
|
||||
'\u00a7',
|
||||
'\u045e',
|
||||
'\u045f'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatincyrillic",
|
||||
"cyrillic",
|
||||
"iso-8859-5",
|
||||
"iso-ir-144",
|
||||
"iso8859-5",
|
||||
"iso88595",
|
||||
"iso_8859-5",
|
||||
"iso_8859-5:1988"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-5";
|
||||
|
||||
static final Encoding INSTANCE = new Iso5();
|
||||
|
||||
private Iso5() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,194 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso6 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u00a4',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u060c',
|
||||
'\u00ad',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u061b',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u061f',
|
||||
'\ufffd',
|
||||
'\u0621',
|
||||
'\u0622',
|
||||
'\u0623',
|
||||
'\u0624',
|
||||
'\u0625',
|
||||
'\u0626',
|
||||
'\u0627',
|
||||
'\u0628',
|
||||
'\u0629',
|
||||
'\u062a',
|
||||
'\u062b',
|
||||
'\u062c',
|
||||
'\u062d',
|
||||
'\u062e',
|
||||
'\u062f',
|
||||
'\u0630',
|
||||
'\u0631',
|
||||
'\u0632',
|
||||
'\u0633',
|
||||
'\u0634',
|
||||
'\u0635',
|
||||
'\u0636',
|
||||
'\u0637',
|
||||
'\u0638',
|
||||
'\u0639',
|
||||
'\u063a',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u0640',
|
||||
'\u0641',
|
||||
'\u0642',
|
||||
'\u0643',
|
||||
'\u0644',
|
||||
'\u0645',
|
||||
'\u0646',
|
||||
'\u0647',
|
||||
'\u0648',
|
||||
'\u0649',
|
||||
'\u064a',
|
||||
'\u064b',
|
||||
'\u064c',
|
||||
'\u064d',
|
||||
'\u064e',
|
||||
'\u064f',
|
||||
'\u0650',
|
||||
'\u0651',
|
||||
'\u0652',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"arabic",
|
||||
"asmo-708",
|
||||
"csiso88596e",
|
||||
"csiso88596i",
|
||||
"csisolatinarabic",
|
||||
"ecma-114",
|
||||
"iso-8859-6",
|
||||
"iso-8859-6-e",
|
||||
"iso-8859-6-i",
|
||||
"iso-ir-127",
|
||||
"iso8859-6",
|
||||
"iso88596",
|
||||
"iso_8859-6",
|
||||
"iso_8859-6:1987"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-6";
|
||||
|
||||
static final Encoding INSTANCE = new Iso6();
|
||||
|
||||
private Iso6() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,192 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso7 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u00a3',
|
||||
'\u20ac',
|
||||
'\u20af',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u037a',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\ufffd',
|
||||
'\u2015',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u0384',
|
||||
'\u0385',
|
||||
'\u0386',
|
||||
'\u00b7',
|
||||
'\u0388',
|
||||
'\u0389',
|
||||
'\u038a',
|
||||
'\u00bb',
|
||||
'\u038c',
|
||||
'\u00bd',
|
||||
'\u038e',
|
||||
'\u038f',
|
||||
'\u0390',
|
||||
'\u0391',
|
||||
'\u0392',
|
||||
'\u0393',
|
||||
'\u0394',
|
||||
'\u0395',
|
||||
'\u0396',
|
||||
'\u0397',
|
||||
'\u0398',
|
||||
'\u0399',
|
||||
'\u039a',
|
||||
'\u039b',
|
||||
'\u039c',
|
||||
'\u039d',
|
||||
'\u039e',
|
||||
'\u039f',
|
||||
'\u03a0',
|
||||
'\u03a1',
|
||||
'\ufffd',
|
||||
'\u03a3',
|
||||
'\u03a4',
|
||||
'\u03a5',
|
||||
'\u03a6',
|
||||
'\u03a7',
|
||||
'\u03a8',
|
||||
'\u03a9',
|
||||
'\u03aa',
|
||||
'\u03ab',
|
||||
'\u03ac',
|
||||
'\u03ad',
|
||||
'\u03ae',
|
||||
'\u03af',
|
||||
'\u03b0',
|
||||
'\u03b1',
|
||||
'\u03b2',
|
||||
'\u03b3',
|
||||
'\u03b4',
|
||||
'\u03b5',
|
||||
'\u03b6',
|
||||
'\u03b7',
|
||||
'\u03b8',
|
||||
'\u03b9',
|
||||
'\u03ba',
|
||||
'\u03bb',
|
||||
'\u03bc',
|
||||
'\u03bd',
|
||||
'\u03be',
|
||||
'\u03bf',
|
||||
'\u03c0',
|
||||
'\u03c1',
|
||||
'\u03c2',
|
||||
'\u03c3',
|
||||
'\u03c4',
|
||||
'\u03c5',
|
||||
'\u03c6',
|
||||
'\u03c7',
|
||||
'\u03c8',
|
||||
'\u03c9',
|
||||
'\u03ca',
|
||||
'\u03cb',
|
||||
'\u03cc',
|
||||
'\u03cd',
|
||||
'\u03ce',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csisolatingreek",
|
||||
"ecma-118",
|
||||
"elot_928",
|
||||
"greek",
|
||||
"greek8",
|
||||
"iso-8859-7",
|
||||
"iso-ir-126",
|
||||
"iso8859-7",
|
||||
"iso88597",
|
||||
"iso_8859-7",
|
||||
"iso_8859-7:1987",
|
||||
"sun_eu_greek"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-7";
|
||||
|
||||
static final Encoding INSTANCE = new Iso7();
|
||||
|
||||
private Iso7() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,191 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso8 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\ufffd',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u00d7',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u00f7',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u2017',
|
||||
'\u05d0',
|
||||
'\u05d1',
|
||||
'\u05d2',
|
||||
'\u05d3',
|
||||
'\u05d4',
|
||||
'\u05d5',
|
||||
'\u05d6',
|
||||
'\u05d7',
|
||||
'\u05d8',
|
||||
'\u05d9',
|
||||
'\u05da',
|
||||
'\u05db',
|
||||
'\u05dc',
|
||||
'\u05dd',
|
||||
'\u05de',
|
||||
'\u05df',
|
||||
'\u05e0',
|
||||
'\u05e1',
|
||||
'\u05e2',
|
||||
'\u05e3',
|
||||
'\u05e4',
|
||||
'\u05e5',
|
||||
'\u05e6',
|
||||
'\u05e7',
|
||||
'\u05e8',
|
||||
'\u05e9',
|
||||
'\u05ea',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u200e',
|
||||
'\u200f',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csiso88598e",
|
||||
"csisolatinhebrew",
|
||||
"hebrew",
|
||||
"iso-8859-8",
|
||||
"iso-8859-8-e",
|
||||
"iso-ir-138",
|
||||
"iso8859-8",
|
||||
"iso88598",
|
||||
"iso_8859-8",
|
||||
"iso_8859-8:1988",
|
||||
"visual"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-8";
|
||||
|
||||
static final Encoding INSTANCE = new Iso8();
|
||||
|
||||
private Iso8() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Iso8I extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0080',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u0085',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u0091',
|
||||
'\u0092',
|
||||
'\u0093',
|
||||
'\u0094',
|
||||
'\u0095',
|
||||
'\u0096',
|
||||
'\u0097',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\ufffd',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u00d7',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u00f7',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u2017',
|
||||
'\u05d0',
|
||||
'\u05d1',
|
||||
'\u05d2',
|
||||
'\u05d3',
|
||||
'\u05d4',
|
||||
'\u05d5',
|
||||
'\u05d6',
|
||||
'\u05d7',
|
||||
'\u05d8',
|
||||
'\u05d9',
|
||||
'\u05da',
|
||||
'\u05db',
|
||||
'\u05dc',
|
||||
'\u05dd',
|
||||
'\u05de',
|
||||
'\u05df',
|
||||
'\u05e0',
|
||||
'\u05e1',
|
||||
'\u05e2',
|
||||
'\u05e3',
|
||||
'\u05e4',
|
||||
'\u05e5',
|
||||
'\u05e6',
|
||||
'\u05e7',
|
||||
'\u05e8',
|
||||
'\u05e9',
|
||||
'\u05ea',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u200e',
|
||||
'\u200f',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csiso88598i",
|
||||
"iso-8859-8-i",
|
||||
"logical"
|
||||
};
|
||||
|
||||
private static final String NAME = "iso-8859-8-i";
|
||||
|
||||
static final Encoding INSTANCE = new Iso8I();
|
||||
|
||||
private Iso8I() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,185 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Koi8R extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u2500',
|
||||
'\u2502',
|
||||
'\u250c',
|
||||
'\u2510',
|
||||
'\u2514',
|
||||
'\u2518',
|
||||
'\u251c',
|
||||
'\u2524',
|
||||
'\u252c',
|
||||
'\u2534',
|
||||
'\u253c',
|
||||
'\u2580',
|
||||
'\u2584',
|
||||
'\u2588',
|
||||
'\u258c',
|
||||
'\u2590',
|
||||
'\u2591',
|
||||
'\u2592',
|
||||
'\u2593',
|
||||
'\u2320',
|
||||
'\u25a0',
|
||||
'\u2219',
|
||||
'\u221a',
|
||||
'\u2248',
|
||||
'\u2264',
|
||||
'\u2265',
|
||||
'\u00a0',
|
||||
'\u2321',
|
||||
'\u00b0',
|
||||
'\u00b2',
|
||||
'\u00b7',
|
||||
'\u00f7',
|
||||
'\u2550',
|
||||
'\u2551',
|
||||
'\u2552',
|
||||
'\u0451',
|
||||
'\u2553',
|
||||
'\u2554',
|
||||
'\u2555',
|
||||
'\u2556',
|
||||
'\u2557',
|
||||
'\u2558',
|
||||
'\u2559',
|
||||
'\u255a',
|
||||
'\u255b',
|
||||
'\u255c',
|
||||
'\u255d',
|
||||
'\u255e',
|
||||
'\u255f',
|
||||
'\u2560',
|
||||
'\u2561',
|
||||
'\u0401',
|
||||
'\u2562',
|
||||
'\u2563',
|
||||
'\u2564',
|
||||
'\u2565',
|
||||
'\u2566',
|
||||
'\u2567',
|
||||
'\u2568',
|
||||
'\u2569',
|
||||
'\u256a',
|
||||
'\u256b',
|
||||
'\u256c',
|
||||
'\u00a9',
|
||||
'\u044e',
|
||||
'\u0430',
|
||||
'\u0431',
|
||||
'\u0446',
|
||||
'\u0434',
|
||||
'\u0435',
|
||||
'\u0444',
|
||||
'\u0433',
|
||||
'\u0445',
|
||||
'\u0438',
|
||||
'\u0439',
|
||||
'\u043a',
|
||||
'\u043b',
|
||||
'\u043c',
|
||||
'\u043d',
|
||||
'\u043e',
|
||||
'\u043f',
|
||||
'\u044f',
|
||||
'\u0440',
|
||||
'\u0441',
|
||||
'\u0442',
|
||||
'\u0443',
|
||||
'\u0436',
|
||||
'\u0432',
|
||||
'\u044c',
|
||||
'\u044b',
|
||||
'\u0437',
|
||||
'\u0448',
|
||||
'\u044d',
|
||||
'\u0449',
|
||||
'\u0447',
|
||||
'\u044a',
|
||||
'\u042e',
|
||||
'\u0410',
|
||||
'\u0411',
|
||||
'\u0426',
|
||||
'\u0414',
|
||||
'\u0415',
|
||||
'\u0424',
|
||||
'\u0413',
|
||||
'\u0425',
|
||||
'\u0418',
|
||||
'\u0419',
|
||||
'\u041a',
|
||||
'\u041b',
|
||||
'\u041c',
|
||||
'\u041d',
|
||||
'\u041e',
|
||||
'\u041f',
|
||||
'\u042f',
|
||||
'\u0420',
|
||||
'\u0421',
|
||||
'\u0422',
|
||||
'\u0423',
|
||||
'\u0416',
|
||||
'\u0412',
|
||||
'\u042c',
|
||||
'\u042b',
|
||||
'\u0417',
|
||||
'\u0428',
|
||||
'\u042d',
|
||||
'\u0429',
|
||||
'\u0427',
|
||||
'\u042a'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cskoi8r",
|
||||
"koi",
|
||||
"koi8",
|
||||
"koi8-r",
|
||||
"koi8_r"
|
||||
};
|
||||
|
||||
private static final String NAME = "koi8-r";
|
||||
|
||||
static final Encoding INSTANCE = new Koi8R();
|
||||
|
||||
private Koi8R() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,182 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Koi8U extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u2500',
|
||||
'\u2502',
|
||||
'\u250c',
|
||||
'\u2510',
|
||||
'\u2514',
|
||||
'\u2518',
|
||||
'\u251c',
|
||||
'\u2524',
|
||||
'\u252c',
|
||||
'\u2534',
|
||||
'\u253c',
|
||||
'\u2580',
|
||||
'\u2584',
|
||||
'\u2588',
|
||||
'\u258c',
|
||||
'\u2590',
|
||||
'\u2591',
|
||||
'\u2592',
|
||||
'\u2593',
|
||||
'\u2320',
|
||||
'\u25a0',
|
||||
'\u2219',
|
||||
'\u221a',
|
||||
'\u2248',
|
||||
'\u2264',
|
||||
'\u2265',
|
||||
'\u00a0',
|
||||
'\u2321',
|
||||
'\u00b0',
|
||||
'\u00b2',
|
||||
'\u00b7',
|
||||
'\u00f7',
|
||||
'\u2550',
|
||||
'\u2551',
|
||||
'\u2552',
|
||||
'\u0451',
|
||||
'\u0454',
|
||||
'\u2554',
|
||||
'\u0456',
|
||||
'\u0457',
|
||||
'\u2557',
|
||||
'\u2558',
|
||||
'\u2559',
|
||||
'\u255a',
|
||||
'\u255b',
|
||||
'\u0491',
|
||||
'\u045e',
|
||||
'\u255e',
|
||||
'\u255f',
|
||||
'\u2560',
|
||||
'\u2561',
|
||||
'\u0401',
|
||||
'\u0404',
|
||||
'\u2563',
|
||||
'\u0406',
|
||||
'\u0407',
|
||||
'\u2566',
|
||||
'\u2567',
|
||||
'\u2568',
|
||||
'\u2569',
|
||||
'\u256a',
|
||||
'\u0490',
|
||||
'\u040e',
|
||||
'\u00a9',
|
||||
'\u044e',
|
||||
'\u0430',
|
||||
'\u0431',
|
||||
'\u0446',
|
||||
'\u0434',
|
||||
'\u0435',
|
||||
'\u0444',
|
||||
'\u0433',
|
||||
'\u0445',
|
||||
'\u0438',
|
||||
'\u0439',
|
||||
'\u043a',
|
||||
'\u043b',
|
||||
'\u043c',
|
||||
'\u043d',
|
||||
'\u043e',
|
||||
'\u043f',
|
||||
'\u044f',
|
||||
'\u0440',
|
||||
'\u0441',
|
||||
'\u0442',
|
||||
'\u0443',
|
||||
'\u0436',
|
||||
'\u0432',
|
||||
'\u044c',
|
||||
'\u044b',
|
||||
'\u0437',
|
||||
'\u0448',
|
||||
'\u044d',
|
||||
'\u0449',
|
||||
'\u0447',
|
||||
'\u044a',
|
||||
'\u042e',
|
||||
'\u0410',
|
||||
'\u0411',
|
||||
'\u0426',
|
||||
'\u0414',
|
||||
'\u0415',
|
||||
'\u0424',
|
||||
'\u0413',
|
||||
'\u0425',
|
||||
'\u0418',
|
||||
'\u0419',
|
||||
'\u041a',
|
||||
'\u041b',
|
||||
'\u041c',
|
||||
'\u041d',
|
||||
'\u041e',
|
||||
'\u041f',
|
||||
'\u042f',
|
||||
'\u0420',
|
||||
'\u0421',
|
||||
'\u0422',
|
||||
'\u0423',
|
||||
'\u0416',
|
||||
'\u0412',
|
||||
'\u042c',
|
||||
'\u042b',
|
||||
'\u0417',
|
||||
'\u0428',
|
||||
'\u042d',
|
||||
'\u0429',
|
||||
'\u0427',
|
||||
'\u042a'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"koi8-ru",
|
||||
"koi8-u"
|
||||
};
|
||||
|
||||
private static final String NAME = "koi8-u";
|
||||
|
||||
static final Encoding INSTANCE = new Koi8U();
|
||||
|
||||
private Koi8U() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,182 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class MacCyrillic extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0410',
|
||||
'\u0411',
|
||||
'\u0412',
|
||||
'\u0413',
|
||||
'\u0414',
|
||||
'\u0415',
|
||||
'\u0416',
|
||||
'\u0417',
|
||||
'\u0418',
|
||||
'\u0419',
|
||||
'\u041a',
|
||||
'\u041b',
|
||||
'\u041c',
|
||||
'\u041d',
|
||||
'\u041e',
|
||||
'\u041f',
|
||||
'\u0420',
|
||||
'\u0421',
|
||||
'\u0422',
|
||||
'\u0423',
|
||||
'\u0424',
|
||||
'\u0425',
|
||||
'\u0426',
|
||||
'\u0427',
|
||||
'\u0428',
|
||||
'\u0429',
|
||||
'\u042a',
|
||||
'\u042b',
|
||||
'\u042c',
|
||||
'\u042d',
|
||||
'\u042e',
|
||||
'\u042f',
|
||||
'\u2020',
|
||||
'\u00b0',
|
||||
'\u0490',
|
||||
'\u00a3',
|
||||
'\u00a7',
|
||||
'\u2022',
|
||||
'\u00b6',
|
||||
'\u0406',
|
||||
'\u00ae',
|
||||
'\u00a9',
|
||||
'\u2122',
|
||||
'\u0402',
|
||||
'\u0452',
|
||||
'\u2260',
|
||||
'\u0403',
|
||||
'\u0453',
|
||||
'\u221e',
|
||||
'\u00b1',
|
||||
'\u2264',
|
||||
'\u2265',
|
||||
'\u0456',
|
||||
'\u00b5',
|
||||
'\u0491',
|
||||
'\u0408',
|
||||
'\u0404',
|
||||
'\u0454',
|
||||
'\u0407',
|
||||
'\u0457',
|
||||
'\u0409',
|
||||
'\u0459',
|
||||
'\u040a',
|
||||
'\u045a',
|
||||
'\u0458',
|
||||
'\u0405',
|
||||
'\u00ac',
|
||||
'\u221a',
|
||||
'\u0192',
|
||||
'\u2248',
|
||||
'\u2206',
|
||||
'\u00ab',
|
||||
'\u00bb',
|
||||
'\u2026',
|
||||
'\u00a0',
|
||||
'\u040b',
|
||||
'\u045b',
|
||||
'\u040c',
|
||||
'\u045c',
|
||||
'\u0455',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u00f7',
|
||||
'\u201e',
|
||||
'\u040e',
|
||||
'\u045e',
|
||||
'\u040f',
|
||||
'\u045f',
|
||||
'\u2116',
|
||||
'\u0401',
|
||||
'\u0451',
|
||||
'\u044f',
|
||||
'\u0430',
|
||||
'\u0431',
|
||||
'\u0432',
|
||||
'\u0433',
|
||||
'\u0434',
|
||||
'\u0435',
|
||||
'\u0436',
|
||||
'\u0437',
|
||||
'\u0438',
|
||||
'\u0439',
|
||||
'\u043a',
|
||||
'\u043b',
|
||||
'\u043c',
|
||||
'\u043d',
|
||||
'\u043e',
|
||||
'\u043f',
|
||||
'\u0440',
|
||||
'\u0441',
|
||||
'\u0442',
|
||||
'\u0443',
|
||||
'\u0444',
|
||||
'\u0445',
|
||||
'\u0446',
|
||||
'\u0447',
|
||||
'\u0448',
|
||||
'\u0449',
|
||||
'\u044a',
|
||||
'\u044b',
|
||||
'\u044c',
|
||||
'\u044d',
|
||||
'\u044e',
|
||||
'\u20ac'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"x-mac-cyrillic",
|
||||
"x-mac-ukrainian"
|
||||
};
|
||||
|
||||
private static final String NAME = "x-mac-cyrillic";
|
||||
|
||||
static final Encoding INSTANCE = new MacCyrillic();
|
||||
|
||||
private MacCyrillic() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,184 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Macintosh extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c7',
|
||||
'\u00c9',
|
||||
'\u00d1',
|
||||
'\u00d6',
|
||||
'\u00dc',
|
||||
'\u00e1',
|
||||
'\u00e0',
|
||||
'\u00e2',
|
||||
'\u00e4',
|
||||
'\u00e3',
|
||||
'\u00e5',
|
||||
'\u00e7',
|
||||
'\u00e9',
|
||||
'\u00e8',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ed',
|
||||
'\u00ec',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u00f1',
|
||||
'\u00f3',
|
||||
'\u00f2',
|
||||
'\u00f4',
|
||||
'\u00f6',
|
||||
'\u00f5',
|
||||
'\u00fa',
|
||||
'\u00f9',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u2020',
|
||||
'\u00b0',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a7',
|
||||
'\u2022',
|
||||
'\u00b6',
|
||||
'\u00df',
|
||||
'\u00ae',
|
||||
'\u00a9',
|
||||
'\u2122',
|
||||
'\u00b4',
|
||||
'\u00a8',
|
||||
'\u2260',
|
||||
'\u00c6',
|
||||
'\u00d8',
|
||||
'\u221e',
|
||||
'\u00b1',
|
||||
'\u2264',
|
||||
'\u2265',
|
||||
'\u00a5',
|
||||
'\u00b5',
|
||||
'\u2202',
|
||||
'\u2211',
|
||||
'\u220f',
|
||||
'\u03c0',
|
||||
'\u222b',
|
||||
'\u00aa',
|
||||
'\u00ba',
|
||||
'\u03a9',
|
||||
'\u00e6',
|
||||
'\u00f8',
|
||||
'\u00bf',
|
||||
'\u00a1',
|
||||
'\u00ac',
|
||||
'\u221a',
|
||||
'\u0192',
|
||||
'\u2248',
|
||||
'\u2206',
|
||||
'\u00ab',
|
||||
'\u00bb',
|
||||
'\u2026',
|
||||
'\u00a0',
|
||||
'\u00c0',
|
||||
'\u00c3',
|
||||
'\u00d5',
|
||||
'\u0152',
|
||||
'\u0153',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u00f7',
|
||||
'\u25ca',
|
||||
'\u00ff',
|
||||
'\u0178',
|
||||
'\u2044',
|
||||
'\u20ac',
|
||||
'\u2039',
|
||||
'\u203a',
|
||||
'\ufb01',
|
||||
'\ufb02',
|
||||
'\u2021',
|
||||
'\u00b7',
|
||||
'\u201a',
|
||||
'\u201e',
|
||||
'\u2030',
|
||||
'\u00c2',
|
||||
'\u00ca',
|
||||
'\u00c1',
|
||||
'\u00cb',
|
||||
'\u00c8',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u00cc',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\uf8ff',
|
||||
'\u00d2',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00d9',
|
||||
'\u0131',
|
||||
'\u02c6',
|
||||
'\u02dc',
|
||||
'\u00af',
|
||||
'\u02d8',
|
||||
'\u02d9',
|
||||
'\u02da',
|
||||
'\u00b8',
|
||||
'\u02dd',
|
||||
'\u02db',
|
||||
'\u02c7'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csmacintosh",
|
||||
"mac",
|
||||
"macintosh",
|
||||
"x-mac-roman"
|
||||
};
|
||||
|
||||
private static final String NAME = "macintosh";
|
||||
|
||||
static final Encoding INSTANCE = new Macintosh();
|
||||
|
||||
private Macintosh() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,59 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Replacement extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csiso2022kr",
|
||||
"hz-gb-2312",
|
||||
"iso-2022-cn",
|
||||
"iso-2022-cn-ext",
|
||||
"iso-2022-kr"
|
||||
};
|
||||
|
||||
private static final String NAME = "replacement";
|
||||
|
||||
static final Replacement INSTANCE = new Replacement();
|
||||
|
||||
private Replacement() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new ReplacementDecoder(this);
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,75 @@
|
|||
/*
|
||||
* Copyright (c) 2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CoderResult;
|
||||
|
||||
class ReplacementDecoder extends Decoder {
|
||||
|
||||
private boolean haveEmitted = false;
|
||||
|
||||
ReplacementDecoder(Charset cs) {
|
||||
super(cs, 1.0f, 1.0f);
|
||||
}
|
||||
|
||||
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
|
||||
for (;;) {
|
||||
if (!in.hasRemaining()) {
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (haveEmitted) {
|
||||
in.position(in.limit());
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
in.position(in.limit());
|
||||
haveEmitted = true;
|
||||
if (this.report) {
|
||||
return CoderResult.malformedForLength(1);
|
||||
}
|
||||
out.put('\uFFFD');
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* @see java.nio.charset.CharsetDecoder#implFlush(java.nio.CharBuffer)
|
||||
*/
|
||||
@Override protected CoderResult implFlush(CharBuffer out) {
|
||||
// TODO Auto-generated method stub
|
||||
return super.implFlush(out);
|
||||
}
|
||||
|
||||
/**
|
||||
* @see java.nio.charset.CharsetDecoder#implReset()
|
||||
*/
|
||||
@Override protected void implReset() {
|
||||
// TODO Auto-generated method stub
|
||||
super.implReset();
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,62 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class ShiftJis extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"csshiftjis",
|
||||
"ms932",
|
||||
"ms_kanji",
|
||||
"shift-jis",
|
||||
"shift_jis",
|
||||
"sjis",
|
||||
"windows-31j",
|
||||
"x-sjis"
|
||||
};
|
||||
|
||||
private static final String NAME = "shift_jis";
|
||||
|
||||
static final ShiftJis INSTANCE = new ShiftJis();
|
||||
|
||||
private ShiftJis() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,55 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class UserDefined extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"x-user-defined"
|
||||
};
|
||||
|
||||
private static final String NAME = "x-user-defined";
|
||||
|
||||
static final UserDefined INSTANCE = new UserDefined();
|
||||
|
||||
private UserDefined() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new UserDefinedDecoder(this);
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,56 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CoderResult;
|
||||
|
||||
class UserDefinedDecoder extends Decoder {
|
||||
|
||||
UserDefinedDecoder(Charset cs) {
|
||||
super(cs, 1.0f, 1.0f);
|
||||
}
|
||||
|
||||
@Override protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
|
||||
// TODO figure out if it's worthwhile to optimize the case where both
|
||||
// buffers are array-backed.
|
||||
for (;;) {
|
||||
if (!in.hasRemaining()) {
|
||||
return CoderResult.UNDERFLOW;
|
||||
}
|
||||
if (!out.hasRemaining()) {
|
||||
return CoderResult.OVERFLOW;
|
||||
}
|
||||
int b = (int)in.get();
|
||||
if (b >= 0) {
|
||||
out.put((char)b);
|
||||
} else {
|
||||
out.put((char)(b + 128 + 0xF780));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,55 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Utf16Be extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"utf-16be"
|
||||
};
|
||||
|
||||
private static final String NAME = "utf-16be";
|
||||
|
||||
static final Utf16Be INSTANCE = new Utf16Be();
|
||||
|
||||
private Utf16Be() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,56 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Utf16Le extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"utf-16",
|
||||
"utf-16le"
|
||||
};
|
||||
|
||||
private static final String NAME = "utf-16le";
|
||||
|
||||
static final Utf16Le INSTANCE = new Utf16Le();
|
||||
|
||||
private Utf16Le() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,57 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
|
||||
class Utf8 extends Encoding {
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"unicode-1-1-utf-8",
|
||||
"utf-8",
|
||||
"utf8"
|
||||
};
|
||||
|
||||
private static final String NAME = "utf-8";
|
||||
|
||||
static final Utf8 INSTANCE = new Utf8();
|
||||
|
||||
private Utf8() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return Charset.forName(NAME).newDecoder();
|
||||
}
|
||||
|
||||
@Override public CharsetEncoder newEncoder() {
|
||||
return Charset.forName(NAME).newEncoder();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1250 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0083',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u0088',
|
||||
'\u2030',
|
||||
'\u0160',
|
||||
'\u2039',
|
||||
'\u015a',
|
||||
'\u0164',
|
||||
'\u017d',
|
||||
'\u0179',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u0098',
|
||||
'\u2122',
|
||||
'\u0161',
|
||||
'\u203a',
|
||||
'\u015b',
|
||||
'\u0165',
|
||||
'\u017e',
|
||||
'\u017a',
|
||||
'\u00a0',
|
||||
'\u02c7',
|
||||
'\u02d8',
|
||||
'\u0141',
|
||||
'\u00a4',
|
||||
'\u0104',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u015e',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u017b',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u02db',
|
||||
'\u0142',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u0105',
|
||||
'\u015f',
|
||||
'\u00bb',
|
||||
'\u013d',
|
||||
'\u02dd',
|
||||
'\u013e',
|
||||
'\u017c',
|
||||
'\u0154',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u0102',
|
||||
'\u00c4',
|
||||
'\u0139',
|
||||
'\u0106',
|
||||
'\u00c7',
|
||||
'\u010c',
|
||||
'\u00c9',
|
||||
'\u0118',
|
||||
'\u00cb',
|
||||
'\u011a',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u010e',
|
||||
'\u0110',
|
||||
'\u0143',
|
||||
'\u0147',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u0150',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u0158',
|
||||
'\u016e',
|
||||
'\u00da',
|
||||
'\u0170',
|
||||
'\u00dc',
|
||||
'\u00dd',
|
||||
'\u0162',
|
||||
'\u00df',
|
||||
'\u0155',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u0103',
|
||||
'\u00e4',
|
||||
'\u013a',
|
||||
'\u0107',
|
||||
'\u00e7',
|
||||
'\u010d',
|
||||
'\u00e9',
|
||||
'\u0119',
|
||||
'\u00eb',
|
||||
'\u011b',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u010f',
|
||||
'\u0111',
|
||||
'\u0144',
|
||||
'\u0148',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u0151',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u0159',
|
||||
'\u016f',
|
||||
'\u00fa',
|
||||
'\u0171',
|
||||
'\u00fc',
|
||||
'\u00fd',
|
||||
'\u0163',
|
||||
'\u02d9'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1250",
|
||||
"windows-1250",
|
||||
"x-cp1250"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1250";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1250();
|
||||
|
||||
private Windows1250() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1251 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u0402',
|
||||
'\u0403',
|
||||
'\u201a',
|
||||
'\u0453',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u20ac',
|
||||
'\u2030',
|
||||
'\u0409',
|
||||
'\u2039',
|
||||
'\u040a',
|
||||
'\u040c',
|
||||
'\u040b',
|
||||
'\u040f',
|
||||
'\u0452',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u0098',
|
||||
'\u2122',
|
||||
'\u0459',
|
||||
'\u203a',
|
||||
'\u045a',
|
||||
'\u045c',
|
||||
'\u045b',
|
||||
'\u045f',
|
||||
'\u00a0',
|
||||
'\u040e',
|
||||
'\u045e',
|
||||
'\u0408',
|
||||
'\u00a4',
|
||||
'\u0490',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u0401',
|
||||
'\u00a9',
|
||||
'\u0404',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u0407',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u0406',
|
||||
'\u0456',
|
||||
'\u0491',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u0451',
|
||||
'\u2116',
|
||||
'\u0454',
|
||||
'\u00bb',
|
||||
'\u0458',
|
||||
'\u0405',
|
||||
'\u0455',
|
||||
'\u0457',
|
||||
'\u0410',
|
||||
'\u0411',
|
||||
'\u0412',
|
||||
'\u0413',
|
||||
'\u0414',
|
||||
'\u0415',
|
||||
'\u0416',
|
||||
'\u0417',
|
||||
'\u0418',
|
||||
'\u0419',
|
||||
'\u041a',
|
||||
'\u041b',
|
||||
'\u041c',
|
||||
'\u041d',
|
||||
'\u041e',
|
||||
'\u041f',
|
||||
'\u0420',
|
||||
'\u0421',
|
||||
'\u0422',
|
||||
'\u0423',
|
||||
'\u0424',
|
||||
'\u0425',
|
||||
'\u0426',
|
||||
'\u0427',
|
||||
'\u0428',
|
||||
'\u0429',
|
||||
'\u042a',
|
||||
'\u042b',
|
||||
'\u042c',
|
||||
'\u042d',
|
||||
'\u042e',
|
||||
'\u042f',
|
||||
'\u0430',
|
||||
'\u0431',
|
||||
'\u0432',
|
||||
'\u0433',
|
||||
'\u0434',
|
||||
'\u0435',
|
||||
'\u0436',
|
||||
'\u0437',
|
||||
'\u0438',
|
||||
'\u0439',
|
||||
'\u043a',
|
||||
'\u043b',
|
||||
'\u043c',
|
||||
'\u043d',
|
||||
'\u043e',
|
||||
'\u043f',
|
||||
'\u0440',
|
||||
'\u0441',
|
||||
'\u0442',
|
||||
'\u0443',
|
||||
'\u0444',
|
||||
'\u0445',
|
||||
'\u0446',
|
||||
'\u0447',
|
||||
'\u0448',
|
||||
'\u0449',
|
||||
'\u044a',
|
||||
'\u044b',
|
||||
'\u044c',
|
||||
'\u044d',
|
||||
'\u044e',
|
||||
'\u044f'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1251",
|
||||
"windows-1251",
|
||||
"x-cp1251"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1251";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1251();
|
||||
|
||||
private Windows1251() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,197 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1252 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0192',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u02c6',
|
||||
'\u2030',
|
||||
'\u0160',
|
||||
'\u2039',
|
||||
'\u0152',
|
||||
'\u008d',
|
||||
'\u017d',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u02dc',
|
||||
'\u2122',
|
||||
'\u0161',
|
||||
'\u203a',
|
||||
'\u0153',
|
||||
'\u009d',
|
||||
'\u017e',
|
||||
'\u0178',
|
||||
'\u00a0',
|
||||
'\u00a1',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u00aa',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u00ba',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u00bf',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u00c3',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u00cc',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u00d0',
|
||||
'\u00d1',
|
||||
'\u00d2',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u00d8',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u00dd',
|
||||
'\u00de',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u00e3',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ec',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u00f0',
|
||||
'\u00f1',
|
||||
'\u00f2',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u00f8',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u00fd',
|
||||
'\u00fe',
|
||||
'\u00ff'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"ansi_x3.4-1968",
|
||||
"ascii",
|
||||
"cp1252",
|
||||
"cp819",
|
||||
"csisolatin1",
|
||||
"ibm819",
|
||||
"iso-8859-1",
|
||||
"iso-ir-100",
|
||||
"iso8859-1",
|
||||
"iso88591",
|
||||
"iso_8859-1",
|
||||
"iso_8859-1:1987",
|
||||
"l1",
|
||||
"latin1",
|
||||
"us-ascii",
|
||||
"windows-1252",
|
||||
"x-cp1252"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1252";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1252();
|
||||
|
||||
private Windows1252() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1253 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0192',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u0088',
|
||||
'\u2030',
|
||||
'\u008a',
|
||||
'\u2039',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u0098',
|
||||
'\u2122',
|
||||
'\u009a',
|
||||
'\u203a',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0385',
|
||||
'\u0386',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\ufffd',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u2015',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u0384',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u0388',
|
||||
'\u0389',
|
||||
'\u038a',
|
||||
'\u00bb',
|
||||
'\u038c',
|
||||
'\u00bd',
|
||||
'\u038e',
|
||||
'\u038f',
|
||||
'\u0390',
|
||||
'\u0391',
|
||||
'\u0392',
|
||||
'\u0393',
|
||||
'\u0394',
|
||||
'\u0395',
|
||||
'\u0396',
|
||||
'\u0397',
|
||||
'\u0398',
|
||||
'\u0399',
|
||||
'\u039a',
|
||||
'\u039b',
|
||||
'\u039c',
|
||||
'\u039d',
|
||||
'\u039e',
|
||||
'\u039f',
|
||||
'\u03a0',
|
||||
'\u03a1',
|
||||
'\ufffd',
|
||||
'\u03a3',
|
||||
'\u03a4',
|
||||
'\u03a5',
|
||||
'\u03a6',
|
||||
'\u03a7',
|
||||
'\u03a8',
|
||||
'\u03a9',
|
||||
'\u03aa',
|
||||
'\u03ab',
|
||||
'\u03ac',
|
||||
'\u03ad',
|
||||
'\u03ae',
|
||||
'\u03af',
|
||||
'\u03b0',
|
||||
'\u03b1',
|
||||
'\u03b2',
|
||||
'\u03b3',
|
||||
'\u03b4',
|
||||
'\u03b5',
|
||||
'\u03b6',
|
||||
'\u03b7',
|
||||
'\u03b8',
|
||||
'\u03b9',
|
||||
'\u03ba',
|
||||
'\u03bb',
|
||||
'\u03bc',
|
||||
'\u03bd',
|
||||
'\u03be',
|
||||
'\u03bf',
|
||||
'\u03c0',
|
||||
'\u03c1',
|
||||
'\u03c2',
|
||||
'\u03c3',
|
||||
'\u03c4',
|
||||
'\u03c5',
|
||||
'\u03c6',
|
||||
'\u03c7',
|
||||
'\u03c8',
|
||||
'\u03c9',
|
||||
'\u03ca',
|
||||
'\u03cb',
|
||||
'\u03cc',
|
||||
'\u03cd',
|
||||
'\u03ce',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1253",
|
||||
"windows-1253",
|
||||
"x-cp1253"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1253";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1253();
|
||||
|
||||
private Windows1253() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,192 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1254 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0192',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u02c6',
|
||||
'\u2030',
|
||||
'\u0160',
|
||||
'\u2039',
|
||||
'\u0152',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u02dc',
|
||||
'\u2122',
|
||||
'\u0161',
|
||||
'\u203a',
|
||||
'\u0153',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u0178',
|
||||
'\u00a0',
|
||||
'\u00a1',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u00aa',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u00ba',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u00bf',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u00c3',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u00cc',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u011e',
|
||||
'\u00d1',
|
||||
'\u00d2',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u00d8',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u0130',
|
||||
'\u015e',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u00e3',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u00ec',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u011f',
|
||||
'\u00f1',
|
||||
'\u00f2',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u00f8',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u0131',
|
||||
'\u015f',
|
||||
'\u00ff'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1254",
|
||||
"csisolatin5",
|
||||
"iso-8859-9",
|
||||
"iso-ir-148",
|
||||
"iso8859-9",
|
||||
"iso88599",
|
||||
"iso_8859-9",
|
||||
"iso_8859-9:1989",
|
||||
"l5",
|
||||
"latin5",
|
||||
"windows-1254",
|
||||
"x-cp1254"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1254";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1254();
|
||||
|
||||
private Windows1254() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1255 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0192',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u02c6',
|
||||
'\u2030',
|
||||
'\u008a',
|
||||
'\u2039',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u02dc',
|
||||
'\u2122',
|
||||
'\u009a',
|
||||
'\u203a',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u00a1',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u20aa',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u00d7',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u00f7',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u00bf',
|
||||
'\u05b0',
|
||||
'\u05b1',
|
||||
'\u05b2',
|
||||
'\u05b3',
|
||||
'\u05b4',
|
||||
'\u05b5',
|
||||
'\u05b6',
|
||||
'\u05b7',
|
||||
'\u05b8',
|
||||
'\u05b9',
|
||||
'\ufffd',
|
||||
'\u05bb',
|
||||
'\u05bc',
|
||||
'\u05bd',
|
||||
'\u05be',
|
||||
'\u05bf',
|
||||
'\u05c0',
|
||||
'\u05c1',
|
||||
'\u05c2',
|
||||
'\u05c3',
|
||||
'\u05f0',
|
||||
'\u05f1',
|
||||
'\u05f2',
|
||||
'\u05f3',
|
||||
'\u05f4',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u05d0',
|
||||
'\u05d1',
|
||||
'\u05d2',
|
||||
'\u05d3',
|
||||
'\u05d4',
|
||||
'\u05d5',
|
||||
'\u05d6',
|
||||
'\u05d7',
|
||||
'\u05d8',
|
||||
'\u05d9',
|
||||
'\u05da',
|
||||
'\u05db',
|
||||
'\u05dc',
|
||||
'\u05dd',
|
||||
'\u05de',
|
||||
'\u05df',
|
||||
'\u05e0',
|
||||
'\u05e1',
|
||||
'\u05e2',
|
||||
'\u05e3',
|
||||
'\u05e4',
|
||||
'\u05e5',
|
||||
'\u05e6',
|
||||
'\u05e7',
|
||||
'\u05e8',
|
||||
'\u05e9',
|
||||
'\u05ea',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u200e',
|
||||
'\u200f',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1255",
|
||||
"windows-1255",
|
||||
"x-cp1255"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1255";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1255();
|
||||
|
||||
private Windows1255() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1256 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u067e',
|
||||
'\u201a',
|
||||
'\u0192',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u02c6',
|
||||
'\u2030',
|
||||
'\u0679',
|
||||
'\u2039',
|
||||
'\u0152',
|
||||
'\u0686',
|
||||
'\u0698',
|
||||
'\u0688',
|
||||
'\u06af',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u06a9',
|
||||
'\u2122',
|
||||
'\u0691',
|
||||
'\u203a',
|
||||
'\u0153',
|
||||
'\u200c',
|
||||
'\u200d',
|
||||
'\u06ba',
|
||||
'\u00a0',
|
||||
'\u060c',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u06be',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u061b',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u061f',
|
||||
'\u06c1',
|
||||
'\u0621',
|
||||
'\u0622',
|
||||
'\u0623',
|
||||
'\u0624',
|
||||
'\u0625',
|
||||
'\u0626',
|
||||
'\u0627',
|
||||
'\u0628',
|
||||
'\u0629',
|
||||
'\u062a',
|
||||
'\u062b',
|
||||
'\u062c',
|
||||
'\u062d',
|
||||
'\u062e',
|
||||
'\u062f',
|
||||
'\u0630',
|
||||
'\u0631',
|
||||
'\u0632',
|
||||
'\u0633',
|
||||
'\u0634',
|
||||
'\u0635',
|
||||
'\u0636',
|
||||
'\u00d7',
|
||||
'\u0637',
|
||||
'\u0638',
|
||||
'\u0639',
|
||||
'\u063a',
|
||||
'\u0640',
|
||||
'\u0641',
|
||||
'\u0642',
|
||||
'\u0643',
|
||||
'\u00e0',
|
||||
'\u0644',
|
||||
'\u00e2',
|
||||
'\u0645',
|
||||
'\u0646',
|
||||
'\u0647',
|
||||
'\u0648',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u0649',
|
||||
'\u064a',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u064b',
|
||||
'\u064c',
|
||||
'\u064d',
|
||||
'\u064e',
|
||||
'\u00f4',
|
||||
'\u064f',
|
||||
'\u0650',
|
||||
'\u00f7',
|
||||
'\u0651',
|
||||
'\u00f9',
|
||||
'\u0652',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u200e',
|
||||
'\u200f',
|
||||
'\u06d2'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1256",
|
||||
"windows-1256",
|
||||
"x-cp1256"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1256";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1256();
|
||||
|
||||
private Windows1256() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1257 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0083',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u0088',
|
||||
'\u2030',
|
||||
'\u008a',
|
||||
'\u2039',
|
||||
'\u008c',
|
||||
'\u00a8',
|
||||
'\u02c7',
|
||||
'\u00b8',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u0098',
|
||||
'\u2122',
|
||||
'\u009a',
|
||||
'\u203a',
|
||||
'\u009c',
|
||||
'\u00af',
|
||||
'\u02db',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\ufffd',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\ufffd',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00d8',
|
||||
'\u00a9',
|
||||
'\u0156',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00c6',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00f8',
|
||||
'\u00b9',
|
||||
'\u0157',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u00e6',
|
||||
'\u0104',
|
||||
'\u012e',
|
||||
'\u0100',
|
||||
'\u0106',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u0118',
|
||||
'\u0112',
|
||||
'\u010c',
|
||||
'\u00c9',
|
||||
'\u0179',
|
||||
'\u0116',
|
||||
'\u0122',
|
||||
'\u0136',
|
||||
'\u012a',
|
||||
'\u013b',
|
||||
'\u0160',
|
||||
'\u0143',
|
||||
'\u0145',
|
||||
'\u00d3',
|
||||
'\u014c',
|
||||
'\u00d5',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u0172',
|
||||
'\u0141',
|
||||
'\u015a',
|
||||
'\u016a',
|
||||
'\u00dc',
|
||||
'\u017b',
|
||||
'\u017d',
|
||||
'\u00df',
|
||||
'\u0105',
|
||||
'\u012f',
|
||||
'\u0101',
|
||||
'\u0107',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u0119',
|
||||
'\u0113',
|
||||
'\u010d',
|
||||
'\u00e9',
|
||||
'\u017a',
|
||||
'\u0117',
|
||||
'\u0123',
|
||||
'\u0137',
|
||||
'\u012b',
|
||||
'\u013c',
|
||||
'\u0161',
|
||||
'\u0144',
|
||||
'\u0146',
|
||||
'\u00f3',
|
||||
'\u014d',
|
||||
'\u00f5',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u0173',
|
||||
'\u0142',
|
||||
'\u015b',
|
||||
'\u016b',
|
||||
'\u00fc',
|
||||
'\u017c',
|
||||
'\u017e',
|
||||
'\u02d9'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1257",
|
||||
"windows-1257",
|
||||
"x-cp1257"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1257";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1257();
|
||||
|
||||
private Windows1257() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,183 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows1258 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u201a',
|
||||
'\u0192',
|
||||
'\u201e',
|
||||
'\u2026',
|
||||
'\u2020',
|
||||
'\u2021',
|
||||
'\u02c6',
|
||||
'\u2030',
|
||||
'\u008a',
|
||||
'\u2039',
|
||||
'\u0152',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u02dc',
|
||||
'\u2122',
|
||||
'\u009a',
|
||||
'\u203a',
|
||||
'\u0153',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u0178',
|
||||
'\u00a0',
|
||||
'\u00a1',
|
||||
'\u00a2',
|
||||
'\u00a3',
|
||||
'\u00a4',
|
||||
'\u00a5',
|
||||
'\u00a6',
|
||||
'\u00a7',
|
||||
'\u00a8',
|
||||
'\u00a9',
|
||||
'\u00aa',
|
||||
'\u00ab',
|
||||
'\u00ac',
|
||||
'\u00ad',
|
||||
'\u00ae',
|
||||
'\u00af',
|
||||
'\u00b0',
|
||||
'\u00b1',
|
||||
'\u00b2',
|
||||
'\u00b3',
|
||||
'\u00b4',
|
||||
'\u00b5',
|
||||
'\u00b6',
|
||||
'\u00b7',
|
||||
'\u00b8',
|
||||
'\u00b9',
|
||||
'\u00ba',
|
||||
'\u00bb',
|
||||
'\u00bc',
|
||||
'\u00bd',
|
||||
'\u00be',
|
||||
'\u00bf',
|
||||
'\u00c0',
|
||||
'\u00c1',
|
||||
'\u00c2',
|
||||
'\u0102',
|
||||
'\u00c4',
|
||||
'\u00c5',
|
||||
'\u00c6',
|
||||
'\u00c7',
|
||||
'\u00c8',
|
||||
'\u00c9',
|
||||
'\u00ca',
|
||||
'\u00cb',
|
||||
'\u0300',
|
||||
'\u00cd',
|
||||
'\u00ce',
|
||||
'\u00cf',
|
||||
'\u0110',
|
||||
'\u00d1',
|
||||
'\u0309',
|
||||
'\u00d3',
|
||||
'\u00d4',
|
||||
'\u01a0',
|
||||
'\u00d6',
|
||||
'\u00d7',
|
||||
'\u00d8',
|
||||
'\u00d9',
|
||||
'\u00da',
|
||||
'\u00db',
|
||||
'\u00dc',
|
||||
'\u01af',
|
||||
'\u0303',
|
||||
'\u00df',
|
||||
'\u00e0',
|
||||
'\u00e1',
|
||||
'\u00e2',
|
||||
'\u0103',
|
||||
'\u00e4',
|
||||
'\u00e5',
|
||||
'\u00e6',
|
||||
'\u00e7',
|
||||
'\u00e8',
|
||||
'\u00e9',
|
||||
'\u00ea',
|
||||
'\u00eb',
|
||||
'\u0301',
|
||||
'\u00ed',
|
||||
'\u00ee',
|
||||
'\u00ef',
|
||||
'\u0111',
|
||||
'\u00f1',
|
||||
'\u0323',
|
||||
'\u00f3',
|
||||
'\u00f4',
|
||||
'\u01a1',
|
||||
'\u00f6',
|
||||
'\u00f7',
|
||||
'\u00f8',
|
||||
'\u00f9',
|
||||
'\u00fa',
|
||||
'\u00fb',
|
||||
'\u00fc',
|
||||
'\u01b0',
|
||||
'\u20ab',
|
||||
'\u00ff'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"cp1258",
|
||||
"windows-1258",
|
||||
"x-cp1258"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-1258";
|
||||
|
||||
static final Encoding INSTANCE = new Windows1258();
|
||||
|
||||
private Windows1258() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new InfallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,186 @@
|
|||
/*
|
||||
* Copyright (c) 2013-2015 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* THIS IS A GENERATED FILE. PLEASE DO NOT EDIT.
|
||||
* Instead, please regenerate using generate-encoding-data.py
|
||||
*/
|
||||
|
||||
package nu.validator.encoding;
|
||||
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
|
||||
class Windows874 extends Encoding {
|
||||
|
||||
private static final char[] TABLE = {
|
||||
'\u20ac',
|
||||
'\u0081',
|
||||
'\u0082',
|
||||
'\u0083',
|
||||
'\u0084',
|
||||
'\u2026',
|
||||
'\u0086',
|
||||
'\u0087',
|
||||
'\u0088',
|
||||
'\u0089',
|
||||
'\u008a',
|
||||
'\u008b',
|
||||
'\u008c',
|
||||
'\u008d',
|
||||
'\u008e',
|
||||
'\u008f',
|
||||
'\u0090',
|
||||
'\u2018',
|
||||
'\u2019',
|
||||
'\u201c',
|
||||
'\u201d',
|
||||
'\u2022',
|
||||
'\u2013',
|
||||
'\u2014',
|
||||
'\u0098',
|
||||
'\u0099',
|
||||
'\u009a',
|
||||
'\u009b',
|
||||
'\u009c',
|
||||
'\u009d',
|
||||
'\u009e',
|
||||
'\u009f',
|
||||
'\u00a0',
|
||||
'\u0e01',
|
||||
'\u0e02',
|
||||
'\u0e03',
|
||||
'\u0e04',
|
||||
'\u0e05',
|
||||
'\u0e06',
|
||||
'\u0e07',
|
||||
'\u0e08',
|
||||
'\u0e09',
|
||||
'\u0e0a',
|
||||
'\u0e0b',
|
||||
'\u0e0c',
|
||||
'\u0e0d',
|
||||
'\u0e0e',
|
||||
'\u0e0f',
|
||||
'\u0e10',
|
||||
'\u0e11',
|
||||
'\u0e12',
|
||||
'\u0e13',
|
||||
'\u0e14',
|
||||
'\u0e15',
|
||||
'\u0e16',
|
||||
'\u0e17',
|
||||
'\u0e18',
|
||||
'\u0e19',
|
||||
'\u0e1a',
|
||||
'\u0e1b',
|
||||
'\u0e1c',
|
||||
'\u0e1d',
|
||||
'\u0e1e',
|
||||
'\u0e1f',
|
||||
'\u0e20',
|
||||
'\u0e21',
|
||||
'\u0e22',
|
||||
'\u0e23',
|
||||
'\u0e24',
|
||||
'\u0e25',
|
||||
'\u0e26',
|
||||
'\u0e27',
|
||||
'\u0e28',
|
||||
'\u0e29',
|
||||
'\u0e2a',
|
||||
'\u0e2b',
|
||||
'\u0e2c',
|
||||
'\u0e2d',
|
||||
'\u0e2e',
|
||||
'\u0e2f',
|
||||
'\u0e30',
|
||||
'\u0e31',
|
||||
'\u0e32',
|
||||
'\u0e33',
|
||||
'\u0e34',
|
||||
'\u0e35',
|
||||
'\u0e36',
|
||||
'\u0e37',
|
||||
'\u0e38',
|
||||
'\u0e39',
|
||||
'\u0e3a',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\u0e3f',
|
||||
'\u0e40',
|
||||
'\u0e41',
|
||||
'\u0e42',
|
||||
'\u0e43',
|
||||
'\u0e44',
|
||||
'\u0e45',
|
||||
'\u0e46',
|
||||
'\u0e47',
|
||||
'\u0e48',
|
||||
'\u0e49',
|
||||
'\u0e4a',
|
||||
'\u0e4b',
|
||||
'\u0e4c',
|
||||
'\u0e4d',
|
||||
'\u0e4e',
|
||||
'\u0e4f',
|
||||
'\u0e50',
|
||||
'\u0e51',
|
||||
'\u0e52',
|
||||
'\u0e53',
|
||||
'\u0e54',
|
||||
'\u0e55',
|
||||
'\u0e56',
|
||||
'\u0e57',
|
||||
'\u0e58',
|
||||
'\u0e59',
|
||||
'\u0e5a',
|
||||
'\u0e5b',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd',
|
||||
'\ufffd'
|
||||
};
|
||||
|
||||
private static final String[] LABELS = {
|
||||
"dos-874",
|
||||
"iso-8859-11",
|
||||
"iso8859-11",
|
||||
"iso885911",
|
||||
"tis-620",
|
||||
"windows-874"
|
||||
};
|
||||
|
||||
private static final String NAME = "windows-874";
|
||||
|
||||
static final Encoding INSTANCE = new Windows874();
|
||||
|
||||
private Windows874() {
|
||||
super(NAME, LABELS);
|
||||
}
|
||||
|
||||
@Override public CharsetDecoder newDecoder() {
|
||||
return new FallibleSingleByteDecoder(this, TABLE);
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
/*
|
||||
* Copyright (c) 2010 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
public @interface Auto {
|
||||
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
/*
|
||||
* Copyright (c) 2010 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
public @interface CharacterName {
|
||||
|
||||
}
|
|
@ -0,0 +1,34 @@
|
|||
/*
|
||||
* Copyright (c) 2010 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
/**
|
||||
* Marker for translating into the C++ const keyword on the declaration in
|
||||
* question.
|
||||
*
|
||||
* @version $Id$
|
||||
* @author hsivonen
|
||||
*/
|
||||
public @interface Const {
|
||||
|
||||
}
|
|
@ -0,0 +1,34 @@
|
|||
/*
|
||||
* Copyright (c) 2008 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
/**
|
||||
* The type for attribute IDness. (In Java, an interned string
|
||||
* <code>"CDATA"</code> or <code>"ID"</code>.)
|
||||
*
|
||||
* @version $Id$
|
||||
* @author hsivonen
|
||||
*/
|
||||
public @interface IdType {
|
||||
|
||||
}
|
|
@ -0,0 +1,33 @@
|
|||
/*
|
||||
* Copyright (c) 2009-2010 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
/**
|
||||
* Translates into the C++ inline keyword.
|
||||
*
|
||||
* @version $Id$
|
||||
* @author hsivonen
|
||||
*/
|
||||
public @interface Inline {
|
||||
|
||||
}
|
|
@ -0,0 +1,34 @@
|
|||
/*
|
||||
* Copyright (c) 2009-2010 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
/**
|
||||
* Marks a string type as being the literal string type (typically const char*)
|
||||
* in C++.
|
||||
*
|
||||
* @version $Id$
|
||||
* @author hsivonen
|
||||
*/
|
||||
public @interface Literal {
|
||||
|
||||
}
|
|
@ -0,0 +1,34 @@
|
|||
/*
|
||||
* Copyright (c) 2008 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
/**
|
||||
* The local name of an element or attribute. Must be comparable with
|
||||
* <code>==</code> (interned <code>String</code> in Java).
|
||||
*
|
||||
* @version $Id$
|
||||
* @author hsivonen
|
||||
*/
|
||||
public @interface Local {
|
||||
|
||||
}
|
|
@ -0,0 +1,34 @@
|
|||
/*
|
||||
* Copyright (c) 2008 Mozilla Foundation
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
package nu.validator.htmlparser.annotation;
|
||||
|
||||
/**
|
||||
* The array type marked with this annotation won't have its
|
||||
* <code>.length</code> read.
|
||||
*
|
||||
* @version $Id$
|
||||
* @author hsivonen
|
||||
*/
|
||||
public @interface NoLength {
|
||||
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue