593 Commits

Author SHA1 Message Date
Nick Treleaven
657c7e73be Parse D enum base type & refactor
Merge and extend fishman/ctags enum inheritance parsing for C++, D,
and add simple.d.t/input.d test.
2015-01-14 17:03:26 +00:00
Jiří Techet
c131466a00 Revert "Microoptimization in merge"
This reverts commit cb9e4bbf7446e45365cad2242087f2a766662f20.
2014-12-30 17:09:18 +01:00
Colomban Wendling
29cc8b4d28 d: size_t and wchar_t aren't keywords in D 2014-12-25 16:46:55 +01:00
Colomban Wendling
371301a84d c: Don't parse wchar_t as a keyword 2014-12-25 01:36:27 +01:00
Colomban Wendling
e091a56a18 c, c++: Don't parse size_t as a keyword
This fixes handling of typedefs defining this name.
2014-12-25 01:35:28 +01:00
Jiří Techet
7c22ceacf9 Update the go parser to the latest version from ctags 2014-12-07 22:25:13 +01:00
Colomban Wendling
f08af8046f Merge branch 'js-update'
Import back JavaScript parser changes from fishman/ctags.
2014-12-02 15:03:20 +01:00
Colomban Wendling
02bc3b3638 javascript: Improve string literals handling
1. Don't include the newline itself in a line continuation construct.
   This fixes generation of e.g. properties with embedded line
   continuations.
2. Don't continue parsing strings past an unescaped newline (as naked
   newlines are invalid inside strings).  This avoids parsing the whole
   remaining file as a string in case of broken input.  It is both
   useful to better support partly written files and to avoid loading a
   whole malformed file in memory while reading it as a string.

See section 7.8.4 "String Literals" of ECMA-262:
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf
2014-12-02 15:02:39 +01:00
Colomban Wendling
69a15cf2c1 javascript: Stop using longjmp() and friends
Fixes some memory leaks with malformed or partial files.
2014-12-02 15:02:13 +01:00
Colomban Wendling
94aa892c81 Merge pull request #373 from techee/go_ctags
Add a Go ctags parser.
2014-11-30 02:03:00 +01:00
Jiří Techet
ccb15a31be Add the go ctags parser
Make go one of the builtin filetypes, add the parser and update the related
source and config files. While there, remove Rust from [Groups] in
filetype_extensions.conf because it's already a builtin filetype as well.

The parser itself is stolen from the fishman/ctags repo.
2014-11-30 01:35:00 +01:00
Colomban Wendling
af7d63cdf2 Merge pull request #319 from b4n/better-txt2tags-parser
Better txt2tags parser
2014-11-29 23:40:58 +01:00
Colomban Wendling
5793694134 javascript: Add support for the let keyword
`let` is not yet part of the current ECMAScript standard but is part of
the ECMAScript 6 draft and is supported by Mozilla, and people already
use it in some contexts.

Also, the current ECMAScript standard marks `let` as a "considered
reserved word" (7.6.1.2), so it is already a reserved keyword in strict
mode.

See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/let
2014-11-24 03:00:27 +01:00
Colomban Wendling
ef8c40f1e4 javascript: Add support for the const keyword
`const` is not yet part of the current ECMAScript standard but is
part of the ECMAScript 6 draft and is supported by popular engines
including Mozilla and WebKit, and people already use it.

See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
2014-11-24 02:59:08 +01:00
Colomban Wendling
b85d754610 javascript: Improve support for unterminated statements
Add support for implicit semicolons so many unterminated statements'
end are properly recognized.

The implementation doesn't follow the ECMAScript standard because doing
so requires to recognize precise grammar of all constructs, and the
parser doesn't currently work this way.  So instead it uses some
heuristics that should work most of the time and only consider implicit
semicolons where they would be explicitly relevant to avoid most false-
positives.  See the extensive comment in `readTokenFull()` for details.

In practice, this mostly fixes handling of files using unterminated
variable assignations like the following:

    var v1 = 0
    var v2 = 1
    // ...
    function f1() {
        // ...
    }

In such situations the parser used not to be able to really tell where
the variable assignation would end and would not recognize any
statement before the next semicolon or closing curly brace at the same
level.  In practice, it wouldn't have emitted any tag for this example,
not even `v1` as it generates tags when reaching the statement's end.
2014-11-24 02:57:38 +01:00
Colomban Wendling
f2b368e2cc javascript: Report function signature 2014-11-24 02:55:44 +01:00
Colomban Wendling
f65dec49e7 javascript: Fix more handling of class-related unterminated statements 2014-11-24 02:48:55 +01:00
Colomban Wendling
9c84a91bb3 javascript: Fix scope inside nested blocks (if/else/while/for/etc.) 2014-11-24 02:45:22 +01:00
Colomban Wendling
e01ae923a1 javascript: Cleanup findCmdTerm() callers a bit
Move the check for unterminated inside `findCmdTerm()` itself and
return it rather than having each caller do it itself.
2014-11-24 02:43:44 +01:00
Colomban Wendling
7e6215661e javascript: Fix handling some class-related unterminated statements 2014-11-24 02:43:29 +01:00
Colomban Wendling
5a1a22d930 javascript: Properly handle nested unknown blocks
Properly match open curly braces when parsing a statement not to
possibly get fooled by unexpected nested blocks, e.g. after a
`switch`'s `case` or a label.

This mostly reverts c54c3ad5e815d16e3b48f3c477465627808aadee and
replaces it with a more correct and complete solution.
2014-11-24 02:41:57 +01:00
Colomban Wendling
f158f5d362 javascript: Fix scope after some constructs 2014-11-24 02:41:17 +01:00
dfishburn
f886f7084a Javascript parser: Removed warning of unused variable is_inside_class and cleanUp
git-svn-id: https://svn.code.sf.net/p/ctags/code/trunk@815 c5d04d22-be80-434c-894e-aa346cc9e8e8
2014-11-24 02:00:24 +01:00
dfishburn
c54c3ad5e8 Added new method findMatchingToken() to skip over blocks of code. Updated parseSwitch() to use the new method.
Updated tests/ctags/ui5.ui.controller.js with code that broke the switch statement.

git-svn-id: https://svn.code.sf.net/p/ctags/code/trunk@809 c5d04d22-be80-434c-894e-aa346cc9e8e8
2014-11-24 01:48:04 +01:00
dfishburn
52d2d73527 Udated jscript to be able to parse SAPUI5 controller and view files.
Added test case tests/ctags/ui5.controller.js

git-svn-id: https://svn.code.sf.net/p/ctags/code/trunk@808 c5d04d22-be80-434c-894e-aa346cc9e8e8
2014-11-24 01:39:36 +01:00
Colomban Wendling
f765463af0 Import new CSS parser from fishman-ctags
Some highlights:
* Fixes handling of comments
* Adds support for attribute and namespace selectors
* Adds support for @supports blocks
* Fixes tag type for many selectors
* Adds support for pseudo-classes with arguments
2014-11-11 02:01:41 +01:00
Colomban Wendling
6a0673f4ae TM: Don't allow passing NULL to tm_workspace API 2014-11-08 18:32:41 +01:00
Colomban Wendling
b38f1f99d5 TM: Use gsize everywhere for the memory buffer size 2014-11-08 18:13:13 +01:00
Jiří Techet
f441a121d3 Parse file from buffer only if the file isn't too big
By loading e.g. a huge DB dump into memory we could run out of memory.
Check the size of the file and determine whether to use file parsing
or buffer parsing.

Give up early if LANG_IGNORE set - there's no need to read the file in
this case.
2014-11-06 11:39:40 +01:00
Jiří Techet
0ad85aee04 Fix comment wording 2014-11-05 21:50:07 +01:00
Jiří Techet
a95fc1a994 Don't expose the source file update function to plugins 2014-11-05 21:50:07 +01:00
Jiří Techet
90944c77b0 Unify tag sorting and simplify tag comparison function
Use the same (or compatible) sorting criteria everywhere.

Add tm_tag_attr_line_t to sort options so even after merging file tags
into workspace tags, the same tags defined at different lines are preserved
and not removed as duplicates.

Sort type before scope because it's cheaper to compare (string vs int comparison).

For some reason, the above changes make the sorting performance worse.
Simplify the tag comparison function a bit and reorder the case statements
in the switch to match the sort order. This (again not sure why), makes the
performance like before.
2014-11-05 21:50:07 +01:00
Jiří Techet
29000cf104 Always parse buffer instead of file
This brings the linux kernel parsing time from about 20s to 12s on my machine.
2014-11-05 21:50:07 +01:00
Jiří Techet
448f1fd20e Make sure gboolean tm_source_file_parse() returns the right value
Previously, after finishing the while loop TRUE was returned - this is
wrong because the while was running because parsing was unsuccessful.
Make it work the same way as in ctags (parser() always succeeds,
parser2() returns whether to retry or not).

(The return value actually isn't used, it's just to make sure we know
what we are doing.)
2014-11-05 21:50:07 +01:00
Jiří Techet
f35d0b9c0c Move tm_get_current_tag() from tm_workspace to tm_tag
This function has nothing to do with the workspace so it rather belongs
to tm_tag.
2014-11-02 11:40:05 +01:00
Jiří Techet
71cc1ecb20 Cleaner and safer TMWorkspace API
With the previous TMWorkspace API it was possible to make the workspace
inconsistent by e.g. removing source files and forgetting to update
workspace. This could lead to non-obvious and not immediately visible
crashes.

The new set of the public (but also Geany private) API calls always
updates the workspace accordingly and neither of the calls can lead
to an inconsistent state of the workspace.

In addition, perform some minor cleanups and simplifications - unify
parsing from buffer and from file, support "parsing" of 0-sized buffers
and improve documentation.
2014-11-02 11:39:57 +01:00
Colomban Wendling
48718f4b79 Remove an unused parameter 2014-10-31 20:25:59 +01:00
Colomban Wendling
013c47c6e3 TM: Fix various integer signedness issues 2014-10-31 20:25:13 +01:00
Colomban Wendling
61ee7de86e Return an unisgned tag count in tm_tags_find()
There is no reason to return a signed value an it help unifying
caller's types.
2014-10-31 20:20:25 +01:00
Colomban Wendling
42a9603f4a Use TMTagType everywhere to hold tag types 2014-10-31 20:07:27 +01:00
Jiří Techet
d7ed48f86b Fix a problem in tm_tags_remove_file_tags() when more tags of the same name exist 2014-10-31 02:03:13 +01:00
Jiří Techet
cb9e4bbf74 Microoptimization in merge
Improves performance by about 10%.
2014-10-30 22:08:17 +01:00
Jiří Techet
bdee1336aa Keep a separate list of typenames for Scintilla syntax highlighting
Manage the list the same way as workspace tags_array by the fast tag removal
and merge. Thanks to this, typename tags don't have to be extracted from
tags_array periodically, which speeds up editing.
2014-10-30 22:08:17 +01:00
Jiří Techet
32a3dfab7f Use binary search when removing file tags
Even though the binary search requires expensive string comparisons,
there are just log(n) of them to find the tag in the workspace array
and the result is much faster than scanning the array linearly (this
of course works only under the condition that

len(source_file->tags_array) << len(workspace_array)

This is however satisfied for big projects (and doesn't matter for small
projects).

Also make the tm_tags_find() function more user friendly by returning
tagCount 0 when no tags found.
2014-10-30 22:08:17 +01:00
Jiří Techet
be131b00f9 Extend dedup() and merge() to unref the duplicate tag when needed 2014-10-30 22:08:17 +01:00
Jiří Techet
6ba3bb46a4 Don't pass arguments to search/sort functions using static variables
Instead of qsort() it's possible to use g_ptr_array_sort_with_data() with
similar performance. Unfortunately it seems there's no bsearch_with_data()
anywhere so the patch uses a modified bsearch() implementation from libc
(still probably cleaner than passing arguments using static variables).
2014-10-30 22:08:17 +01:00
Jiří Techet
15c90b63c9 Get rid of lazy initialization in TM
Lazy initializing various member pointers doesn't bring any real performance
improvement but it requires lots of additional NULL checks - get rid of
this.

Make some more cleanups on the way.

In addition, remove success/failure return values from tm_workspace_add_source_file()
and tm_workspace_remove_source_file() which have no real use.
2014-10-30 22:08:17 +01:00
Jiří Techet
43b8ab8d23 Only keep the minimal set of parameter in the TM API calls
Avoid "utility" parameters like do_free for which we already have API calls
and which actually don't perform any free if the source file isn't
in TM. Clarify when to set the update_workspace parameter.
2014-10-30 22:08:17 +01:00
Jiří Techet
0285ec28a5 Move tm_source_file_update() to tm_workspace.c
The placement of this function in tm_source_file is not right - by moving
it to the workspace we can make the source file unaware of the existence
of the workspace (no inclusion of tm_workspace.h in tm_source_file any
more). Also change tm_source_file_new() so it doesn't offer the source file
update.

After this change
* TMWorkspace knows TMSourceFile and TMTag
* TMSourceFile knows TMTag
* TMTag knows TMSourceFile
2014-10-30 22:08:17 +01:00
Jiří Techet
a183d9cb97 Move the refcount TMTag member up in the structure and don't document it for plugins 2014-10-22 16:58:38 +02:00