149 Commits

Author SHA1 Message Date
Colomban Wendling
585b16b0da COBOL: Import new upstream candidate parser
This fixes support for COBOL symbols after the recent breakage of regex
parsers, as well as introducing additional features and bug fixes.

Also import some of the tests.

https://github.com/universal-ctags/ctags/pull/2076

Part of #2119.
2019-04-20 11:36:03 +02:00
Jiří Techet
a1cf475fcf Sync ctags with upstream so that most parsers can be copied from uctags (#2018)
* Use latest version of htable

* Use latest version of mio

* Use latest version of objpool

* Use latest version of ptrarray

* Use latest version of vstring

This also requires adding trashbox.c/h which is now used by vstring and
inline macros from inline.h.

* Rename fieldSpec to fieldDefinition

See b56bd065123d69087acd6f202499d71a86a7ea7a upstream.

* Rename kindOption to kindDefinition

See e112e8ab6e0933b5bd7922e0dfb969b1f28c60fa upstream

* Rename kinds field in parserDefinition to kindTable

See 09ae690face8b5cde940e2d7cf40f8860381067b upstream.

* Rename structure fields about field in parserDefinition

See a739fa5fb790bc349a66b2bee0bf42cf289994e8 upstream.

* Use kindIndex instead of kindDefinition

This patch replaces kindDefinition related entries from sTagEntryInfo
with kindIndex so kinds are referenced indirectly using the index. For
more info please refer to commits:

16a2541c0698bd8ee03c1be8172ef3191f6e695a
f92e6bf2aeb21fd6b04756487f98d0eefa16d9ce

Some other changes had to be made to make the sources compile (without
bringing all the diffs from upstream). At some places, which aren't used
by Geany, only stub implementations have been created.

In particular, the regex parser has been disabled (for now?) because its
current implementation doesn't allow accessing kindDefinitions using
index and allowing this would require big changes in its implementation.
The affected parsers are Cobol, ActionScript and HTML. For HTML we can
use the token-based parser from upstream, and we should consider
whether Cobol and ActionScript are worth the effort to maintain a separate
regex implementation using GRegex (IMO these languages are dead enough
not to justify the extra effort).

The patch also disables tests for languages using regex parsers.

* Rename roleDesc to roleDefinition

See 1345725842c196cc0523ff60231192bcd588961b upstream. Since we don't care
about roles in Geany, we don't have to do the additional stuff the upstream
patch does.

* Add XTAG_ANONYMOUS used by jscript

See 0e4c5d4a0461bc8d9616fe3b97d75b91d014246e upstream.

* Include stdint.h in entry.h

* Don't use hash value as an Anonymous field identifier

Instead of something like "Anonymous0ab283cd9402" use sequential integer
values like "Anonymous1".

* Call anonReset in main part

See 3c91b1ea509df238feb86c9cbd552b621e462653 upstream.

* Use upstream javascript parser

* Use upstream css parser

* Create correctly sized MIO for 0 size

See https://github.com/universal-ctags/ctags/pull/1951

* Always enable promise API and subparsers for Geany

* Support subparsers in Geany and add HTML parser demonstrating this feature

This feature requires several changes:

1. Propagating language of the tag from ctags to Geany so we know whether
the tag comes from a master parser or a subparser.

2. We need to address the problem that tag types from a subparsers can
clash with tag types from master parsers or other subparsers used by the
master parser. For instance, HTML and both its css and javascript
subparsers use tm_tag_class_t but HTML uses it for <h2> headings, and
css and javascript for classes. Representing all of them using
tm_tag_class_t would lead to complete mess where all of these types would
for instance be listed in the same branch of the tree in the sidebar.

To avoid this problem, this patch adds another mapping for subparsers where
each tag type can be mapped to another tag type (which isn't used neither
by master parser or other subparsers). To avoid unwanted clashes with other
parsers, only tags explicitly mentioned in such mappings are added to tag
manager; other subparser tags are discarded.

For HTML this patch introduces mapping only for tm_tag_function_t (which
in this case maps to the same type) to mimick the previous HTML parser
behavior but other javascript and css tag types can be added this way
in the future too.

3. Since in most of the code Geany and tag manager assume that tags from
one file use the same language, subparser's tags are modified to have the
same language like the master parser.

4. HTML parser itself was copied from upstream without any modifications.
Tests were fixed as the parser now correctly ignores comments.

* Rename truncateLine field of tagEntryInfo

See 0e70b22791877322598f03ecbe3eb26a6b661001 upstream. Needed for Fortran
parser.

* Add dummy mbcs.h and trace.h

Included by javascript parser.

* Introduce an accessor to `locate' field of `Option'

See fb5ef68859f71ff2949f1d9a7cab7515f523532f upstream. Needed for Fortran.

* Add numarray.c/h

Needed by various parsers.

* Add getLanguageForFilename() and getLanguageForCommand()

See

416c5e6b8807feaec318d7f8addbb4107370c187
334e072f9d6d9954ebd3eb89bbceb252c20ae9dd

upstream. Needed for Sh parser.

* txt2tags: Fix scope separator definition and re-enable tests

* Rename rest.c to rst.c to match upstream filename

* Use upstream asciidoc and rst parsers

* Add asciidoc and rst unit tests

* Rename conf.c to iniconf.c to match upstream filename

* Add tests of conf, diff, md parsers from universal ctags

* Add more ctags unit tests

This patch adds unit tests for: nsis, docbook, haskell, haxe, abaqus, vala,
abc.

The only missing unit tests are for GLSL and Ferite parsers which
however share the implementation with the C parser and should be
reasonably well covered by other C-like language tests.

The tests were put together from various tutorials and help of the
languages in order to cover the tags these parsers generate. No guarantee
they'd compile with real parsers.

* Rename latex.c to tex.c to match upstream filename

* Rename entry points of parsers to match upstream names

* Initialize trashbox

* Add newline to the end of file
2019-04-06 12:14:30 +10:00
Colomban Wendling
4452b365bf Merge pull request #1263 from techee/ctags_sync_main
First part of syncing with Universal-CTags.
2018-12-17 21:37:43 +01:00
Colomban Wendling
8b68c5a2ca Add a test for the processing order when generating a tags file 2018-11-12 11:47:25 +01:00
Jiří Techet
99e0f208b2 Merge branch 'master' into ctags_sync_main
# Conflicts:
#	ctags/main/lcpp.c
#	ctags/main/parse.c
2018-10-13 14:25:12 +02:00
kloun
adc22a453b bash may not found in the system (#1574)
exampe openbsd.
2017-08-08 14:40:58 +10:00
Colomban Wendling
6f692112e3 C: Fix line continuation handling (#1370)
Escaped newlines were properly handled inside preprocessor directives,
but not otherwise.

Seeing `continue` here suggests the code used to work a long time ago
but some loop refactoring broke it, as now it would not stay in the
loop unless in a preprocessor directive.  Or maybe it only ever worked
for preprocessor directives, and the `continue` was superfluous?

Fixes #1370.
2017-04-20 16:57:02 +10:00
Jiří Techet
8455f8e70d lcpp, c: Fix signature collection
First, make sure that when calling cppGetc() and cppUngetc() the signature
is properly updated.

Second, make sure that signature is cleared when preparing for new token
read.
2017-02-15 00:31:10 +01:00
Jiří Techet
63fbe2f6a2 c.c, lcpp.c: Avoid using File.mio
In uctags File is made private and mio gets inaccessible. At the moment
it's used by c.c and lcpp.c to get the parameter list. The C parser
"marks" the position where the argument list starts and once the right
")" is reached, string corresponding to this range is read from MIO,
filtered and used for parameter list.

For macro parameters the end of parameter list is handled in a slightly
obfuscated way - since the code from read.c reads the code by lines,
getInputFilePosition() returns the position of EOL so the parameter list
is read between '(' and EOL.

The code had to be modified to collect the potential parameter string
on the way - vString *signature has been added to lcpp.c and every
getcFromInputFile() and ungetcToInputFile() has been converted to
getcAndCollect() and ungetcAndCollect(), respectively, which in addition
perform the parameter collection when needed. Unfortunately this involves
many places in lcpp.c and we must be careful to always use these instead
of the standard ones from read.c.

We cannot rely on the implicit reading of whole lines and must add such
a code ourselves: just plain reading and collecting is enough. In addition
I added handling of multi-line signatures which was missing before.

In tests "bug1585745.cpp" and "cpp_destructor.cpp" the new code fixes
missing () in destructors when there's a space between tilde and name.
In "simple.d" test it fixes wrong function prototype.

The output of test "bug507864.c" seems to be worse than before but it was
already broken before and apparently the compiler is confused by it.
2016-10-11 00:03:18 +02:00
Colomban Wendling
60147a8c8d Merge pull request #857 from techee/cpp_h
Treat the "h" extension as a C++ file
2016-06-10 23:30:02 +02:00
Masatake YAMATO
9bc5857f89 make: fix a typo in parenthesis handling
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Colomban Wendling <ban@herbesfolles.org>
X-Universal-CTags-Commit-ID: 39c1236cc1a40aac6b93c60537d30489912bbbb2
X-Universal-CTags-Issue: universal-ctags/ctags#944
2016-05-31 15:35:49 +02:00
Colomban Wendling
83c2a0de69 Merge branch 'ruby/uctags-update'
Update Ruby parser from Universal-CTags.

Closes #587.
2016-03-19 23:22:38 +01:00
Colomban Wendling
116c749cd9 ruby: Report singleton type inside anonymous classes at a class level
New test case contributed by @masatake in universal-ctags/ctags#456.

Closes universal-ctags/ctags#455 and universal-ctags/ctags#456.
2016-03-14 19:27:22 +01:00
Colomban Wendling
17606d8af7 ruby: Fix scope after anonymous classes 2016-03-14 19:27:22 +01:00
Colomban Wendling
1ed29f1d7b ruby: Fix parsing qualified identifiers
The implementation is a bit hacky, but avoids the need for complex
logic to pop several scopes at once.

Closes universal-ctags/ctags#452.
2016-03-14 19:27:22 +01:00
Colomban Wendling
eaf6c82af8 ruby: Fix keyword matching
After an identifier there can be anything but an identifier character.
2016-03-14 19:27:22 +01:00
Colomban Wendling
e003da2bea ruby: Properly skip documentation contents 2016-03-14 19:27:22 +01:00
Masatake YAMATO
e9e9b9988d ruby: handle singleton method including ?!= in its name(sf.bug:364)
This patch is intended a bug reported as sf.bug:364.
https://sourceforge.net/p/ctags/bugs/364/

Writing a test case is helped by Dmitry Gutov.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2016-03-14 19:27:22 +01:00
Jiří Techet
e513e5a099 Move filetypes.* to the filedefs directory 2016-03-13 17:35:08 +01:00
solawing
bef06691b5 objc: fix property parser won't exit bug
Based on 21e74e6a019975045a7975bc611ae63f0917f976 from universal-ctags,
and update the tests accordingly, thanks to @JX7P.

Closes #940.

X-Universal-CTags-Commit-ID: 21e74e6a019975045a7975bc611ae63f0917f976
2016-03-07 12:50:52 +01:00
Colomban Wendling
2c99c8827f c++: Fix a test result
748137bd1dfa648948d9d127aa3e27b6857db764 improved return types, but as
this test case was added in parallel it wasn't updated as needed for
the new, more correct, results.
2016-02-11 15:48:50 +01:00
Colomban Wendling
e38c7e3b67 Merge pull request #879 from b4n/c/cxx11-raw-strings
c++: Fix parsing of C++11 raw string literals.
2016-02-11 15:26:32 +01:00
Colomban Wendling
748137bd1d C, C++, C#, D: Improve return type and var type recognition
This is far from perfect and contains a lot of guessing.  It showed
good results based on our tests cases, fixing several issues and not
introducing any more issues (admittedly, after working around a subtle
one regarding D static ifs).

Closes #845.
2016-01-26 16:18:11 +01:00
Colomban Wendling
5a279f0bf6 c++: Fix parsing of prefixed C++11 raw string literals
See http://en.cppreference.com/w/cpp/language/string_literal
2016-01-24 17:33:32 +01:00
Colomban Wendling
a14aa908c5 c++: Fix parsing of C++11 raw string literals
See http://en.cppreference.com/w/cpp/language/string_literal

Closes #877.

---

This contains a pretty ugly hack to fetch the previous character, in
order not to get fooled by string concatenation hidden behind a macro,
like in `FOUR"five"`, which is not a raw string literal but simply the
identifier `FOUR` followed by the string `"five"`.

While this may sound uncommon, it is not and lead to complaints [2][3]
when Scintilla [1] broke this when they introduced C++11 raw string
literal support themselves.

The implementation here still contains a bug with line continuations: a
raw literal of the form:

```c
const char *str = R\
"xxx(...)xxx";
```

is not properly recognized as such, although it's perfectly valid (yet
probably very uncommon).  For the record, Scintilla has also suffers
from this but nobody complained about it yet.

[1] http://scintilla.org/
[2] https://sourceforge.net/p/scintilla/bugs/1207/
[3] https://sourceforge.net/p/scintilla/bugs/1454/
2016-01-23 21:52:40 +01:00
Colomban Wendling
1c4a9d8dd3 C++: Fix parsing of global scope qualifiers in base class lists
See also https://sourceforge.net/p/ctags/bugs/194/

I didn't use the exact upstream patch only altering the C++ code path,
because as far as I know no c.c language recognize two consecutive
colons separated by whitespace as a single token, so there's no point
in carrying on mistakes from the past.
2016-01-17 04:03:24 +01:00
Colomban Wendling
440a736018 C++, C#: Properly set scope on namespaces
Closes #871.
2016-01-17 03:30:06 +01:00
Jiří Techet
c7bf89a464 Treat the "h" extension as a C++ file
The extension is used by both C and C++ and lexing/parsing C headers with
the C++ parser causes less problems (identifiers named like C++ keywords
get highlighted and tags aren't generated for them) than parsing C++
headers with the C parser (parsing and lexing completely broken).
2016-01-07 23:28:18 +01:00
Ben Wiederhake
2df9f83bf2 Typos overlooked by codespell 2016-01-03 18:44:00 +01:00
Ben Wiederhake
29a6b9c003 Fix typos
All of these typos were found by codespell, so credits go the
the authors of this incredibly useful tool.

I manually confirmed and adapted all changes, which includes
reflowing over-long lines or filling up with spaces for alignment.

Some of these typos may need forwarding to their original authors.
codespell reported a lot words where I am unsure; I have not
included those corrections.
2016-01-03 18:33:25 +01:00
Colomban Wendling
b2879e9fca tagmanager: Fix handling of scopes starting with a non-ASCII character
Fix handling of scopes starting with a non-ASCII character.

Actually, just drop the check on the first byte of the scope, as it
doesn't seem to serve any purpose as it only checks the first byte (so
isn't any kind of real validation; and as it predates Geany it's
impossible to know the real reason behind this check), and breaks
support for non-ASCII scopes.
2015-10-12 19:20:02 +02:00
Pavel Sountsov
568787bc2f Change Rust tests to be in line with the ones in the universal-ctags tests. 2015-08-21 21:21:18 -07:00
Pavel Sountsov
91ee437640 Parse 'where' bounds correctly. 2015-08-21 21:19:47 -07:00
Pavel Sountsov
6814fc1a62 Update Rust keywords. 2015-08-21 21:17:59 -07:00
Colomban Wendling
96d5eec50f Merge pull request #544 from b4n/cxx11-override
c++: Properly parse C++11 overrides, finals and noexcepts
2015-07-08 18:04:05 +02:00
Enrico Tröger
4017442f86 Merge pull request #477 from eht16/ctags_powershell
Add PowerShell tag parser
2015-07-04 12:52:46 +02:00
Colomban Wendling
4476ed9c4b c++: Add a small test combining various C++14 things 2015-07-01 12:55:52 +02:00
Colomban Wendling
f60b31385e c++: Handle C++11 noexcept 2015-07-01 12:55:52 +02:00
Colomban Wendling
95a0d4db7e c++: Properly parse C++11 override and final members
As `override` and `final` aren't real keywords, handle them manually
not to break identifiers of those names.
2015-07-01 12:55:29 +02:00
Colomban Wendling
641863c264 c++: Fix handling of the final contextual keyword
`final` is not a normal keyword, as it only have a special meaning in
some specific context.  So, use a special case instead of a keyword not
to break identifiers of that name.
2015-06-30 23:22:08 +02:00
Enrico Tröger
8a6fbd9786 Add PowerShell tag parser 2015-06-28 15:46:23 +02:00
Colomban Wendling
46a123d6fe python: Fix handling of inline comments
If there was two hashes (#) in an inline comment, only the content
between the two was considered a comment.

X-Universal-CTags-Commit-ID: ee93f5b9f393e76a850cf8c894cc748a62981156
2015-06-25 22:47:32 +02:00
Colomban Wendling
dbbc042786 c family: Add support for digraphs
See http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf 6.4.6§3.

Note: This is not exactly the upstream Universal CTags commit because
it depends on another change for adding the `enter` label, which was
then included here.

X-Universal-CTags-Commit-ID: 3b3b60c7664a321a31ec87de336fc6bda90c405e
2015-06-14 18:25:38 +02:00
Colomban Wendling
b737f031ca c family: Fix trigraph handling
X-Universal-CTags-Commit-ID: d6d1a0f2b90a600bdec9cd6ba964ee69382743e4
2015-06-14 18:23:29 +02:00
Colomban Wendling
944bffb967 json: Fix handling of tags containing a dot
X-Universal-CTags-Commit-ID: 7ae28a3d8a7ad5f8a9d6399a4e357fcf19ad2b2e
2015-06-14 17:29:04 +02:00
Colomban Wendling
d75598cc48 python: Fix resetting scope on anonymous blocks
The previous fix, coming from [CTags bug #1988026], was incorrect if
the parent was not a root-level element, as it checked the level name
(unqualified) against the parent name (qualified).

However, there is no need to check the level name, all what counts is
the indentation level itself: if it's smaller than an existing level,
it ends it.

This fixes [CTags bug #356].

[CTags bug #1988026]: https://sourceforge.net/p/ctags/bugs/227/
[CTags bug #356]: https://sourceforge.net/p/ctags/bugs/356/

X-Universal-CTags-Commit-ID: ab91e6e1ae84b80870a1e8712fc7f3133e4b5542
2015-06-14 17:13:46 +02:00
Jiří Techet
206379a272 Parse return value of go functions
Unfortunately varType is Geany-only so this patch cannot be ported to ctags.

The removal of the extra { read is not the most elegant thing but making
skipType() aware of the argList collection complicates things too much.
2015-05-28 16:27:23 +02:00
Jiří Techet
e433490672 Sync go parser with fishman-ctags
New features include:
* struct/interface detection
* struct member parsing
* function prototype parsing
2015-05-28 16:22:54 +02:00
Colomban Wendling
39f359b09a make: Add support for GNU make pattern rules 2015-04-20 19:59:06 +02:00
Colomban Wendling
a11d67bb0b make: Fix handling comments inside rules
A line consisting only of blanks or comments should not end a rule,
even if it doesn't start with a tabulation character.
2015-04-20 19:53:28 +02:00