The implementation of this function is almost the same like the original
m_workspace_find_scoped_members() and there's nothing interesting here
we wouldn't be able to recreate trivially.
By comparing the file pointer in the loop we can speed it up a bit
because we can avoid the strcmp() (this function is the slowest part of
the scope completion based on profiling).
Also move the pointer array creation to this function and return it which
is a bit cleaner.
Disclaimer: I have absolutely no idea how the original function works.
After gazing into the code for one hour, I just gave up and wrote my own
version of it based on what I think the function should do
but maybe I'm just missing something what justifies the original
implementation's insanity.
The previous commit fixed the situation when e.g. anon_struct_0 was in the
current file by checking the current file first.
In the case the struct type definition isn't found in the current file,
at the moment we get all members from all anon_struct_0 which can be a
really long list. This list isn't very helpful to users because all the
members from all the structs are mixed. Moreover, not all possible members
are in the list because there are e.g. just members from anon_struct_0 but
not from anon_struct_1 etc. which from the point of view of this function
is a different type.
Instead, restrict the returned members to just a single file (anonymous
structs have unique name per file so it means there will be just one
from the file). Of course the picked file can be wrong and the returned
members might be from a different struct the user wanted but at least
the list will make more sense to users.
At the moment it can happen that even though a member is found in the
currently edited file, the search at the end of the function finds
the type inside another file. This typically happens for anonymous
structs so e.g. for anon_struct_0{...} from the current file we get
members from anon_struct_0{...} from all open documents plus gloabl tags.
Search in an increasing "circle" - start with current file only (trying
all possible types of the variable), continue with workspace array and
finally, if not found, search in the global tags.
All of these typos were found by codespell, so credits go the
the authors of this incredibly useful tool.
I manually confirmed and adapted all changes, which includes
reflowing over-long lines or filling up with spaces for alignment.
Some of these typos may need forwarding to their original authors.
codespell reported a lot words where I am unsure; I have not
included those corrections.
These appear under 64-bit Windows. Only the sciwrappers.c warning is
potentially dangerous. For win32.c, the "handle" provides some useful
information, while "lStdHandle" does not.
Drop the loop in mem_read() in favor of a single memcpy() call.
This greatly improves performances when nmemb > 1, for a small loss
for some values of size when nmemb == 1. Gain can theoretically be
infinite since swapping nmemb and size parameters changes almost
nothing while it had a dramatic performance impact previously. Loss
is up to about 25% in the worst case for some values of size when
nmemb is 1.
Also, now the function always copies as much data as possible, not only
whole blocks. This follows the glibc implementation of fread() and
simplifies the code. Doing so also fixes the position after a partial
read to be at the last readable character rather than the end of the
last read block.
Fix handling of scopes starting with a non-ASCII character.
Actually, just drop the check on the first byte of the scope, as it
doesn't seem to serve any purpose as it only checks the first byte (so
isn't any kind of real validation; and as it predates Geany it's
impossible to know the real reason behind this check), and breaks
support for non-ASCII scopes.
This function won't work correctly on unsorted array because the second
part of the function (after the tags search) expects the array is sorted
by name. The only user of this is tm_source_file_set_tag_arglist() in which
we can go through the tags manually by ourselves (it needs only a single
value so the original behavior of tm_tags_find() wasn't a problem).
Eliminate the tags_search() function as it isn't needed any more.
Just cleanup, not functional change.
Do the same with struct/class/union... member tags as we do with
typenames - extract them from the edited file and merge them with
the array containing all of them so while editing, there should
be no slowdowns because one file usually doesn't contain so many
tags. This eliminates about 2s freeze when typing "." on a linux
kernel project with 2300000 tags.
Extract typename and member tags also for global tags in case someone
creates a giant tags file - this needs to be done just once when
loading the tag files.
All the remaining tm_tags_extract() in Geany are called on
file tag array only so there shouldn't be any performance problems.
This patch contains a bit too many things which are however related.
It started by the part in editor.c (where we previously used only the
first type we found to perform scoped search) by going through all the
possible variable types until the scoped search returns some result
(this is useful if variable foo is used once as int and once as struct
and if the int is the first type found, we won't get the struct's members).
This didn't work. After an hour of debugging, it turned out that
because tm_workspace_find_scope_members() calls internally
tm_workspace_find() and this function returns static array, this
invalidates the array returned by the tm_workspace_find() used
previously to get all the possible variable types.
Since this is really dangerous and hard to notice, I tried to eliminate
the static returns from both tm_workspace_find() and
tm_workspace_find_scoped_members().
The tm_workspace_find_scoped_members() function is where I got
stuck because as I started to understand what it's doing, I found
many problems there. This patch does the following in this function:
1. Eliminates search_global and no_definitions parameters because
we always search the whole workspace and this simplifies the slightly
strange logic at the end of the function.
2. Returns members from global tags even when something found in
workspace tags - previously global tags were skipped when something
was found from workspace tags but I don't see a reason why.
3. Adds the lang parameter to restrict tags by language (we do this
with normal search and the same should be done here).
4. Previously when searching for types with members the function
returned NULL when more than one such type was found (there should
have been >=1 instead of ==1 at line 906). This patch improves the
logic a bit and if multiple types are found, it tries to use the one
which is other than typedef because it probably has some members (the
typedef can resolve to e.g. int).
5. Previously the function prevented only direct typedef loops like
typedef A B;
typedef B A;
but a loop like A->B->C->A would lead to an infinite cycle. This patch
restricts the number how many times the typedef can be resolved by
using for loop with limited number of repetitions and giving up when
nothing useful is resolved.
6. Finally the patch tries to simplify the function a bit, make it
easier to read and adds some comments to make it clearer what the
function does.
They are basically identical except:
1. _scoped() compares scope in addition
2. _scoped() is missing the C/CPP tag compatibility part
3. _scoped() allows returning just single result (unused)
4. _scoped() allows not searching in global tags (unused)
Since we now always put lang also under tag->lang, the match_langs()
function is not necessary.
Extend the add_filtered_tags() (and rename it to fill_find_tags_array()) to
perform the tm_tags_find(), compare the scope and add scope
as parameter of tm_workspace_find() and eliminate tm_workspace_find_scoped()
completely.
1. Factor-out the part common to tags_array and global_tags
2. Get both C/CPP tags when either of the languages is specified (both
for global_tags and tags_array)
3. Remove unnecessary strcmp()s (tm_tags_find() should return only tags
with the specified name)
4. Various minor cleanups
`final` is not a normal keyword, as it only have a special meaning in
some specific context. So, use a special case instead of a keyword not
to break identifiers of that name.
If there was two hashes (#) in an inline comment, only the content
between the two was considered a comment.
X-Universal-CTags-Commit-ID: ee93f5b9f393e76a850cf8c894cc748a62981156
Most of the time there's no start of a string which means all the 10
strcmp()s are done for every character of the input. This is very expensive:
before this patch this function alone takes 55% of the parser time.
When comparing by character (and avoiding further comparison if the first
character doesn't match), this function takes only 11% of the parser time
so the performance of the parser nearly doubles.
In addition check for the "rb" prefix which is possible in Python 3.
Ported from universal-ctags.
The character following an '@' was dropped if it didn't start a string
literal.
This could lead to unexpected problems if '@' was valid in other
situations.
X-Universal-CTags-Commit-ID: 2e62f475af1db08850447de46f56db14ce99d2eb
See http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf 6.4.6§3.
Note: This is not exactly the upstream Universal CTags commit because
it depends on another change for adding the `enter` label, which was
then included here.
X-Universal-CTags-Commit-ID: 3b3b60c7664a321a31ec87de336fc6bda90c405e
When tested with 200000 LOC python file (created by making many copies
of scripts/create_py_tags.py), the tm_tags_remove_file_tags() function
takes about 50% of the CPU time when only this file is open. After adding
the linear path to tm_tags_remove_file_tags() it takes just about 2%. See
the comment in the patch for more details.
Add API to truncate a vString to a certain length. This doesn't support
growing the string, only shrinking it.
X-Universal-CTags-Commit-ID: 4e3d9edf2e7a8a476ff97bc678e71c3919b960f9
The previous fix, coming from [CTags bug #1988026], was incorrect if
the parent was not a root-level element, as it checked the level name
(unqualified) against the parent name (qualified).
However, there is no need to check the level name, all what counts is
the indentation level itself: if it's smaller than an existing level,
it ends it.
This fixes [CTags bug #356].
[CTags bug #1988026]: https://sourceforge.net/p/ctags/bugs/227/
[CTags bug #356]: https://sourceforge.net/p/ctags/bugs/356/
X-Universal-CTags-Commit-ID: ab91e6e1ae84b80870a1e8712fc7f3133e4b5542
Unfortunately varType is Geany-only so this patch cannot be ported to ctags.
The removal of the extra { read is not the most elegant thing but making
skipType() aware of the argList collection complicates things too much.
The tags_lang variable is set from the first tag in the found array but
the array may actually contain tags from several languages. This may
lead to two things:
1. Goto tag definition goes to a tag from a different filetype
2. Worse, the first tag is from a different language than the current file
and we get a message that no tag was found
Get lang for every tag in the array and rename tags_lang to tag_lang.
gtk_tree_store_set() becomes very slow when the tree gets bigger
because internally it calls gtk_tree_store_get_path() which counts
all the entries in a linked list of elements at the same tree level
to get the tree path.
Avoid the call of this function when not needed.
Fixing this is however only theoretically useful, as:
* no actual code paths can currently lead to it;
* even if the code actually ended up reading the uninitialized value,
it would still have a fully defined behavior as the result of the
check is irrelevant in the only case the uninitialized read can
happen.
Anyway, fix this to avoid any possible bad surprises in the future.
This makes it easier to define it consistently to what the compiler
and platform supports, and avoids having to include a special header
everywhere, which is some kind of a problem for separate libraries
like TagManager and especially Scintilla.
As we only use these macros from the source and not the headers, it
is fine for it to be defined to a configure-time check from the build
system.
Warning: Although Waf and Windows makefiles are updated they are not
tested an will probably required tuning.