Character sets and Unicode Byte-Order-Mark (BOM)

Using character sets

Geany provides support for detecting and converting character sets. So you can open and save files in different character sets and even can convert a file from a character set to another one. To do this, Geany uses the character conversion capabilities of the GLib.

Only text files are supported, i.e. opening files which contain NUL-bytes may fail. Geany will try to open the file anyway but it is likely that the file will be truncated because it can only opened up to the first occurrence of the first NUL-byte. All characters after this position are lost and are not written when you save the file.

Geany tries to detect the encoding of a file while opening it. It might be that the encoding of a file cannot be detected correctly so you have to set manually the encoding of the file in order to display it correctly. You can this in the file open dialog by selecting an encoding in the drop down box or by reloading the file with the file menu item "Reload as". The auto detection works well for most encodings but there are also some encodings known where auto detection has its problems. Auto detecting the encoding of a file is not easy and sometimes an encoding might be detected not correctly.

There are different ways to use different encodings in Geany:

  1. Using the file open dialog

    This opens the file with the encoding specified in the encoding drop down box. If the encoding is set to "Detect from file" auto detection will be used. If the encoding is set to "Without encoding (None)" the file will be opened without any character conversion and Geany will not try to auto detect the encoding(see below for more information).

  2. Using the "Reload as" menu item

    This item reloads the current file with the specified encoding. It can help if you opened a file and found out that a wrong encoding was used.

  3. Using the "Set encoding" menu item

    In contrary to the above two options, this will not change or reload the current file unless you save it. It is useful when you want to change the encoding of the file.

Special encoding "None"

There is a special encoding "None" which is actually no real encoding. It is useful when you know that Geany cannot auto detect the encoding of a file and it is not displayed correctly. Especially when the file contains NUL-bytes this can be useful to skip auto detection and open the file properly at least until the occurrence of the first NUL-byte. Using this encoding opens the file as it is without any character conversion.

Unicode Byte-Order-Mark (BOM)

Furthermore, Geany detects an Unicode Byte Order Mark (see http://en.wikipedia.org/wiki/Byte_Order_Mark for details). Of course, this feature is only available if the opened file is in an unicode encoding. The Byte Order Mark helps to detect the encoding of a file, e.g. whether it is UTF-16LE or UTF-16BE and so on. On Unix-like systems using a Byte Order Mark could cause some problems, e.g. the gcc stops with stray errors, PHP does not parse a script containing a BOM and script files starting with a she-bang maybe cannot be started. In the status bar you can easily see whether the file starts with a BOM or not. If you want to set a BOM for a file or if you want to remove it from a file, just use the document menu and toggle the checkbox.

Note

If you are unsure what a BOM is or if you do not understand where to use it, then it is not important for you and you can safely ignore it.