Columns++ for Notepad++ documentation

Introduction

Columns++ is a plugin for Notepad++ which offers features for working with text and data arranged in columns, including an implementation of elastic tabstops, enhanced searching and sorting, column alignment and numeric calulations. Like Notepad++, Columns++ is released under the GNU General Public License (either version 3 of the License, or, at your option, any later version). Columns++ was first released by Randall Joseph Fellmy in 2023; you can find the source code on GitHub.

Columns++ uses the C++ Mathematical Expression Toolkit Library (ExprTk) by Arash Partow, which is released under the MIT license; JSON for Modern C++ by Niels Lohmann, which is released under the MIT license; and the Boost.Regex library, which is released under the Boost Software License, Version 1.0.

Purpose and limitations

Columns++ is designed to provide some helpful functions for editing text or data that is lined up visually in columns, so that you can make a rectangular selection of the column(s) you want to process.

The integrated implementation of Elastic tabstops works to line up columns when tabs are used as logical separators, including tab-separated values data files as well as any ordinary text or code document containing sections in which you want to line up columns easily using tabs. You can use this feature on its own or with the other functions in Columns++.

Columns++ is optimized for use with Elastic tabstops. It also works with files that use traditional, fixed tabs for alignment, or no tabs at all; however, you should ordinarily select only one column at a time in files that don’t use Elastic tabstops.

Columns++ is generally not helpful when columns do not line up visually, such as in comma-separated values files. However, Columns++ can convert between delimiter-separated values and tabbed presentation; and there are some features, particularly Search using numeric formulas in regular expression replacement strings and Sorting with custom criteria, which may be useful in documents that are not column-oriented.

Elastic tabstops can cause loading and editing to be slow for large files. By default, Elastic tabstops is automatically turned off for files over 1000 KB or 5000 lines. You can change these limits.

Elastic tabstops

Columns++ includes a new implementation of Nick Gravgaard’s Elastic tabstops. (Please note that as of this writing I have not communicated with Mr. Gravgaard about my implementation of his proposal, and no endorsement on his part is implied. — RJF)

The first item of the Columns++ menu enables or disables Elastic tabstops. Elastic tabstops stretches tabs so that columns line up to fit their content, using only a single tab to separate one column from the next.

This implementation of Elastic tabstops includes some options that were not part of the original proposal. These options can be accessed by using the Profile... menu option. There are three “built-in” profiles:

Classicendeavors to reproduce precisely the behavior described in the proposal linked above.
Generalensures that leading tabs are always used for indentation, and are not lined up with elastic tabstops.
Tabularis suitable for tab-separated values files, in which the entire file is a single table with the values in each row separated by single tabs.

You can select a profile from the drop-down box in the Elastic tabstops profile dialog. You can also change individual settings; choose some options to automatically enable a profile or disable elastic tabstops for different types of files; and save, rename or delete profiles.

Settings in an elastic tabstops profile

Along with the enabled or disabled status of elastic tabstops, the settings in an elastic tabstops profile are kept independently for each document you have open. These settings, which are available in the Elastic tabstops profile dialog, are:

Use leading tabs for indentation only; don't make them elastic.When checked, this option treats tabs which occur at the beginning of a line, before the first non-tab character, as ordinary fixed-width tabs instead of elastic tabs. Without this option, a line with a tab used to line up a column of data cannot be followed by a line that uses tabs for indentation without an intervening blank line; otherwise, the first leading tab will expand to line up with the tab on the previous line. The disadvantage is that if you want an empty column at the beginning of a line, you must place a space before the first tab to make it line up with the next column.
Line up elastic tabstops throughout the entire document.Normally elastic tabstops are positioned independently whenever a column is interrupted; that is, tabstops created by tabs that appear on adjacent lines are lined up, but they don’t “project through” lines with fewer (or no) tabs. This option indicates that a single set of tabstops is to be used for the entire document, so that columns line up even when intervening lines have fewer columns.
Do not allow text following the last tab on a line to span columns.Normally, text following the last tab on a line is not treated as belonging to a “column” at all. This makes sense for documents that mix text and tables. However, for documents that are entirely tabular but have omitted tabs at the end of lines where the final columns are blank, this option (along with the one above) is needed to keep things lined up properly.
Override default/language tab size (used for indent or minimum):Elastic tabstops uses the “tab size” in different ways depending on whether Use leading tabs for indentation only is checked: when checked, the tab size represents the number of spaces each leading tab indents, and it is otherwise ignored; when unchecked, it is the minimum space between any two tabstops (that is, the width of the intervening column plus the space between columns). When the Override tab size box is unchecked, Columns++ uses the tab size set in Notepad++; when checked, the spin box to the right specifies the size (in spaces) to be used.
Minimum space between elastic columns:This spin box specifies the size (in spaces) occupied by the tab following the longest span of text in a column.
Apply monospaced font optimizations:

Responsiveness with elastic tabstops enabled is greatly improved if it is possible to calculate the width of text by counting characters rather than by measuring. This only works if the fonts in use are monospaced (typewriter-like fonts in which every character has the same width; also called fixed pitch fonts), and all the fonts used by the styles in the current language have the same width.

YesMonospaced font optimizations are applied unconditionally.
NoMonospaced font optimizations are not applied.
Best estimateMonospaced font optimizations are applied if they appear to be appropriate. The text (yes) or (no) following Best estimate indicates whether Columns++ has determined that display of the current document is monospaced.
Don't show expanded mnemonics for non-printing characters when monospaced.Notepad++ normally shows mnemonics, like ESC or NBSP, for control characters, invalid characters and (at user option) other non-printing characters. While this is usually helpful, it breaks the assumption of monospacing, which can dramatically slow some operations on larger files when monospaced fonts are used. This option (checked by default) displays a simple ! indicator instead of the usual multi-character mnemonics when elastic tabstops are enabled and monospaced font optimizations are in effect.

Usually it’s best to let Columns++ determine whether to use monospaced font optimizations, but there can be exceptional cases. Columns++ checks the width of a space and a capital letter W in each font assigned to a style in the current language; if these are all the same, it uses monospace optimizations. In some cases, a language might define styles which inhibit optimization but are never applied in a particular file; for large files, the performance gain from forcing monospaced font optimizations may be considerable. Conversely, a font might use monospaced characters in the ASCII range but wider characters outside that range; in this case, monospaced font optimizations can cause processing to be much slower than necessary, since each line in which text overflows the expected width in any column forces additional measurement and layout of text. If you want to use elastic tabstops with a large file, but response is sluggish and the best estimate chosen by Columns++ seems wrong, it’s worth trying the opposite setting.

These settings are only applied when you click the OK button near the bottom right of the dialog.

Saving, renaming and deleting profiles

You can save the settings in a profile by clicking the Save As... button to the right of the profile selection drop-down box. You can give the profile any name that does not begin with an asterisk or an open parenthesis and is not one of the three built-in profiles (“Classic,” “General” and “Tabular”). You can use the additional options from the drop-down menu at the right of the Save As... button to rename or delete a profile. If you have made changes to an existing profile that is not a built-in profile, you can save the changes without having to type the profile name again by using the Save option.

Automatically enabling or disabling elastic tabstops

By default, Columns++ uses whatever settings were in effect for the last active tab when you open a file or a new tab. You can change this behavior with the remaining options on the Elastic tabstops profile dialog.

The checkbox under the profile selection dropdown labeled Automatically enable this profile when opening type files. is available when a built-in or saved profile is selected (and Disable... when opening type files. in the bottom section of the dialog, which will be explained later, is not checked). Checking this box assigned the selected built-in or saved profile to be enabled whenever you open a file of the same type as the one you are currently viewing. The Type can be existing files with the same extension, existing files with no extension, or new files. This option is only applied when you click the OK button near the bottom right of the dialog.

The options in the box labeled When opening an existing file without an explicit rule for its extension allow you choose what happens when opening existing files for which you haven’t set either Automatically enable this profile... or Disable... when opening...:

keep the same settings as the last viewed tab.This is the default behavior: each existing file you open begins with same the elastic tabstops settings you had previously. Note that setting does not affect the default for new files; if you want a profile enabled, or elastic tabstops disabled, whenever you open a new tab with File|New you must set that behavior specifically using one of the when opening new files options in the Elastic tabstops profile dialog opened when viewing a new file.
disable elastic tabstops.Elastic tabstops will be turned off when opening any existing file unless you’ve specifically set a rule to turn it on for that file’s extension.
enable this profile:You can select any built-in or saved profile, which will be enabled when opening any existing file unless you’ve set a different rule for that file’s extension.

The options in the box labeled Disable elastic tabstops (applies to all profiles) allow you choose specific conditions under which elastic tabstops should always be disabled:

when opening type files.If you always want elastic tabstops disabled when you open the type of file in the current tab, check this box.
when opening files over ____ KB.Elastic tabstops can cause loading and editing to be slow for large files. These options disable elastic tabstops when loading files over the specified limits, regardless of any other settings. The default values disable elastic tabstops for files over 1000 KB or 5000 lines.
when opening files over ____ lines.

Note that although the options for automatically enabling or disabling elastic tabstops do not affect the tab you have open, they are only applied when you click the OK button near the bottom right of the dialog.

Rectangular selections

Most of the commands from the Columns++ menu operate on rectangular selections.

You can select a single column or multiple columns separated by tabs. Since each tab is interpreted as a column separator, this works as expected when elastic tabstops are used. The results with traditional fixed tabs are not likely to be obvious or expected when sequences of multiple fixed tabs are included in the selection, since Columns++ interprets each tab as starting a new “logical” column without regard to physical placement.

When selecting one or more columns in a document using tabs, you should generally include the tab that ends the rightmost selected column in your selection. Unless all the entries in the last column are the same width, it is often difficult or impossible to get a complete selection without including the final tabs; in any case, Columns++ will process the trailing tabs intelligently.

When you invoke a command that requires a rectangular selection and the current selection is not a non-zero-width rectangular selection, Columns++ will inform you of this and, if possible, offer reasonable options to create a rectangular selection based on the current selection or cursor position.

You can enable specific “implicit” rectangular selections in the Options dialog if you would prefer that Columns++ make those selections without prompting you.

Regular expressions

Several commands in Columns++ can use regular expressions for matching character strings. Columns++ uses the same regular expression engine, Boost.Regex, used in Notepad++, so the syntax and behavior are mostly the same. Some considerations unique to Columns++ are described below.

Within a rectangular selection, the selection in each row is matched independently of the surrounding text. The ^ assertion matches the beginning of the selection within a row, the $ assertion matches the end of the selection, and lookahead and lookbehind assertions cannot examine text past the boundaries of the selection. (Lookbehind assertions in Notepad++ can examine all text back to the beginning of the document, even when counting or replacing in a selection.) When using the Search in indicated region dialog, the region to be searched can be made up of one or more separate segments, each of which is searched independently. When a rectangular selection initializes the search region, each row of the selection becomes a separate segment of the search region.

A rectangular selection enclosing a series of lines, or the entire document, is not the same as an ordinary selection encompassing the same series of lines, or the entire document. Rectangular selections do not include line endings, and each line is a separate selection when matching. This applies to search regions created from such selections for use in the Search in indicated region dialog, where it can be important to distinguish these two cases (which, unfortunately, appear the same visually).

Matches for regular expressions using the \K directive are never replaced when performing stepwise Find and Replace in Notepad++. In the Search in indicated region dialog in Columns++, such matches can be replaced if you do not click outside the dialog between finding the match and replacing it.

Regular expressions are matched as UTF-16 sequences for Unicode documents, and as byte sequences for other documents. (This is the same as in Notepad++.) Scintilla (the display control used in Notepad++) handles Unicode as UTF-8. When displaying Unicode documents that contain invalid UTF-8, Scintilla displays each byte that cannot be decoded as a hexadecimal code in reversed colors. When matching a regular expression, Columns++ processes each of these bytes as if it were the Unicode replacement character, U+FFFD (). Notepad++ ignores errors in Unicode text when matching regular expressions.

Calculation

Columns++ can add or average columns of numbers or perform calculations across rows; these are explained below. See Number formats for details about how Columns++ recognizes numbers.

Calculating in columns

The Add numbers... and Average numbers... items on the Columns++ menu perform calculations on a rectangular selection of a column of numbers, or of multiple columns separated by elastic tabstops. (These commands can be used on selections that include traditional fixed tabs; but the results may not be as expected, since they treat tabs as logical separators, ignoring physical positioning.)

Columns++ shows a dialog to present the results of the calculation, which offers the following options:

Thousands separatorSelect a Thousands separator option (None, Comma/Period, Apostrophe or Blank) to control how numeric results are formatted.
Decimal places

If Automatic is checked, Notepad++ chooses the number of decimal places in the result based on the data. For Add numbers, the result uses the fewest decimal places needed to avoid losing precision; for Average numbers, the result shows three more than the greatest number of decimal places in any input values.

If Automatic is not checked, choose the number of decimal places (0-16) to which to round results. If Suppress trailing zeros is checked, zeros at the end of the decimal portion of numbers are omitted; if this box is not checked, exactly the number of decimal places selected are included.

Format as time

If Automatic is checked, Notepad++ uses the time formatting rules set in the Time formats dialog if at least one of the input numbers is expressed as a time; otherwise, the result is formatted as a simple number.

If Automatic is not checked, choose the format (1-4 segments) to be used for the results.

Insert... Check to insert the results into the document, at the end of the rectangular selection, when you close the dialog. If the last line of the rectangular selection is empty (spaces, tabs and/or virtual space), the option will be Insert these results in the last line of the selection; otherwise, it will be Insert a line containing these results following the last line of the selection.
Copy these results to the clipboard? Close the dialog with the Yes button to copy the results to the clipboard, or use No to leave the clipboard unchanged.

Calculating across rows

The Calculate... command from the Columns++ menu inserts the results of a calculation into each line of a rectangular selection. The command opens a dialog which lets you supply the formula for the calculation. Formulas are described in a separate section; they are mostly ordinary mathematical expressions, with special variables and functions to represent numbers found in the selection; for example, if a single column of numbers is selected, this + 20 would add a column with each of the original numbers increased by 20. Here are the options on the Calculate dialog:

FormulaEnter the formula for the calculation, as described in the Formulas section.
RegexIf you enter a regular expression, the first occurrence of the expression within the selection in each row of the rectangular selection is matched. Use the match case box to indicate whether to use case-sensitive matching. If Skip unmatched lines is checked and the regular expression box is not empty, rows in which the regular expression does not match are ignored; the formula is not evaluated, nor is a tab or any space padding added to the selection in the row to account for the new column.
Thousands separatorSelect a Thousands separator option (None, Comma/Period, Apostrophe or Blank) to control how numeric results are formatted.
Decimal placesChoose the number of decimal places (0-16) to which to round results. If Suppress trailing zeros is checked, zeros at the end of the decimal portion of numbers are omitted; if this box is not checked, exactly the number of decimal places selected are included.
Format as timeCheck the Enabled box to use the formats enabled in the bottom section of the Time formats dialog to show the results. The Formats... button lets you open that dialog without closing this one.
New column
TabbedCheck to use a tab to separate the new column from the existing selection; uncheck to pad with spaces so as to leave one space between edge of the existing selection and the new column.
Numeric alignedCheck to align results as numbers, following the same rules as the Align numeric command; uncheck to leave results left-justified.
Insert at leftCheck to insert the new column at the left of the selection; uncheck to insert on the right.

Formulas

Formulas are representations of mathematical computations. Columns++ uses the ExprTk (Expression Toolkit) package to implement formulas used by the Calculate... command and formulas in regular expression replacements for the Search functions. Following are descriptions of the variables and functions defined by Columns++, along with some general features of the syntax of ExprTk expressions.

Numeric values in formulas

Numeric values are represented internally as double precision floating point numbers. Any number up to 9,007,199,254,740,992 without a fraction or decimal, positive or negative, is represented exactly. Most fractions and decimals cannot be represented exactly, but in ordinary use, rounding to a reasonable number of decimal places (so that the total number of digits before and after the decimal is under 15) will make discrepancies irrelevant.

Wherever a numeric value is used, it is also possible for the value to be Not-a-Number, an indication that something which was expected to produce a number failed to do so. This can happen because you tried to get a number from the document, but the associated text could not be unambiguously interpreted as a number. It can also be the result of an undefined mathematical operation, such as dividing by zero. In most cases, if any of the inputs to an operation or function are Not-a-Number, the result is also Not-a-Number. When the result of a formula is Not-a-Number, Columns++ does not insert any text (aside from a tab and/or spaces needed to keep columns aligned).

Variables and functions implemented by Columns++

When Calculate is applied to a rectangular selection, the formula given is evaluated once for each row of the selection (or, if a regular expression is given and Skip unmatched lines is checked, for each row in which the regular expression matches). The term row refers to the part of a single line in the document which is included in the rectangular selection, and the term current row is the row for which the formula is being evaluated, and into which the results of the formula will be inserted. Rows are processed in sequence, from top to bottom.

Variables and functions in Calculate command formulas
countthe total number of rows (lines) in the selection
indexcounting from one, the row number within the selection of the current row
matchzero if no regular expression was given or if the regular expression did not match in the current row; otherwise, counting from one and including the current row, the number of rows on which the regular expression has matched
linethe line number of line containing the current row (within the entire document, counting from one; that is, the same as the line number shown in the left margin if line numbers are enabled in Notepad++)
thisif a regular expression was given, the numeric value of the text matched by the regular expression; otherwise, the numeric value of the current row
col(n)
col(n,p)
col(n,p,v)

tab(n)
tab(n,p)
tab(n,p,v)

reg(n)
reg(n,p)
reg(n,p,v)
These functions retrieve the numeric value in the specified segment within a row (the selected part of a line). The col function divides the segments by white space (any run of blanks and/or tabs). The tab function divides by tab characters. The reg function retrieves regular expression capture groups.
nthe column, tab or capture group to retrieve, numbering from one; if zero, col and tab return the entire row (the selected part of the line), while reg returns the portion of the row matched by the regular expression
pif given, the number of rows previous to the current one from which the value is to be retrieved; if omitted or zero, the current row is accessed. If a regular expression is specified and Skip unmatched lines is checked, unmatched lines will be ignored when counting backward to previous lines.
va numeric value to use instead of Not-a-Number if there are not p previous rows, if there are not n columns, tabs or capture groups in the indicated row, or if the indicated text cannot be unambiguously interpreted as a number
last
last(p)
last(p,v)

The last function with no arguments represents the last result of a calculation that was not Not-a-Number; if the current row is the first row, or if no previous calculations have resulted in anything other than Not-a-Number, last is zero.

When p or p and v are specified, they are interpreted as for col/tab/reg, except that p cannot be zero; the function retrieves the result of the calculation on the row indicated, substituting v, if specified, for Not-a-Number.

When formulas are used in regular expression replacements for the Search dialog, the formula given is evaluated once for each match. Formulas in regular expression replacements are specified as:
(?=formula) or (?=format:formula)

Variables and functions in Search regular expression replacement formulas
matchcounting from one and including the current match, the number of times the regular expression has been matched and replaced (When doing stepwise find and replace, matches which were found but not replaced are not counted.)
linethe line number in which the current match begins (within the entire document, counting from one; that is, the same as the line number shown in the left margin if line numbers are enabled in Notepad++)
thisthe numeric value of the text matched by the regular expression
reg(n)
reg(n,p)
reg(n,p,v)
These functions retrieve the numeric value of a regular expression capture group.
nthe capture group the value of which is to be retrieved, numbering from one; zero returns the value of the text matched by the regular expression
pif given, the number of matches previous to the current one from which the value is to be retrieved; if omitted or zero, the current match is accessed
va numeric value to use instead of Not-a-Number if there are not p previous matches, if there are not n capture groups in the indicated match, or if the indicated text cannot be unambiguously interpreted as a number
sub(n)
sub(n,p)
sub(n,p,v)
These functions retrieve the numeric value previously calculated for a substitution.
nthe formula substitution the value of which is to be retrieved, counting from the left and numbering from one; zero refers to the formula in which the expression appears
p

If given, p specifies the number of matches previous to the current one from which the value is to be retrieved; zero retrieves the value from the current match if n is greater than zero and less than the number of the current formula, otherwise the result is Not-a-Number.

If omitted, indicates the last result for the nth formula substitution which was not Not-a-Number; if the current match is the first match, or if no previous matches have resulted in anything other than Not-a-Number, the result is zero.

va numeric value to use instead of Not-a-Number if the result would otherwise be Not-a-Number
last
last(p)
last(p,v)

equivalent to sub(0), sub(0,p) and sub(0,p,v)

Format specifications for Search regular expression replacement formulas
When a format is not specified, the default is 1.-6 (up to six decimals, suppress trailing zeros, suppress decimal separator if nothing follows, no leading zeros except that a digit is required before the decimal point).
n One or two digits specify the minimum number of integer digits to be shown (i.e., shorter values will be left-padded with zeros); 0 indicates that a leading zero is not required for decimals. If omitted, the default is 1.
t The letter t (or T) specifies that the result of the formula will be shown in time format. If n is used it must appear before t; n then applies to the leftmost time segment in each result, regardless of what time unit that represents.
.
,
A period or a comma indicates that decimal places can be shown, using the current decimal separator (see Options — it doesn’t matter whether you use a period or a comma in the format). When a format is specified without a decimal indicator, decimals are rounded and not shown. If a decimal indicator is present but no additional specification follows, the default is .-6 (up to six decimal places, suppress trailing zeros, suppress decimal separator if nothing follows).
One of the following can follow the decimal indicator if it is specified:
d one or two digits specifying the exact number of decimal places to be shown
-d one or two digits specifying the maximum number of decimal places to be shown, omitting any trailing zeros and decimal separator
m-d Up to d decimal places will be shown, but no fewer than m. If m is 0, all decimal places will be omitted if they are zeros, but the decimal separator will still be shown.

Syntax of formulas

Formulas are written using most of the common conventions for writing mathematical expressions in typical programming languages: numbers are written with an optional minus sign, digits and an optional decimal point (no commas); +, -, *, /, % and ^ indicate addition, subtraction, multiplication, division, remainder and exponentiation; parentheses are used to indicate order of operations. You can also use logical expressions built from common operators, including = or ==, != or <>, <, <=, >, >=, & and |, in a conditional expression:

test ? option1 : option2yields option1 if test is true, option2 if test is false

so col(1)>10?col(2):col(3) gives the content of column 2 if column 1 is greater than 10, otherwise the content of column 3.

Formulas can use the many functions built into ExprTk, including these common mathematical functions:

absabsolute value
avgaverage of any number of values
ceilsmallest integer greater than or equal to
erferror function
erfccomplimentary error function
expe to the power of the given value
floorlargest integer less than or equal to
fracfractional (decimal) part
hypothypotenuse of a right triangle from two sides (eg: hypot(x,y) = sqrt(x*x + y*y))
lognatural logarithm
log10base 10 logarithm
log2base 2 logarithm
maxlargest of any number of values
minsmallest of any number of values
ncdfnormal cumulative distribution function
roundround to the nearest integer
roundnround the first argument to the number of decimal places specified by the second argument
sqrtsquare root
truncinteger part (round down)

and trigonometric functions (in all cases, angles are expressed in radians):

acosarc cosine; interval [-1,+1]
acoshinverse hyperbolic cosine
asinarc sine; interval [-1,+1]
asinhinverse hyperbolic sine
atanarc tangent; interval [-1,+1]
atan2two-argument arc tangent; interval [-pi,+pi]
atanhinverse hyperbolic tangent
coscosine
coshhyperbolic cosine
cotcotangent
csccosecant
deg2gradconvert from degrees to gradians
deg2radconvert from degrees to radians
grad2degconvert from gradians to degrees
rad2degconvert from radians to degrees
secsecant
sinsine
sincsine cardinal
sinhhyperbolic sine
tantangent
tanhhyperbolic tangent

ExprTk expressions have many more features which are described in Sections 8, 12, 13 and 20 of the documentation for ExprTk. Columns++ supports the return call (section 20 of the ExprTk documentation). The returned strings and scalar values will be concatenated and inserted in the new column; if two or more scalar values are specified without intervening strings they will be separated by a tab character (if Tabbed is checked) or a single space. The concatenated string will be left aligned, regardless of whether Numeric aligned is checked.

Alignment

Align left, Align right, Align numeric and Align... process rectangular selections. The selection can be a single column, or multiple columns separated by elastic tabs. (These commands can be used on selections that include traditional fixed tabs; but the results may not be as expected, since they treat tabs as logical separators, ignoring physical positioning.) Alignment is accomplished by adding and/or removing ASCII spaces at the beginning and/or end of a column; therefore, precise alignment is not always possible when using proportionally-spaced fonts.

Align numeric

Details about how numbers are recognized and interpreted are given in the section on Number formats. The alignment of items which are not recognized as numbers is unchanged. The Decimal separator is comma item near the bottom of the Columns++ menu determines whether the comma or the period is the decimal separator.

The settings in the Time formats dialog determine how numeric alignment proceeds when there are numbers with colons. The Numbers with one or two colons represent setting identifies which colon (days:hours, hours:minutes or minutes:seconds) is present in all numbers with colons; that colon is aligned across all lines. Numbers without colons are aligned according to the Time units: numbers with no colons represent setting, such that they line up with the position that same unit would occupy in a time-formatted number with four segments, all of which except days have two integer digits (e.g., “1:00:00:00”).

Custom alignment

The Align... command opens a dialog which allows you to specify a string of one or more characters, or a regular expression, to be aligned in the column or columns within a rectangular selection:

Align by

Specify a character, a string of characters, or a regular expression to be aligned in each column.

Items in which this character string or regular expression does not match are not changed.

Regular expression matches are aligned by the start (leftmost position) of the match. To align by some other part of the match, rewrite it using a lookbehind assertion or the \K directive. It does not matter what characters are included in the match, only the position at which it starts.

First
Last
Regular expression
Choose whether to align using the first or the last occurrence of the Align by character or string in each row, or whether to interpret the Align by specification as a regular expression.
Match caseCheck to distinguish upper and lower case when the string to be matched includes letters.
The following settings can be useful when the column to be processed includes some lines which will not be aligned (because the Align by character, string or expression does not match). Since unmatched items are not changed, you can use the margin settings to control how the set of aligned items, taken together, is placed relative to the unmatched lines. In other situations, the default of 0 Left is usually best.
MarginSpecify number of space characters, if any, to be used as a margin between the edge of the column and the aligned items.
Left
Right
Choose whether aligned items should be positioned relative to the left side or the right side of the column in which they occur. The margin is relative to this side of the column.

Sorting

Notepad++ supports sorting lines using a rectangular selection to define the sort keys, but this does not work as expected when tabs (whether elastic or traditional fixed) are used. The sort commands in Columns++ use a rectangular selection to identify the sort keys and work as expected when tabs are present. These are “stable” sorts, meaning the order of lines with equal sort keys is unchanged. There are three variants of ascending and descending sorts:

binaryThe raw byte values of the internal representations of the selected sort strings are used as sort keys. For most purposes, this matches what you would expect from a “case sensitive” sort, with the sort order dependent on the active code page. Unicode files sort by code point.
localeThe sort order is defined by the current Windows locale. For most purposes, this matches what you would expect from a “case insensitive” sort.
numericThe selections on each line are interpreted as tab-separated numbers. The Number formats section describes in detail how Columns++ recognizes numbers. Items which can’t be interpreted as numbers sort first (whether the sort is ascending or descending).

Custom sorts

In addition to the six immediate sort commands on the Columns++ menu, you can use the Sort... command to open a dialog giving you more control over the details of the sort:

What to sort
Whole linesIndividual lines remain intact and are sorted using the column selection to define the sort keys.
Selected text only

Only the selected portions of lines are sorted; the surrounding text on each line remains in place.

Note: This will result in blank-padding lines in the selection which do not extend to or past the right boundary of the column selection. If elastic tabstops are enabled and the number of tabs included in the column selection is different on different lines (for example, because some lines are short), results using Selected text only are unlikely to be as expected.

Sort direction
AscendingSmaller numbers, narrower text, or characters earlier in the collating sequence, come first.
DescendingLarger numbers, wider text, or characters later in the collating sequence, come first.
Sort type
BinaryThe raw byte values of the internal representations of the selected sort strings are used as sort keys. For most purposes, this matches what you would expect from a “case sensitive” sort, with the sort order dependent on the active code page. Unicode files sort by code point.
LocaleThe sort order is defined by a Windows locale, as specified in the Locale sort details section.
NumericSort strings are interpreted as numbers, as described in the Number formats section. Strings which can’t be interpreted as numbers sort first (whether the sort is ascending or descending). When Regular expression is selected, the regular expression is used to parse the selected text on each line; in all other cases, the text is interpreted as a sequence of tab-separated values.
WidthThe visible width of the selected sort strings are used as keys.
Sort key
Entire columnThe selected text on each line is used as the sort key.
Ignore surrounding blanks/tabsFor Binary and Locale sorts, leading and trailing blanks and tabs in the text selected on each line are ignored, and the remaining text is used as the sort key. For Numeric sorts, this option behaves the same as Entire column (the text is treated as tab-separated values regardless of which option is selected, which is the same as the immediate numeric sorts on the Columns++ menu).
TabbedThe selected text is tab-separated; sort keys must be specified in the Keys box.
Regular expressionA regular expression is used to parse the selected text on each line.
Find whatSpecifies a regular expression. The first match of the expression within the selected text in each line will be used to determine the sort key.
Match caseWhen checked, the regular expression match is case sensitive; otherwise, the case of the text is ignored.
Specify keys using capture groupsWhen checked, the Keys box specifies the sort sequence in terms of capture groups. When unchecked, the text matched by the regular expression is used as the sort key.
Keys

A list of keys, separated by spaces, commas and/or semicolons, to be used for sorting. The major sort key is listed first, with subsequent keys having lower precedence. Each key is designated with a number. If Tabbed is selected, the number indicates a tab-separated field, numbered left to right counting from 1; 0 represents the entire selected text in the line. If Regular expression is selected, the number is the number of a capture group; 0 represents the entire match.

Each sort key number may be followed (without intervening spaces) by one of the letters a or d, and/or one of the letters b, l, n or w. These specify ascending, descending, binary, locale, numeric and width, overriding the selections in the Sort direction and/or Sort type boxes for the capture group or tab field to which they are appended.

Locale sort details
Locale sorting makes use of the Windows API function LCMapStringEx. The exact behavior of the sort is dependent on the exact behavior of this function; the following attempts to describe the most important points.
LanguageSelects the language for which a selection of locales will be offered.
LocaleSelects a Windows locale from those available for the selected language.
Case sensitiveCase sensitivity in a linguistic sort is not applied character by character; instead, when and only when two strings match completely except for case, case is applied to further sort them. Consequently, when this box is checked, the result will still not resemble what most users expect of a “case sensitive” sort. When this box is unchecked, the LINGUISTIC_IGNORECASE flag is passed to LCMapStringEx.
Sort digits as numbersThis causes the sort to attempt to recognize strings including digits — like “data5” and “data10” — in such a way that “data5” will sort before “data10” in an ascending sort instead of after. The same algorithm is used to sort file names in Windows File Explorer. When this box is checked, the SORT_DIGITSASNUMBERS flag is passed to LCMapStringEx.
Ignore diacriticsWindows API documentation says: Ignore nonspacing characters, as linguistically appropriate. Note: This flag does not always produce predictable results when used with decomposed characters, that is, characters in which a base character and one or more nonspacing characters each have distinct code point values. When this box is checked, the LINGUISTIC_IGNOREDIACRITIC flag is passed to LCMapStringEx.
Ignore symbols and punctuationThis causes spaces, punctuation and “symbols” (the documentation is not more specific) to be ignored. Strings are sorted as if all the letters and numbers were run together, ignoring spaces, hyphens, periods and so on. When this box is checked, the NORM_IGNORESYMBOLS flag is passed to LCMapStringEx.

Conversion

Convert tabs to spaces

Use Convert tabs to spaces on any selection to replace tabs in the selection with equivalent spaces, taking elastic tabstops into account if enabled. If nothing is selected, the entire file is converted.

Convert separated values to tabs...
Convert tabs to separated values...

These commands convert the selection, or the entire file if nothing is selected, between delimiter-separated values (typically *.csv, comma-separated values) and tabbed presentation (typically *.tsv or tab-separated values).

Both delimiter-separated values and tab-separated values use a structure comprised of records (rows) containing fields (which are interpreted as being arranged in columns). In tabbed documents, each line of the file is a record, and fields within a record are separated by tabs. Fields cannot contain tabs or line-ending characters as such, but these can be encoded, typically using backslash notation (\t, \n, \r for tab, new line and return). Consistency requires that the encoding character must also be encoded (e.g., two backslashes in the file to represent a single backslash in the field’s value).

In delimiter-separated files, records are divided by line breaks and fields are divided by a separator character, typically a comma. However, when a field contains the separator character or line-ending characters, the problematic characters are escaped rather than encoded, meaning that the original character is still used in the file, but context indicates that it is not to be interpreted as a field or record separator. Typically, quote marks surround a field which contains line-ending or separator characters, and quotes within the field are doubled.

There are many variations in the details of data representation in delimiter-separated and tab-separated values files. When you select Convert separated values to tabs... or Convert tabs to separated values..., Columns++ displays a dialog in which you can adjust the conversion accordingly:

Column separator
Commaselects the column separator for the separated values.
Semicolon
Vertical line
Other specifies the column separator as any single character within the Unicode Basic Multilingual Plane except for null, line feed or carriage return.
Separated values syntax
Quoterecognizes quotes and/or apostrophes at the beginning of a field as the start of a quoted field, in which line-ending and separator characters are part of the field value.
Apostrophe
Escape character defines an escape character for separated values. The character following an escape character is used unchanged as a part of the field value, without any special meaning (that is, it doesn’t separate fields or records or begin or end a quoted field).
Preserve quotes, escapes and blanks when converting to tabbed

indicates that quotation marks, apostrophes, escape characters and leading and trailing blanks within separated values fields are copied as is to the tabbed presentation.

This tends to “clutter” the appearance of the tabbed document; however, it makes it possible to “round-trip” to tabs and back to separated values without any change in fields that were not edited. If you intend to convert a separated values file to tabbed presentation for ease of editing and there are non-standard details in the way quotes, escapes or blanks are used in the separated values file which must be preserved when converting back, keep this box checked for both conversions.

When this box is unchecked, Columns++ quotes or escapes fields containing leading blanks or quotes anywhere in the field when converting from tabbed presentation to separated values. When checked, so long as the field will not cause a parsing failure — such as by containing an unquoted and unescaped separator character, or by beginning with a quote but not being a properly quoted field when taken in its entirety — Columns++ will preserve the field as is.

Tab, new line and return characters in tabbed documents

Fields in tabbed presentation cannot contain tabs or line-ending characters; if there are any of these characters in separated values fields, they must be encoded or replaced when converting to tabs.

Backslash-style encoding

The specified character (\ by default) followed by t, n or r encodes a tab, new line or return; the encoding character is doubled to indicate a single occurrence in the data. Encoding is applied when converting from separated values to tabbed presentation and reversed when converting from tabbed presentation to separated values.

This encoding method, using the backslash as the encoding character, is probably the most commonly-understood way to represent tabs and line-ending characters in tabbed presentation; however, it is inconvenient for reading and editing if the data includes Windows file paths, since all backslash characters in the data are doubled.

URL-style encodingThe specified character (% by default) followed by two hexadecimal digits (numeric digits or the letters A-F in either case) encodes a byte value; %09, %0A and %0D encode tab, line feed and return. When converting from separated values to tabbed presentation, these three are encoded; the per cent symbol or other specified character is encoded only if it is followed by two hexadecimal digits. When converting from tabbed presentation to separated values, any occurrence of the specified character followed by two hexadecimal digits is decoded to a byte in the code page active for the file.
Replace when converting to tabbedindicates that the disallowed characters are replaced with the text specified when converting to tabbed presentation; no attempt is made to restore the original characters when converting from tabs to separated values.

Number formats

Columns++ interprets characters in a document as numbers in many contexts, including the Calculation commands, the Align numeric command, numeric sort fields and formula substitutions in regular expression searches.

Numbers can include thousands separators and decimals. The Decimal separator is comma item near the bottom of the Columns++ menu determines whether the comma or the period is the decimal separator; thousands separators may be a space, an apostrophe, or whichever of comma or period is not the decimal separator. Numbers can also be times, using colons to separate days, hours, minutes and seconds.

There is some flexibility in what can be included along with a number in a column or a regular expression match. Common currency signs can precede the number with or without a space, and a minus sign can precede or follow a currency sign. Non-numeric characters (such as units, like “mg” or “ft”) can follow the number. (These are not interpreted, though; Columns++ will add 5 yards and 5 inches to get 10 without complaint.) Non-numeric characters can precede the number if they are separated from the number by at least one space.

  • Add numbers and Average numbers skip items that have no digits; but if an item which includes one or more digits cannot be unambiguously interpreted as a number, Columns++ will select the item and will not perform the calculation.
  • In formulas, variables and functions which represent numbers in the document are set to Not-a-Number if the associated document text cannot be unambiguously interpreted as a number.
  • When sorting numerically, fields which cannot be unambiguously interpreted as numbers sort to the beginning.
  • Align numeric uses slightly more lenient rules for recognizing numbers; the alignment of items which are not recognized as numbers is unchanged.

Decimal separator is comma

Decimal separator is comma may be checked or unchecked to control how Columns++ interprets numbers. This setting is maintained per document (while the document remains open), so it can be different in different tabs.

Time formats

Time formats... opens a dialog that allows you to control how Columns++ interprets and shows numbers represented as times.

Time units: numbers with no colons represent
days
hours
minutes
seconds
Select the unit for calculations involving times. In calculations involving times, times specified without colons are interpreted as being in this unit, and times specified with colons are converted to this unit.
Numbers with one or two colons represent
days:hoursdays:hours:minutes
hours:minutesdays:hours:minutes
hours:minuteshours:minutes:seconds
minutes:secondshours:minutes:seconds
Select the way times with one or two colons will be interpreted. (Times with three colons are always days:hours:minutes:seconds.)
Results of calculations can use these formats for times
1 segmentunit
2 segmentsunit1:unit2
3 segmentsunit3:unit4:unit5
4 segmentsdays:hours:minutes:seconds
Check the box before each format to enable results to be shown in that format when times are displayed. The Calculation commands and formula substitutions in regular expression search replacements can show their results as times; when time display is enabled, these settings determine which formats can be used for results.

Options

Options... opens a dialog that allows you to control some aspects of Columns++:

Show Columns++ on the main menu bar.lets you choose whether to add an entry for Columns++ to the main menu bar, just to the left of the Plugins menu, or leave it as an entry on the Plugins menu.
Replace: Don't move to the following occurrence.has the same effect as the option of the same name on the Searching panel of the Preferences dialog in Notepad++, but for the Search in indicated region dialog in Columns++. When checked, the Replace button in the search dialog does not immediately perform another find after replacing text; in effect, the button alternates between finding and replacing, giving you a chance to see the effect of the replace before moving to the next occurrence of the search string.
Show Elastic tabstops progress dialog when seconds remaining exceeds about:lets you choose the minimum estimated remaining time, from 1 to 20 seconds, that will cause Columns++ to display a progress dialog during a long-running Elastic tabstops operation. The default is 2.
Automatically extend selections to form rectangles
You can enable “implicit” selections for Columns++ commands that require rectangular selections, bypassing the dialogs that ask you if you want to make a rectangular selection.
Selections on one line extend downward to the last line. A selection of one or more characters on a single line is “projected” downward to the last line of the file. This allows you to select full columns (skipping headers, if desired) without scrolling all the way to the end of the file. If the last line of the file is completely empty (that is, the file ends with an end-of-line sequence) that line will not be included in the selection.
Full row selections are replaced by the enclosing rectangle. A single selection of complete lines is replaced by a rectangular selection wide enough to include all the text on all lines in the selection. Usually you get this kind of selection by dragging in the left margin. If the selection ends at the beginning of a line (as when dragging downward in the left margin), that line is not included in the rectangular selection. The selection made by Edit|Select All will be converted to a rectangular selection that encompasses the entire file (excluding the last line, if it consists of only a line ending).
Zero-width selections extend to the right to the end of the longest line. A “thin” selection, or a rectangular selection containing no characters or virtual space, is extended to the right far enough to enclose the end of the longest line in the selection. When selecting a rectangular region meant to extend from some column far enough to the right include the ends of all lines, this avoids the need to scroll through and figure out how wide the selection needs to be.
Custom style for Search in indicated region

By default, Columns++ defines a custom style to indicate the area to be searched by shading the background. This uses a Scintilla resource called an indicator; in some cases this might conflict with other plugins, so some options to control it are available here. You can also choose the color and transparency of the background.

These options are disabled if the Search in indicated region dialog is open when the Options dialog is opened.

Enable custom styleWhen checked, a custom style will be available.
Alpha, Red, Green, BlueSpecify the transparency and color of the background for the custom style.
Override Notepad++ indicator allocationNotepad++ version 8.5.6 introduced a mechanism to avoid conflicts in indicator numbers used by plugins. If this mechanism is available, Columns++ will set the indicator accordingly unless this box is checked. If you have plugins installed which have not been updated to use the new mechanism, it is possible that there will still be a conflict with the indicator number Notepad++ assigns; in that case, you can check this box to choose the indicator number manually. If this box is disabled, either the version of Notepad++ you are running does not support the indicator allocation mechanism, or there were no available indicators remaining when this plugin was loaded.
Indicator numberWhen Override Notepad++ indicator allocation is unchecked and a Notepad++ indicator allocation is available, this box shows the Scintilla indicator number allocated to Columns++. Otherwise it allows you to choose the indicator number Columns++ will use.
Check for updates and show a notification at the bottom of the Columns++ menu

Columns++ can check for updates. It does this by connecting to the Internet and requesting release information from GitHub. No information about your installation is sent to GitHub. This check is done when Notepad++ loads Columns++, no more often than once every twelve hours. If Columns++ finds a release newer than the one currently installed, it adds the notice Update Available to the Help/About... entry at the bottom of the Columns++ menu.

Show for any new releaseShow a notice if a release newer than the one you have installed is found.
Show for stable releases onlyShow a notice only if a stable release newer than the one you have installed is found. Releases are generally marked stable (production-ready) after they have been released for long enough that the author believes any serious problems would have been reported.
Do not checkSelect this option if you do not want Columns++ to connect to GitHub to check for new releases.

Help/About

Help/About... provides access to release/version identification, this help file, and changelog, license and source information. It also allows you to check GitHub for the newest release and the latest stable release of Columns++.