Little Square Boxes (LSB*) in Foldit RecipesEditIn the Foldit recipe editor, many shared recipes contain little square boxes (LSB). The boxes result from how the original recipe was prepared and added to Foldit.
Two sources account for most of the LSB.
LSB at the end of a line have one source. LSB at the beginning or in the middle of a line have another source.
Both types of LSB can be cleaned up easily in many text editors. This article explores to do this in Vim, Notepad++, and SciTE. Other editors probably have similar settings and procedures.
Before looking at how to fix the LSB, some background on where they came from may be helpful.
Little square boxes at the end of line are leftover characters from the Windows environment. In Windows, files like the ones used for recipes are written out with two normally invisible characters at end of each line. These are the carriage return and the line feed characters. These "control characters" are added so that editors like Windows Notepad can display the lines properly.
Other operating systems, like Linux and Mac OS, use only the newline character as the line delimiter.
Foldit also uses only the newline character. When it sees an extra carriage return, it draws a little square box.
End-of-line little square boxes are most likely caused when a recipe has been pasted into the Foldit recipe editor. (Recipes imported from a file do not have little square boxes.) When a recipe with end-of-line LSB is exported to a file using Save As -> Export in the Foldit recipe editor, the LSB cause various problems in external editors.
Little square boxes not at the end of a line are mostly caused by the tab character, another "control character". In many editors, hitting the "tab" key on the keyboard moves the cursor over a set number of characters. Some editors write out the corresponding number of blank characters when the file is saved. Other editors write out the tab character itself. When a recipe containing tab characters is imported or pasted the Foldit recipe editor, the tabs become LSB.
Vim, Notepad++, and SciTE can correct the problem of LSB. These editors also have configuration settings that help prevent LSB when you update the recipe.
Fixing End-of-line LSBEdit
The easiest way to eliminate LSB in all three editors is to start with a blank document. Open the recipe in the Foldit recipe editor. Instead of exporting the recipe, copy the recipe (using cut-and-paste techniques), and paste it into your editor. For Vim, Notepad++, and SciTE, this eliminates the end-of-line LSB.
If a file has already been exported, SciTE display it with a blank line between every original line. There doesn't seem to be an easy fix for the extra blank lines. Simply import the file back into Foldit, then use the copy-and-paste method in this scenario.
Notepad++, to remove extra blank lines go to Edit > Line Operations > Remove Empty Lines, note that this will remove all blank lines, blank lines used to format code will also be removed. To remove carriage returns, the first option is to navigate to Edit > EOL Conversion > Unix (LF), the second option is to open the replace dialog by navigating to Search > Replace or with Ctrl+H. Then in the bottom left of the dialog select the "Extended..." radio button, in the "Find what:" box type "\r\n" and in the "Replace with:" box type "\n". Click "Replace All" on the right, then copy-and-paste back to the Foldit editor. You can prevent carriage returns from happening when creating new documents by changing the new document settings; go to Settings > Preferences > New Documents, in the "Format (Line ending)" pane select the "Unix (LF)" option.
Vim, on the other hand, displays the end-of-line LSB as "^M" when opening an already exported file. The fix is simply to copy a "^M" and paste it into a command:
at a Vim command prompt. (Just typing "^" and then "M" won't work, you really have to copy-and-paste.)
Fixing LSB from tabsEdit
Most LSB at the beginning or in the middle of a line come from tab characters. Fixing LSB caused by tab characters is similar in all three editors discussed here. The first step is to set options that determine how many blank spaces a tab is worth. These options generally apply to any new tab characters you type. The next step is convert any existing tab characters in your recipe. All three editors have a built-in function to convert tabs to spaces.
Fixing LSB from tabs in VimEdit
In vim, there several "set" commands that affect how tabs are handled:
expandtab shiftwidth=4 softtabstop=4 tabstop=4
These commands can be condensed to one line that can be added to your _vimrc file or entered at a command prompt:
set ts=4 sw=4 softtabstop=4 expandtab
With these settings in place, the built-in "retab" command can be run from a command prompt to convert all tabs to spaces in the current file.
Fixing LSB from tabs in Notepad++EditIn Notepad++, go to Settings -> Preferences -> Language. Under "Tab Settings", highlight "lua", then uncheck "Use default value", make sure the tab size is 4, and check "Replace by space". The next step is to convert existing tabs. Go to Edit -> Blank Operations, then select "TAB to Space". Save the updated recipe file.
Both the Tab Settings and TAB to Space operation are found in Notepad++ 7.1. Older versions of Notepad++ may work differently.
Fixing LSB from tabs in SciTEEdit
In SciTE, many settings are controlled through options files. There are various levels of options files, for example there's a global options file, which can be overridden by settings in a user options file.
The following settings control tab behavior in SciTE:
# Indentation tabsize=4 indent.size=4 use.tabs=1 #indent.auto=1 indent.automatic=1 indent.opening=0 indent.closing=0 #tab.indents=0 #backspace.unindents=0
In particular, the use.tabs=1 setting causes tab characters to be written to the output file. The indent.auto setting is commented out (#) in this example, but if uncommented, it affects use.tabs and other settings.The easiest way to change this setting is to create or edit a user options file via Options -> Open User Options File. Copy the lines shown above and paste into the user options file. Change use.tabs to use.tabs=0. Save the new or modified options file. The next step is to convert existing tabs to spaces. In SciTE, select Options -> Change Indentation settings. Then on the Identation Settings dialog, make sure the tab size and indent size are 4, and uncheck "Use tabs". Click on "Convert" to convert tabs to spaces. Save the updated recipe file.
These instructions are based from SciTE Version 1.75, which dates to 2009. This version of SciTE is included in the Lua for Windows distribution.
The steps listed above should eliminate extra carriage returns and convert any existing tab characters to spaces.
The key to preventing LSB from tab characters is to make sure your editor is set to replace tabs with spaces. The suggested four spaces per tab character is a widely used default.
The other part of the picture is to avoid end-of-line LSB. The best advice here is to avoid pasting the recipe into the Foldit editor on Windows. (More or less exactly the opposite of how you fix end-of-line LSB.)
Instead of pasting, use Load -> Import in the Foldit recipe editor to copy the recipe in from the file system on your computer. When you use this option, Foldit automatically converts the carriage return/line feed used by Windows to just plain line feed. A recipe imported this way can then be exported again via Save As -> Export, without creating any spurious carriage returns.
See Editing Foldit Recipes for a complete discussion of using the export and import functions of the Foldit recipe editor. Once you've eliminated the LSB, using import and export will help to keep them from coming back.
So far, we've kept this high-level, but mucking about with Foldit recipes is a nerdy business, so here goes.
The carriage return/line feed is of course also called a CRLF or maybe CR/LF. Sometimes you'll see the hexadecimal character codes as in 0x0D0A. So a carriage return is hexadecimal 0D or decimal 13, and a line feed is hex 0A or decimal 10.
Just to make things more confusing, the line feed is usually called a newline in the Unix world.
The tab character is 0x09, hexdecimal 09, which is also decimal 9.
All this dates back to the teletype machine, which was one of the first electric typewriters, and an early text messaging/email system. The PC keyboard is derived from the teletype keyboard, which is where things like the escape and control keys originated. Holding down the control key meant that the next character would control the remote teletype machine, resulting in the paper moving up a line, or the bell ringing.
Check out "The Victorian Internet" by Tom Standage for the 'real first online text messaging system.
* LSB should not be confused with LFB