Step detection (membrane tether analysis) in AFM

"afmtether" is a Matlab, GUI-controlled application for the analysis of AFM force curves for discrete steps corresponding to disruption of single membrane tethers.

The program was written by Peter Nagy (email: peter.v.nagy@gmail.com, https://peternagyweb.hu)

The program can be started by typing "afmtether" at the Matlab command prompt. In addition to manual identification of discrete steps, two automatic step-detection algorithms are implemented in the program:

A comprehensive review about different step-detection algorithms has been published in Biophys. J.

The main panel of the program is shown below:

In this brief help you can find a step-by-step introduction to the major functionalities of the program.

You can select which parameters are plotted with the "X axis" and "Y axis" drop-down menus. The extension and retraction parts of the curves are plotted simultaneously, but only one of them is analyzed. The direction to be analyzed is selected in the "Direction" drop-down menu.

Set spring constant

The spring constant must be entered. The program saves the entered sping constant after pressing "Set". The saved spring constant will be displayed in the blue text field at the top of the main panel.

Read IBW file

The program can read Igor Binary Wave (IBW) files based on the submission of Jakub Bialek to Matlab Central. You have to click on the "Select an IBW file" button to select the IBW file to read followed by clicking on "Read". If you enter a variable name, the data will be saved into a Matlab variable and exported to the Matlab base workspace. This structure variable can be read the next time by clicking on "Read AFM data". Reading of this Matlab variable is faster than importing an IBW file, since the Matlab variable already contains the contact point, the baseline force and the spring constant. Specifying the variable is optional, since the program stores the data internally.

The spring constant must be entered before reading an IBW file. After reading the IBW file the "X axis" and "Y axis" drop-down menus will be populated and the contact point will be estimated based on the extension (approach) curve. If the Global Optimization Toolbox is installed in Matlab, the genetic algorithm will be used for detecting the contact point as a steep change in the slope of the curve shown by the red circle in the image below. If the Global Optimization Toolbox is not installed, the moving step fit algorithm will be applied. The moving step fit algorithm, also used for detecting severing of membrane tethers, is described below. After identifying the contact point the baseline force, corresponding to the mean force in the horizontal line distant to the contact point, is also determined.

Read AFM data

Data saved during reading an IBW file or exported by clicking on the button "Export and save" can be read from the Matlab base workspace. You have to specify the variable name and click on the "Read" button.

Read 2D array

2D arrays can be imported into the program from the Matlab work space. The imported data can be

- a table variable (if available): the tracks must be stored in columns of the table, and each column (a variable) must be named according the requirements described below
- a 2D numeric array AND a cell array of strings (the descriptor) containing the variable names which must be named according to the requirements described below. The tracks must be arranged columnwise in the 2D numeric array. The 2D numeric array must contain the same number of columns as the number of strings (variable names) in the descriptor.

There are two different ways of arranging the data:

- two columns: in this case the variables must be named "rawZ" and "defl", or "sep" and "force". After reading the data the program will split the data tracks into two parts corresponding to the extension and rectraction phases. Splitting is performed by searching for the highest rawZ or sep value.
- four columns: in this case the variables must be named "rawZ_ext", "rawZ_ret", "defl_ext" and "defl_ret" or "sep_ext", "sep_ret", "force_ext" and "force_ret".

Compensation

This part of the program can modify the contact point and make the baseline horizontal.

**Find contact point**

The genetic algorithm (if the Global Optimization Toolbox is installed in Matlab), the moving step fit algoritm and a manual approach are available. If the manual method is selected, the user has to move the red marker to the contact point in the graph below:

**Make baseline horizontal**

An automatic and a manual approach are available. The automatic approach makes the baseline horizontal by fitting a line on it. The baseline is defined as the section between the contact point and the beginning of extension in the deflection vs. rawZ plot or in the force vs. separation plot. Therefore, the contact point must be accurately defined for the automatic approach to work. In the manual approach the user must define the baseline by dragging the red and green markers in the graph below. The segment between the two markers will be made horizontal by fitting a line on it. The data point corresponding to the red marker will not be shifted, i.e. it is the reference point..The user is asked to select whether the baseline correction is performed in the deflection-rawZ or force-separation plot if both variable paris exist.

Smoothing

Three kinds of smoothing algorithm are implemented in the program:

- Median filter: Each data point is replaced by the median of its neighborhood. The neighborhood is determined by the filter size.
- Mean filter: Each data point is replaced by the mean of its neighborhood whose size determined by the number entered into the "Filter size" text box.
- Gaussian filter: Each data pont is repalced by the weighted mean of its neighborhood. The weights follow a normal distribution with the central data point given the largest weight. More rigorously, the curve is convolved with a Gaussian kernel whose SD can be adjusted by the user.

The larger the filter size or filter SD, the more pronounced the smoothing effect is.

The unit of the numbers entered into the "Filter size" or "Filter SD" text boxes is "data point", i.e. if "2" is entered into the "Filter SD" box, the SD of the Gaussian filter will be two measured data points in the curve.

Smoothing will be carried out upon clicking the "Smooth" button and the smoothed curves will be added to the "Y axis" drop-down menu. The smoothed data will be stored in the structure variable which can be saved in the "Export and save" panel.

By clicking on "Do it" the parameters selected in the "X axis" and "Y axis" drow-down menus will be plotted. Both the extension and retraction parts will be displayed, but only one of them, selected in the "Direction" drop-down menu, will be clickable and alalyzable. This curve will be dispayed in blue, whereas the other curve will be shown in black. The graph window is shown below:

**Interaction with the graph:**

- Double-clicking on the blue curve: The first double-click on the blue curve will set the beginning of a step which must be followed by a second double-click on the opposite end of the step. A step is displayed by a pair of red and green lines and the Y-coordinates of the beginning and end of a step, used for measuring step heights, are shown by circles.
- Left-clicking on a step will move the step while you drag the mouse.
- Right-clicking anywhere in the graph will bring up the pop-up menu shown below. You can
- unzoom the display (if it has already been zoomed)
- copy the figure to the clipboard
- change the X offset, Y offset or both of them: the offsets will be shifted to the X or Y position of the right-click.

- Right-clicking on a step: you can delete the step on which you right-clicked or you can choose to delete all steps.
- Single left clicking and dragging: depending on your choice made by selecting one of the radio buttons in the lower left corner you can either zoom the graph to the selected area or pan the plot.
- Right clicking and dragging: you will be asked whether you want to delete all the steps in the selected range. The steps to be deleted will be shown in yellow.

Function of the buttons on the top of the graph window:

- View -1: The zoom and position of the graph window is changed to the previous value in the zoom history.
- Scroll left, scroll right: The graph window is moved to the left or to the right.
- Step left, step right: The graph window is moved to the next step to the left or right identified previously.

The algorithm finds a step where fitting two lines with a common slope to a window of the data set produces a significantly better result than fitting a single line. The principle of the procedure is summerized in the figure below:

The difference between the quality of fits is expressed by θ according to the following equation:

where *RSS* and *h* stand for the residual sum of squares and the step height, respectively.

The panel from which moving step fit can be initiated is shown below:

- θ (ΔRSS) thresholding: you can specify how large θ should be so that a step is considered to be a real step.
- "Absolute": you can specify the absolute threshold in the "Absolute threshold" text box. At the end of the fitting the "Percintile threshold (%)" text box will show which percentile the chosen absolute threshold corresponds to.
- "Percentile": You can specify the percentile above which a step is considered to be a real step, e.g. if 99 is entered into the "Percentile threshold (%)" text box, only the top 1% of the steps (regarding their θ value) will be included in the results. At the end of the fitting the "Absolute threshold" text box will show the θ value the chosen percentile corresponds to.
- None: No thresholding on θ is performed.

- Min step height thresholding: If turned on, only steps whose height is over the specified minium will be included in the results.
- Max step height thresholding: If turned on, only steps whose height is below the specified maximum will be included in the results.
- Window size: moving step fit is carried out in a moving window whose size is 2x the window size specified.
- Merge distance: if the distance between two steps is not larger than the merge distance, they will be unified.

If you press the "Do it" button, you will be asked to select a range to be analyzed:

You can select a range by clicking and dragging the mouse followed by pressing 'OK', or you can select the whole range by pressing 'Select whole range'.

Fitting will be performed in the selected range, and the identified steps will be shown in a plot whose appearance is similar to the one shown during manual analysis. The results of the automatic moving step fit are saved in the "movingstep" field of the structure variable. Any modification of the steps while interacting with the graph will not affect data saved in this field, only the data saved in the "steps" field.

**Interaction with the graph:**

- Right-clicking on a step: you can delete the step on which you right-clicked or you can choose to delete all steps.
- Right-clicking anywhere in the graph will bring up the pop-up menu shown below. You can
- unzoom the display (if it has already been zoomed)
- copy the figure to the clipboard
- change the X offset, Y offset or both of them: the offsets will be shifted to the X or Y position of the right-click.

- Single left clicking and dragging will either zoom the graph to the selected area or pan the plot depending on which radio button is selected in the lower left corner.

The functions of the buttons shown above the graph are the same as those described for manual analysis.

Multi-step fit

The multi-step fitting algorithm incorporated into this program was written by Jacob Kerssemakers (j.w.j.kerssemakers@tudelft.nl).

The principle of the fitting algorithm is explained briefly below.

The algorithm begins fitting very few steps to the data set followed by increasing the number of fitted steps. If the number of found steps is smaller than the actual number of steps, underfitting is performed. On the other hand, in the case of overfitting the number of fitted steps is higher than the optimal. For each fit a quality factor is calculated according to the following equation:

where χ^{2}_{counter fit,n} is the χ^{2} deviation between the data set and an intentionally wrong fit with n number of steps and χ^{2}_{best fit,n} is the χ^{2} deviation between the data set and the best fit with n number of steps. The quality factor will have a maximum at the optimal number of steps.

After clicking on the "Do it" button

- the user will have to select a range to be fitted as described for moving step fit
- select the noise level:

The graph shows the RMS noise as a function of the distance between measurement points. Pick the most likely estimate for the noise level. Only the Y-value will be stored. The chosen noise level will have negligible effect on the rest of the evaluation. - The quality factor is calculated for different number of steps and a graph showing the quality factor as a function of the number of fitter steps will be displayed:

The optimal number of steps correspond to the peak of the curve. Picking a value slightly higher than the peak is advisable so that each step is found. In this case there will be a couple of extra steps which do not exist in the data set, but these can be deleted afterwards. - The found steps will be displayed in the same kind of graph as for moving step fit.