A skeleton of processing steps is given by numbered bold face statements. Explanation, options, and possible problems are described in regular face. Commands given by typing are in italic.
On site:
There are currently 5 silicon graphics computers you can access in the computer room adjacent to the faculty office suite on the 5th floor of the Allied Health building. These are the ones with the blue boxes. You can access the same programs and data directories from any of these computers that is available. There is also one in the support area outside the faculty offices. The network that allows cross operability of these computers is referred to as the NIS network.
For information about how to do remote access from outside the structure center, ask Dr. Hardies.
It is no longer true that the NMR computers are integrated into the NIS system such that the same username and password automatically applies to all. It is possible to telnet from the NIS computers to the avance500 or the avance600. These are the computers running the avance500 and avance600 NMR spectrometers, respectively.
Drives attached to the avance500 and the avance600 are mounted in such a way that they appear as subdirectories in the root directories of those computers. The same drives are mounted in the NIS system such that they appear as subdirectories of the root directory of inert, but not necessarily by the same name. When I say that a drive is mounted as /x, that means its name in that system is x and it is referenced and listed as a subdirectory of the root.
Data form the Avance 600
The computer you log into to run the Avance600 spectrometer
is known as avance600. Two data drives used by the computer avance600
are known to it as /avance6001 (until recently was avance600)
and /u. These same two drives are also mounted in the NIS
system, but the second is known by the name /avance600_u rather
than /u, because there is a different drive mounted in the NIS system
already known as /u.
User data sets are deposited by XwinNMR in /avance6001/data/<username>/nmr/,
and can be retrieved from NIS by specifying the same path. Various
of the files used by XwinNMR are stored in directories on the /u
drive. For example the pulse programs are in /u/exp/stan/nmr/lists/pp/.
They can be accessed from the NIS system as /avance6001_u/exp/stan/nmr/lists/pp/.
Your user root directory on the avance600 computer is at /user/people/<username>/.
If
you took notes with a text editor during your session, they will be in
that directory. That directory is not on a drive mounted in the NIS
system. If you want to retrieve such a file to the NIS system, you
will have to retrieve it by scp. Note: you should back up any records
of significance placed in the avance user root directories into your space
in the NIS system. Otherwise, these directories are particularly
vulnerable to being lost during system upgrades.
Data from the Avance 500
The avance 500 is configured analogously. The drives are /avance500_u, /avance5001, /avance5002.
Note: If you access the /avance drives from NIS, they are read only.
See the spectrometer status pages for more up to date information about system configuration (ie. computer and disk name) changes.
Logon to the NIS system:
At this time, usernames/passwords for NIS, avance500, and avance600 are not necessarily the same. You will have to make sure you have an account on the NIS system in addition to the spectrometer, and you will have to enter the NIS username and password to initiate the session described here. When you interrupt the screen saver with a mouse movement, a box will be displayed for your username and password. Upon entering these, you will get a desktop displaying the unix tool chest.
2. Select <desktop><open unix shell> from the unix tool chest to get a winterm window to work in.
This will start you in /u/people/<your username>, which
is your home directory in the NIS system.
You can return to this directory by cd or cd
~. This directory will not be large enough for processing nmr data.
It is used to keep environment files and files from other miscellaneous
operations you may choose to do.
You will be assigned a larger quota of space on another
disk drive to do the actual data processing. For example my data
directory is /instinct2/hardies. Both your home directory and data
directory have size limits set by the administrator. To see how much
space is left in both of your directories, you have to know which computer
your drives are attached too. In my case they are both attached to
instinct.
Type telnet <compter hosting drive> & logon with the same
username and password. Then type quota -v.
[7/18/02 - quotas not currently in force; disks are filling
up and the quota command doesn't really tell you how much space is left.]
Directory pseudonyms.
3. If you have not already done so, create pseudonyms for your NIS data directory, and spectrometer data directory.
These instructions assume you have created pseudonyms for your NIS data directory, your avance data directories, and the avance pulse program directories. This is done by adding lines to the end of your .cshrc file (a hidden file in your home directory; type ls -a to see hidden files). For example, I have used vi to add the following two lines to my .cshrc file:
setenv data /instinct2/hardies
setenv nmrdata500 /avance5002/data/hardies/nmr
setenv nmrdata600 /avance6001/data/hardies/nmr
setenv pp500 /avance500_u/exp/stan/nmr/lists/pp
setenv pp600 /avance600_u/exp/stan/nmr/lists/pp
After opening a shell with this .cshrc file in effect, these directories can now be abbreviated as $data, etc.
So wherever it says $nmrdata below, understand that you
must substitute a pseudonym specific for your directory on the particular
machine that collected the data (in my case either $nmrdata500 or $nmrdata600).
[See spectrometer info page for where the old amx500
data directories have been placed]
There is detailed documentation on the nmrPipe
programs on line at: http://instinct.v24.uthscsa.edu/~hincklab
Select <software packages> and <nmrPipe>
Summaries of individual
nmrPipe functions can be obtained by typing
nmrPipe
-fn <function name> -help at the unix command line.
4. Copy the dataset to your NIS data directory:
cd $data
ls $nmrdata (to id the dataset directory name)
cp -r $nmrdata/<dataset name> . (copies entire
subdirectory structure of dataset; the "period" for a destination means
to copy to the default directory, which is $data in the sequence above)
ls (confirm that your dataset is present; this
is a subdirectory)
cd <dataset name>
ls (directories 1 2 3 etc. represent different
experiment numbers you assigned. These are subdirectories. You must
determine from your records which experiment number you intend to process.
If you created a title file within XWIN-NMR, it can be found in pdata/1/title
and examined with vi.)
cd <experiment number>
ls -1 (letter l, not number 1; shows directory
including file sizes)
You should observe several files. The acqu* files are parameter files. ser is the actual data. The pulse program is saved by the name pulseprogram and pulseprogram.P. These are somewhat processed by XWIN-NMR before saving, so you may wish to compare to the original in the $pp directory when seeking some information about the pulse program. However, remember that you may have subsequently edited the version in the $pp directory in conjunction with a different experiment.
5. Write down the size of the ser file. It should be 2,867,200 if the data were collected in the usual way by the hsqc_fb pulse program (1024 complex points in direct dimension by 175 complex points in the indirect dimension).
6. Delete XWIN-NMR processed files (if you have processed with XWIN-NMR):
If you have processed the data with XwinNMR at the spectrometer,
there will be large data files embedded further in the subdirectories that
you should delete.
cd pdata
You will see one or more numbered directories.
Each one is a separate time that you processed the data (and gave a different
process number). For each one cd <process number>, ls,
rm
2*, rm dsp*. This deletes the large processed files.
Leave other files; they may contain documentation that will be useful for
you later. Note: due to space limitations on the spectrometer drives,
you should also delete the same processed data files from that system when
they becomes obsolete. You must log onto amx500 or avance600 to do
that. You can do that from the NIS system by telnet amx500 (or
telnet avance600), logon, go to your data directory and
delete the same files. Current practice is to leave the rest of the
dataset intact on the spectrometer drives as a backup (specifically including
the parameter files and the ser files).
7. Set your directory to the relevant experiment-numbered subdirectory with the ser file you wish to process.
eg. cd $data/hsqc_fb_l1.sch/1
8. Run bruker
bruker
You will see two new windows. One contains a table
of parameters for your review, and the other the conversion script that
bruker
will build from them. The spectrometer reading box should say ./ser
Change the name of the converted output file if you like
(The default name, test.fid, is assumed in the instructions that follow).
The arrows on many of the boxes give drop down menus. You may select
an option or directly type something in the box.
The table will be updated with information extracted from the parameter files and the pulse program files saved in the directory with the ser file. The column labeled "x-axis" is also called the direct dimension or the proton dimension. The y-axis is also called the indirect dimension, or in this case the nitrogen dimension. The values highlighted in yellow are particularly likely to require correction; however, you should check them all.Note: for a 3 dimensional experiment, the identities of the 2nd (y) and 3rd (z) columns are less intuitive, but must be correctly identified. The isotope whose delay times are changed in the inner loop of the pulse program is the y-axis. The one changed in the outer loop is the z-axis.
This is an example of a properly set up table for a 1H/15N HSQC experiment:
x-axis y-axis Total points 2048 350 Total valid complex points 1024 175 Mode complex complex Spectral width 6024.096 2000.00 Observe frequency 500.134 50.684 Center position 4.757 118.100 Axis label HN N
#!/bin/csh
bruk2pipe -in ./ser -bad 0.0 -nosqap
\
-xN
2048
-yN
350 \
-xT
1024
-yT
175 \
-xMODE
Complex -yMODE
Complex \
-xSW
6024.096 -ySW
2000.000 \
-xOBS
500.134 -yOBS
50.684 \
-xCAR
4.757
-yCAR 118.100
\
-xLAB
HN
-yLAB N
\
-ndim
2
-aq2D States
\
-out ./test.fid -verb -ov
The entries most likely to need changed are:
This table corresponds to your experiment as follows:
x-axis. The protons in the protein are excited by a pulse sequence that involves an interaction with adjacent nitrogens. The two detectors then take a series of paired readings at successive time points to define the RF signal that is emitted by those protons. This is the data that would define one complete 1-dimensional nitrogen-edited proton spectrum. The parameters describing one such series are listed in the x-axis column.
y-axis. There are a series of nitrogen-edited proton spectra recorded in the ser file, each taken with a different delay time in the pulse sequence that affects how the nitrogens influence the intensity of the bound protons. The y axis column has parameters related to retrieving the chemical shifts of the bound nitrogens by processing this series.
MODE: In the ser file, the signal from the two detectors are recorded as ordered pairs in the format of a complex number. Therefore the mode row should be set to "complex" in the x column. The series of proton fids is taken in pairs so that the nitrogen dimension will also be composed of complex numbers. So the mode in the y column should also be "complex".
Total points and Valid complex points: The total number of complex numbers in each complete proton spectrum is listed as number of valid points. This should be twice the number of total points tabulated, since for complex numbers there are two points per complex number. The subroutine called by the pulse program to collect the data retrieves 1024 complex points per call.. The proton scans are stacked up as pairs used to construct a complex number for each point in the indirect dimension. The number of such pairs is controlled by parameter L3=175.You can check on the value of L3 by looking at the messages given in the winterm window as bruker runs. You can also use a specialized version of grep to retrieve it from the parameter files: Open a new shell, set the directory to this data directory, type uxgrep l (lower case L). uxgrep <string> can be used to search the parameter files for other parameters.
[Note: when you change the acquisition time by altering parameter NS, you do not change these parameters. You just change the number of replicate readings that get averaged into a single recorded reading. NS doesn't appear in the pulse program. It is used by the subroutine called by the pulse program.]
You can check that the number of points inferred by bruker are correct by noting that total points in x * total points in y * 4 should equal the size of the ser file in bytes (which you wrote down earlier). In this case, 2048 * 350 * 4 = 2,867,200.
NB! The above parameters must be correctly set in order to process the data. The parameters below influence the accuracy with which the axes are labeled on the 2D plot that results.
Observe frequency. - the carrier frequency for each of the nuclei observed. The drop-down menu lists appropriate values for each of the following nuclei (H = 500.134, N=50.684, C = 125.764). More accurate values should be used for calculating center position values below. To get more accurate values look at the pulse program to see which channel was used as the carrier for each kind of nucleus and use uxgrep to see the corresponding SFOx parameter. For example, in hsqc_fb, nitrogen was stimulated on channel f3, and SFO3 reveals the carrier frequency to have been 50.683840. For 13C, there may be several frequencies used in the pulse program, specified by a parameter F2LIST. In this case, only one of the frequencies is the relevant observe frequency for 13C, and a comment in the pulse program should reveal this frequency.
Spectral width - the maximum frequency that can be correctly measured given the time intervals between the digital RF measurements. The pulse program specifies the sweep width for the proton axis as parameter SW_h. The sweep width for the indirect dimension is set up in the pulse program as 1/(2*IN0). For other pulse programs, the correct formulation for spectral width should similarly be revealed by a comment in the pulse program. This formula can be picked out of the drop-down menu. Note that signals outside the range appear at false positions within the range (said to be "folded"). These are identifiable by having an inconsistent phase with the other peaks in the final 2D plot.
Center position - The chemical shift of the point in the middle of the spectrum (N/2 + 1). As of this writing, it is 4.7396 for 1H and 118.05 for 15N at 300 K. The temperature controller currently reads 1.6 degrees higher than the true temperature. See temperature calibration and chemical referencing.
Axis label - is whatever you want as a label for this axis on the graph. "NH" for example, means amide hydrogens.
Alternatively, you may click on <save script> and <execute script> to execute the script from within the bruker program. Check that the output file (test.fid, or otherwise if you renamed it in bruker) has been created .
The processing is also done by a script executed by the program nmrPipe, and there is an interactive graphical program named nmrDraw to help you set up the processing script and to view your data.
An example of a script that I have used on 1H/15N HSQC data is as follows:
#!/bin/csh
#
# Basic 2D Phase-Sensitive Processing
# Cosine-Bells are used in both dimensions.
# Use of "ZF -auto" doubles size, then rounds
to power of 2.
# Use of "FT -auto" chooses correct Transform
mode.
# Imaginaries are deleted with "-di" in each
dimension.
# Phase corrections should be inserted by
hand.
nmrPipe -in test.fid \
| nmrPipe -fn SOL \
| nmrPipe -fn SP -off 0.30 -end 1.00
-pow 1 -c 0.5 \
| nmrPipe -fn ZF -auto
\
| nmrPipe -fn FT -auto
\
| nmrPipe -fn PS -p0 167.0 -p1 0.00 -di -verb
\
| nmrPipe -fn EXT -left -sw \
| nmrPipe -fn POLY -ord 1 -nl 20 40 60 80 100 120
140 180 750 800 850 900 950 \
| nmrPipe -fn TP \
| nmrPipe -fn SP -off 0.35 -end 1.00 -pow 1 -c
1.0 \
| nmrPipe -fn ZF -auto \
| nmrPipe -fn FT -auto \
| nmrPipe -fn PS -p0 -90.00 -p1 180.00 -di \
-ov -out test.ft2
The general form of this script is that the first line reads in the raw data file (-in test.fid), each subsequent line performs a processing function (like -fn SOL) with a variety of parameters set by switches (like -auto), and the last line includes a specification to write an output file (-out test.ft2). You will have to customize several parts of the script for your data.
The \ at the end of each line is a continuation mark.
Do
not put any characters (including blanks) past the \.
The | at the beginning of each line is a unix "pipe"
operator. The output of one function is passed as input to the next
without writing a file. You must put a -out <filename>
specification to get any output. You may also put this specification
elsewhere in the option list to save intermediate results. The -ov
specification means to overwrite preexisting files of the same name.
If you put a -out specification at an intermediate position, you
should comment out the remainder of the script (with a leading #), or follow
the statement with the -out with a nmrPipe -in <intermediate
filename> statement to restart the pipe.
Meanings of the functions in the example script are listed below, as well as whether they usually need customized..
The script will be kept in the same directory with
the input data by the name nmrproc.com.
You could copy a script like the one above to this directory
and name it nmrproc.com if you wanted to use it as a template. Otherwise,
you can set up the script from a template provided by nmrDraw.
One explores the steps necessary to customize the script by running the graphical program nmrDraw. The program nmrDraw allows you to execute a whole script on the full data, or only certain steps in the script on selected fids. There are two ways to execute the functions from within nmrDraw. 1) You can edit a script to comment out steps you don't want to do (by adding a # to the beginning of the line), and add a -ov -out <filename> to steps for which you want to recall and review partially processed data. Or 2) you can directly load the unprocessed file, a partially processed file, or the fully processed file, and then pick a particular fid or frequency domain slice and transiently apply functions to it. For first-pass adjustments, one partially processes to the step prior to which customization is needed, and then uses transient applications of the next function to settle on the desired parameters by trial and error. One then edits these parameters into the script and uncomments down to the next stopping point (remembering to move the -ov -out <filename> specification to the new point. Once one gets to a fully processed spectrum, additional adjustments may be made by editing the script and fully processing the data to see the end result of the modification.
The program is very general, and different users will develop their own strategies for working through the data. However, a step-by-step example is given below to help new users get started.
9. Run nmrDraw.
With the directory set to contain your raw data as test.fid, type nmrDraw10. Load a template script by <file><Macro edit><process 2D><Basic 2D>.
nmrDraw menus are expanded by right mouse clicks, but functions in the menus are activated by left mouse clicks.
An editor window will come up with a basic script with most of the processing steps you can expect to perform already filled in. If you want to use a script that you've used before, copy it to the directory and rename it nmrproc.com prior to running nmrDraw. It will then appear in the macro edit window after <file><Macro edit>.11. Explore the solvent distortion correction.
1. Load your raw data by <file><select file> and pick test.fid out of the menu. Click <done>. 2. Press "d" on the keyboard and then "h". 3. Left click near the bottom of the drawing area and drag the mouse to the bottom of the drawing area. The box in the upper left display area should say y=1.
"d" (or <draw><contour>) shows a view from the top of the rows of fid oscillations. "h" (or <mouse><1 D horizontal>) shows one of the fids; the row on the bottom (y=1) is the first serial fid with no time evolution from nitrogen. Until the 2nd dimension is Fourier transformed, you should always look at this slice when judging the operation of a function to avoid being confused by the effects of nitrogen evolution.
On a slow connect, skip the 2D display by loading with <read> <done>. Then set the y coordinate box to 1 followed by a < carriage return> and choose the horizontal 1D display with "h".
Available mouse operations are indicated on the upper window bar. When the mouse is inside the drawing area, the left, middle, right buttons select an fid, horizontal pan, and horizontal zoom, respectively. When the mouse is over the purple borders the buttons set the phase pivot, vertical scale, and vertical offset, respectively, of the chosen fid.
Examine this fid (y=1). It should be a complex of interfering high frequency sine waves which decay in intensity as they proceed from left to right. The axis that the waves oscillate around should be a straight line. Solvent distortion will cause the axis to slowly undulate itself as it progresses from left to right. This is an artifact due to imprecision in water suppression, and the first issue requiring a processing step. If you do not remove this effect, the right side of your 2D plot will be overcome with a residual water signal centered on the water (carrier) frequency.12. Edit the script to process through the first Fourier transform.The recommended correction function is named SOL. It takes an average of 30 points around each point to estimate the offset of the undulating axis from zero. It then subtracts that offset from each point. To see if SOL improves the fid try it out as follows.
4. Right click <proc> <function> <SOL solvent correction>. SOL should appear as the proposed function to transiently execute. Then left click <execute>. Make the pop-up window go away with <done>.
Your fid should become straightened out. If not, then 1) make a resolution to set up your solvent suppression better the next time you do an acquisition. 2) You can remove the effect of the last processing by pressing "h" on the keyboard, and then repeat step 4 above using some of the options available for SOL or the alternative function POLY, solvent correction. Information about how to use these options can be found by consulting the nmrPipe web page cited at the top of this document. A listing of options can be found for any nmrPipe function by typing (in a separate shell window) nmrPipe -fn <function name> -help. If the solvent can not be completely suppressed, then process through to the Fourier transformed data and see if the noise is confined to a region away from your peaks. If so, you may use something like -fn EXT -x1 10.5PPM -xn 6.0PPM -sw to cut the water region out of the dataset.
You could go on to explore how the SP, ZF and FT functions affect the look of the 1st direct slice by transiently applying them in turn. However, these don't need adjustment at this time.
Your script should look something like this:
13. <save> and <execute> the script.nmrPipe -in test.fid \Note that you will have added the line with -fn SOL (or whatever alternative you settled upon).
| nmrPipe -fn SOL \
| nmrPipe -fn SP -off 0.50 -end 1.00 -pow 1 -c 0.5 \
| nmrPipe -fn ZF -auto \
| nmrPipe -fn FT -auto -ov -out test.ft1 \
#| nmrPipe .... rest of lines commented out
You also should add the -c 0.5 switch to the SP function. This prevents a mathematical artifact in the Fourier transform that will shift your baseline for the transformed spectra up from zero. You will probably later modify the -off parameter and maybe the -pow parameter, but leave them alone for now.
Also, don't forget to add the -ov -out <filename> specification after the 1st Fourier transform.
<save> puts the edited script in nmrproc.com. <execute> executes the script in nmrproc.com.14. Load the partially processed file by <file><select><test.ft1><done>. Use "d" and "h" and pick the 1st slice (y=1) as before. "c" followed by "h" will remove the contour display.
If you do not <save> before <execute> you will inadvertently execute the previous version of the script (if there was one).
15. Phase the spectrum.
Phases are modulo 360. -200 and +160 are the same thing.
It will be possible to adjust phase more precisely after baseline correction is made and you separate the peaks more cleanly in the 2D display.
1. Edit the script to uncomment the PS step and the EXT step. Move the output statement to after the EXT step.
Notice that we will process from the beginning to
the new stopping point. Instead we could input the first intermediate
file and just do the new processing steps. But in the way illustrated
above, the script always corresponds to exactly what processing was done
to make the latest intermediate file.
The -verb switch can be put on any function.
It displays a popup window showing how the process is going during execution.
If the process fails, unfortunately, the pop up tends to disappear before
you can read the messages. If a process fails, minimize nmrDraw
to an icon There will be error messages in the winterm window.
Poly does a polynomial fit of points judged to be on the
baseline, and then subtracts this function from the spectrum to try to
flatten and zero the baseline. The -auto switch allows the
program to automatically choose the points that it considers to be on the
baseline for the fitting. -ord specifies the order of the
polynomial, ie. -ord 1 fits a straight line. The default is
-ord
4. Because there is a big region in the middle of the spectrum
that does not come down to baseline, lower order polynomials may be a better
choice.
The script should look something like this:
You have to redraw the screen ("d") to visualize each + or - adjustment. A more quantitative approach is to set the first contour level box relative to the level of noise. <peak><estimate noise> gives a pop-up window with an estimate of rms noise. By default, the first countour is set to 6 times the noise level. This tends to leave out smaller peaks. 4 times the noise is good for looking for faint peaks. 3 times the noise is good for displaying patterns in the noise, including phase errors.
On a slow connect, read the 2D file by <read><done>, and use <peak> <estimate noise> to set the first contour box as desired before issuing the draw command. This will avoid waiting for an unnecessary display to be drawn.
First order phase correction (P1) means that a different phase correction is applied to each peak as a linear function of its frequency. Ideally there would be no need for a first order phase correction. It appears because a delay parameter "d7" in the pulse program hasn't been fine tuned yet. Right now I get about -45 degrees. It may be considerably reduced in future releases of the hsqc pulse program. The first order phase correction may interact with the -c switch in the SP window function. Ideally it would be small and -c 0.5 would then be correct for the first dimension SP function. High first order phase correction (as in the 2nd dimension) should use -c 1.0 (which is the default). The slightly high P1 correction in the first dimension may cause a baseline shift, which may be complicating the baseline correction. For example, it may be why POLY -auto tends to fail.19. Evaluate the signal to noise and the resolution.
The truncation problem mainly affects the 15N
dimension. Peaks differ in their vulnerability to this problem based
on their individual relaxation times. Peaks that have long relaxation
times also tend to be intense, thus increasing the visibility of the artifact.
You may choose to tolerate this artifact on certain well separated peaks,
in exchange for better resolution in other areas of the plot.
In the 2D plot, look at the baselines of several of the transformed slices. If there is a consistent curvature to them, then you may return to the POLY baseline correction and try harder to remove that trend from the data. It may be helpful to return SP -off to 0.5 to suppress noise while reworking the baseline.21. Plot the final processed data.
To create a plot, select <print hard copy> from the <file> menu. The the window which appears, supply an appropriate name for a postscript file to contain the plot. To get hard copy at this time, replace the word "echo" with lp -dlaser1 (that's "el" pee - dlaser "one"). This directs the plot to the printer in the back of the structure center computer lab. [laser2 is in the 700 MHz room].22. Record your final noise and signal to noise ratio.As currently configured, neither the title specified in the <print hard copy> window, nor labels assigned to peaks through <peak detection> will display on the printed copy.
When creating a .PS file from a remote connection, blank out the field that contains the word "echo".
There is a program named showps that can be used to print postscript files at a later time.