Processing 3D HNCACB data.
As I thus far understand it:
-
The data is processed through nmrPipe.
-
Planes from the data (which will be carbon by hydrogen) will be examined
by nmrDraw to see if spots from the HSQC are represented by columns with
2 positive and 2 negative signals in the carbon dimension. I guess
particular attention should be paid to seeing if the HSQC signals with
known marginal T2 times are making it through. If not, we need to fall
back to HNCA and an HN(CA)CB determinations. If it looks OK, the
CBCACONH should be started. If CBCACONH is too insensitive, it is
possible to break it to a HN(CO)CA and HN(COCA)CB.
-
Then a program: plotpseq is used to convert the 3D data to strips (essentially
a 2D array of columns centered on HSQC signals). The input appears
to be derived from a table of picked peak coordinates from the HSQC.
There is a web page with some instructions linked off of Andy's unix software
page.
-
The strips can be viewed with nmrDraw or PIPP.
-
Each 4 spot series gives C alpha and beta values for this residue and the
adjacent one. Therefore strips likely to be adjacent to each other
in the sequence are identified. There is some reciprocol intesity
relationship that is also used.
-
Likely identity of residues is inferred from a table of the distribution
of C alpha and C beta resonance frequencies of the various residues.
The table of standard values is obtained from biomagresbank. It's
retrieved by clicking the <retrieve> button on the upper left of the
page and looking for the relevant tabulation in that menu. Glycine
of course will have no C beta signals.
-
Then trick is to match the ambiguous subsequences to the real sequence.
Linkage won't be possible through proline, right?
-
Andy says that there are a variety of different strategies for organizing
the task, including using some programs that may or may not be trustworthy.
I'm thinking to let Perl's regular expression matching function do it.
-
I'll have a substantial segment that's disordered. They'll be obvious
because they are so bright. Andy thinks to use the same table of
characteristic resonances as for the structured regions. I also notice
a table at BioMagResBank with value from pentapeptides.
-
I should also have a minority signal from a unfolded portion of the folded
part of the protein. These signals may be identifiable due to being
narrower. Actually they probably won't be a minority signal after
the T2 effect beats down the signal from the folded portion of the protein.
In principle, I ought to be able to assign a random coil all the way through
the sequence, and then tell which of the residues are also in the folded
portion by the much diminished intensity of the random coil signal in the
HSQC at those positions. The center part of the proton spectrum where
all this happens might be pretty crowded though.