Overview
CONNJUR-ST uses a combination of vendor documentation and file exploration to determine the contents of files. We call data about data metadata. ST reads both NMR data and a subset of metadata for translation from one format to another. This document outlines assumptions utilized in developing CONNJUR-ST; where does the come from?
The design philosophy of the spectrum translator is that if a translation can not be performed with a high degree of confidence, it should not be performed. Therefore in practice if ST encounters what it considers illogical or inconsistent metadata, or the size of the NMR data is not what ST expects, it will output an error message and stop conversion. In the event more NMR data is present than expected a configuration switch allows the user to direct ST to ignore the excess.
NMRPipe
NMRPipe files contain a header of float encoded metadata followed by a block of NMR spectrum data. Files fdatap.html and fdatap.h in the NMRPipe distribution document the header. The table below documents usage by Spectrum Translator. Usage codes are as follows:
- L indicates the metadata is used to determine the layout of NMR data.
- M indicates metadata read and written by ST as part of the translation that does not affect the NMR data layout.
- MWindicates metadata written by ST as part of the translation that does not affect the NMR data layout.
ST name | Offset | NMRpipe name | Dimension | Description | Usage |
---|---|---|---|---|---|
ENDIAN_CONST_INDEX | 2 | na | Constant value used to determine endianess | L | |
DIMENSIONS | 9 | FDDIMCOUNT | Number of dimensions present | L | |
DIMORDER1 | 24 | FDDIMORDER1 | First | x/y/z/a dimension designation | M |
DIMORDER2 | 25 | FDDIMORDER2 | Second | x/y/z/a dimension designation | M |
DIMORDER3 | 26 | FDDIMORDER3 | Third | x/y/z/a dimension designation | M |
DIMORDER4 | 27 | FDDIMORDER4 | Fourth | x/y/z/a dimension designation | M |
D1SIZE | 99 | FDSIZE | First | dimension size | L |
D2SIZE | 219 | FDSPECNUM | Second | dimension size | L |
D3SIZE | 15 | FDF3SIZE | Third | dimension size | L |
D4SIZE | 32 | FDF4SIZE | Fourth | dimension size | L |
D2TYPE | 56 | FDF2QUADFLAG | First | Complex or real data | L |
D1TYPE | 55 | FDF1QUADFLAG | Second | Complex or real data | L |
D3TYPE | 51 | FDF3QUADFLAG | Third | Complex or real data | L |
D4TYPE | 54 | FDF4QUADFLAG | Fourth | Complex or real data | L |
D1SWEEPWIDTH | 229 | FDF1SW | First | sweep width | M |
D2SWEEPWIDTH | 100 | FDF2SW | Second | sweep width | M |
D3SWEEPWIDTH | 11 | FDF3SW | Third | sweep width | M |
D4SWEEPWIDTH | 29 | FDF4SW | Fourth | sweep width | M |
FDF1AQSIGN | 475 | FDF1AQSIGN | First | combined alternating / negate imaginaries flag | M |
FDF2AQSIGN | 64 | FDF2AQSIGN | Second | combined alternating / negate imaginaries flag | M |
FDF3AQSIGN | 476 | FDF3AQSIGN | Third | combined alternating / negate imaginaries flag | M |
FDF4AQSIGN | 477 | FDF4AQSIGN | Fourth | combined alternating / negate imaginaries flag | M |
FDF1FTFLAG | 222 | FDF1FTFLAG | First | Time or Frequency domain info 1 | M |
FDF2FTFLAG | 220 | FDF2FTFLAG | Second | Time or Frequency domain info | M |
FDF3FTFLAG | 13 | FDF3FTFLAG | Third | Time or Frequency domain info | M |
FDF4FTFLAG | 31 | FDF4FTFLAG | Fourth | Time or Frequency domain info | M |
FDF2OBS | 119 | FDF2OBS | First | "Observed MHz, or spectral frequency" | M |
FDF1OBS | 218 | FDF1OBS | Second | "Observed MHz, or spectral frequency" | M |
FDF3OBS | 10 | FDF3OBS | Third | "Observed MHz, or spectral frequency" | M |
FDF4OBS | 28 | FDF4OBS | Fourth | "Observed MHz, or spectral frequency" | M |
FDF1CAR | 67 | FDF1CAR | First | Carrier ppm | M |
FDF2CAR | 66 | FDF2CAR | Second | Carrier ppm | M |
FDF3CAR | 68 | FDF3CAR | Third | Carrier ppm | M |
FDF4CAR | 69 | FDF4CAR | Fourth | Carrier ppm | M |
FDF1P0 | 245 | FDF1P0 | First | Zero order phase correction | M |
FDF2P0 | 109 | FDF2P0 | Second | Zero order phase correction | M |
FDF3P0 | 60 | FDF3P0 | Third | Zero order phase correction | M |
FDF4P0 | 62 | FDF4P0 | Fourth | Zero order phase correction | M |
FDF1P1 | 246 | FDF1P1 | First | First order phase correction | M |
FDF2P1 | 110 | FDF2P1 | Second | First order phase correction | M |
FDF3P1 | 61 | FDF3P1 | Third | First order phase correction | M |
FDF4P1 | 63 | FDF4P1 | Fourth | First order phase correction | M |
FDCOMMENT | 312 | FDCOMMENT | Comment field 160 characters | M | |
FDF1LABEL | 18 | FDF1LABEL | First | Nucleus label | M |
FDF2LABEL | 16 | FDF2LABEL | Second | Nucleus label | M |
FDF3LABEL | 20 | FDF3LABEL | Third | Nucleus label | M |
FDF4LABEL | 22 | FDF4LABEL | Fourth | Nucleus label | M |
TRANSPOSED | 221 | FDTRANSPOSED | Dimensions are transposed flag | M | |
FD2DPHASE | 256 | FD2DPHASE | Second | States/TPPI designation | MW |
Rowland Toolkit
The Rowland Toolkit format is documented in the online manual. It consists of an ASCII parameter ("par") file and separate binary NMR Data file.
The following lines are used to determine the file layout. The number of dimensions is implicitly determined based on the number of columns present.
- Dom indicates the domain (time/frequency) of the data. 1
- Format indicates endianess and datatype.
- N indicates the number of points in each dimension and whether the data is real/complex.
- Layout indicates ordering of the data in the binary file and provides a secondary description of the number of points in a dimension.
The following lines are read and/or written for metadata support.
- Cphase indicates zero order phase correction term.
- Lphase indicates the first order phase correction term.
- Sf indicates spectral frequency.
- Ppm indicates ppm of the carrier frequency
- Nacq indicates the number points. ST only writes this as it is redundant with N.
- Quad indicates quadrature; that is States, TPPI, et. al.
- Comment is the data set comment.
Varian
Varian information is stored in a binary file fid and an ASCII procpar file. Metadata exists in both files. Documentation of the binary fid file is found in VNMR User Programming VNMR 6.1C Software2 and documentation of the ASCII procpar file is found in VNMR Command and Parameter Reference Varian NMR Spectrometer Systems With VNMR 6.1C Software 3.
The binary file is composed of multiple blocks separated by a block header. From the header information about the number of blocks, the type of data (float or 16 bit integer of 32 bit integer), whether the data is time or frequency domain.1, and a valid data flag is read.
The procpar files in parsed and a subset of parameters are used. Currently only uniformly sampled data is supported. The table below documents usage by Spectrum Translator. Usage codes are as follows:
- L indicates the metadata is used to determine the layout of NMR data.
- M indicates metadata read and written by ST as part of the translation that does not affect the NMR data layout.
Some metadata depends on the channel assignment. This can be specified via the channel_assignment configuration option. (By default channel one is assigned dimension one, etc.) The Channel or Dimension column indicates whether data is assigned by channel (C) or dimension (D).
Procpar parameter | Dimension or Channel | Description | Channel or Dimension | Usage |
---|---|---|---|---|
tn | 1 | Name of nucleus | C | M |
dn | 2 | Name of nucleus | C | M |
dn2 | 3 | Name of nucleus | C | M |
dn3 | 4 | Name of nucleus | C | M |
dn4 | 5 | Name of nucleus | C | M |
np | 1 | Number points | D | L |
ni | 2 | Number points | D | L |
ni2 | 3 | Number points | D | L |
ni3 | 4 | Number points | D | L |
sfrq | 1 | Spectral frequency | C | M |
dfrq | 2 | Spectral frequency | C | M |
dfrq2 | 3 | Spectral frequency | C | M |
dfrq3 | 4 | Spectral frequency | C | M |
dfrq4 | 5 | Spectral frequency | C | M |
sw | 1 | Sweep width | D | M |
sw1 | 2 | Sweep width | D | M |
sw2 | 3 | Sweep width | D | M |
sw3 | 4 | Sweep width | D | M |
rfl | 1 | Reference Peak Position | C | M |
rfl1 | 2 | Reference Peak Position | C | M |
rfl2 | 3 | Reference Peak Position | C | M |
rfl3 | 4 | Reference Peak Position | C | M |
rfl4 | 5 | Reference Peak Position | C | M |
rfp | 1 | Reference Peak Frequency | C | M |
rfp1 | 2 | Reference Peak Frequency | C | M |
rfp2 | 3 | Reference Peak Frequency | C | M |
rfp3 | 4 | Reference Peak Frequency | C | M |
rfp4 | 5 | Reference Peak Frequency | C | M |
rp | 1 | Zero order phase correction | D | M |
rp1 | 2 | Zero order phase correction | D | M |
rp2 | 3 | Zero order phase correction | D | M |
rp3 | 4 | Zero order phase correction | D | M |
rp4 | 5 | Zero order phase correction | D | M |
lp | 1 | First order phase correction | D | M |
lp1 | 2 | First order phase correction | D | M |
lp2 | 3 | First order phase correction | D | M |
lp3 | 4 | First order phase correction | D | M |
lp4 | 5 | First order phase correction | D | M |
When translating from Varian format Carrier PPM is calculated using the above values using the equation:
Carrier PPM = (Sweep Width/2 - Reference Peak Position + Reference Peak Frequency) / Spectral Frequency
When translating to Varian format the Reference Peak Frequency is set to zero and the inverse of the above equation used to calculate Reference Peak Position..
The layout of real and imaginaries numbers and whether a dimension is complex is inferred by the values of the array procpar parameter, as outlined in the table below. Varian data which does not follow this convention, e.g. custom pulse programs, cannot currently be translated by the Spectrum Translator.
array parameter value | Dimension or Channel | Description | Channel or Dimension | Usage |
---|---|---|---|---|
phase | 2 | Data complex | D | L |
phase2 | 3 | Data complex | D | L |
phase3 | 4 | Data complex | D | L |
The procpar file is also used to support configuration of Varian's VNMJR software. The ST Varian translation output does not directly support VNMJR, however, the procpar_template option may used to modify an existing procpar file with metadata translated from another data set.
Bruker
Bruker file formats are documented in the TopSpin Acquistion Reference Guide4. Two formats are used; one for raw time domain data coming from the spectrometer and another for data which has been processed, including conversion to frequency domain. Code for processed data is present in the Spectrum Translator but has not completed Quality Assurance testing.
Bruker data consists of a file system hiearchy of files. Time domain data is stored in numbered directories beneath a name root directory; the time domain directories contain a subdirectory named pdata containing numbered directories in which processed data is stored.
Both formats store metadata in dimension specific files.
Dimension | Time domain data | Processed data |
---|---|---|
1 | acqus | procs |
2 | acqu2s | proc2s |
3 | acqu3s | proc3s |
4 | acqu4s | proc4s |
The tables below documents usage by Spectrum Translator. Usage codes are as follows:
- L indicates the metadata is used to determine the layout of NMR data.
- M indicates metadata read and written by ST as part of the translation that does not affect the NMR data layout.
- MWindicates metadata written by ST as part of the translation that does not affect the NMR data layout.
The following parameters are read for the dimension 1 (direct) time domain data.
Parameter | Description | Usage |
---|---|---|
BYTORDP | whether data is big or little endian | L |
TD | number of data points | L |
AQ_mod | real or complex data | L |
PARMODE | the number of dimensions | L |
SFO1 | spectral frequency | M |
SW | sweep width (spectral window) in ppm | M |
SW_h | sweep width (spectral window) in Hertz | MW |
AQSEQ | acquistion sequence (3D sets only) | MW |
NUC1 | nucleus name | M |
The following parameters are read for the dimension 2 and above (indirect) time domain data.
Parameter | Description | Usage |
---|---|---|
FnMODE | real or complex data and sign alternation | L |
TD | number of data points | L |
SFO1 | spectral frequency | M |
SW | sweep width (spectral window) | M |
NUC1 | nucleus name | M |
1Much of the Spectrum Translator has been written to support frequency domain data; however this code has not undergone quality assurance testing and is considered experimental.
2VNMR User Programming VNMR 6.1C Software Pub. No. 01-999165-00, Rev. A1200, (C) 2000, Varian, Inc.
3VNMR Command and Parameter Reference Varian NMR Spectrometer Systems With VNMR 6.1C Software Pub. No. 01-999164-00, Rev. B0801, (C) 2001, Varian, Inc.
4TOPSPINAcquistion Reference Guide Part Number H9775SA1 V2/February 3rd 2005 (C) 2005 Bruker BioSpin GmbH