Spectrum Translator Configuration

Back

Table of Contents

Overview

The Connjur file converter may be configured via command line arguments, or an XML (eXtensible Markup Language) file. If an option is specified in both the XML file and the command line the command line setting is effective.
Command line arguments typically may be specified with a short option, e.g. -h or a long option e.g. --help. Either may be used; they are functionality equivalent.

Two types of control constructs are supported: flags, which are present or not, and options, which take a value as an argument. Flags and options are contained within section XML elements. Depending on the particular element, options are specified as XMLattributes, e.g. <option name="Magnetic" />, or as XML values, e.g. <format>varian</format>.

To prevent file converter misbehavior due to ambiguous directives, the XML file must be both well-formed and valid with respect to the file converter XML Schema packaged within the application. A validating editor, such as the free Serna Free or a commercial product such as XMLSpy is maybe be help for writing the configuration file properly, but it is not required. The file may be validated by the file converter using the check XML file option.

XML Document Header

Each XML document should begin with the following lines:

<?xml version="1.0" encoding="UTF-8"?>
<convert
xmlns="http://connjur.uchc.edu"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://connjur.uchc.edu jar:converterOptions.xsd"> 

Convert Element

The top level XML element is the convert. Within convert are five possible sub-elements ( controlmetadatasourcedestination, and semantic). Each element is optional.

Control Element

The control section contains two possible options as outlined below.

Ignoreextradata

<ignoredextradata />. Flag. By default the file converter considers extra data in the vendor specific binary file as an error. Extra data may represent a misreading of metadata, indicating an increased risk of incorrect file conversion. This option suppresses the extra data error.
Command line equivalent: -ie, --ignoreextradata

Loglevel

<loglevel>INFO</loglevel> Option. The file converter has the ability to display processing information. By default only fatal errors are logged but may be useful in diagnosing errors. The allowed levels are OFF, ALL, DEBUG, INFO, WARN, ERROR, FATAL.
Command line equivalent: -ll, --loglevel. Finer grained control over logging is provided by using a log4j.properties file. Note use of debug or info will produce signficant output and substantially slow down file conversion.

Overwrite

<overwrite />. Flag. By default the file converter will not overwrite existing binary or metadata files. This option instructs the file converter to overwrite existing files.
Command line equivalent: -f, --force

Metadata Element

The metadata section allows specification of additional data not determinable from the specified input files. Not all metadata is used by all vendor formats; metadata not supported by the particular output format will not be utilized.

Comment Element

<comment value="Comment about file" /> Option. Specifies a note about the converted file.

Dimname Element

<dimname dim="1" name="H1" /> Option. Specifies the name (label) for a dimension.

Source and Destination Elements

The source and destination elements specifies the file converter input and output. They share common subelements.

Format

<format>nmrpipe<format> Option. The vendor specific file format.
Command line equivalents: -st, --srctype (input), -dt, --desttype (output).

Name

<name type="file">input.dat</name> Option. Specify name and type of input or output file. type must be "file" or "directory" and depends on the vendor format selected. Some formats require use of pattern instead. name and pattern should not be both be specified.
Command line equivalents for files: -sf, --srcfile (input) or -df, --destfile (output).
Command line equivalents for directories: -sd, --srcdir (input) or -dd, --destdir (output).

Pattern

<pattern>input%02d.dat</name> Option. Specify pattern for multiple input or output file names. A pattern requires one or more specifications of the form %02d; these are replaced with a sequence of integers upon processing. The number, in this example "2," indicates the width of integer field. The example would expand to input01.dat, input02.dat, etc.
Currently pattern is used only for NMRPipe 3D and 4D data sets.
Command line equivalents: -sp, --srcpattern (input) or -dp, --destpattern (output).

Option

<option name="some_option" value="10" /> Option.
<option name="some_flag" /> Flag.
(Note: the above are examples of syntax, not actual options.) Vendor format specific option. The help command line option provides a list of these.
Command line equivalent: -some_option 10 or -some_flag.

Destination Element

In addition to the common elements listed above, the following elements may be present only in the destination section.

Dimseq

<dimseq>312<dimseq> Option. Specify output sequence of NMR dimensions. In the example given, input dimension 3 would be output as the first dimension, 1 as the second, and 2 as the third.
Command line equivalent: -ds, --dimseq.

Endian

<endian>big</endian> Option. Specify the endianness of the binary output data file(s). Valid values are big,little.
Command line equivalent: -e, --endian

Datatype

<datatype>ieee_float32</datatype> Option. Specify the numeric output type. Not all formats are valid for all vendor formats, and the Spectrum Translator will not perform conversions which truncate data (e.g. converting floating point numbers to integers). Valid values are ieee_float32,int32,short_int.
Command line equivalent: none

Semantic Element

Semantic operations may be specified, in order, by one or more op elements.

OP

<op name="negateimaginaries"/>Flag. Semantic operation which requires no parameters.

<op name="rancekay">1</op>Flag. Semantic operation which requires a parameter.

Complete Sample XML File

Below is a complete sample of an XML file. Individual elements and sections are optional and may be omitted in actual configuration files:

<?xml version="1.0" encoding="utf-8"?>
<convert xmlns="http://connjur.uchc.edu"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://connjur.uchc.edu jar:converterOptions.xsd">

  <control>
    <loglevel>info</loglevel> Display logging messages info level and above.
    <ignoreextradata /> Ignore leftover data in binary file.
    <overwrite /> Overwrite existing "nhsqc.sec" and "nhsqc.par" files.
  </control>
  <metadata>
    <comment value="nmrpipe to rowland conversion" /> Add this comment to output if supported by writer.
    <dimname dim="1" name="15N" /> Label the first converted dimension as "15N", if supported by writer.
    <dimname dim="2" name="1H" /> Label the second converted dimension as "1H", if supported by writer.
  </metadata>
  <source>
    <format>nmrpipe</format> Input format is NMRPipe.
    <name type="file">nhsqc.dat</name> Input is file named "nhsqc.dat".
    <option name="city" value="farmington" /> <!-- not a real option --> Demonstration of setting option with value.
    <option name="uchc" /> <!-- not a real option --> Demonstration of setting flag.
  </source>
  <destination>
    <format>rowland</format>  Output format is Rowland Toolkit.
    <name type="directory">mydata</name> Output is directory named "mydata".
    <dimseq>21</dimseq> Output input dimension 1 as output dimension 2, and input dimension 2 as output #1.
    <endian>big</endian> Output binary data in big endian.
    <datatype>ieee_float32</datatype> Output data in IEEE floating point format.
  </destination>
  <semantic>
    <op name="negateimaginaries/> After reading data, first negate imaginary values.
    <op name="rancekay>1</$gt; After negating imaginary values, preform Rance Kay correction on the first dimension. 
  </semantic>
</convert>
     

Command Line Only Options

The following options are only available via the command line:

  • help -h, --help. show command line help.
  • xmlfile -x, --xmlfile. Specify xml file name.
  • xml help -xh, --xmlhelp. Show xml specific options.
  • validate xml -cx,--checkxmlfile. Validate xml file for correct formatting without executing file conversion.
  • extract schema -dx,--dumpschema. Extract copy of W3C Schema from Java jar file.

Non Uniform Sample Options

The following nonuniform processing options have been added to the command line. Bruker non-uniform conversion is not currently supported.

Non-uniform Input

  • -scd,--schedulein required for non-uniform samples nonuniform data schedule file
  • -avg,--averaged output data has transient values averaged
  • -ctdin,--counted input counted schedule file
  • -sparse output nonuniform data in uniform format with missing values zero filled
  • -tran,--transients nonuniform input data has transients (duplicate lines in schedule file

Uniform Input

  • -compress compress sparse (contains fids with all zeros) uniform data to nonuniform

Command Line Examples

java -jar ConnjurST.jar -st varian -sd vdata -dt nmrpipe -df mydata.pip
Convert Varian data in directory named "vdata" into a NMRPipe file named "mydata.pip"

java -jar ConnjurST.jar -st rowland -sf data.dat -dt nmrpipe -dp mydata%02d.pip
Convert Rowland toolkit file named "fid" into a set of NMRPipe file named "mydata01.pip, mydata02.pip ..."

connjurst -nonuniform -st varian -sd 2dnus -schedulein varian.scd -dt rowland -df r2d.sec
Convert nonuniform varian sample in directory "2dnus" with schedule file "varian.scd" to nonuniform rowland file "r2d.sec" (using default output schedule file name).

Summary Table

The following table summarizes the avaiable options:

Item Type Command
line
short
Command
line
long
XML section XML element XML attribute(s) Example value
Source type Option -st --srctype source format none nmrpipe
Source file name Option -sf --srcfile source name type="file" 2Dspectrum.dat
Source directory name Option -sd --srcdir source name type="directory" nhsqc
Source name pattern Option -sp --srcpattern source pattern none 3D%02d.dat
Generic source option Option None --option_name source option name="option_name", value="value" none
Generic source flag Flag None --flag_name source option name="flag_name" none
Destination type Option -dt --desttype destination format none varian
Destination file name Option -df --destfile destination name type="file" spectrum.dat
Destination directory name Option -dd --destdir destination name type="directory" converted
Destination name pattern Option -dp --destpattern destination pattern none out%02d-%03d.dat
Output dimension order Option -ds --dimseq destination dimseq none 132
Output endianness Option -e --endian destination endian none little
Output datatype Option none none destination datatype none ieee_float32
Generic destination option Option None --option_name destination option name="option_name", value="value" none
Generic destination flag Flag None --flag_name destination option name="flag_name" none
Data set comment Option None None metadata comment value="comment" none
Dimension name Option None None metadata dimname dim="1" name="H1" none
Log level Option -ll --loglevel control loglevel none INFO
Extra data Flag -ie --ignoreextradata control ignoreextradata none none
Overwrite existing files Flag -f --force control overwrite none none
Semantic Operation Flag -op --operation semantic op name="operation_name" none
Semantic Operation Option -op --operation semantic op name="operation_name" 1