[FRPythoneers] MEAS Data file format
jvickroy at sec.noaa.gov
Fri Aug 11 12:15:05 MDT 2000
This appears to meet your criteria and seems straightforward enough.
Out of curiosity what are your reasons for steering away from XML or
relational data bases?
From: Wayde Allen [mailto:wallen at boulder.nist.gov]
Sent: Friday, August 11, 2000 11:43 AM
To: List: Boulder Linux User's Group
Cc: List: Front Range Pythoneers; Cleland, Bob; Hough, Cristopher
Subject: [FRPythoneers] MEAS Data file format
A typical problem here at NIST is the exchange of measurement data between
various measurement systems. Typically, there are a large number of
researchers generating data, and most of whom want to spend as little time
as possible coding. This usually means that each researcher creates
"flat" row and column formated data and simply assigns some filename that
is supposed to have some meaning. Unfortunately, filenames convey very
little information about the measurement, its conditions, or purpose; and
to make things worse can easily get changed. This results in collections
of files simply containing sets of numbers without any memory of how or
why these data were gathered.
As if we really need yet another file format, I've created the following
file description loosely based on Hewlett Packard's CITIFILE format, and
have tried to make this simple so that it has a chance of being used by
someone who would normally just create a row and column set of data. The
idea is that the file can be simply used by most graphing programs such as
gnuplot without modification, but can also be used to convey information
needed for more careful analysis or report generation. I figured I'd post
this hear to see if anyone has comments, good or bad. What do you think?
(wallen at boulder.nist.gov)
Measurement Exchange and Storage (MEAS) File Format
The Measurement Exchange and Storage (meas) is designed to be a simple
data exchange and storage mechanism based on the following principles:
- the data file should be self documenting so that you can figure out
the files contents without needing external documentation,
- the structure needs to be highly extensible to deal with changing
data and contents,
- the resulting file and code needed to read it should not be
"brittle". In other words, if someone creates an extension to the
file this should NOT invalidate old legacy code,
- and finally, the structure should be as simple to implement as
In order to achieve these goals, there are two key observations. The
first being that the most common, and hence "simplest", representation of
file data is the so-called row and column format. The second is that it
is common practice to write data parsing programs to ignore lines that
start with a special character. A typical character to use for these
so-called "comment" lines is the '#' symbol. This structure alone allows
us to write data files in a "natural" form, and to embed extra information
in the form of comment lines.
However, we often want to be able to extract some of the information in
the comment lines for use in processing these data. This is easily done
by introducing keywords that begin with the # symbol. For instance, using
a keyword such as '#DATE:' allows us to enter a line in this file that
identifies the location of the date information while still complying with
the comment line idea. This means that programs written to specifically
look for the #keyword data can do something useful with this information.
Any other program simply treats such lines as comments and ignores them.
This has the added advantage that keywords for additional data can be
added at any time without causing programs not specifically looking for
these keywords to fail. The current list of established keywords is:
# - Any line beginning with # followed by whitespace is
an undefined keyword. This allows for comments
internal to the measurement file itself.
#BEGIN_TEST - Marks the beginning of a test set. This allows more
test to be included in a single file if desired.
#END_TEST - Marks the end of a test block
#BEGIN_DATA - Marks the beginning of a data block so more than one
set of data can be included for a given test.
#END_DATA - Marks the end of a data block
#COMMENT: - Comments used to describe the measurement. These
are cumulative, and subsequent comment lines are
appended to those that came before in the file.
#CUSTOMER: - Who these data belong to
#DATATYPE: - Are the numbers represented as COMPLEX or MAGPHASE
#DATE: - Date the data was taken
#DEVICE: - The relevant device ID(s)
#FILENAME: - The name of this data file
#FOLDER: - The ID number of the measurement folder
#FREQSCALE: - The frequency units (Hz, KHz, MHz, GHz, etc.)
#MANUFACTURER: - The device manufacturer
#OPERATOR: - Who was running the equipment that took these data
#STANDARDS: - Space separated list of calibration standard ID's used
#SYSTEM: - What measurement system was used to obtain these data
#VERSION: - This describes the file revision number
The following is a sample meas datafile:
#VERSION: HighPower 1.0.0
#DATE: Tuesday, April 18, 2000
#STANDARDS: 814211, 814212, 814214
#MANUFACTURER: Hewlett Packard
#OPERATOR: Wayde Allen
#COMMENT: This file contains measurement data for the gamma_g
#COMMENT: program. These data are the result of a compilation
#COMMENT: of measurements done on the devices by both the 6-port
#COMMENT: and low frequency impedance labs.
# Freq(GHz) Short(Magnitude, Phase) Open(Magnitude,Phase)
0.010 0.9996 179.89 0.9998 -0.13 0.0007 123.48
0.020 1.0001 179.75 1.0000 -0.31 0.0006 125.05
0.030 1.0000 179.61 1.0000 -0.46 0.0005 88.07
0.040 1.0000 179.48 1.0000 -0.62 0.0007 62.40
0.050 1.0000 179.36 0.9999 -0.78 0.0009 52.96
0.060 0.9996 179.24 0.9999 -0.93 0.0011 45.63
0.070 0.9994 179.10 0.9998 -1.09 0.0011 47.02
As can be seen, this structure is humanly readable, but can also be
simply parsed by a computer program. The Keywords provide standardized
documentation for both the subsequent computer program or user. Also the
computer code needed to read this only needs to look for known keywords,
and to discard blank lines or lines beginning with '#'. Anything else
will be treated as data. This means that you can add any number of
arbitrary lines or additional keywords without affecting the code used to
read the file. Only code written to use any new keywords will care.
This message sent by the FRPythoneers mailing list.
Unsubscribe: echo unsubscribe | FRPythoneers-request at lists.tummy.com
More information about the FRPythoneers