pqact.conf
Contents
Introduction and General Syntax
A pqact configuration-file (typically pqact.conf) tells the
pqact process that reads it how to dispose of certain classes of
data-products. This file normally resides in the etc/ subdirectory of the
LDM installation.
The general syntax of an entry in the configuration-file is:
feedtype TAB prodIdPat TAB action TAB [arg ...]
where:
- feedtype
- A feedtype (e.g., WMO, IDS|DDPLUS, 3).
- prodIdPat
- An ERE for matching
data-product identifiers or the string ^_ELSE_$. If prodIdPat
is ^_ELSE_$, then the specified action is performed if nothing has been done with the data-product yet
and the first character of the data-product identifier is not an underscore (_).
- action
- The action to take with data-products that match
feedtype and prodIdPat. Possible actions are
- NOOP
- Don't do anything with the data-product. This might be useful to
prevent data-products from being acted-upon by a subsequent entry whose product-ID pattern is
^_ELSE_$.
- FILE
- Write the data-product to a file using the
write() function.
- STDIOFILE
- Write the data-product to a file using the (buffered)
fwrite() function.
- DBFILE
- Write the data-product to a database.
- EXEC
- Execute a program.
- PIPE
- Write the data-product to a program's standard input.
- [arg ...]
- Optional arguments for action. See String Substitution in Action-Arguments
below.
- TAB
- Is either a tab character or a newline character followed by a tab character.
Comments have a hash character (#) in column one.
String Substitution in Action-Arguments
In constructing arguments for an action, certain character sequences have special meaning. These sequences serve
as templates for replacement strings derived from either the combination of the
prodIdPat and the data-product identifier,
from the data-product creation-time, or from the
sequence number.
Simple Subexpression Replacement
The character sequence \x in the argument field is replaced by the substring of the
data-product identifier that matches the corresponding subexpression of
prodIdPat. x is either a single digit greater than 0 and less than or equal
to 9 (e.g., \3) or two digits surrounded by parentheses (e.g., \(12)). Thus, for example, the entry
DDS ^SAUS.. .... (..)(..)
FILE saus_\1\2.wmo
would append all products with a DDS feedtype that have
data-product identifiers (in this case WMO headers) beginning with the
characters "SAUS" to hourly files named "saus_ddhh.wmo", where dd and hh are the two-digit
day and hour from the data-product identifier, respectively.
ASIDE: Information on the format of WMO headers can be found at
https://www.weather.gov/tg/headef
Temporal Subexpression Replacement
If a parenthetical subexpression of prodIdPat delimits the day of the month field
in the data-product identifier, then that subexpression can be used to
obtain certain temporal strings in the argument field. If \n is the matching subexpression, then the
following character sequences in the argument field are replaced with the indicated strings (the parentheses are
mandatory):
- (\x:yyyy)
- is replaced with the 4-digit year.
- (\x:yy)
- is replaced with the 2-digit year of the century.
- (\x:mmm)
- is replaced with the 3-character abbreviation for the month.
- (\x:mm)
- is replaced with the 2-digit index of the month (Jan = 1).
- (\x:ddd)
- is replaced with the 3-digit day of the year.
- (\x:dd)
- is replaced with the 2-digit day of the month. This will differ from the original \x string if, for
example, the day-of-the-month field is 31 but the data-product arrives on September 30th.
where x is as described under "Simple Subexpression Replacement", above (i.e., either a single digit or
two digits surrounded by parentheses).
The interpretation of the day of the month subexpression is aided by the current clock-time in the (hopefully)
obvious way.
Thus, for example, the following entry
WMO ^...... .... ([0-3][0-9])([0-2][0-9]).*/pAGO
FILE data/gempak/nwx/obs/ago/(\1:yyyy)(\1:mm)\1\2.ago
would append matching data-products such as
SHUS44 KAMA 161540 /pAGOAMA
to files whose pathnames were based on the year, month, day, and hour of the
data-product as indicated or implied by the
data-product identifier.
The following characters sequences in the argument field are replaced with the indicated strings based on the
data-product creation-time. The replacement is done using the function
strftime()
, which might have more possibilities than just the following on your system
(e.g., "%N"
might obtain nanoseconds on some systems).
- %a
- is replaced by the locale's abbreviated weekday name.
- %A
- is replaced by the locale's full weekday name.
- %b
- is replaced by the locale's abbreviated month name.
- %B
- is replaced by the locale's full month name.
- %c
- is replaced by the locale's appropriate date and time representation.
- %C
- is replaced by the century number (the year divided by 100 and truncated to an integer) as a decimal number
[00-99].
- %d
- is replaced by the day of the month as a decimal number [01,31].
- %D
- same as %m/%d/%y.
- %e
- is replaced by the day of the month as a decimal number [1,31]; a single digit is preceded by a space.
- %F
- is equivalent to %Y-%m-%d (the ISO 8601:2000 standard date format).
- %h
- same as %b.
- %H
- is replaced by the hour (24-hour clock) as a decimal number [00,23].
- %I
- is replaced by the hour (12-hour clock) as a decimal number [01,12].
- %j
- is replaced by the day of the year as a decimal number [001,366].
- %m
- is replaced by the month as a decimal number [01,12].
- %M
- is replaced by the minute as a decimal number [00,59].
- %n
- is replaced by a newline character.
- %p
- is replaced by the locale's equivalent of either a.m. or p.m.
- %r
- is replaced by the time in a.m. and p.m. notation; in the POSIX locale this is equivalent to %I:%M:%S %p.
- %R
- is replaced by the time in 24 hour notation (%H:%M).
- %S
- is replaced by the second as a decimal number [00,61].
- %t
- is replaced by a tab character.
- %T
- is replaced by the time (%H:%M:%S).
- %u
- is replaced by the weekday as a decimal number [1,7], with 1 representing Monday.
- %U
- is replaced by the week number of the year (Sunday as the first day of the week) as a decimal number
[00,53].
- %V
- is replaced by the week number of the year (Monday as the first day of the week) as a decimal number [01,53].
If the week containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise,
it is the last week of the previous year, and the next week is week 1.
- %w
- is replaced by the weekday as a decimal number [0,6], with 0 representing Sunday.
- %W
- is replaced by the week number of the year (Monday as the first day of the week) as a decimal number [00,53].
All days in a new year preceding the first Monday are considered to be in week 0.
- %x
- is replaced by the locale's appropriate date representation.
- %X
- is replaced by the locale's appropriate time representation.
- %y
- is replaced by the year without century as a decimal number [00,99].
- %Y
- is replaced by the year with century as a decimal number.
- %Z
- is replaced by the timezone name or abbreviation, or by no bytes if no timezone information exists.
- %%
- is replaced by %.
Replacment-String Derived from thesequence number
The string "(seq)
" (without the quotes) in the argument field will be replaced by the
sequence number of the
data-product. This might be useful to differentiate two different products
that, nevertheless, have the same identifiers.
NOOP Action
The NOOP action tells the pqact process to do nothing to the
data-product.This might be useful to prevent data-products from being
acted-upon by a subsequent entry whose product-ID pattern is ^_ELSE_$.
FILE Action
The FILE action tells the pqact process to write the
data-product to a file using the (unbuffered)
write() function.
The syntax of a FILE action is
FILE TAB [-overwrite] [-flush|-close] [-strip] [-log] [-metadata] pathname
where:
- -overwrite
- Causes the file to be completely rewritten every time it is opened; consequently, you should probably always
use the -close option in conjunction with this option.
- -flush
- Causes the fsync() function to be
called after a data-product is written.
- -close
- Causes the file to be closed after a data-product is written. The
default is to keep the file open.
- -strip
- Causes control characters other than newline (see
iscntrl()) to be removed from the
data-product before it is written to the file.
- -log
- Causes the pqact process to log the fact that it filed the
data-product.
- -metadata
- Causes the metadata of the data-product to be written to the file before any data. The metadata is written in
the following order using the indicated binary data-types of the C language:
- Metadata-length in bytes (uint32_t)
- Data-product signature (MD5 checksum) (uchar[16])
- Data-product size in bytes (uint32_t)
- Product creation-time in seconds since the epoch:
- Integer portion (uint64_t)
- Microseconds portion (int32_t)
- Data-product feedtype (uint32_t)
- Data-product sequence number (uint32_t)
- Product-identifier:
- Length in bytes (excluding NUL) (uint32_t)
- Non-NUL-terminated string (char[])
- product-origin:
- Length in bytes (excluding NUL) (uint32_t)
- Non-NUL-terminated string (char[])
- The endianness of the multi-byte primitive types is that of the local host.
- pathname
- Is the pathname of the file to which the data-product will be
written.
STDIOFILE Action
The STDIOFILE action tells the pqact process to write the
data-product to a file using the (buffered)
fwrite() function. In general, this
is more efficient than the FILE action but risks loosing data if the computer crashes.
The syntax of an STDIOFILE action is
STDIOFILE TAB [-overwrite] [-flush|-close] [-strip] [-log] pathname
where the options and argument are the same as for the FILE action, except that the -flush
option calls the fflush()
function.
DBFILE Action
The DBFILE action tells the pqact process to store the
data-product in a gdbm database.
The syntax of a DBFILE action is
DBFILE TAB pathname [key]
where:
- pathname
- Is the pathname of the gdbm database into which the
data-product will be put.
- [key]
- Is the optional key under which to put the data-product.
EXEC Action
The EXEC action tells the pqact process to execute a program as a child
process.
The syntax of a EXEC action is
EXEC TAB [-wait] pathname [arg ...]
where:
- -wait
- Causes the pqact process to suspend itself until the child-process has
terminated before continuing. This should only be done if it is known that the child-process will terminate
quickly.
- pathname
- Is the pathname of the program to be executed.
- [arg ...]
- Are optional arguments to pathname.
PIPE Action
The PIPE action tells the pqact process to execute a program as a child process
and to write the data-product to the standard input of the child
process.
The syntax of a PIPE action is
PIPE TAB [-strip] [-flush|-close] [-metadata] pathname [arg ...]
where:
- -strip
- Causes control characters other than newline (see
iscntrl()) to be removed from the
data-product before it is written to the pipe.
- -flush
- Causes the pqact process to flush its internal buffer to the pipe at the
end of each data-product.
- -close
- Causes the pqact process to close the pipe to the child process after writing
the data-product. The default is to keep the pipe open.
- -metadata
- Causes the metadata of the data-product to be written to the file before any data. The metadata is written in
the following order using the indicated binary data-types of the C language:
- Metadata-length in bytes (uint32_t)
- Data-product signature (MD5 checksum) (uchar[16])
- Data-product size in bytes (uint32_t)
- Product creation-time in seconds since the epoch:
- Integer portion (uint64_t)
- Microseconds portion (int32_t)
- Data-product feedtype (uint32_t)
- Data-product sequence number (uint32_t)
- Product-identifier:
- Length in bytes (excluding NUL) (uint32_t)
- Non-NUL-terminated string (char[])
- product-origin:
- Length in bytes (excluding NUL) (uint32_t)
- Non-NUL-terminated string (char[])
- The endianness of the multi-byte primitive types is that of the local host.
- pathname
- Is the pathname of the program to be executed.
- [arg ...]
- Are optional arguments to pathname.
The program pathname should be written so that it times-out and terminates after some interval
(e.g., ten minutes).
Checking Modifications
Modifications to a configuration-file should be checked for correct syntax before being made operational. The
syntax of all pqact configuration-files that are associated with active
EXEC pqact entries in the LDM configuration-file can be checked via the
command
ldmadmin pqactcheck
Otherwise, the syntax of the single pqact configuration-file, pathname,
can be checked via the command
ldmadmin pqactcheck -p pathname
Limit on the Number of Open Output-Files
The pqact utility has a limit on the number of open output-files. A output-file
is opened for every unique instance of the following actions:
FILE
STDIOFILE
DBFILE
PIPE
when taken together with their action-arguments after string substitution. Thus,
for example, the (somewhat contrived) entry
ANY (.*) FILE \1
would cause an output-file to be opened for every unique
data-product identifier that the
pqact process encountered!
The limit on the number of open output-files is equal to the output of the command
getconf OPEN_MAX | awk '{print $1-3}'
This limit has ramifications for decoding data. When the pqact program
wants to open a new output-file after having reached its limit, it first closes the least recently used
output-file. If the action associated with that output-file is PIPE, then the
decoder process reading from the pipe will encounter an end-of-file condition
and will terminate. As a consequence, decoders must be written so that they can "start-up" in the middle of
decoding a product.
If this is not possible, then more than one pqact(1) utility can be started (each with its own
configuration-file, or course) and the work shared between them so that no one pqact(1) utility reaches the
limit.