Package dap4.core.ce

Class CEConstraint

  • Direct Known Subclasses:
    Universal

    public class CEConstraint
    extends Object
    A Constraint is a structure containing a parsed representation of a constraint expression. Its purpose is define a subset of interest of a dataset.

    The constraint object is defined with respect to some underlying DMR. It defines a subset view over that DMR in that it specifies a set of declarations (variables, enums, dimensions, groups) to be included. For each such variable, the constraint specifies any overriding of the dimensions of the variables. Additionally, each variable (if appropriate) may have a filter expr.

    Thus, there are three 'sub' constraints within a full constraint. 1. Referencing - is a variable from the underlying dataset included in the constraint, directly (by or indirectly: e.g. fields of a structure when only the structure is referenced 2. Projection - the actual values of a variable to be included in the constraint; this is specified by a triple [start:stride:stop] for each dimension of a variable. 3. Selection (aka filters) - A predicate over the scalar fields of a row of a Sequence; if it evaluates to true, then that row matches the constraint.

    There are multiple ways to effect a constraint. 1. Generate and test mode: the constraint is asked if a given element matches the constraint. E.g. a. For referencing, one might ask the constraint if a given variable or field is in the constraint b. For a projection filter, one might ask the constraint if a given set of dimension indices match the projection. c. For a selection filter, one might ask the constraint if a given sequence row matches the filter predicate 2. Iteration mode: the constraint provides an iterator that returns the elements matching the constraint. E.g. a. For referencing, the iterator would return all the variables and fields referenced in the constraint. b. For a projection filter, the iterator would return the successive sets of indices of the projection, or it could return the actual matching value. c. For a selection filter, the iterator would return either the row indices or the actual rows of a sequence that matched the filter predicate

    3. Read mode: Sometimes, it may be more efficient to let the DataVariable object handle the constraint more directly. E.g. a. For example, if the data variable was backed by a netcdf file, then passing in the complete projection might be more efficient than pulling values 1 by 1. b. Similarly, if the sequence object had an associated btree, then it would be more efficient to allow the sequence object to evaluate the filter using the btree. Note that this requires analysis of the filter expression to see if the btree is usable.

    Ideally, we would allow all three modes, but for now, only generate-and-test is implemented, and only a subset of that. Generate-and-test is provided for referencing and selection. It is not provided for projection (for now) because it essentially requires the inverse of iteration and that is fairly tricky.

    • Constructor Detail

      • CEConstraint

        public CEConstraint()
      • CEConstraint

        public CEConstraint​(DapDataset dmr)
    • Method Detail

      • release

        public static void release​(DapDataset dmr)
      • eval

        protected Object eval​(DapVariable var,
                              DapSequence seq,
                              D4Array record,
                              CEAST expr)
                       throws DapException
        Evaluate a filter with respect to a Sequence record. Assumes the filter has been canonicalized so that the lhs is a variable.
        Parameters:
        seq - the template
        record - the record to evaluate
        expr - the filter
        Returns:
        the value of the expression (usually a Boolean)
        Throws:
        DapException
      • isUniversal

        public boolean isUniversal()
      • toConstraintString

        public String toConstraintString()
        Convert the view to a constraint string suitable for use in a URL, except not URL encoded.
        Returns:
        constraint string
      • dumpvar

        protected void dumpvar​(CEConstraint.Segment seg,
                               StringBuilder buf,
                               boolean forconstraint)
        Recursive helper for tostring/toConstraintString
        Parameters:
        seg -
        buf -
        forconstraint -
      • references

        public boolean references​(DapNode node)
        Reference X match
        Parameters:
        node - to test
        Returns:
        true if node is referenced by this constraint
      • match

        public boolean match​(DapVariable sqvar,
                             DapSequence seq,
                             D4Array rec)
                      throws DapException
        Selection X match

        Evaluate a filter with respect to a Sequence record. Assumes the filter has been canonicalized.

        Parameters:
        seq - the template
        rec - the record to evaluate
        Returns:
        true if the filter matches the record
        Throws:
        DapException
      • matches

        protected boolean matches​(DapVariable var,
                                  DapSequence seq,
                                  D4Array rec,
                                  CEAST filter)
                           throws DapException
        Evaluate a filter with respect to a Sequence record.
        Parameters:
        seq - the template
        rec - the record to evaluate
        filter - the filter
        Returns:
        true if a match
        Throws:
        DapException
      • findVariableIndex

        protected int findVariableIndex​(DapVariable var)
      • expand

        public void expand()
        Locate each unexpanded Structure|Sequence and: 1. check that none of its fields is referenced => do not expand 2. add all of its fields as leaves Note that #2 may end up adding additional leaf structs &/or seqs
      • contract

        public void contract()
        Locate each Structure|Sequence and: 1. check that all of its fields are referenced recursively and not constrained, otherwise ignore 2. contract by removing all of the fields of the Structure or Sequence. This is intended to be (not quite) the dual of expand();
      • contractR

        protected boolean contractR​(DapStructure dstruct,
                                    Set<DapStructure> contracted)
        Recursive helper
        Parameters:
        dstruct - to contract
        contracted - set of already contracted compounds
        Returns:
        true if this structure was contracted, false otherwise
      • expansionCount

        protected int expansionCount​(DapStructure struct)
        Count the number of fields of a structure that already in this view.
        Parameters:
        struct - the dapstructure to check
        Returns:
        # of fields in this view
        Throws:
        DapException
      • isWholeCompound

        protected boolean isWholeCompound​(DapStructure dstruct)
        See if a structure is "whole", which means that none of its fields is missing from the constraint, all of fields use default (non-constrained) dimension), and all of its fields are also whole. This must be done recursively.
        Parameters:
        dstruct - to test
        Returns:
        true if this structure is whole.
      • computedimensions

        protected void computedimensions()
                                  throws DapException
        Compute dimension related information using slicing and redef info. In effect, this is where projection constraints are applied

        Assume that the constraint compiler has given us the following info:

        1. A list of the variables to include.
        2. A pair (DapDimension,Slice) for each redef
        3. For each variable in #1, a list of slices taken from the constraint expression

        Two products will be produced.

        1. The variables map will be modified so that the slices properly reflect any original or redef dimensions.
        2. A set, dimrefs, of all referenced original dimensions.

        The processing is as follows

        1. For each redef create a new redef dimension
        2. For each variable:
          1. if the variable is scalar, do nothing.
          2. if the variable has no associated slices, then make its new dimensions be the original dimensions.
          3. otherwise, walk the slices and create new dimensions from them; use redefs where indicated
        Throws:
        DapException
      • computeenums

        protected void computeenums()
        Walk all the included variables and accumulate the referenced enums
      • computegroups

        protected void computegroups()
        Walk all the included declarations and accumulate the set of referenced groups