StructuredFile/Parse Task

Overview

The StructuredFile/Parse@1 task converts position-based fixed-width flat files into structured JSON format. This task is essential for processing legacy file formats commonly used in warehouse management systems, transportation, and EDI integrations.

Task Definition

task: "StructuredFile/Parse@1"
name: parseStructuredFile
inputs:
  fileData: "AASNTBCHUSPSWAREHOUSE123    SHR123456 MBL1234567890..."
  config: { ... } # Detailed configuration object
  parseMode: "structured" # Optional: "flat" or "structured" (default: "flat")
  encoding: "UTF-8" # Optional: Character encoding (default: "UTF-8")
  skipEmptyLines: true # Optional: Skip empty lines (default: true)
  trimFields: true # Optional: Trim whitespace from fields (default: true)
outputs:
  - name: "parsedData"
    mapping: "result"
  - name: "recordCount"
    mapping: "recordCount"
  - name: "recordTypes"
    mapping: "recordTypes"
  - name: "parseErrors"
    mapping: "errors"

Input Parameters

Parameter	Type	Required	Default	Description
`fileData`	string	Yes	-	Raw flat file content to parse
`config`	object	Yes	-	Configuration defining file structure and parsing rules
`parseMode`	string	No	"flat"	Output format: "flat" returns array of records, "structured" returns hierarchical object
`encoding`	string	No	"UTF-8"	Character encoding of the input file
`skipEmptyLines`	boolean	No	true	Whether to skip empty lines during parsing
`trimFields`	boolean	No	true	Whether to trim whitespace from extracted field values

Configuration Object

The configuration object is the core of the parser, defining how to interpret the flat file structure.

Basic Structure

config:
  # Define output structure (for structured mode)
  structured: true
  structure: [...]

  # Define record formats
  records: [...]

  # Global options
  options:
    includeFillers: false
    includeRecordType: true
    dateFormat: "YYYYMMDD"

Records Configuration

Each record type in the flat file must be defined in the records array:

records:
  - id: "A" # Single character that identifies this record type
    name: "header" # Logical name for the record type
    description: "File header record" # Optional description
    required: true # Whether this record must appear in valid files
    minOccurrences: 1 # Minimum times this record must appear
    maxOccurrences: 1 # Maximum times this record can appear
    fields: [...] # Field definitions for this record

Field Configuration

Each field within a record is defined with precise positioning and optional transformations:

fields:
  - name: "carrier_code" # Field name in output
    description: "SCAC code" # Optional field description
    start: 9 # Starting position (1-based)
    length: 4 # Number of characters
    type: "string" # Data type (see Field Types section)
    required: true # Whether field must have a value
    trim: true # Override global trim setting
    padCharacter: " " # Character used for padding
    padDirection: "left" # Padding direction: "left" or "right"
    defaultValue: "" # Default if field is empty
    skip: false # Whether to exclude from output
    validation: # Optional validation rules
      pattern: "^[A-Z]{4}$"
      minLength: 4
      maxLength: 4

Field Types

The parser supports several field types with automatic conversion:

String (default)

- name: "description"
  start: 10
  length: 30
  type: "string"
  transform: "uppercase" # Optional: "uppercase", "lowercase", "capitalize"

Number

- name: "quantity"
  start: 40
  length: 6
  type: "number"
  divisor: 100 # Divide extracted value by this amount
  decimals: 2 # Number of decimal places
  thousandSeparator: "," # Optional thousand separator
  defaultValue: 0

Date

- name: "ship_date"
  start: 10
  length: 8
  type: "date"
  inputFormat: "YYYYMMDD" # Format in the file
  outputFormat: "YYYY-MM-DD" # Desired output format
  timezone: "UTC" # Optional timezone

Boolean

- name: "is_hazmat"
  start: 50
  length: 1
  type: "boolean"
  trueValues: ["Y", "1", "T"] # Values that represent true
  falseValues: ["N", "0", "F"] # Values that represent false
  defaultValue: false

Decimal/Currency

- name: "unit_price"
  start: 60
  length: 9
  type: "decimal"
  divisor: 1000 # Common for implied decimal places
  precision: 2 # Decimal precision
  currencySymbol: "$" # Optional currency symbol in output
  format: "0,0.00" # Number format pattern

Structure Configuration (Hierarchical Output)

For parseMode: "structured", define how records relate to each other:

structure:
  - type: "header" # Record type name
    level: 0 # Nesting level (0 = root)
    key: "header" # Property name in output
    singleton: true # Only one instance expected

  - type: "load"
    level: 0
    key: "loads"
    collection: true # Multiple instances form an array

  - type: "store"
    level: 1
    parent: "load" # Parent record type
    collection: "stores" # Collection name within parent

  - type: "carton"
    level: 2
    parent: "store"
    collection: "cartons"

  - type: "carton_content"
    level: 3
    parent: "carton"
    collection: "items"

Complete Configuration Example

task: "StructuredFile/Parse@1"
name: parseShipmentManifest
inputs:
  fileData: "{{ workflow.input.manifestFile }}"
  parseMode: "structured"
  config:
    structured: true
    structure:
      - type: "header"
        level: 0
        key: "header"
        singleton: true
      - type: "load"
        level: 0
        key: "currentLoad"
        singleton: true
      - type: "store"
        level: 1
        parent: "load"
        collection: "stores"
      - type: "carton"
        level: 2
        parent: "store"
        collection: "cartons"
      - type: "carton_content"
        level: 3
        parent: "carton"
        collection: "items"
      - type: "store_totals"
        level: 2
        parent: "store"
        key: "totals"
      - type: "trailer_totals"
        level: 1
        parent: "load"
        key: "totals"

    records:
      # Header Record
      - id: "A"
        name: "header"
        description: "ASN file header"
        required: true
        maxOccurrences: 1
        fields:
          - name: "record_type"
            start: 1
            length: 1
            type: "string"
            defaultValue: "A"
            skip: true
          - name: "file_type"
            start: 2
            length: 3
            type: "string"
            validation:
              pattern: "ASN"
          - name: "batch_code"
            start: 5
            length: 4
            type: "string"
          - name: "carrier_code"
            start: 9
            length: 4
            type: "string"
            required: true
            transform: "uppercase"
            validation:
              pattern: "^[A-Z]{4}$"
              message: "Carrier code must be 4 uppercase letters"
          - name: "origin_facility"
            start: 13
            length: 15
            type: "string"
            trim: true
          - name: "shipper_reference"
            start: 28
            length: 10
            type: "string"
          - name: "master_bol"
            start: 38
            length: 30
            type: "string"
            trim: true

      # Load/Trailer Record
      - id: "T"
        name: "load"
        description: "Trailer/Load information"
        required: true
        fields:
          - name: "record_type"
            start: 1
            length: 1
            skip: true
          - name: "load_date"
            start: 2
            length: 8
            type: "date"
            inputFormat: "YYYYMMDD"
            outputFormat: "YYYY-MM-DD"
            required: true
          - name: "trailer_number"
            start: 14
            length: 12
            type: "string"
            trim: true
            required: true
          - name: "seal_number"
            start: 26
            length: 10
            type: "string"
            trim: true
          - name: "ship_date"
            start: 36
            length: 8
            type: "date"
            inputFormat: "YYYYMMDD"
            outputFormat: "YYYY-MM-DD"

      # Store/Destination Record
      - id: "B"
        name: "store"
        description: "Store destination information"
        fields:
          - name: "record_type"
            start: 1
            length: 1
            skip: true
          - name: "carrier_code"
            start: 2
            length: 4
            type: "string"
            transform: "uppercase"
          - name: "pool_location"
            start: 6
            length: 9
            type: "string"
            trim: true
          - name: "invoice_number"
            start: 17
            length: 6
            type: "string"
          - name: "bol_number"
            start: 23
            length: 12
            type: "string"
            trim: true

      # Carton Record
      - id: "C"
        name: "carton"
        description: "Individual carton/package"
        fields:
          - name: "record_type"
            start: 1
            length: 1
            skip: true
          - name: "tracking_number"
            start: 2
            length: 28
            type: "string"
            trim: true
            required: true
            validation:
              minLength: 10
              message: "Tracking number must be at least 10 characters"
          - name: "weight"
            start: 30
            length: 7
            type: "decimal"
            divisor: 100
            precision: 2
            defaultValue: 0
          - name: "is_signature_required"
            start: 44
            length: 1
            type: "boolean"
            trueValues: ["Y", "1"]
            falseValues: ["N", "0", " "]
            defaultValue: false
          - name: "declared_value"
            start: 68
            length: 8
            type: "decimal"
            divisor: 1000
            precision: 2
            currencySymbol: "$"

      # Carton Content Record
      - id: "P"
        name: "carton_content"
        description: "Items within a carton"
        fields:
          - name: "record_type"
            start: 1
            length: 1
            skip: true
          - name: "style_number"
            start: 2
            length: 15
            type: "string"
            trim: true
          - name: "sku"
            start: 27
            length: 13
            type: "string"
            trim: true
            required: true
          - name: "quantity"
            start: 40
            length: 6
            type: "number"
            divisor: 100000
            decimals: 0
          - name: "color"
            start: 46
            length: 30
            type: "string"
            trim: true
          - name: "size"
            start: 76
            length: 6
            type: "string"
            trim: true
          - name: "retail_price"
            start: 82
            length: 9
            type: "decimal"
            divisor: 1000
            precision: 2
            currencySymbol: "$"
          - name: "item_type"
            start: 100
            length: 8
            type: "string"
            trim: true
          - name: "description"
            start: 108
            length: 30
            type: "string"
            trim: true

      # Store Totals Record
      - id: "D"
        name: "store_totals"
        description: "Store-level summary"
        fields:
          - name: "record_type"
            start: 1
            length: 1
            skip: true
          - name: "total_units"
            start: 2
            length: 7
            type: "number"
            divisor: 100000
          - name: "total_weight"
            start: 9
            length: 9
            type: "decimal"
            divisor: 1000
            precision: 3

      # Trailer Totals Record
      - id: "E"
        name: "trailer_totals"
        description: "Trailer-level summary"
        fields:
          - name: "record_type"
            start: 1
            length: 1
            skip: true
          - name: "total_units"
            start: 2
            length: 7
            type: "number"
            divisor: 100000
          - name: "total_weight"
            start: 9
            length: 9
            type: "decimal"
            divisor: 1000
            precision: 3

    options:
      includeFillers: false # Don't include FILLER fields
      includeRecordType: false # Don't include record type in output
      strictValidation: true # Enforce all validation rules
      continueOnError: false # Stop on first error

outputs:
  - name: "manifest"
    mapping: "result"
  - name: "recordCounts"
    mapping: "recordTypes"
  - name: "hasErrors"
    mapping: "errors[0] != null"

Output Examples

Flat Mode Output

{
  "result": [
    {
      "_type": "header",
      "file_type": "ASN",
      "batch_code": "TBCH",
      "carrier_code": "USPS",
      "origin_facility": "WAREHOUSE123",
      "shipper_reference": "SHR123456",
      "master_bol": "MBL1234567890"
    },
    {
      "_type": "load",
      "load_date": "2024-01-15",
      "trailer_number": "TRL123456",
      "seal_number": "SEAL789",
      "ship_date": "2024-01-15"
    },
    {
      "_type": "store",
      "carrier_code": "USPS",
      "pool_location": "TXPOOL",
      "invoice_number": "INV123",
      "bol_number": "BOL12345678"
    }
  ],
  "recordCount": 7,
  "recordTypes": {
    "header": 1,
    "load": 1,
    "store": 1,
    "carton": 1,
    "carton_content": 1,
    "store_totals": 1,
    "trailer_totals": 1
  },
  "errors": []
}

Structured Mode Output

{
  "result": {
    "header": {
      "_type": "header",
      "file_type": "ASN",
      "batch_code": "TBCH",
      "carrier_code": "USPS",
      "origin_facility": "WAREHOUSE123",
      "shipper_reference": "SHR123456",
      "master_bol": "MBL1234567890"
    },
    "currentLoad": {
      "_type": "load",
      "load_date": "2024-01-15",
      "trailer_number": "TRL123456",
      "seal_number": "SEAL789",
      "ship_date": "2024-01-15",
      "stores": [
        {
          "_type": "store",
          "carrier_code": "USPS",
          "pool_location": "TXPOOL",
          "invoice_number": "INV123",
          "bol_number": "BOL12345678",
          "cartons": [
            {
              "_type": "carton",
              "tracking_number": "00123456789012345678901234567890",
              "weight": 12.34,
              "is_signature_required": true,
              "declared_value": 12.35,
              "items": [
                {
                  "_type": "carton_content",
                  "style_number": "STYLE123456789",
                  "sku": "MSK1234567890",
                  "quantity": 1,
                  "color": "BLACK",
                  "size": "M",
                  "retail_price": 12.35,
                  "item_type": "CTNBOX",
                  "description": "Men's T-Shirt Black"
                }
              ]
            }
          ],
          "totals": {
            "_type": "store_totals",
            "total_units": 1,
            "total_weight": 9.876
          }
        }
      ],
      "totals": {
        "_type": "trailer_totals",
        "total_units": 10,
        "total_weight": 98.765
      }
    }
  },
  "recordCount": 7,
  "recordTypes": {
    "header": 1,
    "load": 1,
    "store": 1,
    "carton": 1,
    "carton_content": 1,
    "store_totals": 1,
    "trailer_totals": 1
  },
  "errors": []
}

Error Handling

The parser returns detailed error information when issues are encountered:

{
  "errors": [
    {
      "line": 3,
      "record": "store",
      "field": "carrier_code",
      "message": "Carrier code must be 4 uppercase letters",
      "value": "ups",
      "position": {
        "start": 2,
        "end": 5
      }
    },
    {
      "line": 5,
      "record": "carton",
      "field": "tracking_number",
      "message": "Required field is empty",
      "position": {
        "start": 2,
        "end": 29
      }
    }
  ]
}

Advanced Configuration Options

Conditional Fields

Define fields that are only parsed based on conditions:

fields:
  - name: "hazmat_code"
    start: 100
    length: 4
    type: "string"
    condition:
      field: "is_hazmat"
      operator: "equals"
      value: true

Computed Fields

Add fields calculated from other fields:

fields:
  - name: "total_value"
    computed: true
    expression: "quantity * retail_price"
    type: "decimal"
    precision: 2

Field Groups

Group related fields for cleaner output:

fields:
  - name: "dimensions"
    type: "group"
    fields:
      - name: "length"
        start: 50
        length: 5
        type: "number"
        divisor: 10
      - name: "width"
        start: 55
        length: 5
        type: "number"
        divisor: 10
      - name: "height"
        start: 60
        length: 5
        type: "number"
        divisor: 10

Custom Transformations

Apply custom transformations using expressions:

fields:
  - name: "status_code"
    start: 80
    length: 2
    type: "string"
    transform:
      type: "map"
      mapping:
        "01": "pending"
        "02": "in_transit"
        "03": "delivered"
        "04": "exception"
      default: "unknown"

Best Practices

Start Position Accuracy: Always use 1-based positioning as specified in file documentation
Field Length Validation: Ensure field lengths match exactly to avoid data bleeding
Type Safety: Use appropriate field types for automatic conversion and validation
Error Handling: Implement proper error handling for malformed records
Testing: Test with various file samples including edge cases
Performance: For large files, consider streaming parse options
Documentation: Document custom formats and maintain sample files

Common Issues and Solutions

Issue: Overlapping Fields

# Wrong - Fields overlap
- name: "field1"
  start: 10
  length: 5 # Ends at position 14
- name: "field2"
  start: 14 # Starts at position 14 - overlap!
  length: 3

# Correct
- name: "field1"
  start: 10
  length: 5 # Ends at position 14
- name: "field2"
  start: 15 # Starts at position 15
  length: 3

Issue: Incorrect Divisor for Implied Decimals

# File contains: "000123" representing $1.23

# Wrong
- name: "amount"
  type: "decimal"
  divisor: 10 # Results in 12.3

# Correct
- name: "amount"
  type: "decimal"
  divisor: 100 # Results in 1.23

Issue: Date Parsing Failures

# Handle various date formats
- name: "ship_date"
  start: 10
  length: 8
  type: "date"
  inputFormat: "YYYYMMDD"
  outputFormat: "YYYY-MM-DD"
  defaultValue: null # Return null for invalid dates
  validation:
    allowEmpty: true # Allow empty date fields

Overview​

Task Definition​

Input Parameters​

Configuration Object​

Basic Structure​

Records Configuration​

Field Configuration​

Field Types​

String (default)​

Number​

Date​

Boolean​

Decimal/Currency​

Structure Configuration (Hierarchical Output)​

Complete Configuration Example​

Output Examples​

Flat Mode Output​

Structured Mode Output​

Error Handling​

Advanced Configuration Options​

Conditional Fields​

Computed Fields​

Field Groups​

Custom Transformations​

Best Practices​

Common Issues and Solutions​

Issue: Overlapping Fields​

Issue: Incorrect Divisor for Implied Decimals​

Issue: Date Parsing Failures​