Icon

wkflw_​validate_​data_​infile_​wip

1. Source file checksFile-name parsing:<market>-<ss prefix>-<file type index>-<ingestiontype code>-<storage area>-<layout type code><market>-<ss prefix>-<file type index>-<ingestiontype code>-<storage area>-<layout type code>-cntSample data filesRO-MSD-2-DBBTCH-LZ-1-20220127174355.txtRO-MSD-2-DBBTCH-LZ-1-20220127174355-cnt.txtRO,MSD => Ref-table-1 (market, ss-prefix)2 => Ref-table-5 (File-Type-code)DBBTCH => Ref-table-2 (ingestion-Type-code)LZ => Ref-table-1(Storage-area)1 => Ref-table-3 (Layout-Type-code)Acceptance criteria: All the parts of the file name 2. Data Dictionary checksif (Ref-table-3 (Layout-Type-code)) == 1{ CSV file: * first line = header => read the field names from row-1 * Acceptance Criteria for field-name: Field-names from the header shouldmatch field names from the Ref-table-5 ^ get the length of each field * Acceptance Criteria for field-length: field length computed should notexceed field length in the Ref-table-5<To be completed>} 3. File Balance check 1. Read the data file using : a) CSV reader for delimited files b) Line/File Reader for fixed-length files 2.Read the number of records using Extract TableDimensions node. 3.Read the count file corresponding to the data file inStep 1.. 4. Compare the numbers in Steps 2 & 3 Acceptance Criteria: a) CSV/Delimited: Record count (data file) -1 =Record count(count file) b)Fixed-Width: Record count (data file) =Record count(count file) Filename Validation DD Check (WIP) File Balance Check Fetch data &references filename_s3_bkt_Arr[0] => Marketfilename_s3_bkt_Arr[1] => SS Prefixfilename_s3_bkt_Arr[2] => File Type Codefilename_s3_bkt_Arr[3] => Ingestion Type Codefilename_s3_bkt_Arr[4] => Storage Areafilename_s3_bkt_Arr[5] = Layout Type Codefilename_s3_bkt_Arr[6] => Created Datetime Node 0S3 Bucket Path (Data File)File name pattern(Data File)LZ incomingdata file source connTemporary Local Pathto store downloads(Data Files)Node 57S3 Bucket Path (Reference File)File name pattern(Ref-Table-*)Temporary Local Pathto store downloads(Data Files)Node 62LZ incomingdata file source ConnectionNode 69Node 70Node 86Node 87Node 93 Transfer files fromSource File System meta_load_ref_data_to_knime_lz String Widget String Widget Connect to AWS String Widget meta_extract_split_filename_lz meta_infile_name_validation String Widget String Widget String Widget Transfer files fromSource File System Connect to AWS Passing ref_table_3for DD checks meta_check_for_delimited_vs_fixedwidth meta_get_counts_cnt_n_data_files Rule EngineVariable meta_read_ref_table_3(layout type code)) 1. Source file checksFile-name parsing:<market>-<ss prefix>-<file type index>-<ingestiontype code>-<storage area>-<layout type code><market>-<ss prefix>-<file type index>-<ingestiontype code>-<storage area>-<layout type code>-cntSample data filesRO-MSD-2-DBBTCH-LZ-1-20220127174355.txtRO-MSD-2-DBBTCH-LZ-1-20220127174355-cnt.txtRO,MSD => Ref-table-1 (market, ss-prefix)2 => Ref-table-5 (File-Type-code)DBBTCH => Ref-table-2 (ingestion-Type-code)LZ => Ref-table-1(Storage-area)1 => Ref-table-3 (Layout-Type-code)Acceptance criteria: All the parts of the file name 2. Data Dictionary checksif (Ref-table-3 (Layout-Type-code)) == 1{ CSV file: * first line = header => read the field names from row-1 * Acceptance Criteria for field-name: Field-names from the header shouldmatch field names from the Ref-table-5 ^ get the length of each field * Acceptance Criteria for field-length: field length computed should notexceed field length in the Ref-table-5<To be completed>} 3. File Balance check 1. Read the data file using : a) CSV reader for delimited files b) Line/File Reader for fixed-length files 2.Read the number of records using Extract TableDimensions node. 3.Read the count file corresponding to the data file inStep 1.. 4. Compare the numbers in Steps 2 & 3 Acceptance Criteria: a) CSV/Delimited: Record count (data file) -1 =Record count(count file) b)Fixed-Width: Record count (data file) =Record count(count file) Filename Validation DD Check (WIP) File Balance Check Fetch data &references filename_s3_bkt_Arr[0] => Marketfilename_s3_bkt_Arr[1] => SS Prefixfilename_s3_bkt_Arr[2] => File Type Codefilename_s3_bkt_Arr[3] => Ingestion Type Codefilename_s3_bkt_Arr[4] => Storage Areafilename_s3_bkt_Arr[5] = Layout Type Codefilename_s3_bkt_Arr[6] => Created Datetime Node 0S3 Bucket Path (Data File)File name pattern(Data File)LZ incomingdata file source connTemporary Local Pathto store downloads(Data Files)Node 57S3 Bucket Path (Reference File)File name pattern(Ref-Table-*)Temporary Local Pathto store downloads(Data Files)Node 62LZ incomingdata file source ConnectionNode 69Node 70Node 86Node 87Node 93 Transfer files fromSource File System meta_load_ref_data_to_knime_lz String Widget String Widget Connect to AWS String Widget meta_extract_split_filename_lz meta_infile_name_validation String Widget String Widget String Widget Transfer files fromSource File System Connect to AWS Passing ref_table_3for DD checks meta_check_for_delimited_vs_fixedwidth meta_get_counts_cnt_n_data_files Rule EngineVariable meta_read_ref_table_3(layout type code))

Nodes

Extensions

Links