Saturday, 25 January 2025

Tar command

 The tar command in Linux/Unix is used to create, extract, and manage archive files. It’s commonly used to combine multiple files or directories into a single file, often for backup or distribution purposes.


Key Concepts:

Tar archives are often compressed to save space (e.g., .tar.gz or .tgz files).

Think of it like “zipping” and “unzipping” files, but specifically for Unix/Linux systems.


Basic Syntax:


tar [options] [archive_name] [file_or_directory]


options: What you want to do (e.g., create, extract, list, etc.).

archive_name: The name of the archive file (e.g., backup.tar).

file_or_directory: The files or directories to include in the archive.


Common Examples:

1. Create a tar archive:


tar -cvf archive.tar file1 file2 directory/


-c: Create an archive.

-v: Verbose (show what’s happening).

-f: Specify the file name of the archive (archive.tar).


2. Extract a tar archive:


tar -xvf archive.tar


-x: Extract files from an archive.

-v: Verbose (list files while extracting).

-f: Specify the archive file name.


3. Compress a tar archive (gzip):


tar -cvzf archive.tar.gz file1 file2 directory/


-z: Compress the archive using gzip.

Creates a .tar.gz file.


4. Extract a compressed tar archive:


tar -xvzf archive.tar.gz


-z: Decompress the gzip-compressed archive.


5. List contents of a tar archive:


tar -tvf archive.tar


-t: List the files in the archive.


6. Add files to an existing tar archive:


tar -rvf archive.tar newfile


-r: Append files to an archive.


Real-World Examples:

Backup a directory:


tar -cvzf backup.tar.gz /home/user/documents


This creates a compressed backup of the documents directory.


Extract a downloaded file:


tar -xvzf software.tar.gz


Extracts the contents of a compressed .tar.gz file.


Transfer files as one package:

Combine files into a single tar file, send it over, and extract it on another machine.


Quick Reference for Options:


Option Description

-c Create an archive

-x Extract files from an archive

-v Verbose (show progress)

-f Specify archive file name

-z Compress or decompress using gzip

-t List contents of an archive

-r Append files to an existing archive


In short, tar is a versatile tool for packing, unpacking, and compressing files!


Friday, 24 January 2025

egrep (Extended grep) in Unix

 You use egrep (Extended grep) in Unix when you need to search for patterns using extended regular expressions (ERE). It supports additional features like +, ?, |, and {} directly without escaping.

Example Use Cases:

  1. Multiple Patterns:

    egrep 'apple|orange|banana' file.txt
    

    Matches any line containing "apple", "orange", or "banana".

  2. Repetition:

    egrep 'a{2,}' file.txt
    

    Matches lines with "a" repeated 2 or more times.

  3. Optional Characters:

    egrep 'colou?r' file.txt
    

    Matches "color" or "colour".

  4. Find Lines with a Specific Range of Characters: egrep '[0-9]{3,5}' file.txt

          Matches lines containing a number with 3 to 5 digits (e.g., 123, 12345).
-------------------------------------------------------------------------------------------------------------

Find Lines with Repeated Patterns:
egrep '(ab)+c' file.txt

Matches lines containing "ab" repeated one or more times, followed by "c" (e.g., "abc", "abababc").

-------------------------------------------------------------------------------------------------------------

Match Words with Optional Characters:

egrep 'gr(e|a)y' file.txt

Matches "grey" or "gray".

-------------------------------------------------------------------------------------------------------------

Search for Lines with a Specific Word or Another (Logical OR):

egrep 'cat|dog' file.txt
or
grep -E 'cat|dog' file.txt

Matches lines containing either "cat" or "dog".

Why Use egrep?

egrep is simply a shorthand for grep -E, so using it is more convenient for extended regular expressions. However, newer systems recommend grep -E because egrep is considered deprecated in some environments.

Use egrep when complex pattern matching is needed!

Tuesday, 21 January 2025

DROP, DELETE, and TRUNCATE

Here’s a comparison of DROP, DELETE, and TRUNCATE in SQL:


Aspect DROP DELETE TRUNCATE
Purpose Removes an entire table or database. Removes specific rows from a table. Removes all rows from a table.
DML/DDL DDL (Data Definition Language). DML (Data Manipulation Language). DDL (Data Definition Language).
Deletes Data Only? Deletes both structure and data. Deletes data, keeps table structure. Deletes data, keeps table structure.
WHERE Clause Not applicable. Can use WHERE to filter rows. Cannot use WHERE.
Performance Faster for dropping structure. Slower for large data (logged). Faster (minimal logging).
Rollback Support Cannot be rolled back. Can be rolled back if in a transaction. Cannot be rolled back.
Resets Identity? Yes, as table is removed. No, retains identity sequence. Yes, resets identity sequence.

Examples:

  1. DROP:

    DROP TABLE employees;
    
  2. DELETE:

    DELETE FROM employees WHERE department_id = 101;
    
  3. TRUNCATE:

    TRUNCATE TABLE employees;
    


**SQL query performance optimization techniques**

 1. **Use Indexing**:

   - Add indexes on frequently filtered or joined columns.


2. **Avoid SELECT ***:

   - Retrieve only required columns.


3. **Use Joins Efficiently**:

   - Prefer inner joins over outer joins when possible.

   - Join on indexed columns.


4. **Optimize WHERE Clauses**:

   - Avoid functions on columns in `WHERE` (e.g., `WHERE YEAR(date) = 2023` → `WHERE date >= '2023-01-01'`).


5. **Use EXISTS Instead of IN**:

   - `EXISTS` is often faster for subqueries.


6. **Limit the Rows**:

   - Use `LIMIT` or `TOP` to fetch only needed rows.


7. **Partition Large Tables**:

   - Use table partitioning for large datasets.


8. **Avoid Correlated Subqueries**:

   - Replace them with joins or CTEs.


9. **Use Query Execution Plans**:

   - Analyze and tune queries based on execution plans.


10. **Batch Updates and Inserts**:

    - Split large operations into smaller batches.


These techniques improve query performance in real-world scenarios.


sed, grep ,find Command Questions

  1. Replace a word only in specific lines:

    • Replace "ERROR" with "WARNING" only in lines 10 to 20.
    • Answer:
      sed '10,20s/ERROR/WARNING/g' file.txt
      
  2. Delete lines matching a pattern:

    • Delete all lines containing the word "DEBUG".
    • Answer:
      sed '/DEBUG/d' file.txt
      
  3. Insert a line after a specific pattern:

    • Insert "Processing complete" after every line containing "Job Finished".
    • Answer:
      sed '/Job Finished/a\Processing complete' file.txt
      
  4. Replace nth occurrence of a word:

    • Replace the 3rd occurrence of "Job" in each line with "Task".
    • Answer:
      sed 's/Job/Task/3' file.txt
      
  5. Extract lines between two patterns:

    • Extract lines between "START" and "END".
    • Answer:
      sed -n '/START/,/END/p' file.txt
      
  6. Replace tabs with spaces:

    • Replace all tabs in a file with 4 spaces.
    • Answer:
      sed 's/\t/    /g' file.txt
      
  7. Modify a file in place:

    • Change "FAILED" to "SUCCESS" directly in the file.
    • Answer:
      sed -i 's/FAILED/SUCCESS/g' file.txt
      
  8. Print line numbers of matching patterns:

    • Print line numbers of lines containing "ERROR".
    • Answer:
      sed -n '/ERROR/=' file.txt
      
  9. Remove trailing spaces:

    • Remove trailing spaces from all lines in a file.
    • Answer:
      sed 's/[[:space:]]*$//' file.txt
      
  10. Replace using a variable:

    • Replace "PLACEHOLDER" with the value of a variable $var.
    • Answer:
      sed "s/PLACEHOLDER/$var/g" file.txt
      

grep (Global Regular Expression Print)

  1. Search recursively in subdirectories:

    • Find the word "FAILED" in all files in /logs.
    • Answer:
      grep -r 'FAILED' /logs
      
  2. Match exact word:

    • Find lines with the exact word "Job".
    • Answer:
      grep -w 'Job' file.txt
      
  3. Display matching lines with line numbers:

    • Find "ERROR" and show line numbers.
    • Answer:
      grep -n 'ERROR' file.txt
      
  4. Exclude specific files:

    • Search for "SUCCESS" in all .log files but exclude files containing "debug".
    • Answer:
      grep --exclude='*debug*.log' 'SUCCESS' *.log
      
  5. Find lines starting with a specific word:

    • Find lines that start with "INFO".
    • Answer:
      grep '^INFO' file.txt
      
  6. Search case-insensitively:

    • Search for "error" without case sensitivity.
    • Answer:
      grep -i 'error' file.txt
      
  7. Search and count occurrences:

    • Count occurrences of "Job Finished".
    • Answer:
      grep -c 'Job Finished' file.txt
      
  8. Search in multiple files:

    • Find "ERROR" in file1.log and file2.log.
    • Answer:
      grep 'ERROR' file1.log file2.log
      
  9. Context around matches:

    • Show 3 lines before and after each match for "ERROR".
    • Answer:
      grep -C 3 'ERROR' file.txt
      
  10. Inverse search:

    • Show all lines that do not contain "DEBUG".
    • Answer:
      grep -v 'DEBUG' file.txt
      

find (File Locator)

  1. Find files by name:

    • Find all .log files in /var/log.
    • Answer:
      find /var/log -name "*.log"
      
  2. Find files by size:

    • Find files larger than 1GB.
    • Answer:
      find / -type f -size +1G
      
  3. Find files modified recently:

    • Find files modified in the last 7 days.
    • Answer:
      find /path/to/dir -type f -mtime -7
      
  4. Find empty files:

    • Find all empty files in /data.
    • Answer:
      find /data -type f -empty
      
  5. Find files with specific permissions:

    • Find files with 777 permissions.
    • Answer:
      find / -type f -perm 777
      
  6. Delete specific files:

    • Delete all .tmp files older than 30 days.
    • Answer:
      find /path/to/dir -name "*.tmp" -type f -mtime +30 -exec rm {} \;
      
  7. Find and execute commands:

    • Find .log files and compress them.
    • Answer:
      find /logs -name "*.log" -exec gzip {} \;
      
  8. Search by file type:

    • Find all directories in /var.
    • Answer:
      find /var -type d
      
  9. Find recently accessed files:

    • Find files accessed in the last 2 days.
    • Answer:
      find /path/to/dir -atime -2
      
  10. Find files with partial names:

    • Find files containing "log" in their names.
    • Answer:
      find / -name "*log*"
      

awk (Text Processing)

  1. Print specific columns:

    • Print the 2nd and 4th columns from a file.
    • Answer:
      awk '{print $2, $4}' file.txt
      
  2. Filter lines by a condition:

    • Print lines where the value in column 3 is greater than 100.
    • Answer:
      awk '$3 > 100' file.txt
      
  3. Count occurrences of a word:

    • Count occurrences of "ERROR" in a file.
    • Answer:
      awk '/ERROR/ {count++} END {print count}' file.txt
      
  4. Sum values in a column:

    • Calculate the sum of column 5.
    • Answer:
      awk '{sum += $5} END {print sum}' file.txt
      
  5. Find the maximum value in a column:

    • Find the maximum value in column 3.
    • Answer:
      awk 'BEGIN {max=0} $3 > max {max=$3} END {print max}' file.txt
      
  6. Add a header to output:

    • Add "ID Name Salary" as a header.
    • Answer:
      awk 'BEGIN {print "ID Name Salary"} {print $1, $2, $3}' file.txt
      
  7. Format output:

    • Align columns with custom spacing.
    • Answer:
      awk '{printf "%-10s %-20s %-10s\n", $1, $2, $3}' file.txt
      
  8. Replace column values:

    • Replace "ERROR" with "OK" in column 4.
    • Answer:
      awk '$4 == "ERROR" {$4 = "OK"} 1' file.txt
      
  9. Process CSV files:

    • Print the second field of a CSV file.
    • Answer:
      awk -F, '{print $2}' file.csv
      
  10. Conditional processing:

    • Print "High" if column 3 > 100, otherwise "Low".
    • Answer:
      awk '{if ($3 > 100) print "High"; else print "Low"}' file.txt
      

How NDM (Connect:Direct) can be used within an Ab Initio graph for file transfer.

Scenario:

  • Source System: Mainframe
  • Target System: UNIX server
  • Use Case: Transfer a file from the mainframe to the UNIX system after data processing in Ab Initio.

Steps in Ab Initio Graph with NDM Integration

1. Ab Initio Graph Design:

  • Components Used:
    • Input File Component: Reads the file to process.
    • Transform Component: Applies necessary transformations.
    • Output File Component: Writes the processed data to a local file.
    • Run Program Component: Executes the NDM script to transfer the file.

2. Example Graph Flow:

[Input File] -> [Transform] -> [Output File] -> [Run Program (NDM)]

3. Sample NDM Script:

This script is stored as a .ndm file and is executed via the Run Program Component.

NDM Script Example:

NDM_TRANSFER PROCESS COPY FROM ( DSN=source_file_name DISP=SHR NODE=MAINFRAME_NODE ) TO ( DSN=/data/target_directory/target_file_name DISP=RPL NODE=UNIX_NODE ) RUN TASK AFTER SUCCESS PEND

4. Run Program Component in Ab Initio:

This component is used to trigger the NDM script.

Run Program Configuration:

  • Command: Call the NDM script using the shell.
    c:DIRECT/bin/directcli -f /scripts/ndm_transfer.ndm
  • Parameters:
    • Path to the NDM script (/scripts/ndm_transfer.ndm).
    • Additional arguments if needed.

5. Error Handling in the Graph:

  • Use Conditional Components or check NDM logs to validate the transfer.
  • Example: Monitor the output log from the NDM process to confirm success or trigger alerts on failure.

Sample Workflow Execution:

  1. Input File: The source file is read from the mainframe.
  2. Processing: The data is transformed in Ab Initio.
  3. Output File: Processed data is written to a local directory on the ETL server.
  4. NDM Transfer:
    • The Run Program Component triggers the NDM script.
    • The file is securely transferred to the target system (UNIX server).
  5. Post-Transfer: Logs are checked to verify successful completion.

Benefits of This Integration:

  1. Automation: File transfer is seamlessly part of the ETL process.
  2. Error Recovery: NDM supports checkpoint/restart for interrupted transfers.
  3. Scalability: Suitable for large file transfers and cross-platform integration.
  4. Security: Uses encrypted protocols for secure file movement.

How the SQL statements are executed by UPDATE TABLE ?

api mode execution


The statements are applied to the incoming records as follows. For each record:

1. The statement referenced by updateSqlFile is tried first. If the statement can be successfully applied to the current record, it is executed; and the statement referenced by insertSqlFile is skipped.
2 .If the updateSqlFile statement cannot be applied to the current record, the component attempts to execute the statement referenced by insertSqlFile.

how to create dml dynamically in Ab-initio

 $[ begin let int i = 0; let string(int) complete_list = "emp_nm,billable,age,designation,location"; let string(int) file_content ...