Saturday, May 15, 2021

How to Split a Large File into Multiple Smaller Files in BIP Report Bursting

 There are different delivery methods for BIP report bursting like :

  • EMAIL

  • FAX

  • FTP

    Let us take example where we need to  Burst a BIP report to email. The report size is around 40MB.

    Many of the email exchange servers support file sizes like 10MB, 25MB, etc. If the file exceeds the exchange server threshold value then the report will not be delivered.

    To overcome this situation, we can split the large file into multiple small files.

    There are different ways of achieving this, two of them are:

    1. Using Row number
    2. Using oracle Hash function

    In this post we will discuss how we can use the "ORA_HASH" function for our requirement.

    The Ora Hash Function splits records into multiple data sets.

     Below are the two options we would need to split the file:

    1. Having a maximum limit of records in each file
    2. Passing number of files count as parameter to split into multiple files.

     

    Let us see option 2 first:

    Format and example:

    ORA_HASH(column, filesCount)
    Eg. ORA_HASH(cust_account_id,1)
    This will split the records into two sets (0,1) and hence 2 files will be burst.

    Below steps to be followed to achieve the requirement:

    • Create a data model having  ORA_HASH function value as key 
    • Create a parameter-count, which will take input as the number of files to be split.

    Note:  if we give count as 1 then it will split into 2 data sets having hash value as 0 and one which will serve as our bursting.

    • Add bursting query having logic to fetch distinct hash value for the above main data set.
    • View the data and save as sample data
    • Save the data model
    • Create a new report and link the data model which we created above
    • Enable bursting for the report
    • Schedule report , the report gets burst into count+1 files 
    • Verify the same in mail.

    Below screenshots explains the steps to be followed:

     

    Create Data set with hash vale as key for bursting

     

     

    Sample Query

     

     

    Create count parameter which we will pass to ORA_HASH function to split  into multiple files .

     

     

     Add Bursting query:



     View data and save as sample data:


     

     Save the data model:

     

     

    Create a new report, link the above data model, use generate csv option, default the output to  data csv and save the report.

     

     

    Go to properties and Enable bursting:

     

     

     

     

    Save the report.

     

     

    Schedule the report.

     

    We want to split the large file into 3 files so we will pass count value as 2. It will create 3 sets with key as 0,1 and 2

     

     

    Click on submit and provide report job name in the next window that pops up.

     

     

     


     

    Open the Report job history to see the status.


     

    Verify in mail if we have got 3 files.

       


-------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------

 Now let us see how we can fix maximum number of rows per file:

We will still use ORA_HASH. Instead of taking number of files as an input we will either take maximum number of records per file as input from user or we can hard code the limit.

Then we will use this limit to divide the total records to generate count of files dynamically.

-> Optionally we can use ceil function.

Sample:

select ceil(count(*)/:count) count_val from hz_cust_accounts

or

select ceil(count(*)/5000) count_val from hz_cust_accounts 

Note:-  We do not need to write the whole query to get total count, we just need to pick the lowermost table which accounts for the total count. For eg. In case of account and sites, count of sites>= count of account. So we take count of sites as total count.

In the above we are fetching count value dynamically also this assures the files won't have more than 5k records or the  number of records which user provides in parameter.

We need to use this count value in the same ORA_HASH function, as we had done in previous case.

Accordingly We also need to tweak bursting query as well.

This will assure that the smaller files don't exceed the maximum allowed limit my mail exchange servers.

Also this will be helpful if we are unsure about the volume of data.

Example->  if we have total records as 2 million and if we split in 2 files, we will have 1 millions in each file. This might exceed the max allowed limit by mail exchange servers.


 

No comments:

Post a Comment