Cloud

O SpecGold OracleBusIntApps7 clr

 Gcloud

 

   Call us now 

  Manchester Office

  +44 (0) 8450 940 998

 

  

 

Welcome to the Beyond Blog

As you'd expect from the winners of the Specialized Partner of the Year: Business Analytics at the Oracle UKI Specialized Partner Awards 2014, Beyond work with leading edge BI Applications primarily within the UK Public Sector. We intend to share some of our ideas and discoveries via our blog and hopefully enrich the wider discussion surrounding Oracle Business Intelligence and driving improved insight for customers

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that have been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Team Blogs
    Team Blogs Find your favorite team blogs here.
  • Login
    Login Login form
Subscribe to this list via RSS Blog posts tagged in ETL

Unless you've been hiding under a stone for the past few years you'll know that the cloud is the big thing at Oracle. There are fewer and fewer on-premise installations for greenfield projects. With the new pricing structure it is easy to see why more and more organizations are considering cloud services for their new developments. An easy venture for a client new to cloud may be say a reporting suite, developed in APEX, utilizing data from their source ERP system. The big question then of course is how do you transfer your data to the cloud securely? there are many products out there to facilitate this, such as Oracle Data Integrator (ODI), Oracle DataSync, custom processes with file transfers over sFTP etc. However I want to show a really easy way to do this via an SSH tunnel.

There are a number of steps that need to be done - some are optional (such as TNS Names entries) and you can work without them, however I've written the post as I would prefer to set it up - you may choose . I am using E-Business Suite R12.1.3 Vision as a source system, however the principle applies equally to others.

Source System Configuration

First we create a read-only user on the source system and grant the objects we wish to expose. We then create synonyms as that user to make querying easier (and to protect against change in the future).
As SYS

VIS121 r121@ebs121-vm ~ $ sqlplus / as sysdba

SQL*Plus: Release 11.1.0.7.0 - Production on Tue Dec 12 16:00:40 2017

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create user apps_cl_query identified by apps_cl_query;

User created.

SQL> grant connect, resource to apps_cl_query;

Grant succeeded.

SQL> conn apps/apps
Connected.
SQL> grant select on per_all_people_f to apps_cl_query;

Grant succeeded.

SQL> conn apps_cl_query/apps_cl_query
Connected.
SQL> create synonym per_all_people_f for apps.per_all_people_f;

Synonym created.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
VIS121 r121@ebs121-vm ~ $

Last modified on Continue reading
Tagged in: Cloud DBaaS ETL
in Technical 229 0
0

It is quite common for warehouse developers to customise standard processes to add additional fields. Ideally this is done by modifying the appropriate mapping(s) to pass the field through, regardless of the technology use (Informatica, ODI etc). However in some instances, generally due to the complexity of a particular mapping process, developers implement these updates using a Post Load Process (PLP). From a technical perspective this can be done a number of ways, i.e. using a mapping, SQL statement updates and so on. The focus of this blog is the latter; specifically one of the traps that a developer can fall into when doing do.

First let's assume we have a simple mapping process which populates a table WC_DEMO_F. For demo purposes I can create that table as follows.

Create Table wc_demo_f As 
  Select level row_wid,
         'Row ' || To_Char(level,'FM999999') demo_value,
         'N' x_two_multiple
    From Dual
  Connect By Level <= 1000000;
Last modified on Continue reading
in Technical 2780 0
0

Whilst using the standard Oracle tools for load plan execution monitoring and diagnosis I can't help but feel sometimes that it's a little "clunky". An overview of the execution can be seen from BIACM and progress can be seen through ODI Studio Operator, however I either have to look in multiple places to see the current progress and state of a load plan, there are many clicks needed to get somewhere,or the information isn't quite as I want it (or sometimes there is far too much information).

BIACM Control

ODI Load Plan Monitoring

In reality, most of the time the things I am looking for are a rough idea of how far through the load plan we are, and whether there have been any failures. So here is a quick sneak preview of something I've spent a little bit of my free time playing with recently.

APEX Monitor

I developed this in Oracle Application Express (APEX), a free web-based rapid application development tool from Oracle. It's a very basic application that I put together however it lets me see instantly the progress of my load plans (based purely on number of executed scenarios as opposed to a time-based algorithm), with the ability to drill into any failures which may have occurred. Over and above all that it has provided an excellent opportunity to understand the underlying metadata structures used within ODI which has given me some nice ideas for future enhancements and other process automations. It is read-only, i.e. it doesn't support any options for interacting with ODI/BIACM (starting load plans etc), however that isn't really something I'll be looking at due to the risks involved and potential for things going wrong. Anything along thoses lines would of course affect your support agreement with Oracle should you need assistance on a related issue.

Drill to Failures

This is by no means a finished article, however it's the first few tiny steps towards what will hopefully be a useful tool for us internally. The interface definitely needs a tidy up and there are many more bits of things I'd like to use this for. Ideas at the moment (amongst others) are

  • Auto-refresh
  • Better Failure alerting (although I do have separate scripts to do this)
  • Export log information
  • View load plan in a tree structure
  • Better error diagnosis
  • Slicker interface
  • Proactive monitoring - have load times for a particular step jumped significantly etc? Do certain times of the plan have too many sessions running?
  • Step trends, row count rate of increase, warehouse statistics etc.

Feel free to comment with any more!

Last modified on Continue reading
in Strategic Views 5297 6
0

Prune days... often misunderstood, often just left "as default". Possibly not helped by the fact that entering the search term into Google yields the following as the top hit:

The LAST_UPD column in Siebel transactional database tables is used for incremental change capture. This timestamp reflects the actual event time. It is therefore possible for a data row to be committed to the transactional database with a LAST_UPD date that is older than the date on which the last refresh was executed. This will cause the data row to be missed in the subsequent extract (if based purely on LAST_UPD date). However, the LAST_UPD date column still provides an opportunity to improve the change capture process by overlapping the extraction date window by the number of days set in this parameter. The records extracted in the overlapped window are filtered by comparing this information with information in the Image table. The Prune Days setting ensures that the rows that had values in LAST_UPD older than values in LAST_REFRESH_DATE are not missed. This is a parameter that can be set based on experience with processes, such as remote sync, that potentially can cause records to be missed. This parameter cannot be less than 1. For example: Assume the table W_PERSON_D was refreshed on January 15th by querying the table S_CONTACT. And, the Prune Days setting was set to 5. The next time S_CONTACT is queried to load W_PERSON_D, the records that have a LAST_UPD value since January 10 are compared with the ROW_ID of the Image table to cover for any missing records between January 15 and January 10 (the overlap period).

Source: http://docs.oracle.com/cd/E12102_01/books/AnyInstAdm/AnyInstAdmWinInstall43.html

First it's probably worth explaining what this parameter actually does. PRUNE_DAYS is a number of days which is subtracted from the LAST_EXTRACT_DATE parameter before it is passed by the DAC to Informatica. So if we have a LAST_EXTRACT_DATE of 02-JAN-2015 21:00:00 and a PRUNE_DAYS of 1, the value actually used will be 01-JAN-2015 21:00:00.

So do you need to use this on an E-Business System? The answer is... yes. But why?
Consider a scenario where you have an incremental ETL process scheduled to begin at 9pm each evening to extract records from table X. At 20:55 an automated process starts (or even a user manually inputting records) which inserts a batch of records into that table. The process takes 20 minutes to complete and (correctly) commits at the end of processing. It uses a value of SYSDATE for the last_update_date field, which is calculated at the point of insert.
In the mean time however, the ETL process has completed its load of data by 21:10, and has reset the LAST_EXTRACT_DATE parameter to be 21:00 on the current evening in preparation for the next incremental ETL process. However it has no visibility of the records inserted by the batch process as these were not committed at the point of select.

Without prune days some or all of the data inserted by your batch process will be lost as the last_update_date field is prior to the current LAST_EXTRACT_DATE value for those records. What the PRUNE_DAYS parameter does is reduce the window by a day (or whatever value you have set) to pick up such events.

Prune Days

There is of course a trade off with using a higher value for prune days in that the incremental ETL process will be reconsidering more data that it has previously already processed (and will therefore disregard).

Last modified on Continue reading
in Techniques 5553 0
0