Monday, January 31, 2011

SQL to Identify Siebel MVG duplicate records that are not the PrimaryMVG Record

This article shows how to build a SQL to identify Siebel MVG duplicate records specifically identifying the MVG duplicate that is not the primary. This allows us to easily identify the duplicate MVG record we wish to delete without risk of deleting the primary MVG record. We do not want risk of deleting the primary MVG record because any MVF based on this MVG would display as blank if the MVG has no primary record.



If I want to identify the Contact MVG (S_CONTACT_XM) duplicate records in an Oracle DB I can run SQL:



SELECT *

FROM

(

SELECT

xm.PAR_ROW_ID

,xm.NAME

,ROW_NUMBER() OVER (PARTITION BY xm.PAR_ROW_ID || xm.NAME || xm.TYPE ORDER BY NULL) RID

,xm.ROW_ID

,xm.TYPE

FROM    S_CONTACT_XM xm

)

WHERE RID > 1;



This SQL will perform a ROW_NUMBER partition based on the user key of S_CONTACT_XM (PAR_ROW_ID, NAME, TYPE). Therefore any record with RID > 1 is a duplicate based on this partition.



Lets say for a particular S_CONTACT_XM.TYPE = 'Type1' the primary id column on the parent contact is S_CONTACT.X_PR_TYPE1_ID. The SQL above cannot guarantee that the duplicate records selected are not the primary. Therefore deleting them may result in no primary for that MVG any longer. We can update our SQL to query for the duplicates first, then requery the entire duplicate set by filtering based on the initial query set and assign a value to a new column (ISPRIMARY) of value 1 if the MVG record is a primary and 0 if it is not. Now we have a list of each duplicate pair with ISPRIMARY = 1 if it is a primary and 0 if it is not:



select (
CASE
WHEN (xm.ROW_ID = con.X_PR_TYPE1_ID)
THEN (1)
ELSE (0)
END
) ISPRIMARY,
xm.*,
con.X_PR_TYPE1_ID
from   s_contact_xm xm, s_contact con
where con.ROW_ID = xm.PAR_ROW_ID
and   xm.NAME || xm.PAR_ROW_ID || xm.TYPE
IN
(
SELECT rec.NAME || rec.PAR_ROW_ID || rec.TYPE
from
(
SELECT  xm.PAR_ROW_ID
,xm.NAME
,ROW_NUMBER() OVER (PARTITION BY xm.PAR_ROW_ID || xm.NAME || xm.TYPE ORDER BY NULL) RID
,xm.TYPE
FROM    S_CONTACT_XM xm
where xm.type='Type1'
) rec
where rec.RID > 1
)



Now to out of the above data set we want to only select one record out of the duplicate pair, the one that is not a primary. So we use the above query as our data set, query this data set adding another partition based on the user key PAR_ROW_ID, NAME, TYPE and order this partition by the ISPRIMARY DESC, this way the primary duplicate MVG record will appear first. Now if we only select filter only records were the partition RID > 1 then we will never select the first (primary) MVG duplicate record:



SELECT ROW_ID
from
(
select ROW_NUMBER() OVER (PARTITION BY r.PAR_ROW_ID || r.NAME || r.TYPE ORDER BY r.ISPRIMARY DESC) RID,
r.ROW_ID
FROM
(

select (
CASE
WHEN (xm.ROW_ID = con.X_PR_TYPE1_ID)
THEN (1)
ELSE (0)
END
) ISPRIMARY,
xm.*,
con.X_PR_TYPE1_ID
from s_contact_xm xm, s_contact con
where con.ROW_ID = xm.PAR_ROW_ID
and   xm.NAME || xm.PAR_ROW_ID || xm.TYPE
IN
(
SELECT rec.NAME || rec.PAR_ROW_ID || rec.TYPE
from
(
SELECT  xm.PAR_ROW_ID
,xm.NAME
,ROW_NUMBER() OVER (PARTITION BY xm.PAR_ROW_ID || xm.NAME || xm.TYPE ORDER BY NULL) RID
,xm.TYPE
FROM    S_CONTACT_XM xm
where xm.type='Type1'
) rec
where rec.RID > 1
)

) r
)
WHERE RID > 1;

No comments :

Post a Comment