Skip to main content
Submitted

Logging on data cleansing procedure

  • January 12, 2021
  • 3 replies
  • 1 view

Forum|alt.badge.img
 

Already asked by Peter Jensen with 9 upvotes in 2017.

No comments from any TX Employees so far.
So even today I am forced to modify the cleansing procedure and add my own logging...
It would nice to be able to have an option per table by which you can enable extended data cleansing logging.
This would log the start and end date of each step in this procedure to a dedicated log table. (Have to do this when a proc starts throwing timeouts or takes a long time all of sudden)

I think this takes about 15 minutes to implement into your code base.
At the start of the proc you check if the logging table exists else you create one.
Before each step in the proc you store the datetime in a variable.
At the end of each step you write a record to the log table and reset the start time again as such:


IF OBJECT_ID(N'etl.HAWEmployeesPlanningMT_DataCleansingLog', N'U') IS NULL AND @enableExtendedLogging = 1
BEGIN
CREATE TABLE etl.HAWEmployeesPlanningMT_DataCleansingLog (
Id BIGINT IDENTITY(1,1) NOT NULL,
Version BIGINT NOT NULL,
Step NVARCHAR(1000) NOT NULL,
DateTimeStart DateTime,
DateTimeStop DateTime,

CONSTRAINT [PK_bb573581_2114_4b3e_b1bd_042488deac10] PRIMARY KEY CLUSTERED
(
Id ASC
)
)
END


SET @DateTimeStart = GETDATE()
SET @step = 'Keep field values up-to-date: detect lookup value changes'
-- Keep field values up-to-date: detect lookup value changes
/* TX generated SQL */

IF @enableExtendedLogging = 1
INSERT INTO etl.HAWEmployeesPlanningMT_DataCleansingLog(version, Step, DateTimeStart, DateTimeStop) VALUES(@version, @step, @DateTimeStart , GETDATE())


SET @DateTimeStart = GETDATE()
SET @step = 'Update conditional lookup fields (Many lookups, Take the first value):''CustomerHQId'',''CustomerCode1'',''IsTardyCancellation'',''IsExaminationsFinished'',''DateExamination'',''ExaminationPart1ResearchCode'',''MedicalExaminationCode'',''ExaminationTypeCode'',''MedicalExamination_IsPeriodical'',''CategoryCode'',''SubmissionCategory'',''DateInFunction'',''DateOutService'',''PlanningTimeCell_Time'',''PlanningEntityDate'''
-- Update conditional lookup fields (Many lookups, Take the first value):
/* TX generated SQL */

IF @enableExtendedLogging = 1
INSERT INTO etl.HAWEmployeesPlanningMT_DataCleansingLog(version, Step, DateTimeStart, DateTimeStop) VALUES(@version, @step, @DateTimeStart , GETDATE())


https://support.timextender.com/hc/en-us/community/posts/115011859306-Logging-during-Cleansing-Stored-Procedure 

 

3 replies

Forum|alt.badge.img
  • Participating Frequently
  • January 13, 2021

Whenever we have a proc with a longer then expected duration, we go the querystore and compare the timings.
It's also possible to query the querystore directly to get the info you want and loads of more info like waits, reads etc.

Example of such query:
WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
select ss.name, so.name,(qsrs.last_duration/60000000.00) last_duration_minutes,qsws.total_query_wait_time_ms/60000.00 total_query_wait_time_minutes,qsws.total_query_wait_time_ms_allwaittyppes/60000.00 total_query_wait_time_minutes_allwaittyppes ,qsws.wait_category_desc,qsqt.query_sql_text,qsrs.last_clr_time, (qsrs.last_cpu_time/60000000.00) last_cpu_time, qsrs.last_dop, qsrs.last_log_bytes_used, qsrs.last_logical_io_reads, qsrs.last_logical_io_writes, qsrs.last_num_physical_io_reads, qsrs.last_page_server_io_reads, qsrs.last_physical_io_reads, qsrs.last_query_max_used_memory, qsrs.last_rowcount, qsrs.last_tempdb_space_used, qsrs.count_executions ,
try_cast(qsp.query_plan as xml) as query_plan,
qsp_statementcost.statementcost
from sys.query_store_query qsq
inner join
sys.objects so
on
qsq.object_id = so.object_id
inner join
sys.schemas ss
on
ss.schema_id = so.schema_id
/* the last query plan */
cross apply
(select top 1 * from sys.query_store_plan sub_qsp where
sub_qsp.query_id = qsq.query_id order by sub_qsp.last_execution_time desc) qsp
/* the last execution */
cross apply
(select top 1 * from sys.query_store_runtime_stats sub_qsrs where
sub_qsrs.plan_id = qsp.plan_id order by sub_qsrs.last_execution_time desc) qsrs
inner join
sys.query_store_query_text qsqt
on qsqt.query_text_id = qsq.query_text_id
/* the largest waittime */
cross apply
(select top 1 total_query_wait_time_ms,wait_category_desc,sum(total_query_wait_time_ms) over (partition by plan_id,runtime_stats_interval_id ) total_query_wait_time_ms_allwaittyppes from sys.query_store_wait_stats sub_qsws where
sub_qsws.plan_id = qsp.plan_id and sub_qsws.runtime_stats_interval_id = qsrs.runtime_stats_interval_id order by sub_qsws.total_query_wait_time_ms desc
) qsws
cross apply (
select
CHARINDEX('StatementSubTreeCost',qsp.query_plan) pos1
) qsp_subs_1
cross apply (
select
CHARINDEX('"',qsp.query_plan,qsp_subs_1.pos1) pos2
) qsp_subs_2
cross apply (
select
CHARINDEX('"',qsp.query_plan,qsp_subs_2.pos2+1) pos3
) qsp_subs_3
cross apply (
select
SUBSTRING(qsp.query_plan,qsp_subs_2.pos2+1,qsp_subs_3.pos3-qsp_subs_2.pos2-1) as statementcost
) qsp_statementcost
where
1=1
--this object, replace the usp_######_Clean with your own proc name
and so.name = 'usp_######_Clean'
--only todays execution
and cast(qsq.last_execution_time as date) = CAST(getdate() as date)
order by 3 desc

 


Forum|alt.badge.img
  • Author
  • Known Participant
  • January 13, 2021
 

Thanks for pointing out the query store.
Will also enable this for additional logging.


  • Participating Frequently
  • February 12, 2021

Thanks for your product feedback.

We have passed your input to our R&D department for further evaluation.

The roadmap of our product is focused on a split in ODX, MSW, Semantic Layer and Customer Portal.

With that split we will see new features and functions appear that might exceed your expectations or make the request obsolete.