Thursday, October 29, 2015

ssis Infromation


5down voteaccepted



Looks fine. There are options if you start to run into performance

issues, but if this is stable (finishes within data-loading time

window, source systems aren't being drained of resources, etc),

then I see no reason to change.

Some potential issues to keep an eye on...
 1.having 20+ full-cache lookup-transforms may pose a problem if

your dimensions increase in size...due to memory constraints on the

SSIS system...but since they are type 1, I wouldn't worry.
2.full-cache lookups "hydrate" pre-execution...having 20+ of them

may slow you down

A common alternative (to what you have above) is to extract the

fact table data from the source system and land it in a staging

area before doing the dimension key lookups via a single SQL

statement. Some even keep a set of dimension key mapping tables in

the staging area specifically for this purpose. This reduces

locking/blocking on the source system...if you have a lot of data

each load, and have to block the source system while you suck the

data out and run it through those 20+ lookup transforms.

Having a good staging area strategy becomes more important when you

have a large amount of data, large dimensions, complex key mappings

(usually due to multiple source systems), and short data-loading

time windows.



Lookup tables for dimensions denote natural hierarchies in a

Snowflake schema.
The Lookup transformation can route unmatched data to an error

flow.


incorrect:Try to use fully cached lookups when loading fact tables

to get the needed surrogate keys from dimension tables

Incorrect: You can only change the SQL command of the Lookup task

when you are using property expressions for the data flow task that

has a Lookup task inside it.


Incorrect: Lookup is a synchronous transformation.

Check whether the Lookup transformation is fully cached and that

the query contains only the needed columns. The last step is to

optimize the buffer
settings of the data flow.





We are currently working on SSRS reports styling using CSS style

sheets or templates. We need these styling changes to get reflected

in all the reports developed based on this template by changing the

source template without modifying each and every report.

In this regard, we would like to find out if anyone has expertise

in these functionalities. Appreciate if the respective resources

can get in touch with us at the earliest or any pointers is also

fine with us.


MERGE JOIN
:This transformation has two inputs and one output. It does not

support an error output.

•Specify the join is a FULL, LEFT, or INNER join.
 •Specify the columns the join uses.
 •Specify whether the transformation handles null values as equal

to other nulls.

Note

If null values are not treated as equal values, the transformation

handles null values like the SQL Server Database Engine does.


NULL Handling

We can specify whether the transformation handles null values as

equal to other nulls. Set the property TreatNullsAsEqual to True.

If null values are not treated as equal values, the transformation

handles null values like any ANSI database does.



SSIS Merge Join - Replace Null Columns

I normally use a conditional split after the merge join, the use

the ISNULL function to direct the rows to a DERIVED COLUMN

transformation, the simply replace the column with the value you

want.

then after that, put a UNION ALL transformation in and connect from

the conditional split and the derived column.





Unionall

This transformation has multiple inputs and one output. It does not

support an error output.







By including the Merge transformation in a data flow, you can

perform the following tasks:

•Merge data from two data sources, such as tables and files.
 •Create complex datasets by nesting Merge transformations.
 •Remerge rows after correcting errors in the data.

The Merge transformation is similar to the Union All

transformations. Use the Union All transformation instead of the

Merge transformation in the following situations:
 •The transformation inputs are not sorted.
 •The combined output does not need to be sorted.
 •The transformation has more than two inputs.
 ==================================================================

--Monitor Data Flow
--Integration Services includes features and tools that you can use

to troubleshoot the data flow in a package during execution.
--Tapping the data flow during execution

Declare @execid bigint
--1. Create an execution instance of a package by using the

(ssms-->RC pkg.dtsx and click execute ,2nd click script,3.now add

data taps
EXEC [SSISDB].[catalog].[create_execution] @folder_name=N'ETL

Folder', @project_name=N'ETL Project',

@package_name=N'Package.dtsx', @execution_id=@execid OUTPUT
--2. Add a data tap by using either (
EXEC [SSISDB].[catalog].add_data_tap
@execution_id               = @execid,
@task_package_path          = '\Package\Data Flow Task',/*ssms--

>dft-->properties:pkg path:(value) is task pkg path*/
@dataflow_path_id_string    = 'Paths[Flat File Source.Flat File

Source Output]',/*flatfilessource output(datapath;ie;arrow mark of

source & dest;arrow mark properties>identification string*/
@data_filename              = 'output.txt' /*in <Program Files>

\Microsoft SQL Server\110\DTS\DataDumps  */
--3. Start the execution instance of the package by using the
EXEC [SSISDB].[catalog].[start_execution] @execid


==================================================================

-So you should handle rows which cause Error. this is the time when

Failure Error Output comes to help.

--Ignore Failure    will ignore any errors during insertion of

records to destination
--Redirect Row    will redirect any rows which cause error during

insertion to destination
--Fail Component    will cause to fail component when insertion

hits any error

===============================================================

there are some packages and moved to production. 12:34 PM
later there were some changes need to make and the production

support does not know how to change.. 12:35 PM
how will you make it from developemnet env


2 and 1 more q 12:38 PM
which is better to use 12:38 PM
scd or lookup 12:38 PM
tell me quick answer


yeah,but corresponding event is might not available 5:11 PM
ie;restart,if task filaure 5:11 PM
taskfailure/success status keep in one temp table 5:12 PM
and identify if taskfailure status is available 5:12 PM
by using eventhandlers

You can have a table (TaskName,ExecutionStatus)that has all the

task names with execution status as 0. Then before executing a task

you need to read execution status for that particular task in a

package variable and based on the value you can decide
to execute the task (precedence constraint with expression). Once

the task executes successfully you need to update the table and set

the executionstatus as 1.
ExecutionStatus 0 means the task needs to be executed and 1 means

that task has been executed.

Any reason why you arenot using checkpoints?Nitesh Rai- Please mark

the post as answered if it answers your question
 5:21 PM
1 more q is there.. 5:21 PM
in a shared location there are 5 flat files..they need to be

processed to the staging..among them 1 flat file missing..or there

is some corrupted data in a column 5:23 PM
how will you handle that?


A: How to check whether all the data loading into staging from flat

files which are in shared drvie

Read All 7 Posts

Have a data flowtask with the flat file source and rocount

transform to get the count of records in file. Now connect this

data flow to the existing data flow. In the existing data flow task

have another row count transfom one before the oledb destination
and one for error records.
now in the control flow check if the row count initially mtches the

count after the load are thereany records in the error.My Blog|Ask

Me



if match then succesfully loaded,otherwise if any rows are not

matched then willl send a  mail 6:04 PM
ok
===================================================

http://stackoverflow.com/questions/32097020/adding-new-column-to-

excel-source-in-ssis


1. Adding new column to Excel Source in SSIS
2.How to handle datatype double-precision float [DT_R8] in ssis

data flow task

source(float double-precision float [DT_R8].)
destination(varchar(50)
Result:+------------+---------------------------------+
    | SourceValue| DestinationValue After execution|
    +------------+---------------------------------+
    | 0.579      | 0.578999999999999965


3.
connecting to a FTP server that has a certificate using SSIS








up vote1down votefavorite
3




I am trying to download files from a FTP site inside my production

environment. The FTP site has been assigned with a certificate

(pfx). The FTP port is 21.

Due to this certificate I am unable to download the content using

FTP task in SSIS.

I tried using script task however I got the same error that ftp

cannot accomplish the connection due to certificate.

I also tried with SFTP component but that gave me a error shown

below


I googled the error and found out that SFTP does not use TLS/SSL

(neither implicit nor explicit). Explicit/Implict TLS/SSL is used

with FTP from the post


Below are the workaround for this...

There are three SSL options when we create a FTP site in IIS.
1.Allow SSL
2.Require SSL
 3.No SSL

In the FTP site that I had has the Require SSL option selected. I

changed that option to Allow SSL which still enables the SSL

feature for the FTP but allows the FTP task

=======================================================

Report Layout
--What is Tablix?
--The best of Table merged with the best of Matrix
--Allows for fixed and dynamic columns and rows
--Introduces optional omission of row/column headers

--Innovative layout &Grouping capabilities
--Support for richly formatted text


___________________________________________________________________

--Chart Data:Sumvalue,Category Group,Series Group
--Tablix Properties:Border,Data

Only,Fill,Filter,General,localization,No

Rows,others,Positions,location,size,PublishedReportPart,Visiblity
--General Tab:Datasetname,FixedColumnHeaders(F),FixedRowHeaders

(F),GroupsBeforeRowHeaders(0),KeepTogether

(F),Name,PageBreak,PageName,RepeatRowHeader,ToolTip
--No Rows Tab:NoRowMessage:(Expression)
--Others:Custom Properties,Sort Expression



____________________________________________________________

RS dev Demo:

--Tablix – adjacent groups?
--Sparkline
--Bullet
--Sparkline Drillthrough
--Hide Columns or Rows Conditionally
Create tablix, sparkline, bullet, column chart for quarters

=====================================================Report Layout
--What is Tablix?
--The best of Table merged with the best of Matrix
--Allows for fixed and dynamic columns and rows
--Introduces optional omission of row/column headers

--Innovative layout &Grouping capabilities
--Support for richly formatted text


___________________________________________________________________

--Chart Data:Sumvalue,Category Group,Series Group
--Tablix Properties:Border,Data

Only,Fill,Filter,General,localization,No

Rows,others,Positions,location,size,PublishedReportPart,Visiblity
--General Tab:Datasetname,FixedColumnHeaders(F),FixedRowHeaders

(F),GroupsBeforeRowHeaders(0),KeepTogether

(F),Name,PageBreak,PageName,RepeatRowHeader,ToolTip
--No Rows Tab:NoRowMessage:(Expression)
--Others:Custom Properties,Sort Expression



____________________________________________________________

RS dev Demo:

--Tablix – adjacent groups?
--Sparkline
--Bullet
--Sparkline Drillthrough
--Hide Columns or Rows Conditionally
Create tablix, sparkline, bullet, column chart for quarters

================================================

--1.SSIS is an in-memory pipeline, so ensure that all

transformations occur in memory.
  --2.The following Network perfmon counters can help you tune your

topology:•Network Interface / Current Bandwidth: This counter

provides an estimate of current bandwidth.
 --•Network Interface / Bytes Total / sec: The rate at which bytes

are sent and received over each network adapter.
 --•Network Interface / Transfers/sec
 --3.Baseline source system extract speed.
 --Execute the package from the command line (DTEXEC) and measure

the time it took for it to complete its task. Use the Integration

Services log output to get an accurate calculation of the time. You

want to calculate rows per second:
 --Rows / sec = Row Count / TimeData Flow
 --To increase this Rows / sec calculation, you can do the

following:
--Improve drivers and driver configurations
--Start multiple connections
--Use multiple NIC cards
--Optimize the SQL data source, lookup transformations, and

destination.

   

--DataAccess Mode:When you execute SQL statements within

Integration Services (as noted in the above Data access mode dialog

box), whether to read a source, to perform a look transformation,

or to change tables, some standard optimizations significantly help

performance:
-- •Use the NOLOCK or TABLOCK hints to remove locking overhead.
-- •To optimize memory usage, SELECT only the columns you actually

need. If you SELECT all columns from a table (e.g., SELECT * FROM)

you will needlessly use memory and bandwidth to store and retrieve

columns that do not get used. .

-- •If possible, perform your datetime conversions at your source

or target databases, as it is more expensive to perform within

Integration Services

 -- •In SQL Server 2008 Integration Services, there is a new

feature of the shared lookup cache. When using parallel pipelines

(see points #8 and #10 below), it provides a high-speed, shared

cache. .

  --•If Integration Services and SQL Server run on the same server,

use the SQL Server destination instead of the OLE DB destination to

improve performance..

 --•Commit size 0 is fastest on heap bulk targets, because only one

transaction is committed.
 --If you cannot use 0, use the highest possible value of commit

size to reduce the overhead of multiple-batch writing.
 -- Commit size = 0 is a bad idea if inserting into a Btree –

because all incoming rows must be sorted at once into the target

Btree—and if your memory is limited, you are likely to spill.
-- Batchsize = 0 is ideal for inserting into a heap. For an indexed

destination, I recommend testing between 100,000 and 1,000,000 as

batch size.

 --•Use a commit size of <5000 to avoid lock escalation when

inserting; note that in SQL Server 2008 you can now enable/disable

lock escalation at the object level, but use this wisely.

 --•Heap inserts are typically faster than using a clustered index.
 --This means that you may want to drop indexes and rebuild if you

are changing a large part of the destination table;
 --you will want to test your inserts both by keeping indexes in

place and by dropping all indexes and rebuilding to validate..

 --•Use partitions and partition SWITCH command; i.e., load a work

table that contains a single partition and SWITCH it in to the main

table after you build the indexes and put the constraints on..

--Use data types – yes, back to data types! –wisely.
--Of all the points on this top 10 list, this is perhaps the most

obvious. Yet, it is such an important point that it needs to be

made separately. Follow these guidelines:
-- •Make data types as narrow as possible so you will allocate less

memory for your transformation.
-- •Do not perform excessive casting of data types – it will only

degrade performance. Match your data types to the source or

destination and explicitly specify the necessary data type

casting..
-- •Watch precision issues when using the money, float, and decimal

types. Also, be aware the money is faster than decimal, and money

has fewer precision considerations than float.

-- declare a cursor and loop to update stats and recompile.



---oledb destination(insert into table) There quite a few

strategies that can be adopted while loading data into a table.
 In the OLEDB Destination Component properties there are different

AccessMode, one of them is called OpenRowSet Using Fast Load

option. When this option is set one gets to work with other

FastLoad Properties such as:
 FastLoadKeepIdentity
 FastLoadKeepNulls
 FastLoadOptions
 FastLoadMaxInsertCommitSize
---In Summary oledb destination, while using FastLoad option one

needs to aware of what kind of load is being performed in terms of

number of records and then tune the fast load options to work

effectively.

To prevent the package from stopping, set the Maximum insert commit

size option to 2147483647.

----------------

oledb destination component:Table or View – Profiler Results :Looks

like we have a cursor on the go.. (bad news for performance in most

cases!)
oledb destination component:Table or View – Fast Load results:bulk

insert..The real question is to why not use Fast load. I can’t

think of many, The Fast Loads will cause a table lock because of

doing a BULK Load, so if you’re running this during the day you may

find that it’s not suitable, but it’s definitely worth testing!.



----------

    --Query Options toolbar-->General-->Set Rowcount(0)
                               -->Advanced-->Set()

Nocount,ARITHABORT(Checked),no EXEC,SHOWPLAN_TEXT,PARSE

ONLY,STATISTICS TIME,CONCAT_NULL_YIELDS_NULL,STATISTICS IO
                                --Transaction ISOLATION LEVEL

===================================================================

====


Handling errors at row level A. Ignore failure  B. Fail

component  C. Redirect row  D. Delete row
Fuzzy components are limited to character data comparisons,whereas

DQS matching can compare numbers and dates as well.
copy data from Table/views,no restrication for multiple data

flow,whereas by using query,to copy then we can restrict,but

onlyallow single dataflow.
Dev/Prod env is differ is in "configuration of data store";Dev:same

server(same db);Platform:ssis version diff(Prod/Dev) and also O/S

differ(DEV/Prod)
A success constraint will only allow the process to continue to the

following task if the preceding task completed successfully,

whereas a completion constraint will allow the process to continue

as soon as the preceding task has  completed, regardless of the

outcome.

 The Merge transformation is similar to Union All, but with Merge,

the sources have to be sorted and the sort position is preserved.

The Lookup transformation does not need a sorted input; it is a

non-blocking transformation, and in cases when more matching rows

exist from the referenced dataset, only the first one will be

retrieved. This means that the transformation will never retrieve

more rows than exist in the input rows. With the Merge Join

transformation more rows can be retrieved, because all matching

data is retrieved.      
Parameters are exposed to the caller, but variables are not. If a

particular property needs to be set dynamically but should not be

set by the administrator,                                   -- use

a variable; otherwise, use a parameter.
Some log types provide better integration with the operating

system,whereas others provide better integration with the data in a

database. Not every log type is best suited for every intended use.


Table object-Index  --new index,Rebuild All,Reorganize All,disable

All  For individual existed index object--create/disable index,new

index,Rebuild ,Reorganize ,disable  Index Properties General

Tab:tablename,indexname,index type-->Clustered/non-

clustered/primary/Spatil and also Unique(checkbox) and index key

columns list in grid  Index Properties Options Tab:Rebuild index

(Default-unchecked),ignore duplicate values(disable mode-

unchecked),automatically recompute statistics(checked),use row

locks when accessing the index(checked),use Page locks when

accessing the index(checked)      Included columns:In addition to

index key,you can include other table column in the index;Non-key

columns to include in the index


Control flows contain the definitions of data management operations.
■■ Control flows determine the order and the conditions of execution.
Control flow is defined by precedence constraints that determine the order and conditions of execution.
■■ Tasks representing logical units of work can be grouped in containers.
■■ Loop containers allow a unit of work to be executed repeatedly.

Creating a Robust and Restartable Package
--■■ You can enlist a container or a task to be part of a transaction.




The data flow task provides all three essential elements of complex data
--movements—data extraction, data transformations, and data loading. In addition,all three types of operations can be performed in memory and—depending on the data provider used by the destination connection manager—fast loading can also be used.

--bigint 40 digits:The number '1234567891234567891234567891234567891234' is out of the range for numeric representation (maximum precision 38).
  --int,if pass 11 or more digits then giving error like:Arithmetic overflow error converting expression to data type int.,The statement has been terminated.
--int:10 digits:allowed
 
  --Precedence Constraint editor
  --constraint options-->evaluation operation:constraint/expression/ expression and constraint/ expression or constraint
                                    --value:success/failure/completion
                                    --expression
  --multiple constraints:choose only one(radiobutton)
  --logical AND:all constraints must evaluate TRUE
  --logical OR:one constraint must evaluate TRUE
 
--Select either Expression and Constraint or Expression or Constraint to use an expression in the evaluation operation of the precedence constraint.
-- The expression that you use must evaluate to a Boolean.
--For example, a package could use an expression that evaluates whether the constrained executable runs based on the amount of disk space available, the number of rows a data flow inserted into a table, or the day of the month.

--Let's look at some sample expressions. In the following expression, the variable varRowCount, set by a Row Count transformation in a data flow, is compared to the value 1000. You could use this expression to determine whether the package should run a task that backs up a database depending on the number of rows inserted into the destination database. The prefix @ is required when variables are used in expressions to differentiate the variables from column names. In most user interface, the @ is added for you automatically.

--@varRowCount > 1000

--Likewise, in the following expression, the variable @varDisk, set by a Script task, is compared to the value 20. You could use this expression to determine whether to continue package execution.

--@varDisk >= 20

--In the following expression, the GETDATE function determines the day. If the day is the first day in the month, the constrained executable runs.

--DATEPART("day",GETDATE()) == 1

In your ETL process, a few values need to be retrieved from a database at run time,
based on another value available at run time, and they cannot be retrieved as part of
any data flow task. Which task can you use in this case?

Execute T-SQL Statement task
Execute SQL task
Expression task
Execute Process task
task cannot retrieve values from the database,
--nor does it support parameters, both of which are needed in this case.
used in this case, because it allows values to
--be passed as input parameters to the SQL statement retrieved as output parameters
--from the statement.
cannot be used to retrieve data from a data source.
Although arguments can be passed to the external process via the
--Execute Process task, values cannot be retrieved this way.

Which task in the control flow supports transactions?
The File System task &XML task does not provide a rollback mechanism, so it cannot support transactions.

Which transaction isolation level does not lock the records being read?
Serializable
Snapshot
Chaos
ReadUncommitted
is the strictest isolation level
isolation does not lock the data
is similar to ReadUncommitted, but it checks the isolation level of other pending transactions during a write operation.
is the least restrictive isolation level because it ignores locks placed by other transactions




Non-Blocking streaming transformations
Partially blocking transformation
Non-blocking row based transformation
blocking transformation
These transforms operate on the same buffer from start to finish
Partially-blocking transforms might wait on other inputs, and always produce new buffer
These transforms operate row by row .-but can't work on same buffer simultenously
These transforms stop pipeline until all row read and always produce new buffer


The both transformations accept one input and give us multiple outputs
Multicast
Conditionalsplit
Directs each row of the source to every o/p
Directs each row to a single o/p

Both transformations can add new columns.
Dervied column
Copy column
Can add new columns without any help from existing columns.
can add new columns only through existing columns
can assign different transformations and data types to the new columns
No options available
supports error output
whereas Copy column can’t support error output


IN
EXISTS
JOIN

Return expected results
Return expected results
Inner join Return join gives us more rows than we need, because it joins to every child row
we want to get all data in ta1 that doesn't have a child row in ta2
Null condition
It will not displayed “null” records, because null-unknown,Three-valued logic(TFUnknown)
Not IN
Return null records as well by using
Not Exists condition
Left join with IS null condition included in the where  caluse


4. What is the difference between the Union All and the Merge transformation?

The Merge transformation is similar to Union All, but with Merge, the sources have to be sorted and the sort position is preserved.

5. What is the difference between the Lookup and Merge Join transformations?

The Lookup transformation does not need a sorted input; it is a non-blocking transformation, and in cases when more matching rows exist from the referenced dataset, only the first one will be retrieved. This means that the transformation will never retrieve more rows than exist in the input rows. With the Merge Join transformation more rows can be retrieved, because all matching data is retrieved.

6. There is an important difference between lineage and other columns:

lineage columns are never exposed to end users and are never shown on end users reports.

To summarize, a dimension may contain the following types of columns:

■■ Keys Used to identify entities

■■ Name columns Used for human names of entities

■■ Attributes Used for pivoting in analyses

■■ Member properties Used for labels in a report


■■ Lineage columns Used for auditing, and never exposed to end users


Troubleshooting Tools for Package Execution
Integration Services includes features and tools that you can use to troubleshoot packages when you execute them after they have been completed and deployed.
At design time, SQL Server Data Tools (SSDT) provides breakpoints to pause package execution, the Progress window, and data viewers to watch your data as it passes through the data flow. However, these features are not available when you are running packages that have been deployed. The main techniques for troubleshooting deployed packages are as follows:
  • Catch and handle package errors by using event handlers.
  • Capture bad data by using error outputs.
  • Track the steps of package execution by using logging.
You can also use the following tips and techniques to avoid problems with running packages
·         Help to ensure data integrity by using transactions
·         Restart packages from the point of failure by using checkpoints
1.       Catch and Handle Package Errors by Using Event Handlers
2.       Troubleshoot Bad Data by Using Error Outputs
3.       Troubleshoot Package Execution by Using Operations Reports
4.       Troubleshoot Package Execution by Using SSISDB Views
5.       Troubleshoot Package Execution by Using Logging
6.       Troubleshoot Package Execution by Using Debug Dump Files
7.       Troubleshoot Run-time Validation Issues
8.       Troubleshoot Run-time Permissions Issues
9.       Troubleshoot 64-bit Issues
10.   Troubleshoot Errors without a Description

1.Catch and Handle Package Errors by Using Event Handlers:Create an event handler for the OnError event
2. Troubleshoot Bad Data by Using Error Outputs:
You can use the error output available on many data flow components to direct rows that contain errors to a separate destination for later analysis.
1.      Capture bad data by using error outputs.
2.      Add friendly information to the error outputs.
3.      Add the description of the error.
4.      Add the name of the error column
a.       Create a lookup table of column names
b.      Add the column name to the output.

2. Troubleshoot Bad Data by Using Error Outputs àCapture bad data by using error outputs: Send rows that contain errors to a separate destination such as an error table or a text file.
The error output automatically adds two numeric columns that contain the number of the error that caused the row to be rejected, and the ID of the column in which the error occurred.
For more info: Error Handling in Data:
Errors typically fall into one the following categories
1.      Data conversion errors
2.      Expression evaluation errors
3.      Lookup errors, which occur if a lookup operation fails to locate a match in the lookup table.
Configure Error output (red-marked error flow)
a.       Error and Truncation Options:Default ID & column
b.      Adding the Error Description: You can use the Script component to include the error description in an additional column by using a single line of script to call the GetErrorDescription method

public class ScriptMain:
    UserComponent
{
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {

  Row.ErrorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);

    }
}
3.Troubleshoot Package Execution by Using Operations Reports
Standard operations reports are available in SQL Server Management Studio to help you monitor Integration Services packages that have been deployed to the Integration Services catalog. These package reports help you to view package status and history, and, if necessary, identify the cause of failures.

Troubleshoot Package Execution by Using Logging
You can track much of what occurs in your running packages by enabling logging. Log providers capture information about the specified events for later analysis, and save that information in a database table, a flat file, an XML file, or another supported output format.
a.       Enable logging.
b.      Select the package's Diagnostic event to troubleshoot provider issues

c.       Enhance the default logging output.
Logging typically appends rows to the logging destination each time that a package is run. Although each row of logging output identifies the package by its name and unique identifier, and also identifies the execution of the package by a unique ExecutionID, the large quantity of logging output in a single list can become difficult to analyze.
The following approach is one suggestion for enhancing the default logging output and making it easier to generate reports:
1.      Create a parent table that logs each execution of a package.
2.      Add auditing information to the data flow.
3.      Consider capturing row count data.


 Questions:


S.No
Instead of Triggers
After Triggers
1
Each table or view can have one INSTEAD OF trigger for each triggering action (UPDATE, DELETE, and INSERT)
A table can have several AFTER triggers for each triggering action.
2
INSTEAD OF triggers fire in place of the triggering action and before constraints are processed.
AFTER triggers fire after the triggering action (INSERT, UPDATE, or DELETE) and after any constraints are processed.


S.No
Clustered Indexes
Non-Clustered Indexes
1
It reorders the physical storage of records in the table
It sorts and maintain a separate storage
2
There can be only one Clustered index per table
We can have 249 non-clustered indexes in a table
3
The leaf nodes contain data
The leaf node contains pointer to data
4
To create clustered index Sql server required more memory because the leaf pages in the tree structure will maintain actual data .
To create non-clustered index Sql server requires less memory because the leaf pages will contain pointers to actual data
5
By using clustered index retrieving data is more faster,when we compare with non-clustered index.
By using non-clustered index retrieving data is slower than clustered index.



create synonym dbo.tsample for testtable
create synonym dbo.vsample for v1ci--view,function,sp

How do you search for synonyms of a word with the CONTAINS predicate?
1. You have to use the CONTAINS(FTcolumn,‘FORMSOF(THESAURUS,
SearchWord1)’) syntax.


1. Does a synonym store T-SQL or any data?
2. Can synonyms be altered?
Answer
1. No, a synonym is just a name. All that is stored with asynonym is the object it
refers to.
2. No, to change a synonym, you must drop and recreate it.

Synonyms cannot refer to other synonyms. They can only refer to database objects such as
tables, views, stored procedures, and functions. In other words, synonym chaining is not allowed.

FAST LOAD:

The data flow task provides all three essential elements of complex data movements—data extraction, data transformations, and data loading.
In addition,all three types of operations can be performed in memory and—depending on the data provider used by the destination connection manager—fast loading can also be used.


■■ Use Fast Load or Batch mode when inserting data by using an ODBC or OLE DB destination adapter.

A quick comparison between loading a SQL Server Ole DB destination with “Table and View” vs the “Table and View Fast Load”
I’ve set up a simple data flow that loads data from a csv and pushes it into SQL Server.
Table or View – Profiler Results

Looks like we have a cursor on the go.. (bad news for performance in most cases!)
And Table or View – Fast Load results:

A bulk insert..
The real question is to why not use Fast load. I can’t think of many, The Fast Loads will cause a table lock because of doing a BULK Load, so if you’re running this during the day you may find that it’s not suitable, but it’s definitely worth testing!.


Note:

OLE DB Destination
The OLE DB destination supports all of the bulk load options for SQL Server. However, to support ordered bulk load, some additional configuration is required. For more information, see “Sorted Input Data”. To use the bulk API, you have to configure this destination for “fast load”.
The OLE DB destinationcan use both TCP/IP and named pipes connections to SQL Server. This means that the OLE DB destination, unlike the SQL Server destination, can be run on a computer other than the bulk load target. Because Integration Services packages that use the OLE DB destination do not need to run on the SQL Server computer itself, you can scale out the ETL flow with workhorse servers.

Configuring the Integration Services OLE DB Destination for Sorted Input
While the GUI does not directly expose it, it is possible to configure the OLE DB Destination in Integration Services to support a sorted input stream. This is done from the advanced properties of the component.
Figure 6:Configuring OLE DB Destination for sorted input
Add the ORDER hint, as illustrated above, to the FastLoadOptions property.















No comments:

Post a Comment