Showing posts with label etl. Show all posts
Showing posts with label etl. Show all posts

Tuesday, March 27, 2012

Advice needed: DW and ETL tools

Hi!
I'm involved in building a medium sized ware house with MS SQLServer 2000=20
as the backend. I've been searching for some good tools for SQLServer to=20
help me with the demanding tasks of planning and building the database and =
(preferrably the same tool) making the mappings and data transformations=20
when populating the database, but still I've not come across any that has=20
satisfied my needs. Oracle has a great tool for this called Oracle=20
Warehouse Builder, but haven't seen anything like it for MS databases.=20
Therefore I'm asking: Does anyone know about a tool that can be used in=20
planning, deploying and maintaining a data warehouse built on a MS=20
SQLServer 2000? Doesn't matter if it has a price tag on it or not
S=F6ren Hakola
FinlandI think that anyone that uses SQL Server to deploy a Data Warehouse... uses
SQL Server 2000 capabilities only. And it's cheeper )
- Diagram: create logic models with direct link to the tables
- Tables: Info repository
- Data Transformation Services (DTS): ETL, data transformation (populate the
tables)
- Jobs: schedule DTS for automatic ETL
- Analysis Services: If you desire to work with a MOLAP tool
Other tools:
Data Stage (Unix and Windows, all databases) - Top tool, expensive
Informatica (Unix and Windows, all databases) - Top tool, expensive|||Thanks for the response. I'll check those out!
Greetings
S.sql

Sunday, March 25, 2012

AdventureWorksDW ETL packages

Where can i get the AdventureWorksDW ETL packages. I attched the AdventureWorks,AdventureWorksDW databases. Now i need ETL packages to transfer the data froam AdventureWorks to AdventureWorksDW.

If anybody knows about this please help me..............

Regards,

Hanu

If you installed the samples and tools when you installed SQL Server, look here: C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Package Samples\AWDataWarehouseRefresh

If you did not, then look here: http://www.codeplex.com/MSFTISProdSamples/Release/ProjectReleases.aspx?ReleaseId=4039

|||

Thank you very much for your reply.

Regards

Hanu

sql

Friday, February 24, 2012

ADO.NET or OLEDB connection/recordset?

My package needs to be a High Performance (target: 150,000 rows in 30 minutes) ETL solution. We are using all MS technologies - SSIS, SQL 2005, BIDS, etc. I need to loop the recordset executed by a Stored Proc in a Execute SQL Task in a Source Script Component.

If I use an ADO.NET Connection Manager, here is the code in the Source Script Component Public Overrides Sub CreateNewOutputRows()

Code 1

Dim sqlAdapter As New SqlDataAdapter

Dim dataRow As Data.DataRow

Dim ds As DataSet = CType(Me.Variables.rsSomeResultset, DataSet)

sqlAdapter.Fill(ds)

Iget: Error: System.InvalidCastException: Unable to cast object of type 'System.Object' to type 'System.Data.DataSet'.

Code 2

Dim oledbAdapter As New OleDb.OleDbDataAdapter

Dim dataTable As DataTable

oledbAdapter.Fill(dataTable, Me.Variables.rsSomeResultset)

Error: System.ArgumentException: Object is not an ADODB.RecordSet or an ADODB.Record. Parameter name: adodb

It works all right when I use an OLEDB Connection Manager with the second code sample.

Question: In order to extract the maximum performance, wouldn't it be preferred to use ADO.NET with SqlClient Provider in an all SQL Server 2005 environment? Or will an OLEDB Connection provide comparable or equal performance?

If so, what code can I use? Since the recordset returned by the Stored Proc (in the Execute SQL Task) can only be captured in a System.Object variable and you can only use the overload of the Fill() method of the OleDbDataAdapter to accept an ADO Recordset or Record object.

There was a post recently that compared some of the connection types, but thanks to the lovely search functionality, I can't find it. If anyone else has it, please post it to this thread.

In general, I don't think you are going to see a significant performance difference between ADO.NET and OLEDB against SQL Server.

That being said, if you want to do some further research into the problem, try adding a message box to your script to display the type of the variable.

Code Snippet

System.Windows.Forms.MessageBox.Show(Me.Variables.rsSomeResultset.ToString())

|||You Code 1 snippet doesn't look right for ADO.NET. You shouldn't be trying to call sqlAdapter.Fill(ds). Try Dim dataTable as DataTable = ds.Tables(0) instead. I'd guess that OLE DB would be faster. Please post your findings.
|||

Do you mean this discussion?

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1985034&SiteID=1

ADO.NET or OLEDB connection/recordset?

My package needs to be a High Performance (target: 150,000 rows in 30 minutes) ETL solution. We are using all MS technologies - SSIS, SQL 2005, BIDS, etc. I need to loop the recordset executed by a Stored Proc in a Execute SQL Task in a Source Script Component.

If I use an ADO.NET Connection Manager, here is the code in the Source Script Component Public Overrides Sub CreateNewOutputRows()

Code 1

Dim sqlAdapter As New SqlDataAdapter

Dim dataRow As Data.DataRow

Dim ds As DataSet = CType(Me.Variables.rsSomeResultset, DataSet)

sqlAdapter.Fill(ds)

Iget: Error: System.InvalidCastException: Unable to cast object of type 'System.Object' to type 'System.Data.DataSet'.

Code 2

Dim oledbAdapter As New OleDb.OleDbDataAdapter

Dim dataTable As DataTable

oledbAdapter.Fill(dataTable, Me.Variables.rsSomeResultset)

Error: System.ArgumentException: Object is not an ADODB.RecordSet or an ADODB.Record. Parameter name: adodb

It works all right when I use an OLEDB Connection Manager with the second code sample.

Question: In order to extract the maximum performance, wouldn't it be preferred to use ADO.NET with SqlClient Provider in an all SQL Server 2005 environment? Or will an OLEDB Connection provide comparable or equal performance?

If so, what code can I use? Since the recordset returned by the Stored Proc (in the Execute SQL Task) can only be captured in a System.Object variable and you can only use the overload of the Fill() method of the OleDbDataAdapter to accept an ADO Recordset or Record object.

There was a post recently that compared some of the connection types, but thanks to the lovely search functionality, I can't find it. If anyone else has it, please post it to this thread.

In general, I don't think you are going to see a significant performance difference between ADO.NET and OLEDB against SQL Server.

That being said, if you want to do some further research into the problem, try adding a message box to your script to display the type of the variable.

Code Snippet

System.Windows.Forms.MessageBox.Show(Me.Variables.rsSomeResultset.ToString())

|||You Code 1 snippet doesn't look right for ADO.NET. You shouldn't be trying to call sqlAdapter.Fill(ds). Try Dim dataTable as DataTable = ds.Tables(0) instead. I'd guess that OLE DB would be faster. Please post your findings.
|||

Do you mean this discussion?

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1985034&SiteID=1