Ashwani 的个人资料Ashwani Roy's BI Blog照片日志列表更多 工具 帮助

日志


10月26日

F#- What , Why and BI – My First Look

I am preparing a demo for User Group in UK of how F# can used with Business Intelligence. Frankly speaking I don’t know the end result but as I go along I am going to keep blogging about what I am learning in F# with simple examples. The aim is to have a series of blog post on this new language (something like I have done for ADO.NET entity Framework on http://csentities.wordpress.com ) which will serve as a quick start guide and possibly will open up areas of discussion where I will be using F# with BI to add the power of functional programming into Business Intelligence.

I will be using VS 2010 (I have a Beta 2 Ultimate as of now but the code should work with VS 2008 and fsi.exe).

So Lets go ahead and start a new Blank Solution

image

Now lets go ahead and add a new F# console application.

image

Now Open the program.fs and paste the code given below in it.

open System
let a = 2
Console.WriteLine(a)
Console.Read()

 

Now if you hover over the Console.Read() you will see an error saying “The expression should have type ‘uint’, but has type ‘int’. Use ‘ignore’ to discard the result of the expression , or let to bind to a name.”

image

If you write Console.Read() in C# you will not get any diagnostic warning like this. This reminds you that you want to either pipe the result to ignore (e.g. if just calling a function for its side-effects) or else use the result.  There are a handful of minor diagnostic improvements like this (though there is still plenty of room for us to continue to improve).

If you want to get rid of this warning message then use the code below.

open System
let a = 2
Console.WriteLine(a)
ignore(Console.Read())

 

Run the application and your output will be look like this

image

Console.Read is just to hold the screen for us to see it and the screen exits when you hit any key after this.

So by now we know how to create a simple F# console application writes something to the console. Watch out for the coming posts.

10月2日

Microsoft MVP Award to Me !! Thank you all

I got a email today at 4:00 PM from Microsoft MVP award which read

Dear Ashwani Roy,
Congratulations! We are pleased to present you with the 2009 Microsoft® MVP Award! This award is given to exceptional technical community leaders who actively share their high quality, real world expertise with others. We appreciate your outstanding contributions in SQL Server technical communities during the past year

I would like to thank UK SQL Server Community , my blog readers my peers and friends and I hope I will keep adding value to community.

How can solve order effect your MDX calculations

[Old post reposted as I was not very happy with the formatting]

Have a look at this query

   1:  WITH
   2:  MEMBER [Measures].[REFUNDS] AS [Measures].[Internet Sales Amount] – [Measures].[Internet Sales Amount] * .4
   3:  MEMBER [Measures].[PROFIT] AS [Measures].[Internet Sales Amount] – [Measures].[REFUNDS]
   4:  MEMBER [Product].[Product].[Top10] AS
   5:  sum(
   6:  TOPCOUNT([Product].[Product].[Product].MEMBERS
   7:  , 10
   8:  , [Measures].[Internet Sales Amount]
   9:  )
  10:  ,
  11:  ([Measures].CurrentMember)
  12:  )
  13:  MEMBER [Product].[Product].[Top10PercOfTotal] AS [Product].[Product].[Top10]/[Product].[Product].[All Products]
  14:  SELECT 
  15:  {
  16:  [Measures].[Internet Sales Amount],
  17:  [Measures].[REFUNDS],
  18:  [Measures].[PROFIT]
  19:  } ON COLUMNS,
  20:  {
  21:  [Product].[Product].[Top10]
  22:  ,[Product].[Product].[Top10PercOfTotal]
  23:  ,[Product].[Product].[All Products]
  24:  } 
  25:  ON ROWS
  26:  FROM [Adventure Works]

 

This yields same result as yours i.e. not a flat cell by cell division

  Internet Sales Amount REFUNDS PROFIT
Top10 $10,355,525.92 $6,213,315.55 $4,142,210.37
Top10PercOfTotal 0.35272454 0.211634724 0.141089816
All Products $29,358,677.22 $17,615,206.33 $11,743,470.89

 

I don’t expect it to produce this result . If the computations was going as planned the output should be 0.35272454  (35 %) for each.

I have altered the SOLVE ORDER (this is the property which determines the order of cell computation) .

Here is modified query

   1:  WITH
   2:  MEMBER [Measures].[REFUNDS] AS [Measures].[Internet Sales Amount] - [Measures].[Internet Sales Amount] * .4,SOLVE_ORDER = 1
   3:  MEMBER [Measures].[PROFIT] AS   [Measures].[Internet Sales Amount] - [Measures].[REFUNDS],SOLVE_ORDER = 2
   4:   
   5:  MEMBER [Product].[Product].[Top10] AS
   6:   sum(
   7:    TOPCOUNT([Product].[Product].[Product].MEMBERS
   8:    , 10
   9:    , [Measures].[Internet Sales Amount]
  10:    )
  11:    ,
  12:    ([Measures].CurrentMember)
  13:    )
  14:  ,SOLVE_ORDER = 3
  15:   
  16:  MEMBER [Product].[Product].[Top10PercOfTotal] AS 
  17:  [Product].[Product].[Top10]/[Product].[Product].[All Products]
  18:  , FORMAT_STRING = "Percent"
  19:   , SOLVE_ORDER = 4
  20:   
  21:   
  22:   
  23:  SELECT 
  24:   {
  25:    [Measures].[Internet Sales Amount],
  26:    [Measures].[REFUNDS],
  27:    [Measures].[PROFIT]
  28:    } ON COLUMNS,
  29:   {
  30:  [Product].[Product].[Top10]
  31:  ,[Product].[Product].[Top10PercOfTotal]
  32:  ,[Product].[Product].[All Products]
  33:   
  34:    } 
  35:    ON ROWS
  36:  FROM [Adventure Works]
  37:  

 

Here is the output

  Internet Sales Amount REFUNDS PROFIT
Top10 $10,355,525.92 6213315.552 4142210.368
Top10PercOfTotal 35.27% 35.27% 35.27%
All Products $29,358,677.22 $17,615,206.33 $11,743,470.89

 

There you go. SOLVE_ORDER determines the series of how cell computation  will be formed. One with smallest SOLVE_ORDER will be evaluated first.

Hope this explains SOLVE_ORDER and how it works. For more have a look at MDSN @ http://msdn.microsoft.com/en-us/library/ms145539.aspx

9月26日

10th SEP – PASS Meeting- What's new in SQL Server 2008 for BI

SSAS 2008 has improved Dimension and Aggregation Designer, new Attribute Relationship Designer, Optimize performance with block computations mode and dynamic management views for enhanced resource monitoring.SSRS 2008 reports has the unique data format of Tablix which allows writing reports with combined advantage of Table and Matrix formats. It can integrate with Microsoft Office SharePoint Server 2007 for central delivery and management of business insight. It also enables users to quickly gain insight into complex sets of data by displaying data graphically with enhanced visualization capabilities. Performance has improved drastically for situations where you are generating large reports.

SSIS Pipeline is optimized to enable more parallel loading of data. You can write script components in C# now. It comes with improved scalability with thread pooling and enhanced lookup transformations. It also performs more functional and scalable data transfers with the improved SQL Server Import and Export Wizard.

Other engine features that you need to be aware of are, improvements to partitioning, change data capture to enable easy extraction of changed data from a production system, backup and table compression which enables better performance and less storage space utilisation. It also comes with optimized Star-Join, improved lock escalation handling, Merge statements and other advanced T SQL enhancements.

 

The slides and demos are available here for download.

8月25日

Thursday, 10th September – SQL Server User Group Meeting

The evening's agenda includes the following presentations (see Events section for more details):

What's new in SQL Server 2008 for BI

Presented by Ashwani Roy, MCTS MCITP MCAD MCP

We're also pleased to let you know that SQL Server MVP, Simon Sabin will be presenting. Details about this presentation will be made available shortly.

 

Location:
Bank Of America
5 Canada Square
London E14 5AQ
United Kingdom

 

Register @ http://www.sqlpass.org.uk/

7月27日

OLAP PivotTable Extensions

Excel 2007 provides many APIs which are quite powerful from analytics point of view.Not all of them are exposed via the UI though ( for some reason … ). SQL Server MVP Greg Galloway has developed this very cool tool which is EXCEL 2007 add-in and lets you do very cool stuff.

You know that calculated members are evaluated on the fly.But the only way you can have this calculated member available from browsing is by doing this physically in the cube. There is a way that you can do it in the excel itself .These are called Private Calculated Members

 

Simple ratios or differences and stuff that are very specific to a (or small group) of cube users can be put in here , rather than cluttering the cube.

Limitation: If you run "OLAP Tools... Convert to Formulas" on a PivotTable with private calculated members, the private calculated members will show N/A. There is no known workaround at this point other than having your OLAP administrator define these calculated members in the cube itself.

There is much more to this too and the download and more information is available here http://www.codeplex.com/OlapPivotTableExtend.

7月23日

SQL Server 2008 – Merge , Grouping Sets , table Valued Parameters and some other stuff !!

Some time back I took a small session for a group of SQL Server users about some cool stuff in SQL Server 2008 from the T SQL point of view. Some of the students wanted to have access to the scripts , so here they are

MERGE STATEMENT and OUTPUT CLAUSE

drop table Source
drop table Destination

go

CREATE TABLE Source (id int,val varchar(56));

CREATE TABLE Destination (id int,val varchar(56));

GO

INSERT Source VALUES (1,'A');
INSERT Source VALUES (2,'B');-- This will be inserted

INSERT Destination VALUES (1,'q');--this will be updated to A
INSERT Destination VALUES (6,'w');
INSERT Destination VALUES (8,'r');

GO

MERGE Destination D -- target table

        USING Source S -- source table

        ON S.ID = D.ID

        WHEN MATCHED  THEN

                UPDATE set D.val = S.val

          WHEN NOT MATCHED THEN

                -- insert a row if the stock is newly acquired

                INSERT VALUES (S.ID, S.VAL)

        -- output details of INSERT/UPDATE/DELETE operations

        -- made on the target table

        OUTPUT $action, inserted.*, deleted.*; ---this is not required in Merge statement I have kept it here just to demostrate that from SQL 2005 onwards
                                                       --- OUTOUT clause can be used along with INSERTED , DELELED to get the what has been done as a result to DML action
                                                       --- Might help in auditing the DML actions
/*output of the OUTPUT CLAUSE IS

$action    id    val        id        val--- I can see what is Inserted and what is Deleted
INSERT    2    B        NULL    NULL
UPDATE    1    A        1        q

*/                                                      

SELECT * FROM Destination;--See the result

GO

 

GROUPING SETS

GROUPING SETS are a new feature of SQL Server 2008. Using them will allow multiple groupings to be returned in one record set.

We will grouping on City and StateProvice in the same query

use AdventureWorksLT
go
---old way ----------------
SELECT NULL, City, COUNT(StateProvince) AS Nbr
FROM SalesLT.Address
GROUP BY  City
UNION ALL
SELECT StateProvince, NULL, COUNT(StateProvince) AS Nbr
FROM SalesLT.Address
GROUP BY StateProvince

--useing GROUPING SETS in SQL 2008
SELECT StateProvince, City, COUNT(StateProvince) AS Nbr
FROM SalesLT.Address
GROUP BY
GROUPING SETS
(
  (City),
(StateProvince)
--()
  );

TABLE VALUED PARAMTERS

In this example I will pass a table valued parameter in the stored procedure

--USING Table valued input Parameters
CREATE TYPE myTableType AS TABLE (id INT, name NVARCHAR(100),qty INT);

ALTER PROCEDURE myProc (@tvp myTableType READONLY) AS

declare @table TABLE (id int)

    UPDATE Inventory SET
    qty += s.qty
    FROM Inventory AS i INNER JOIN     @tvp AS tvp
    ON i.id = tvp.id
GO

 

NEW DATE TIME DATA TYPES

-- IN SQL 2008 INSTANCE
declare @datetime datetime,@DATETIME2 DATETIME2
select @datetime = '1500-03-04 15:43:26.857'--this will give out of range exception asyou cannot have anything before 1900
select @datetime2 = '1500-03-04 15:43:26.857'-- there you can do it
SELECT @DATETIME2

DECLARE @DATE DATE , @TIME TIME
SELECT @DATE ='1900-01-01',@TIME = '15:43:26.857'
SELECT @DATE ,@TIME

DECLARE @DATETIMEOFFSET datetimeoffset
SELECT @DATETIMEOFFSET = '2005-09-08 12:20:19.345 -08:00'    -- stored datetime with OFFSET Lets say you want datetime datetime captured in LONDON server transfrom to US and lets assume the offset is 8 hrs
SELECT @DATETIMEOFFSET

--------------------------------------------------------------------------------------------------------------------------

Thats all for now!! I though it will be easier to get the scripts from here rather than download from my sky drive. It is not a very extensive list , it just covers the 1 hours session which I took for beginner sql users.

A quick Look at CDC –(Change data Capture)

 

What is CDC :-

Change data capture (CDC) is a set of software design pattern used to determine (and track) the data that has changed so that action can be taken using the changed data. Also, Change data capture (CDC) is an approach to data integration that is based on the identification, capture and delivery of the changes made to enterprise data sources.

It is a very common requirement for a data warehouse load package is to determine what has changed in the source systems and load this data into the warehouse.

All I wanted to give was a script that you can use along with adventure works Database to see what are basic stuff in CDC and how it works. off course you can use it in your Data warehouse SSIS packages.

---------------------------------------------------------------

USE AdventureWorks
GO
--Enable Change Data Capture on the Database--
EXEC sp_cdc_enable_db
GO

--Enable Change Tracking on a table--
EXEC sp_cdc_enable_table
    @source_schema = 'HumanResources' ,@source_name = 'Employee',@supports_net_changes = 1,@role_name = null
go
--Simulate a Data Change --
UPDATE HumanResources.Employee
SET   ManagerID = 10 WHERE Title = 'Production Technician - WC60'
go

select * from HumanResources.Employee where Title = 'Production Technician - WC60'

--Get the Data --
select sys.fn_cdc_get_min_lsn('HumanResources_Employee'),sys.fn_cdc_get_max_lsn() 
go

select *  from cdc.fn_cdc_get_all_changes_HumanResources_Employee
(0x00000036000017400078,0x0000003600001EC1006C,'all')

declare @minLSN varbinary(max),@maxLsn varbinary(max)

select @minLSN = sys.fn_cdc_get_min_lsn('HumanResources_Employee')
,@maxlsn=sys.fn_cdc_get_max_lsn() 

select *  from cdc.fn_cdc_get_all_changes_HumanResources_Employee
(@minLSN ,@maxLsn ,'all')

--Disable CDC as I dont want this running on my Laptop--
EXEC sp_cdc_disable_db
GO
/*
    WHEN 1 THEN 'delete'
    WHEN 2 THEN 'insert'
    WHEN 4 THEN 'update'

*/

--exec sp_configure

-----------------------------------------------------------------------------------------------------

7月15日

Dynamic Management Views for SSAS

With SQL Server 2005 onwards we are provided with a set of DMVs. These DMVs allow us to monitor facts like which index is being used and which are not , where are IO bottlenecks , what is cache hit ratio and other stuff. But what about analysis service.

What if I want to know which of my hierarchies are being used , which are the aggregations that are not being used etc etc.

Well from SQL 2008 onwards you can have this capability.

AS 2008 has

-->  4 DBSCHEMA (Database Schema DMVs)

1. $SYSTEM.DBSCHEMA_CATALOGS

2. $SYSTEM.DBSCHEMA_COLUMNS

3. $SYSTEM.DBSCHEMA_PROVIDER_TYPES

4. $SYSTEM.DBSCHEMA_TABLES

--> 10 DMSCHEMA DMVs

1. $SYSTEM.DMSCHEMA_MINING_COLUMNS

2. $SYSTEM.DMSCHEMA_MINING_FUNCTIONS

3. $SYSTEM.DMSCHEMA_MINING_MODEL_CONTENT

4. $SYSTEM.DMSCHEMA_MINING_MODEL_CONTENT_PMML

5. $SYSTEM.DMSCHEMA_MINING_MODEL_XML

6. $SYSTEM.DMSCHEMA_MINING_MODELS

7. $SYSTEM.DMSCHEMA_MINING_SERVICE_PARAMETERS

8. $SYSTEM.DMSCHEMA_MINING_SERVICES

9. $SYSTEM.DMSCHEMA_MINING_STRUCTURE_COLUMNS

10. $SYSTEM.DMSCHEMA_MINING_STRUCTURES

These DMVs describe data mining models in the Analysis Services database.

--> There are 13 DMVs which describe the Meta Data of Analysis service data base (cube,partitions,hierarchies etc)

1. $SYSTEM.MDSCHEMA_CUBES

2. $SYSTEM.MDSCHEMA_DIMENSIONS

3. $SYSTEM.MDSCHEMA_FUNCTIONS

4. $SYSTEM.MDSCHEMA_HIERARCHIES

5. $SYSTEM.MDSCHEMA_INPUT_DATASOURCES

6. $SYSTEM.MDSCHEMA_KPIS

7. $SYSTEM.MDSCHEMA_LEVELS

8. $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS

9. $SYSTEM.MDSCHEMA_MEASUREGROUPS

10. $SYSTEM.MDSCHEMA_MEASURES

11. $SYSTEM.MDSCHEMA_MEMBERS

12. $SYSTEM.MDSCHEMA_PROPERTIES

13. $SYSTEM.MDSCHEMA_SETS

 

These DMVs expose a Whole New gold mine for Analysis and monitoring of SSAS. Have a look at some of the DMX queries below which use these DMVs to expose some very useful information.

1. Open SQL Server Management Studio

2. Open a DMX query Window and connect to Adventure Works Cube .(if you dont have adventure works cube you can download it from codeplex.com)

3. paste this query

SELECT *
FROM $SYSTEM.MDSCHEMA_MEASURES
WHERE CUBE_NAME = 'Adventure Works'
   AND MEASUREGROUP_NAME = 'Internet Sales'
ORDER BY [MEASUREGROUP_NAME]

You will see that you have information about all the measures in the Measure Group

here is sample output (Only selected few column due to space constraint)

 

CATALOG_NAME SCHEMA_NAME CUBE_NAME MEASURE_NAME MEASURE_UNIQUE_NAME DEFAULT_FORMAT_STRING
Adventure Works DW Adventure Works Internet Order Quantity [Measures].[Internet Order Quantity] #,#
Adventure Works DW Adventure Works Internet Extended Amount [Measures].[Internet Extended Amount] Currency
Adventure Works DW Adventure Works Internet Tax Amount [Measures].[Internet Tax Amount] Currency
Adventure Works DW Adventure Works Internet Freight Cost [Measures].[Internet Freight Cost] Currency
Adventure Works DW Adventure Works Internet Unit Price [Measures].[Internet Unit Price] Currency
Adventure Works DW Adventure Works Internet Total Product Cost [Measures].[Internet Total Product Cost] Currency
Adventure Works DW Adventure Works Internet Standard Product Cost [Measures].[Internet Standard Product Cost] Currency

 

Now that we know how useful can this DMV be here are some scripts that can help you play around with

select * from $system.dbschema_tables
select * from $system.dbschema_columns
select * from $system.dbschema_catalogs

select * from $system.mdschema_cubes
select * from $system.mdschema_hierarchies

select * from [Adventure Works].[$Product] --database dimension

select * from $system.discover_commands
select * from $system.discover_connections
select * from $system.discover_memoryusage
select * from $system.discover_object_memory_usage
select * from $system.discover_object_activity where object_reads > 0
select * from $system.discover_partition_stat

SQL Server MVP Darren Gosbell has blogged about it and I suggest you look at his blog http://geekswithblogs.net/darrengosbell/Default.aspx 

(even otherwise it is a very good blog)

7月9日

- SQL BITS – V [19th - 21st November 2009 Celtic Manor, Newport] www.sqlbits.com

 

We are pleased to announce SQLBits goes West, the 5th instalment of SQLBits conferences. We are making it even bigger and better than last time, now spanning 3 days, and still keeping everything that has worked so well at previous events.

The event will be held at the Celtic Manor Resort in Newport South Wales, just off the M4 motorway.

This is the biggest event yet, with 3 days of top quality SQL Server content. It starts with the pre-conference training day on Thursday 19th, more details coming soon. We have had a lot of feedback about weekday versus weekend during previous events so this time we have added a paid conference day on the Friday 20th, with a SQL 2008 and R2 theme. Finally we have the free Saturday community day, with speakers from around the world covering all manner of SQL Server topics.

For more go to www.sqlbits.com

6月24日

F# – Microsoft’s answer to functional Programming

“F# supports functional programming, object-oriented programming, and imperative programming.” MSDN

Object-oriented programming (OOP) is a programming paradigm that uses "objects" — data structures consisting of datafields and methods — and their interactions to design applications and computer programs. Programming techniques may include features such as information hiding, data abstraction, encapsulation, modularity, polymorphism, and inheritance. It was not commonly used in mainstream software application development until the early 1990s.

Wikipedia :- http://en.wikipedia.org/wiki/Object_oriented 

Functional Programming :- It emphasizes on functions rather than change of state .

Wikipedia : - http://en.wikipedia.org/wiki/Functional_programming

More information on F# :- http://msdn.microsoft.com/en-gb/library/dd553242(VS.100).aspx . Don Syme from Microsoft Research lab @ cambridge is the principle architect . Though this langauge it looks like it is OCaml it is much more than just OCaml. It also brings to the table , the .net framwork and a way to use F# classes into any other langauge as C#.

So with this language I would say Microsoft has finally married off mathematially modellers and programmers and made a bigger happier family.

I will be blogging more on it.

6月13日

Download Available (BI EVENING 10 JUNE Reading (UK):- Attribute Relationships, Aggregations and using MDX Studio to its best)

I was a speaker this BI evening at Microsoft Reading on Attribute Relationships, Aggregations and using MDX Studio to its best.

Setting proper relationships for the attributes of a dimension is essential from a query performance point of view. It is one of the most important things while dimensional modelling.
Aggregations are pre calculated summaries of cube data that help enable Microsoft SQL Server 2005 Analysis Services (SSAS) to provide rapid query responses. Aggregations are like fact table data grouped and aggregated to a higher level in the Hierarchy. Building the correct aggregations can greatly enhance query performance.
MDX Studio can be used to analyze complex MDX expressions, monitor performance characteristics of MDX queries and get insight into how MDX interacts with other UDM features such as attribute relationships. MDX Studio provides unique visualization representation of MDX expression and allows the user to interact with different stages of MDX execution. It is built by Mosha Pasumansky, who is inventor of MDX and one of the Architects of Microsoft Analysis Services.

The Slides and Demo is available for download here.

6月1日

June 10th BI Event in Reading

June 10th BI Event in Reading; SQL 2008 R2 & Gemini; Data Modelling to Info Architecture; Attribute Relationships, Aggregations & using MDX studio to its best

For more information and to register: http://www.sqlserverfaq.com/events/168/SQL-2008-R2-and-Gemini-From-Data-Modelling-to-Information-Architecture-and-Attribute-Relationships-Aggregations-and-using-MDX-Studio-to-its-best.aspx

Join us for another UK SQL Server User Group meeting.
5.30pm – 6:00 Registration and networking + Tea/Coffee with biscuits.
Meet and greet.
6:00pm – 6:15pm Round Table discussion and Nuggets - ALL
Take stock and get the latest news in the SQL Server field. This is also a great opportunity to ask any burning questions you have, may be a problem at work.
It's also a great opportunity to share your knowledge and gain exposure in the industry by giving a short 1 - 6 minute "nugget" demo/tip.
6.15pm – 7:00pm Andrew Fryer, Microsoft

SQL Server 2008 R2 and Gemini
SQL Server 2008 R2 will be the release with Gemini in it. Gemini is about trying to marry self service BI for end users with proper control of that data for the IT Professional. The beta program has only been recently announced and so it’s still early days but if you want to know what to expect then this is your first real opportunity in the UK. So join Andrew Fryer for an hour of peering into what 2010 BI looks like
7:00pm - 7:45pm Alex Pratt and Andrew de Rozairo Sybase

From Data Modeling to Information Architecture
It used to be about doing the thing right: a tactical approach for data modelling was necessary, because all the data was contained within the same domain. Now it’s about doing the right thing. Management demands information that is strategic and which helps them make business decisions. This information is more complex, housed in multiple domains, created in different languages, and used by different areas of the organisation for opposing purposes. Adding to the complexity is the critical need for it to be timely, accurate and comprehensive. 
In this 45-minute session, Andrew de Rozairo and Alex Pratt of Sybase will impart the secrets of successfully building a foundation for a comprehensive and successful information architecture. You’ll learn how to take a new approach to modelling: combining the business view, information view and the technology view to achieve a strategic vision.
7:45pm – 8pm BREAK: Light refreshments

More time to network and ask questions...
8pm - 8:15pm Vincent Rainardi
Nugget (demo) on SSAS
8:15pm – 9.00pm Ashwani Roy
Attribute Relationships, Aggregations and using MDX Studio to its best
Setting proper relationships for the attributes of a dimension is essential from a query performance point of view. It is one of the most important things while dimensional modelling.
Aggregations are pre calculated summaries of cube data that help enable Microsoft SQL Server 2005 Analysis Services (SSAS) to provide rapid query responses. Aggregations are like fact table data grouped and aggregated to a higher level in the Hierarchy. Building the correct aggregations can greatly enhance query performance.
MDX Studio can be used to analyze complex MDX expressions, monitor performance characteristics of MDX queries and get insight into how MDX interacts with other UDM features such as attribute relationships. MDX Studio provides unique visualization representation of MDX expression and allows the user to interact with different stages of MDX execution. It is built by Mosha Pasumansky, who is inventor of MDX and one of the Architects of Microsoft Analysis Services.
Speaker Bio's
Andrew Fryer
Microsoft Evangelist specialising in Business Intelligence.
Ashwani Roy
Ashwani Roy is a Business Intelligence Consultant with IMGROUP with 5+ Years of experience in RDBMS and OLAP. He specialises in Database engine, SSIS and SSAS, C#, Design Patterns.
Ashwani is MCTS & MCITP in SQL Server 2005 and also has 2 patents in Business Intelligence domain.
Ashwani is also frequent blogger on www.sqlkit.com  and http://ashwaniroy.spaces.live.com  and one of the authors of Server 2008 MCP exams 451 (database solutions). He is also regular responder on Microsoft Forums
Andrew de Rozairo
Andrew de Rozairo has focused on delivering business value through technology for the last 23 years. An Electrical Engineering and Computer Science degree from the Massachusetts Institute of Technology with an MBA from INSEAD, Europe's leading Business School, provides him with a unique mix of technical and business perspective. Andrew has over 16 years in management roles in the Data Management Industry, including as CEO of a VC-backed encryption start-up, and European Managing Director of an American data monitoring software company. In his current role as Business Development Manager EMEA for Sybase, Andrew works with strategic clients and partners to develop and deliver business value propositions based on leading-edge Sybase modeling, metadata management and analytics technology solutions.

5月28日

WolframAlpha

It is quite amazing , how powerful data can be and for someone who works in BI it is even fascinating to see a software that combines the search capabilities to BI.

Today I stumbled accorss this post from Mosha @ http://sqlblog.com/blogs/mosha/archive/2009/05/14/wolframalpha.aspx and went to website http://www.wolframalpha.com/ for a quick test drive.

With the hardware and computational power that is available to us , and the kind of computational ability that Wolfram talks about , it seems like the goal of “make all systematic knowledge immediately computable by anyone” is reachable .

5月17日

Parent-Child Dimensions – Introduction , drawback and alternative approach

First of all lets understand what Parent Child dimensions are and where and When they are modelled.

From MSDN

A parent-child dimension is based on two dimension table columns that together define the lineage relationships among the members of the dimension. One column, called the member key column, identifies each member; the other column, called the parent key column, identifies the parent of each member. This information is used to create parent-child links, which are then combined into a single member hierarchy that represents a single meta data level.

 

For example, in the following Employee table, the column that identifies each member is Employee_Number. The column that identifies the parent of each member is Manager_Employee_Number. (This column stores the employee number of each employee's manager.)

 

Here is an example of how the data underlining a parent-child dimension might look like.

 

 

image 

Important  If a parent-child dimension is included in a cube with a fact table that has rows associated with the dimension's nonleaf members, you must set the dimension's Members With Data property to Nonleaf data visible or Nonleaf data hidden. Otherwise, processing the cube fails.

The Members With Data property indicates whether nonleaf members of a parent-child dimension are allowed to have associated fact table data. By default, nonleaf members are not allowed to have associated fact table data, so the property is initially set to Leaf members only.

Limitations of Parent Child Dimensions

Parent child dimension do provide flexibility when modelling dimensions like employee organization structure and other self referencing dimensions but beware that this flexibility is not free.

The primary issue is that since there is no consistent levelling, you cannot have pre-calculated aggregates for intermediate levels.

If you go to Mosha Pasumansky’s Blog link which is given below , you will find the codeplex link to the Jon Burchel (a Senior Support Escalation Engineer in Microsoft) PCDimNaturalizer project.

http://sqlblog.com/blogs/mosha/archive/2008/08/25/parent-child-dimension-table-naturalizer.aspx

NOTE :- It is also added to BIDS Helper's new release http://bidshelper.codeplex.com/Wiki/View.aspx?title=Parent-Child%20Dimension%20Naturalizer  but if you want to call it from SSIS or external application you will need binaries which can be downloaded from http://pcdimnaturalize.codeplex.com/

 

 

3月30日

SQL Bits IV Presentations

SQL Bits IV is over.By far this was the largest SQL Server community event that I have spoken at. The power point decks are uploaded and can be donloaded from the website http://www.sqlbits.com/information/PublicSessions.aspx --> right click on the session and Open it in new window. You will find most of the session's PPT Deck.
 
 
Alternatively you can download it from here
 
3月20日

Creating Sub Cubes - Visual and Non Visual Mode

With AS 2008 , you can create subcube in Visual and Non Visual Modes. First of all What is Visual and Non Visual Mode. I will try to write some simple MDX to demonstrate it.
 
Fire this query without any subcubeing
 

select {[Measures].[Reseller Sales Amount] } on 0,

[Business Type].members on 1

from

[Adventure Works]

where

[Category].members

You Get

  Reseller Sales Amount
All Resellers $80,450,596.98
Specialty Bike Shop $6,756,166.18
Value Added Reseller $34,967,517.33
Warehouse $38,726,913.48

Alright so this is the All Resellers total Value  ($80,450,596.98)

Now lets fire the next MDX but this time with simple SUB CUBE

CREATE SUBCUBE [Adventure Works] AS

Select

{[Category].Accessories, [Category].Clothing} on 0,

{[Business Type].[Value Added Reseller], [Business Type].[Warehouse]}

on 1

from

[Adventure Works]

 

with

member

Measures.VisualSum As

([Category].Accessories,[Measures].[Reseller Sales Amount])+([Category].Clothing,[Measures].[Reseller Sales Amount])

select

{[Measures].[Reseller Sales Amount] ,Measures.VisualSum} on 0,

[Business Type].

members on 1

from

[Adventure Works]

where

[Category].members

  Reseller Sales Amount VisualSum
All Resellers $2,031,079.39 $2,031,079.39
Value Added Reseller $767,388.52 $767,388.52
Warehouse $1,263,690.86 $1,263,690.86

You would see that Visual total is eaqul to Reseller total which is much less than the actual total. Why does this happen?

This is because when you try to do a total on SUB CUBE with considers only VISUAL TOTALS , the other values ( of Reseller Sales here) will not be aggregated. Sometimes this is what is required in a business scenarion but many times you will find your customers want to see total figure . In this case the query below will show the SUB CUBE total and the NON VISUAL total . For this we will create a sub cube with NON VISUAL mode.(this is available from AS 2008)

CREATE

SUBCUBE [Adventure Works] AS

NON

VISUAL (Select {[Category].Accessories, [Category].Clothing} on 0,

{[Business Type].[Value Added Reseller], [Business Type].[Warehouse]}

on 1

from [Adventure Works])

 

 

select

[Category].members on 0,

[Business Type].

members on 1

from

[Adventure Works]

where

[Measures].[Reseller Sales Amount]

 

 

with

member

Measures.VisualSum As

([Category].Accessories,[Measures].[Reseller Sales Amount])+([Category].Clothing,[Measures].[Reseller Sales Amount])

select

{[Measures].[Reseller Sales Amount] ,Measures.VisualSum} on 0,

[Business Type].

members on 1

from

[Adventure Works]

where

[Category].members

 

this is what you get

  Reseller Sales Amount VisualSum
All Resellers $80,450,596.98 $2,349,138.77
Value Added Reseller $34,967,517.33 $767,388.52
Warehouse $38,726,913.48 $1,263,690.86

 

See that the NON VISUAL total $80,450,596.98   is without considering the sub cube and this is eaqul to the first total that we had , which was on the complete cube , but the Visual total is same as the Total we had for the SUB CUBE in Visual Mode.

 

I have tried my best to explain that Visual and Nonb Visual totals are how can Create SUB CUBE be used in both modes. For more  do go to

http://msdn.microsoft.com/en-us/library/ms144916.aspx

http://sqlblog.com/blogs/mosha/archive/2008/11/04/as2008-mdx-subselects-and-create-subcube-in-non-visual-mode.aspx

 

1月26日

Dynamically creating Excel Sheets and populating data from SQL server tables

I came across this very interesting question on Microsoft Forums where in the developer wanted to Create an SSIS package which Loops thought a set of tables and create Excel files on the fly and Load data into it.

There can be many ways to solve this including building a dot net custom component which creates excel sheet on the fly and then dynamically loads the data into it. Though I came across this cool SP that does this job really well and in very quick steps.

 

Here is how you can do this --

 

--        Enable SP_MAKEWEBTASK 
3 exec sp_configure 'Web Assistant Procedures', 1 -- Configuration option 'Web Assistant Procedures' changed from 0 to 1. Run the RECONFIGURE statement to install. RECONFIGURE 
4 RECONFIGURE
5  
6 --RUN THIS TO CREATE A FILE DUNAMICALLY AND LOAD DATA INTO IT 
7  
8 EXEC sp_makewebtask 
9     @outputfile = 'd:\testing.xls'
10     @query = 'Select * from sys.tables'
11     @colheaders =1, 
12     @FixedFont=0,@lastupdated=0,@resultstitle='Testing details' 
13     
14  
15 --NOW WE KNOW HOW TO DYNAMICALLY CREATE A XLS FILE AND LOAD DATA FROM TABLES LETS TRY PUTTING THIS IN A SP AND RUNNING IT BY PARAMETERS 
16  
17 ALTER PROCEDURE CREATE_AND_LOAD_EXCEL 
18  
19      @ioutputfile       VARCHAR(2000)
20     ,@itable            VARCHAR(MAX)
21     ,@iresultstitle     VARCHAR(2000)
22 AS 
23 declare @sql varchar(max),@excelfile varchar(max)
24 set @sql ='Select * from '+CHAR(9)+@itable
25 set @excelfile = 'd:\'+@ioutputfile
26     EXEC sp_makewebtask 
27     @outputfile = @excelfile, 
28     @query = @sql, 
29     @colheaders =1, 
30     @FixedFont=0,@lastupdated=0,@resultstitle=@iresultstitle
31     
32 --EXECUTE THIS SP 
33     
34     EXEC CREATE_AND_LOAD_EXCEL 
35         @ioutputfile = 'NewFile.xls'
36         @itable = 'sys.tables'
37         @iresultstitle='Testing details' 


Now that we have seen that you can do this by passing table name and excel file name dynamically
this proc can be run with a execute sql task in a FOR EACH LOOP container where in you pass the
table where you want to load data from and name of excelfile you want to create dynamically.

1月25日

Conditional Split over Datetime Column in SSIS

There are times when we need to branch the flow of our ETL package based on the datetime column of the source file(table).

I have seen some questions begin asked on Microsoft forums with developers facing this error

the output evaluated to NULL, but the 'component conditional split' requires a boolean.

This happens when there are records inthe datetime column which are NULL and this the condition split expression fails on trying to split NULLS.

The solution is

--CHECK IF ISNULL --->do something

ELSE --> do the split as per logic

here is what your expression should look like to handle this

ISNULL([MY_DATE_COLUMN])]) ? False : [MY_DATE_COLUMN]] >@[User:MinDate]

12月11日

Dynamic SQL- A quick tip

Many times I am asked both personally and on Microsoft forums about how to build a dynamic SQL query. Even though there are many reasons why not to use a dynamic SQL like
-- it causes SP recompiles every time it runs.This you loose the benifit of cached execution Plan
-- There are security Issues like SQL Injection etc
 
Lets leave that behind . Dynamic is like "Can't live with it can't live without it" stuff.
 

Lets say we want a integer variable

declare
@var int
declare @query varchar(max) --use Max when you have it...
select
@var = 10
Set
@query = 'select * from Mytable where MyVariable = '+CAST(@var AS VARCHAR) -- So just put four quotes both sides
print
@query
--here it is output
--select * from Mytable where MyVariable = 10
 
I am sure this is not the most trivial stuff but hope it helps someone sometimes.
 
Lets see how to build a dynamic query with Varchar variable
 
declare
@var varchar(10)
declare
@query varchar(max) --use Max when you have it...
select
@var = 'Test'
Set
@query = 'select * from Mytable where MyVariable = '+''''+@var+'''' -- So just put four quotes both sides
print
@query
--Here is the output
--select * from Mytable where MyVariable = 'Test'

 

Lets say we want a

datetime variable
declare
@var datetime
declare
@query varchar(max) --use Max when you have it...
select
@var = '2008-10-31'
Set
@query = 'select * from Mytable where MyVariable = '+''''+CONVERT(Varchar,@var,105)+'''' -- So just put four quotes both sides and cast the date to varchar
print
@query
--here it is
--select * from Mytable where MyVariable = '31-10-2008'