Ashwani 的个人资料Ashwani Roy's BI Blog照片日志列表更多 ![]() | 帮助 |
|
|
10月26日 F#- What , Why and BI – My First LookI am preparing a demo for User Group in UK of how F# can used with Business Intelligence. Frankly speaking I don’t know the end result but as I go along I am going to keep blogging about what I am learning in F# with simple examples. The aim is to have a series of blog post on this new language (something like I have done for ADO.NET entity Framework on http://csentities.wordpress.com ) which will serve as a quick start guide and possibly will open up areas of discussion where I will be using F# with BI to add the power of functional programming into Business Intelligence. I will be using VS 2010 (I have a Beta 2 Ultimate as of now but the code should work with VS 2008 and fsi.exe). So Lets go ahead and start a new Blank Solution Now lets go ahead and add a new F# console application. Now Open the program.fs and paste the code given below in it. open System let a = 2 Console.WriteLine(a) Console.Read()
Now if you hover over the Console.Read() you will see an error saying “The expression should have type ‘uint’, but has type ‘int’. Use ‘ignore’ to discard the result of the expression , or let to bind to a name.” If you write Console.Read() in C# you will not get any diagnostic warning like this. This reminds you that you want to either pipe the result to ignore (e.g. if just calling a function for its side-effects) or else use the result. There are a handful of minor diagnostic improvements like this (though there is still plenty of room for us to continue to improve). If you want to get rid of this warning message then use the code below. open System let a = 2 Console.WriteLine(a) ignore(Console.Read())
Run the application and your output will be look like this Console.Read is just to hold the screen for us to see it and the screen exits when you hit any key after this. So by now we know how to create a simple F# console application writes something to the console. Watch out for the coming posts. 10月2日 Microsoft MVP Award to Me !! Thank you allI got a email today at 4:00 PM from Microsoft MVP award which read
I would like to thank UK SQL Server Community , my blog readers my peers and friends and I hope I will keep adding value to community. How can solve order effect your MDX calculations[Old post reposted as I was not very happy with the formatting] Have a look at this query 1: WITH 2: MEMBER [Measures].[REFUNDS] AS [Measures].[Internet Sales Amount] – [Measures].[Internet Sales Amount] * .4 3: MEMBER [Measures].[PROFIT] AS [Measures].[Internet Sales Amount] – [Measures].[REFUNDS] 4: MEMBER [Product].[Product].[Top10] AS 5: sum( 6: TOPCOUNT([Product].[Product].[Product].MEMBERS 7: , 10 8: , [Measures].[Internet Sales Amount] 9: ) 10: , 11: ([Measures].CurrentMember) 12: )13: MEMBER [Product].[Product].[Top10PercOfTotal] AS [Product].[Product].[Top10]/[Product].[Product].[All Products] 14: SELECT 15: { 16: [Measures].[Internet Sales Amount], 17: [Measures].[REFUNDS], 18: [Measures].[PROFIT]19: } ON COLUMNS, 20: { 21: [Product].[Product].[Top10] 22: ,[Product].[Product].[Top10PercOfTotal]23: ,[Product].[Product].[All Products] 24: } 25: ON ROWS 26: FROM [Adventure Works]
This yields same result as yours i.e. not a flat cell by cell division
I don’t expect it to produce this result . If the computations was going as planned the output should be 0.35272454 (35 %) for each. I have altered the SOLVE ORDER (this is the property which determines the order of cell computation) . Here is modified query 1: WITH 2: MEMBER [Measures].[REFUNDS] AS [Measures].[Internet Sales Amount] - [Measures].[Internet Sales Amount] * .4,SOLVE_ORDER = 1 3: MEMBER [Measures].[PROFIT] AS [Measures].[Internet Sales Amount] - [Measures].[REFUNDS],SOLVE_ORDER = 2 4: 5: MEMBER [Product].[Product].[Top10] AS 6: sum( 7: TOPCOUNT([Product].[Product].[Product].MEMBERS 8: , 10 9: , [Measures].[Internet Sales Amount] 10: ) 11: , 12: ([Measures].CurrentMember) 13: ) 14: ,SOLVE_ORDER = 3 15: 16: MEMBER [Product].[Product].[Top10PercOfTotal] AS 17: [Product].[Product].[Top10]/[Product].[Product].[All Products] 18: , FORMAT_STRING = "Percent" 19: , SOLVE_ORDER = 4 20: 21: 22: 23: SELECT 24: { 25: [Measures].[Internet Sales Amount], 26: [Measures].[REFUNDS], 27: [Measures].[PROFIT]28: } ON COLUMNS, 29: { 30: [Product].[Product].[Top10] 31: ,[Product].[Product].[Top10PercOfTotal]32: ,[Product].[Product].[All Products] 33: 34: } 35: ON ROWS 36: FROM [Adventure Works] 37:
Here is the output
There you go. SOLVE_ORDER determines the series of how cell computation will be formed. One with smallest SOLVE_ORDER will be evaluated first. Hope this explains SOLVE_ORDER and how it works. For more have a look at MDSN @ http://msdn.microsoft.com/en-us/library/ms145539.aspx 9月26日 10th SEP – PASS Meeting- What's new in SQL Server 2008 for BISSAS 2008 has improved Dimension and Aggregation Designer, new Attribute Relationship Designer, Optimize performance with block computations mode and dynamic management views for enhanced resource monitoring.SSRS 2008 reports has the unique data format of Tablix which allows writing reports with combined advantage of Table and Matrix formats. It can integrate with Microsoft Office SharePoint Server 2007 for central delivery and management of business insight. It also enables users to quickly gain insight into complex sets of data by displaying data graphically with enhanced visualization capabilities. Performance has improved drastically for situations where you are generating large reports. SSIS Pipeline is optimized to enable more parallel loading of data. You can write script components in C# now. It comes with improved scalability with thread pooling and enhanced lookup transformations. It also performs more functional and scalable data transfers with the improved SQL Server Import and Export Wizard. Other engine features that you need to be aware of are, improvements to partitioning, change data capture to enable easy extraction of changed data from a production system, backup and table compression which enables better performance and less storage space utilisation. It also comes with optimized Star-Join, improved lock escalation handling, Merge statements and other advanced T SQL enhancements.
The slides and demos are available here for download. 8月25日 Thursday, 10th September – SQL Server User Group MeetingThe evening's agenda includes the following presentations (see Events section for more details): What's new in SQL Server 2008 for BI Presented by Ashwani Roy, MCTS MCITP MCAD MCP We're also pleased to let you know that SQL Server MVP, Simon Sabin will be presenting. Details about this presentation will be made available shortly.
Location:
Register @ http://www.sqlpass.org.uk/ 7月27日 OLAP PivotTable ExtensionsExcel 2007 provides many APIs which are quite powerful from analytics point of view.Not all of them are exposed via the UI though ( for some reason … ). SQL Server MVP Greg Galloway has developed this very cool tool which is EXCEL 2007 add-in and lets you do very cool stuff. You know that calculated members are evaluated on the fly.But the only way you can have this calculated member available from browsing is by doing this physically in the cube. There is a way that you can do it in the excel itself .These are called Private Calculated Members
Simple ratios or differences and stuff that are very specific to a (or small group) of cube users can be put in here , rather than cluttering the cube. Limitation: If you run "OLAP Tools... Convert to Formulas" on a PivotTable with private calculated members, the private calculated members will show N/A. There is no known workaround at this point other than having your OLAP administrator define these calculated members in the cube itself. There is much more to this too and the download and more information is available here http://www.codeplex.com/OlapPivotTableExtend. 7月23日 SQL Server 2008 – Merge , Grouping Sets , table Valued Parameters and some other stuff !!Some time back I took a small session for a group of SQL Server users about some cool stuff in SQL Server 2008 from the T SQL point of view. Some of the students wanted to have access to the scripts , so here they are MERGE STATEMENT and OUTPUT CLAUSE drop table Source go CREATE TABLE Source (id int,val varchar(56)); CREATE TABLE Destination (id int,val varchar(56)); GO INSERT Source VALUES (1,'A'); INSERT Destination VALUES (1,'q');--this will be updated to A GO MERGE Destination D -- target table USING Source S -- source table ON S.ID = D.ID WHEN MATCHED THEN UPDATE set D.val = S.val WHEN NOT MATCHED THEN -- insert a row if the stock is newly acquired INSERT VALUES (S.ID, S.VAL) -- output details of INSERT/UPDATE/DELETE operations -- made on the target table OUTPUT $action, inserted.*, deleted.*; ---this is not required in Merge statement I have kept it here just to demostrate that from SQL 2005 onwards $action id val id val--- I can see what is Inserted and what is Deleted */ SELECT * FROM Destination;--See the result GO
GROUPING SETS GROUPING SETS are a new feature of SQL Server 2008. Using them will allow multiple groupings to be returned in one record set. We will grouping on City and StateProvice in the same query use AdventureWorksLT --useing GROUPING SETS in SQL 2008 TABLE VALUED PARAMTERS In this example I will pass a table valued parameter in the stored procedure --USING Table valued input Parameters ALTER PROCEDURE myProc (@tvp myTableType READONLY) AS declare @table TABLE (id int) UPDATE Inventory SET
NEW DATE TIME DATA TYPES -- IN SQL 2008 INSTANCE DECLARE @DATE DATE , @TIME TIME DECLARE @DATETIMEOFFSET datetimeoffset -------------------------------------------------------------------------------------------------------------------------- Thats all for now!! I though it will be easier to get the scripts from here rather than download from my sky drive. It is not a very extensive list , it just covers the 1 hours session which I took for beginner sql users. A quick Look at CDC –(Change data Capture)
What is CDC :- Change data capture (CDC) is a set of software design pattern used to determine (and track) the data that has changed so that action can be taken using the changed data. Also, Change data capture (CDC) is an approach to data integration that is based on the identification, capture and delivery of the changes made to enterprise data sources. It is a very common requirement for a data warehouse load package is to determine what has changed in the source systems and load this data into the warehouse. All I wanted to give was a script that you can use along with adventure works Database to see what are basic stuff in CDC and how it works. off course you can use it in your Data warehouse SSIS packages. --------------------------------------------------------------- USE AdventureWorks --Enable Change Tracking on a table-- select * from HumanResources.Employee where Title = 'Production Technician - WC60' --Get the Data -- select * from cdc.fn_cdc_get_all_changes_HumanResources_Employee declare @minLSN varbinary(max),@maxLsn varbinary(max) select @minLSN = sys.fn_cdc_get_min_lsn('HumanResources_Employee') select * from cdc.fn_cdc_get_all_changes_HumanResources_Employee --Disable CDC as I dont want this running on my Laptop-- */ --exec sp_configure ----------------------------------------------------------------------------------------------------- 7月15日 Dynamic Management Views for SSASWith SQL Server 2005 onwards we are provided with a set of DMVs. These DMVs allow us to monitor facts like which index is being used and which are not , where are IO bottlenecks , what is cache hit ratio and other stuff. But what about analysis service. What if I want to know which of my hierarchies are being used , which are the aggregations that are not being used etc etc. Well from SQL 2008 onwards you can have this capability. AS 2008 has --> 4 DBSCHEMA (Database Schema DMVs) 1. $SYSTEM.DBSCHEMA_CATALOGS 2. $SYSTEM.DBSCHEMA_COLUMNS 3. $SYSTEM.DBSCHEMA_PROVIDER_TYPES 4. $SYSTEM.DBSCHEMA_TABLES --> 10 DMSCHEMA DMVs 1. $SYSTEM.DMSCHEMA_MINING_COLUMNS 2. $SYSTEM.DMSCHEMA_MINING_FUNCTIONS 3. $SYSTEM.DMSCHEMA_MINING_MODEL_CONTENT 4. $SYSTEM.DMSCHEMA_MINING_MODEL_CONTENT_PMML 5. $SYSTEM.DMSCHEMA_MINING_MODEL_XML 6. $SYSTEM.DMSCHEMA_MINING_MODELS 7. $SYSTEM.DMSCHEMA_MINING_SERVICE_PARAMETERS 8. $SYSTEM.DMSCHEMA_MINING_SERVICES 9. $SYSTEM.DMSCHEMA_MINING_STRUCTURE_COLUMNS 10. $SYSTEM.DMSCHEMA_MINING_STRUCTURES These DMVs describe data mining models in the Analysis Services database. --> There are 13 DMVs which describe the Meta Data of Analysis service data base (cube,partitions,hierarchies etc) 1. $SYSTEM.MDSCHEMA_CUBES 2. $SYSTEM.MDSCHEMA_DIMENSIONS 3. $SYSTEM.MDSCHEMA_FUNCTIONS 4. $SYSTEM.MDSCHEMA_HIERARCHIES 5. $SYSTEM.MDSCHEMA_INPUT_DATASOURCES 6. $SYSTEM.MDSCHEMA_KPIS 7. $SYSTEM.MDSCHEMA_LEVELS 8. $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS 9. $SYSTEM.MDSCHEMA_MEASUREGROUPS 10. $SYSTEM.MDSCHEMA_MEASURES 11. $SYSTEM.MDSCHEMA_MEMBERS 12. $SYSTEM.MDSCHEMA_PROPERTIES 13. $SYSTEM.MDSCHEMA_SETS
These DMVs expose a Whole New gold mine for Analysis and monitoring of SSAS. Have a look at some of the DMX queries below which use these DMVs to expose some very useful information. 1. Open SQL Server Management Studio 2. Open a DMX query Window and connect to Adventure Works Cube .(if you dont have adventure works cube you can download it from codeplex.com) 3. paste this query SELECT * You will see that you have information about all the measures in the Measure Group here is sample output (Only selected few column due to space constraint)
Now that we know how useful can this DMV be here are some scripts that can help you play around with select * from $system.dbschema_tables select * from $system.mdschema_cubes select * from [Adventure Works].[$Product] --database dimension select * from $system.discover_commands SQL Server MVP Darren Gosbell has blogged about it and I suggest you look at his blog http://geekswithblogs.net/darrengosbell/Default.aspx (even otherwise it is a very good blog) 7月9日 - SQL BITS – V [19th - 21st November 2009 Celtic Manor, Newport] www.sqlbits.comWe are pleased to announce SQLBits goes West, the 5th instalment of SQLBits conferences. We are making it even bigger and better than last time, now spanning 3 days, and still keeping everything that has worked so well at previous events. The event will be held at the Celtic Manor Resort in Newport South Wales, just off the M4 motorway. This is the biggest event yet, with 3 days of top quality SQL Server content. It starts with the pre-conference training day on Thursday 19th, more details coming soon. We have had a lot of feedback about weekday versus weekend during previous events so this time we have added a paid conference day on the Friday 20th, with a SQL 2008 and R2 theme. Finally we have the free Saturday community day, with speakers from around the world covering all manner of SQL Server topics. For more go to www.sqlbits.com 6月24日 F# – Microsoft’s answer to functional Programming“F# supports functional programming, object-oriented programming, and imperative programming.” MSDN Object-oriented programming (OOP) is a programming paradigm that uses "objects" — data structures consisting of datafields and methods — and their interactions to design applications and computer programs. Programming techniques may include features such as information hiding, data abstraction, encapsulation, modularity, polymorphism, and inheritance. It was not commonly used in mainstream software application development until the early 1990s. Wikipedia :- http://en.wikipedia.org/wiki/Object_oriented Functional Programming :- It emphasizes on functions rather than change of state . Wikipedia : - http://en.wikipedia.org/wiki/Functional_programming More information on F# :- http://msdn.microsoft.com/en-gb/library/dd553242(VS.100).aspx . Don Syme from Microsoft Research lab @ cambridge is the principle architect . Though this langauge it looks like it is OCaml it is much more than just OCaml. It also brings to the table , the .net framwork and a way to use F# classes into any other langauge as C#. So with this language I would say Microsoft has finally married off mathematially modellers and programmers and made a bigger happier family. I will be blogging more on it. 6月13日 Download Available (BI EVENING 10 JUNE Reading (UK):- Attribute Relationships, Aggregations and using MDX Studio to its best)I was a speaker this BI evening at Microsoft Reading on Attribute Relationships, Aggregations and using MDX Studio to its best. Setting proper relationships for the attributes of a dimension is essential from a query performance point of view. It is one of the most important things while dimensional modelling. The Slides and Demo is available for download here. 6月1日 June 10th BI Event in ReadingJune 10th BI Event in Reading; SQL 2008 R2 & Gemini; Data Modelling to Info Architecture; Attribute Relationships, Aggregations & using MDX studio to its best For more information and to register: http://www.sqlserverfaq.com/events/168/SQL-2008-R2-and-Gemini-From-Data-Modelling-to-Information-Architecture-and-Attribute-Relationships-Aggregations-and-using-MDX-Studio-to-its-best.aspx Join us for another UK SQL Server User Group meeting. 5月28日 WolframAlphaIt is quite amazing , how powerful data can be and for someone who works in BI it is even fascinating to see a software that combines the search capabilities to BI. Today I stumbled accorss this post from Mosha @ http://sqlblog.com/blogs/mosha/archive/2009/05/14/wolframalpha.aspx and went to website http://www.wolframalpha.com/ for a quick test drive. With the hardware and computational power that is available to us , and the kind of computational ability that Wolfram talks about , it seems like the goal of “make all systematic knowledge immediately computable by anyone” is reachable . 5月17日 Parent-Child Dimensions – Introduction , drawback and alternative approachFirst of all lets understand what Parent Child dimensions are and where and When they are modelled. From MSDN A parent-child dimension is based on two dimension table columns that together define the lineage relationships among the members of the dimension. One column, called the member key column, identifies each member; the other column, called the parent key column, identifies the parent of each member. This information is used to create parent-child links, which are then combined into a single member hierarchy that represents a single meta data level.
For example, in the following Employee table, the column that identifies each member is Employee_Number. The column that identifies the parent of each member is Manager_Employee_Number. (This column stores the employee number of each employee's manager.)
Here is an example of how the data underlining a parent-child dimension might look like.
Important If a parent-child dimension is included in a cube with a fact table that has rows associated with the dimension's nonleaf members, you must set the dimension's Members With Data property to Nonleaf data visible or Nonleaf data hidden. Otherwise, processing the cube fails. The Members With Data property indicates whether nonleaf members of a parent-child dimension are allowed to have associated fact table data. By default, nonleaf members are not allowed to have associated fact table data, so the property is initially set to Leaf members only.
Limitations of Parent Child Dimensions Parent child dimension do provide flexibility when modelling dimensions like employee organization structure and other self referencing dimensions but beware that this flexibility is not free. The primary issue is that since there is no consistent levelling, you cannot have pre-calculated aggregates for intermediate levels. If you go to Mosha Pasumansky’s Blog link which is given below , you will find the codeplex link to the Jon Burchel (a Senior Support Escalation Engineer in Microsoft) PCDimNaturalizer project. http://sqlblog.com/blogs/mosha/archive/2008/08/25/parent-child-dimension-table-naturalizer.aspx NOTE :- It is also added to BIDS Helper's new release http://bidshelper.codeplex.com/Wiki/View.aspx?title=Parent-Child%20Dimension%20Naturalizer but if you want to call it from SSIS or external application you will need binaries which can be downloaded from http://pcdimnaturalize.codeplex.com/
3月30日 SQL Bits IV PresentationsSQL Bits IV is over.By far this was the largest SQL Server community event that I have spoken at. The power point decks are uploaded and can be donloaded from the website http://www.sqlbits.com/information/PublicSessions.aspx --> right click on the session and Open it in new window. You will find most of the session's PPT Deck.
Alternatively you can download it from here
3月20日 Creating Sub Cubes - Visual and Non Visual ModeWith AS 2008 , you can create subcube in Visual and Non Visual Modes. First of all What is Visual and Non Visual Mode. I will try to write some simple MDX to demonstrate it.
Fire this query without any subcubeing
select {[Measures].[Reseller Sales Amount] } on 0,[Business Type].members on 1
from where You Get
Alright so this is the All Resellers total Value ($80,450,596.98)
Now lets fire the next MDX but this time with simple SUB CUBE
CREATE Select {[Business Type].[Value Added Reseller], [Business Type].[Warehouse]} from with member ([Category].Accessories,[Measures].[Reseller Sales Amount])+([Category].Clothing,[Measures].[Reseller Sales Amount])
select [Business Type]. from where
You would see that Visual total is eaqul to Reseller total which is much less than the actual total. Why does this happen? This is because when you try to do a total on SUB CUBE with considers only VISUAL TOTALS , the other values ( of Reseller Sales here) will not be aggregated. Sometimes this is what is required in a business scenarion but many times you will find your customers want to see total figure . In this case the query below will show the SUB CUBE total and the NON VISUAL total . For this we will create a sub cube with NON VISUAL mode.(this is available from AS 2008)
CREATE NON {[Business Type].[Value Added Reseller], [Business Type].[Warehouse]} from
select [Category].members on 0,[Business Type]. members on 1from [Adventure Works]where [Measures].[Reseller Sales Amount]
with member Measures.VisualSum As([Category].Accessories,[Measures].[Reseller Sales Amount])+([Category].Clothing,[Measures].[Reseller Sales Amount]) select {[Measures].[Reseller Sales Amount] ,Measures.VisualSum} on 0,[Business Type]. members on 1from [Adventure Works]where [Category].members
this is what you get
See that the NON VISUAL total $80,450,596.98 is without considering the sub cube and this is eaqul to the first total that we had , which was on the complete cube , but the Visual total is same as the Total we had for the SUB CUBE in Visual Mode.
I have tried my best to explain that Visual and Nonb Visual totals are how can Create SUB CUBE be used in both modes. For more do go to http://msdn.microsoft.com/en-us/library/ms144916.aspx
1月26日 Dynamically creating Excel Sheets and populating data from SQL server tablesI came across this very interesting question on Microsoft Forums where in the developer wanted to Create an SSIS package which Loops thought a set of tables and create Excel files on the fly and Load data into it. There can be many ways to solve this including building a dot net custom component which creates excel sheet on the fly and then dynamically loads the data into it. Though I came across this cool SP that does this job really well and in very quick steps.
Here is how you can do this --
1月25日 Conditional Split over Datetime Column in SSISThere are times when we need to branch the flow of our ETL package based on the datetime column of the source file(table). I have seen some questions begin asked on Microsoft forums with developers facing this error the output evaluated to NULL, but the 'component conditional split' requires a boolean. This happens when there are records inthe datetime column which are NULL and this the condition split expression fails on trying to split NULLS. The solution is --CHECK IF ISNULL --->do something ELSE --> do the split as per logic here is what your expression should look like to handle this ISNULL([MY_DATE_COLUMN])]) ? False : [MY_DATE_COLUMN]] >@[User:MinDate] 12月11日 Dynamic SQL- A quick tipMany times I am asked both personally and on Microsoft forums about how to build a dynamic SQL query. Even though there are many reasons why not to use a dynamic SQL like
-- it causes SP recompiles every time it runs.This you loose the benifit of cached execution Plan
-- There are security Issues like SQL Injection etc
Lets leave that behind . Dynamic is like "Can't live with it can't live without it" stuff.
Lets say we want a integer variabledeclare @var int
declare @query varchar(max) --use Max when you have it...
select @var = 10
Set @query = 'select * from Mytable where MyVariable = '+CAST(@var AS VARCHAR) -- So just put four quotes both sides
print @query
--here it is output
--select * from Mytable where MyVariable = 10
I am sure this is not the most trivial stuff but hope it helps someone sometimes.
Lets see how to build a dynamic query with Varchar variable
declare @var varchar(10)
declare @query varchar(max) --use Max when you have it...
select @var = 'Test'
Set @query = 'select * from Mytable where MyVariable = '+''''+@var+'''' -- So just put four quotes both sides
print @query
--Here is the output
--select * from Mytable where MyVariable = 'Test'
Lets say we want a
|
|
|