Előadást letölteni
Az előadás letöltése folymat van. Kérjük, várjon
KiadtaLóránd Mészáros Megváltozta több, mint 10 éve
1
Üzleti igények kielégítése az SQL Server Analysis (OLAP) Services segítségével Kővári Attila OLAP csoport vezető NOVOSYS kft. This session will cover the features of Analysis Services, formerly known as OLAP Services, and their relationships to real world business scenarios. This session will show how Analysis Services provides a flexible end-user tool that offers excellent query performance, powerful analytics and calculations without sacrificing data consistency or the ability to handle large data volumes. In the process, it will demonstrate many of the new dimension features, such as parent-child dimensions, custom rollups and formulas in a dimension, changing dimensions, and parent-child dimensions. It will also demonstrate the security enhancements, advanced analytics features, and scalability enhancements.
2
Tartalom Alapfogalmak Az Analysis (OLAP) Server felépítése
Mi az OLAP? Az OLAP-pal szemben támasztott követelmények OLAP és DW, Adatbázis építés (elmélet) Az Analysis (OLAP) Server felépítése Tárolás Felösszegzés Adatbázis építés Eladások elemzése Pénzügyi modellezés Nagy mennyiségű adatok kezelése Elemzés az Interneten keresztül
3
… 10 évvel ezelőtt Valamit ki kell találni
Túl sok adat - túl kevés információ Do while van türelmem Kérdés Riport End A vezetőség a válaszokat tegnapra kéri
4
A megoldás: A cél: Egy olyan technológia, vagy adatmodell kialakítása amely eleget tesz a következő követelményeknek Gyorsaság Felhasználó barátság Elemzés/Döntés központúság Új fogalom született: OLAP OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
5
Mi az OLAP? 2. A szó jelentése
Betűszó: On-Line Analytical Processing Az “On Line” kifejezés jelentése az OLAP szóban: OLAP ≠ On Line adathozzáférés a tranzakciós rendszerekhez: OLAP = Egy tetszőleges döntés meghozatalához szükséges információ lekérdezése nem lehet több, mint néhány másodperc OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
6
A gyorsaság megvalósítása
Ahhoz, hogy a gyorsaság megvalósuljon szakítanunk kell a meglévő adatmodellekkel, az elemzéshez szükséges információkat egy új helyen vagy modellben kell tárolni OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
7
A gyorsaság megvalósítása három lehetőség
I. Új adatmodellek kialakítása relációs alapokon (Csillag és hópihe sémák) II. Új tárolási struktúra kialakítása. Kialakult a többdimenziós adatbázis III. A kettő előnyeit ötvöző új technológia kialakítása ROLAP, MOLAP(OLAP), HOLAP OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
8
A gyorsaság megvalósítása Szeparált adatbázisok
Működtető rendszerek Döntés támogató (OLAP) Rendszerek Az elemzéshez Szükséges információk átemelése OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation. Egyirányú, periódikusan ismétlődő, ellenőrzött folyamat Elemi és aggregált adatok Elemi tranzakciók
9
Az elemzés központúság megvalósítása 1. Technikai követelmények
Csak és kizárólag a döntéshozáshoz és elemzéshez szükséges információk tárolása Aggregáltság idősor
10
Az elemzés központúság megvalósítása 1. Funkcionális követelmények
Többdimenziós nézet Szeletelés Lefúrás Rotálás Beágyazás Külföld Belföld Össz. Vevő Összes cikk Almaíz Narancsíz Barackíz With OLAP you can quickly explore data by drilling up and drilling down to change the level of granularity that you are looking at for a particular dimension. You can twist and pivot to change the dimensions by which you might be viewing data. Ultimately, OLAP cubes are designed to answer a series of business questions – allowing users to quickly explore the data and perform analysis without the need to go back to a DBA to write a new query. AND with data returned from each pivot within seconds. Q1 Q2 Q3 Q4
11
Az elemzés központúság megvalósítása 2. Adatbázis Objektumok
Külföld Belföld Adatbázis Adatkocka Dimenziók Hierarchiák Mértékek Össz. Vevő Összes cikk Almaíz Narancsíz Barackíz Cubes are the logical storage medium in OLAP. (A cube is to OLAP, like a table is to an RDBMS system) Don’t be fooled by the name, however, “cube” is a misnomer. Cubes are not just limited to three dimensions. In fact, Analysis Services can have up to 128 dimensions in a cube. [Note: 7.0 the limit was 64] In Analysis Services, cubes are grouped together in a “database”. Each cube is made up of Dimensions and measures. (Measures are the numeric data elements of principle interest, and dimensions again are the categories by which you will view those numeric data elements.) Q1 Q2 Q3 Q4
12
Összefoglalás 1. Az OLAP egy technológia: Lehet relációs és többdimenziós és mindkettő A Kulcs: Sebesség (lekérdezés sebessége): Függ Tárolás típusa, Dimenzió elemeinek számától Adatbázis méretétől dimenziók számától NEM Elemzési képesség Függ: Adattartalom Kliens OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
13
Stratégiai döntéshozatalt Támogató többdimenziós
Összefoglalás 2. OLAP Stratégiai döntéshozatalt Támogató többdimenziós rendszerek Lassú adatbeolvasás Könnyen és gyorsan Elkészíthető Komlex lekérdezések Működtető rendszerek (OLTP) Napi működést támogató tranzakciós rendszerek (Mission Critical) Gyors adatbeolvasás és hatékony tárolás Nehezen, lassan elkészíthető komplex lekérdezések Táblázatkezelők Kevés adat hatékony elemzése Inkonzisztencia (hiányzó kódrendszer) OLAP is on-line analytical processing. This term was coined in the early nineties by E.F. Codd who had been hired by Arbor Software (A new company with a multi-dimensional database called Essbase) to write a report regarding multi-dimensional reporting and he used this phrase to differentiate this type of system from OLTP (On-line transaction Processing) systems. Multi-dimensional reporting was not a new concept in the early nineties, people had been doing it for more than 20 years, it simply got a name. This category is generally synonymous with EIS – Executive Information Systems, DSS – Decision Support Systems, and BI – Business Intelligence.) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
14
Összefoglalás 3. OLAP technológiát használhatnak a következő rendszerek Vezetői Információs Rendszerek (Management Information System), (VIR, MIS) Döntéstámogató Rendszerek (Desision Support System, DSS) üzleti intelligencia (Business intelligence, BI) EIS (Executive Information System) The key characteristics of OLAP are that they provide a multi-dimensional view of the data, in an intuitive model designed around specific business requirements. Ultimately, OLAP is most well known for offering excellent query performance and flexible navigation.
15
Tartalom Alapfogalmak Az Analysis (OLAP) Server felépítése
Mi az OLAP? Az OLAP-pal szemben támasztott követelmények OLAP és DW, Adatbázis építés (elmélet) Az Analysis (OLAP) Server felépítése Tárolás Felösszegzés Adatbázis építés Eladások elemzése Pénzügyi modellezés Nagy mennyiségű adatok kezelése Elemzés az interneten keresztül
16
OLAP és Data Warehousing A folyamat
Adattárház, adatraktár OLAP OLTP adatok Transzformáció Láttuk eddíg, hogy aZ OLAP mennyire rugalmasan tudja kielégíteni az elemzők igényét. Nézzük meg azt, hogy hogyan tudunk adatkockákat létrehozni és hogyan épül be az OLAP a Data W. terminológiába Hogyan tudunk adatkockát tervezni és létrehozni - A tervezés a legnehezebb feladat (mai problémákból kiindulva kell megépíteni a jövőben felmerülő kérdések megoldására) Specifikáció Igényfelmérés Nyers specifikáció (célja a szükséges adatok meghatározása) A szükséges adatok felkutatása Végleges specifikáció Megvalósítása Now that you have seen what this technology is capable of delivering to the end-user experience, how do you create a cube and where does Analysis Services fit into the overall data warehouse picture? Well, lets answer the second question first. OLAP is just another data mart in your data warehouse. It is really the icing on the data warehousing cake because it offers such an intuitive user experience. Now lets answer the first question, how do you design and build an OLAP cube? Designing an OLAP data mart is really the most difficult part of the job. Understanding end-user requirements for both today’s analysis and reporting needs while trying to anticipate tomorrow’s requirements are pivotal to the success of the cube. Ask the users the type of categories they need to view the measures by and then understand to what level of detail they will typically need to explore this data. Once the user requirements are well understood, you must identify where you can get the detailed data elements for the cube. If a data warehouse is already in place, this may be a simple task of just writing a view to create the STAR schema from which the cube will be built or perhaps an existing star or snowflake schema will work. If there is no warehouse, you must identify the source or transaction systems in which this data resides, integrate that data, transform it, cleanse it and store the results in a star or snowflake schema. Cubes are then built off of this staged data. Külső adatok Felhasz- nálók Az OLAP a felhasználók eszköze
17
Adatbázis építés (elmélet) Relációs elemek
Többdimenziós adatbázis építéséhez szükségesek a következő elemek* Dimenzió táblák Tény táblák Ezek kapcsolatát leíró csillag séma vagy hópihe séma * Nem szükséges SQL srv, de célszerű Tfh meg van a végleges logikai spec We talked about building star and snowflake schemas in our relational database, but what is a star or a snowflake schema? In this section we will define the following terms: Fact table, Dimension table, star schema, and snowflake schema - the relational elements required to build a cube.
18
Adatbázis építés (elmélet) Dimenzió tábla
VevoCode VevoDesc RegioCode OrszCode key1met Metro Regio1 H key1int Interfruct key2pro Profi Regio2 Nemzbul Shifer NULL D A dimension table provides unique identifiers, hierarchies, descriptions and member properties. For example, Region_id 101 and 102 are aggregated to Territory North. 103 and 104 to South. Territory and Region_id represent the two levels of detail in this dimension. Region_name is a description – or alias – that can be used to describe the Region_id. And finally Manager might be a particular attribute of that region_id. All of this information can be used to help build the actual dimension. A dimenzió táblák tartalmazzák az egyes dimenzióértékek azonosítóit, megnevezéseit, tulajdonságait, és szülő-gyerek kapcsolatukat A dimenzió táblák denormalizáltak
19
Adatbázis építés (elmélet) Dimenzió tábla
A dimenziót leíró információkat elhelyezhetjük egy vagy több táblában Ha egy dimenzión több különböző szintszámú hierarchia helyezkedik el akkor célszerű külön táblában tárolni. A dimension table provides unique identifiers, hierarchies, descriptions and member properties. For example, Region_id 101 and 102 are aggregated to Territory North. 103 and 104 to South. Territory and Region_id represent the two levels of detail in this dimension. Region_name is a description – or alias – that can be used to describe the Region_id. And finally Manager might be a particular attribute of that region_id. All of this information can be used to help build the actual dimension.
20
Adatbázis építés (Elmélet) Tény táblák
Dátum Cikk Vevő Liter Forint Dimenziók Mértékek The fact table is simply the relational version of a cube at its lowest level of detail. (The cube without any aggregations). The fact table consists of dimensions and measures. (Remember, cubes are simply dimensions and measures) Every dimension and every measure that is in the cube must be in the fact table. In this simple example our cube might have a Time dimension, Product dimension and Customer dimension and two measures – units and dollars. A tény tábla az adatkocka relációs megfelelői
21
Adatbázis építés (elmélet) Tény tábla
Datum Cikk Vevo Liter Forint 99/1/1 ALMA10 Key1int 250 3 295 Key1met 92 1 422 Kajszi10 105 1 750 81 1 090 99/1/2 125 2 105 302 3 988 144 2 675 171 3 009 If we were to browse this table, it might look like the following. In the date column we see each day – this is the lowest level of granularity, for product we see each SKU, and customer we see each individual customer. The numeric values are associated with each dimension and are categorized by each measure: units and dollars. WE see 250 units and $3,295 dollars for January 2 for SKU 101 for Customer Jones.
22
Adatbázis építés (elmélet) A csillag séma
A csillag séma tartalma Egy központi tény tábla dimenzió táblák csillag formát alkotva A star schema is simply a central fact table joined to related dimension tables forming the shape of a star! Each dimension is defined by simply one dimension table. (This represents a denormalized structure because typically multiple hierarchies are included in the single dimension table.)
23
Adatbázis építés (elmélet) Csillag séma
Tény tábla Here is a typical example of a star schema. The fact table is joined to each dimension table in a one-to-many relationship between the foreign keys in the fact table and their corresponding dimension primary keys. Again, notice the shape of the star. Dimenzió tábla
24
Adatbázis építés (elmélet) Hópehely séma
A dimenzió hierarchiáját több dimenzió tábla írja le Normalizáltabb, mint a csillag séma Gyengébb teljesítmény Nehezebben átlátható A snowflake schema is when the dimensional hierarchy is defined by multiple dimension tables – this usually means a greater degree of normalization compared to a star, but often results in a slight performance penalty to build the dimension due to the extra join(s). [Note: which schema is better…it really depends. A fully normalized schema reduces the risk of data integrity issues, but increases the processing time to build a dimension.]
25
Adatbázis építés (elmélet) Hópehely séma
Dimenzió táblák Notice the two dimension table cascading.. off the central fact table…resembling the shape of a snowflake. Tény tábla
26
OLAP és Data Warehousing Adatbetöltés
OLTP adatok OLE DB for OLAP, ADO-MD Since data is typically getting staged in a relational star or snowflake schema, you will need some assistance moving and transforming transaction data into the staging area. SQL Server includes Data Transformation Services (DTS) to assist in this area. DTS can be used to import to or extract from relational and non-relational data sources. It can be used to perform scheduled automated loads, develop common and reusable data transformations and tasks that can be shared throughout an organization, and can be used to track metadata about data loads. Even though DTS is part of SQL Server it can be used to move data between any OLE DB or ODBC data source. We will create a simple package to demonstrate how DTS could be used to help stage the data. DTS DW tároló OLAP Server Ügyfél alkal- mazás Külső adatok
27
OLAP és Data Warehousing A költségek kocka feltöltése
Az előadás végén árbevétel arányos nyereséget fogunk számítani Költségek kocka Értékesítés kocka Költségek kocka felépítése Dimenziók: Cikkek, Időszakok, mutatószámok(fix és vált ktg.) Mértékek: Ft Since data is typically getting staged in a relational star or snowflake schema, you will need some assistance moving and transforming transaction data into the staging area. SQL Server includes Data Transformation Services (DTS) to assist in this area. DTS can be used to import to or extract from relational and non-relational data sources. It can be used to perform scheduled automated loads, develop common and reusable data transformations and tasks that can be shared throughout an organization, and can be used to track metadata about data loads. Even though DTS is part of SQL Server it can be used to move data between any OLE DB or ODBC data source. We will create a simple package to demonstrate how DTS could be used to help stage the data. Demo…
28
Összefoglalás OLAP helye a DW terminológiában Csillag és hópihe sémák
Hogyan tölthetjük fel a relációs adatbázist és a kockákat Since data is typically getting staged in a relational star or snowflake schema, you will need some assistance moving and transforming transaction data into the staging area. SQL Server includes Data Transformation Services (DTS) to assist in this area. DTS can be used to import to or extract from relational and non-relational data sources. It can be used to perform scheduled automated loads, develop common and reusable data transformations and tasks that can be shared throughout an organization, and can be used to track metadata about data loads. Even though DTS is part of SQL Server it can be used to move data between any OLE DB or ODBC data source. We will create a simple package to demonstrate how DTS could be used to help stage the data.
29
Tartalom Alapfogalmak Az Analysis (OLAP) Server felépítése
Mi az OLAP? Az OLAP-pal szemben támasztott követelmények OLAP és DW, Adatbázis építés (elmélet) Az Analysis (OLAP) Server felépítése Tárolás Felösszegzés Adatbázis építés Eladások elemzése Pénzügyi modellezés Nagy mennyiségű adatok kezelése Elemzés az Interneten keresztül
30
Analysis Services felépítés
Analysis Manager OLEDB for OLAP OLE DB for DM SQL Server Data Warehouse Egyéb OLE DB adatforrás DSO PivotTable Service TCP/IP HTTP Alkalmazás Analysis Server is an OLE-DB and an OLE-DB for OLAP provider that runs as a Service on the your NT or Windows 2000 server machine. The server name is always the name of your NT/Windows2000 machine and therefore you can only have one OLAP Server per NT box. The server really is made up of two discrete processing engines – the OLAP engine – for building and processing cubes and the data mining engine for building data mining models. DSO – Decision Support Objects – is the proprietary API (Application programming interface) that is used to administer the analysis server. Analysis Manager is the server administration user interface that comes with the product and is written in DSO. Data can be loaded from a SQL Server data warehouse or Any OLE-DB provider. Once processed, the data can be either stored in an OLAP Store, back to the relational database, or some combination of the two. (Which we will talk about in a little more detail shortly). The client component of Analysis Services is called PivotTable Service. PivotTable Service is an in process desktop OLAP server. It is required to query the Analysis server. It ships with Office 2000, SQL Server, Dev Studio, and third-party client tools…just to name a few. It can talk to the server using either the TCP/IP or HTTP protocol. Pivot table service contains a subset of the OLAP Server functionality such as in memory data and query caching, a multi-dimensional formula engine (MDX), and local cube persistence. A smaller version of PTS is also available for a thin-client deployment of Analysis Services. PTS is an OLE-DB, OLE-DB for OLAP and OLE-DB for DM provider. [note: ole db is a Com-based application programming interface (API) for accessing data.) OLE-DB for OLAP provides multi-dimensional extensions to the OLE-DB object model and OLE-DB for DM provides data mining specific extensions to the model. This object model is typically used by professional application developers and C/C++ programs.. ADO-MD – provides multi-dimensional extensions to the Active-X Data Objects (ADO) API. This API is an easy to use wrapper for OLE-DB to be used in languages such as VB, VBA, Active Server Pages, etc. Various applications have been created to work with Analysis Services, Procaliry, Work, Max, Wired for OLAP, Powerplay, NovaView are just some of the applications that sit on the server. For a complete list of applications that are part of the DW Framework…check out the Microsoft web-site. Everything in the white-dotted box is Analysis Services – with the green items representing the processing portion of the product. And everything in blue representing the querying side. Analysis Server ADO MD OLAP tároló OLAP motor Data Mining motor Feldolgozás Lekérdezés
31
Analysis Services felépítés Rugalmas OLAP tároló
A felhasználók és az alkalmazások csak a kockát, mint struktúrát látják Microsoft Analysis Services has a flexible data storage architecture. There are three storage options to choose from. MOLAP – multi-dimensional OLAP, ROLAP – Relational OLAP, and HOLAP – Hybrid OLAP. MOLAP – copies the detailed data from the relational database, and stores both detailed data and aggregates in the multi-dimensional store. After a cube is processed the relational database could be disconnected and users can still access all of the data in the cube. MOLAP typically offers the fastest query performance and often consumes less disk space then ROLAP because of the compression algorithm that is used. ROLAP – Keeps the detailed data in the RDBMS and writes the aggregate tables to the relational store as well. The only thing residing on the OLAP Server are the dimensional structures for the MOLAP dimensions. ROLAP offers the slowest query performance of the three modes since relational queries must be issued each time, although the cache – which we will describe shortly – can help minimize this impact. ROLAP offers the benefits of aggregate tables being available to standard relational query tools and offers the ability to have Real-time OLAP (ROLAP with 0 aggregations.) HOLAP – Hybrid OLAP is a compromise between the other two options. Detailed data stays in the RDBMS and aggregates are stored in the Multidimensional store. This typically offers the quickest processing time and very good performance for viewing aggregates. If users will infrequently access detailed data, this is a good compromise. Regardless of storage option, users and applications ONLY see cubes. The storage options merely effect query performance. Particularly when the cache is cold.
32
Analysis Services felépítés Client/Server gyorsítótár
A kliens is számol Lekérdezés 1: Jan98, Feb98, és Mar98 eladások Kliens Lekérdezés 2: Q1 98 eladások 1) Jan98, Feb98, és Mar98 eladások 2) Q1 98 eladások Lekérdezés 3: Q1 98 & Q1 97 eladások 3) Q1 97 eladások Csak a Q1 97 kell Analysis Services, unlike some of the other OLAP tools, offers Client and Server Caching of queries. This offers enhanced performance for queries that are frequently requested. The way the cache works is as follows: Client A comes along and asks to see Jan98,Feb98, and Mar98 Sales. The client machine checks the cache to see if this is stored locally, if not – It sends the query across the network to the server - the server checks its cache – if it isn’t there the query is issued. The results are then stored in the server cache Sent back to the client – And Stored in the client’s cache. Now the users asks to see Q1 98 Sales – The client can calculate! It knows that it has enough information in the client cache to calculate the answer. The result is returned without having to issue a request across the network or query the server! This feature not only reduces network traffic but enables the server to handle more users since not all requests are being answered directly from the server. Now a third query is requested from the client – asking for Q1 98 and Q1 97 Sales. The intelligent cache is smart enough to know that it only needs Q1 97 from the server, It issues the request to the server. (had another client requested this item – this answer might have been available and answered by the server cache but in this case there is no entry) The server issues the query stores the result in the server cache. Sends that information back to the client, And stores it in the client cache. This feature narrows the differences in query performance for frequently requested items between all of the different storage options. Also, an intelligent algorithm is used for managing the cache. It doesn’t simply do a first in first out, but also looks at number of times queries are requested. Szerver 1) Jan98, Feb98, és Mar98 eladások 3) Q1 97 eladások
33
Analysis Services felépítés Adattárolás - nincs adatrobbanás
Más OLAP rendszerek régi problémája Üres cellák tárolása Összegzések tárolása 100% sűrű tárolás Az üres cellákat nem tároljuk Intelligens összegzések Csak a lehetséges összegzések egy (kis) részét számítjuk ki előre Adattömörítő algoritmusok Not only does Analysis Services improve query performance by caching results, but it also manages data explosion with its unique aggregation architecture. A historical weakness of OLAP tools is the data explosion problem. 100 MB of relational data when aggregated in a multi-dimensional store could result in a several Gigabyte OLAP cube. Analysis Services does a very good job of managing data explosion. First, the cube is 100% dense. If no heaters were sold in Phoenix in July – no storage would be allocated. Second, only a subset of the potential aggregates are pre-calculated and stored. As a result, the cubes take less time to process and consumes less disk space. It is not unheard of in Analysis Services to have a MOLAP cube with aggregations that actually consumes LESS disk space than the original un-aggregated relational data. This is because Analysis Services uses a compression algorithm when storing the data to disk. The most unique architectural feature is the intelligent aggregation design.
34
Analysis Services felépítés Részleges aggregálás 1.
35
Analysis Services felépítés Részleges aggregálás 2.
Kérem az összes eladást, az összes termékre, az összes . . . Az összegzés legmagasabb szintje The aggregation design wizard in Analysis Services, determines the aggregations that offer the most bang for the buck. The aggregation designer looks at the number of members at each level of every dimension and based on a complex algorithm it identifies a series of aggregates that will help to answer many of the different requests. The goal is to have most questions be answered by these aggregations and infrequently require the detailed data from the fact table to be used to answer a query. In this example the majority of the answers can be answered from the aggregates. Legrészletesebb összegzések Tény tábla
36
Tartalom Alapfogalmak Az Analysis (OLAP) Server felépítése
Mi az OLAP? Az OLAP-pal szemben támasztott követelmények OLAP és DW, Adatbázis építés (elmélet) Az Analysis (OLAP) Server felépítése Tárolás Felösszegzés Adatbázis építés Eladások elemzése Pénzügyi modellezés Nagy mennyiségű adatok kezelése Elemzés az interneten keresztül
37
Értékesítés elemzése Üdítő italok értékesítésével foglalkozó vállalat
Dimenziók Cikk (Kiegyensúlyozott) Időszak (Kiegyensúlyozott) Vevő (NEM kiegyensúlyozott) Mértékek Ft, Liter Lets use the OLAP Manager and build a simple sales reporting cube. For the rest of the presentation we will be developing a variety of business solutions for the Graphic Design Institute (GDI) company. This is a fictitious e-commerce company that sells everything graphics related on the web. Although the business doesn’t have stores, many of their products are shipped to addresses in the US and Canada. Distributors are used to help fulfill shipments in this business. In this demo you will create the GDI database and the Sales cube which contains the Geography, Customer, Distributor, Time, and Products dimensions.
38
Értékesítés elemzése Kiegyensúlyozott hierarchiák
ország Mo Régió Reg1 Reg2 Reg3 város Győr Zeg Miskolc Pécs Komló Paks As with sales applications, there are many challenges that you will face when you begin creating financial applications. A couple of the biggest challenges are the lack of symmetry, or the ragged nature of many of the critical dimensions in financial applications. For example, take the geography dimension – we have three levels – Country – State – City. In the USA – this works great. All states have cities and all members at a given level have the same number of ancestors. There are simply no gaps. No problem. Egyforma dimenzió szintek Azonos mélység Nincsenek rések Adott szinten minden tagnak ugyanannyi őse van
39
Értékesítés elemzése Szakadozott hierarchiák
All ország Magyaro. Lengyelo. Nincs vagy Nem szükséges Régió Régió1 Régió2 Győr Zeg Miskolc Krakkó Varsó város [New in 2000] If we are an international corporation, however, three levels for the geography dimension may not work across all countries. For example, in Israel – as in many foreign countries, there is no concept of “States”. Ideally, this dimension would simply have two levels of drill-down – Country – City. Fortunately, Analysis Services is able to show simply two levels of drill-down while maintaining the symmetry to the levels which is useful when creating MDX expressions and for performance reasons. Eltérő mélység Demo…
40
Értékesítés elemzése TovábbFúrás (Drill-through)
Régió 1 2000. Január Cikk Liter Ft Alma 66 300 Ananász 10 Birsalma 100 Dátum SzlaSzám Ft 1-Jan 1234 6.5 5-Jan 1235 20 9-Jan 1236 7.25 10-Jan 1237 6.75 17-Jan 1238 5.75 24-Jan 1239 [New in 2000] One of the many challenges we discussed earlier today is keeping OLAP data limited to summary information. Users will always try to put every detailed data element into the OLAP cube…”just in case”. However, there are times that end users NEED to see the detailed data. Analysis Services offers a solution to this challenge as well. Rather than building the details into the cube – which degrades performance for the occasional requirement to view detailed data. It has incorporated the ability to drill-through to the underlying relational data from a cell in the cube. Now when the users want to see every invoice that makes up a number, you can use drill-through to provide that information to them. Demo…
41
Értékesítés elemzése Actions
Linkek Office dokumentumhoz Internet/Intranet oldalakhoz Futtatható állományokhoz… stb… Kapcsolódhatnak Cellákhoz Dimenzió értékhez Dimenzióhoz Kockához Pl.: Szerződések Finally, one of the biggest challenges, really across all applications – not just limited to internet applications– is closing the loop on this analysis. Analysis alone doesn’t offer competitive advantages. Users need to be able to take action on the results of there analysis. [New in 2000] Actions is a feature in Analysis Services that helps facilitate the process of closing the loop! Actions can be used to provide links to executable, URLs, HTML documents, office documents, etc from any part of the cube. Therefore when some important information is discovered you can quickly take action on it and launch another system. In this demonstration we will simply show how we can launch to an HTML page to do further exploration of an issue or research. Demo…
42
Értékesítés elemzése Multi-Dimensional Expressions (MDX)
Lekérdező nyelv Számított mezők Biztonsági szabályok az MDX segítségével beállíthatók Actions Az OLE DB for OLAP spec. része TM1, SAS, Analysis Services, Whitelight, stb. Financial applications require very complex calculations. When working in a multi-dimensional space, calculations must work across all dimensions – MDX – multi-dimensional Expressions have been created to specifically leverage the dimensional structure when writing calculations. It is used as a query language, to create calculated members, custom member formulas, for cell level security, and Actions. It is really the KEY for advanced analysis. You have already seen a couple of MDX expression. The first, ASP, in the Sales cube and the second when we built the Expense reporting cube – with the Custom Rollup Expression. We will now create a couple of additional common MDX expressions to demonstrate the power of this language. Since MDX is part of the OLE DB for OLAP spec, front-end queries using MDX expression will not only work against Analysis Services but a series of other OLAP engines as well. [New in 2000 – there are some new MDX functions like LookupCube which is demonstrated in the custom rollup example.]
43
Értékesítés elemzése A kocka részei
A kocka cellákból áll Minden cellának van egy címe Egy-egy koordináta, minden dimenzió szerint Egy cellához az összes koordinátájának meghatározásával juthatunk el Before we start writing some simple MDX expressions we should review multi-dimensional concepts. A cube consists of cells. Each cell has an address- one coordinate from each dimension, and a complete address is required to get a cell. Don’t forget, every member of every dimension intersects with every member of every other dimension.
44
Értékesítés elemzése (Cikk.barackital , Time.Q2, Vevo.[Vevők összesen]) (Cikk.Narancsital , Time.2000, Vevo.[Vevők összesen]) (Cikk.[Cikkek összesen] , Time.Q1, Vevo.külföld) Külföld Belföld Vevők összesen Cikkek összesen Almaital Each cell has a name. In MDX this is represented by enclosing the comma separated list of members from the different dimensions in brackets. Narancsital Barackital Banánital Q1 Q2 Q3 Q4 2000
45
Értékesítés elemzése Rendezett n-esek (tuples)
(Cikk.Almaital, Time.Q2, Vevo.[Vevők összesen]) KÜLÖNBÖZŐ dimenziókból származó tagok vesszővel elválasztott listája The comma separated list of members from different dimensions is called a Tuple. When you are pinpointing cells you are using a Tuple. Since a complete address is required to access a cell, what cell is being accessed if NOT all dimensions are included in a Tuple? The answer is the CurrentMember of the missing dimension. (Cikk.Almaital, Time.Q2) = (Cikk.Almaital, Time.Q2, Vevo.CurrentMember)
46
Értékesítés elemzése Halmazok
AZONOS dimenziókból származó tagok vesszővel elválasztott listája {Almaital, Barackital} [1999].Children Sum(Time.members) The other key definition you need to begin creating MDX expressions is the definition of a Set. Simplistically – a set is a comma separated list of members from the same dimension enclosed in curly braces. There are also Set functions that understand the familial relationships – like – Children, Descendants, etc. The functions that return sets do not require curly braces. Sets, however, are not simply limited to one dimension. In fact they are a comma separated list of symmetrical TUPLES in curly braces. Now that you have the basics, lets create a few simple MDX expressions. Demo…
47
Tartalom Alapfogalmak Az Analysis (OLAP) Server felépítése
Mi az OLAP? Az OLAP-pal szemben támasztott követelmények OLAP és DW, Adatbázis építés (elmélet) Az Analysis (OLAP) Server felépítése Tárolás Felösszegzés Adatbázis építés Eladások elemzése Pénzügyi modellezés Nagy mennyiségű adatok kezelése Elemzés az interneten keresztül
48
Pénzügyi modellezés Szülő-gyerek dimenziók
Árbevételarányos nyereség Dim érték Szülő Arbev Fedezet ValtKTG Nyereseg FixKtg ArbevNyer Arbev2 Nyere- ség Árbev : [New in 2000] In financial applications there are also several dimensions that don’t really have “levels” per se – but simply a hierarchical relationships between two of the same data types. Take an organization chart, for example. All member of an organization chart are employees. Some employees report to others who in turn report to others and some don’t have managers at all. A parent-child dimension in Analysis Services allows a dimension to be built based off of these relationships. One Column represents the parent (the manager in this case) and the other represents the members themselves (the employee in this example.) Every employee – whether they are a manager or not, will be included in the Member key column and if they manage another employee they may also be in the Parent column. These dimensions, as you can see in the example, can be as unbalanced as the data with which you are working. In this example, Smith is President, Jones is a VP, and White is Smith’s secretary. This helps illustrate why the concept of named levels doesn’t apply to Parent-Child dimensions. One characteristic of parent-child dimensions is that these relationships evolve over time. Common examples of parent-child dimensions are organization charts, chart of accounts, bill of material, etc. It is not uncommon for employees to move frequently within any given organization. Because of the changing nature of these types of relationships, parent-child dimensions are always considered changing dimensions. Fede- zet Fix költség - Árbev Vált. Költség - Demo…
49
Pénzügyi modellezés Visszaírás (write-Back)
A kockán engedélyezni kell a visszaírást Az adatkocka tetszőleges cellájába írhatunk vissza Nem közvetlenül a kockába, vagy a tény táblába írunk Változás tábla a relációs adatbázisban Az Office 2000-ben nincs felhasználói felület a visszaírásra Now we have an expenses reporting cube created that is able to be used to budget next years expenses. It is often challenging to manage the budgeting process on these tools since the users are writing back data to OLAP and not a transaction system. Historically, it has been challenging to maintain proper audit controls of the write-back process. Write-back of data is a different feature than Dimension write-back which we demonstrated earlier today. When we wrote back to the department dimension in the last demo, the new parent-child hierarchies were automatically written back to the dimension table when the changes were saved. In budgeting applications, users typically want to write-back data to the cubes as well. This allows them to do some what-if analysis, and eventually finalize the budget. In order for users to be able to write back to a cube – write back must be enabled for that cube and read/write security granted to that user. Writebacks can be performed at either leaf levels or non-leaf levels as long as the dimension has been set up to receive data at non-leaf levels. What is unique about Analysis Services is that write-back data is not stored directly to the fact table, instead it is written to an incremental change table that is stored relationally (regardless of storage mode.) Write-backs are written at all levels of summarization so there is virtually no recalculation time to see the impact of changes. Unfortunately there is no native User Interface in Office 2000 today to perform write-back. Either a third-party tool or custom application is required to perform this activity.
50
Pénzügyi modellezés Biztonság
Cella adatára vonatkozó jogosultság Dimenzió értékre vonatkozó jogosultság 10 500 20 625 16 500 Terület N/A 35 Nyugat $7,600 75 Közép 50 Kelet Összes fizetés Alkalmazott Cella szint 10 500 20 625 16 500 Terület 35 Nyugat 75 Közép 50 Kelet Alkalmazott Dimenzió elem When it is time to deploy a financial application or any application for that matter, security is of the utmost importance. Users should not be allowed to write-back to Actual data cells, many users should not have access to confidential information like salaries, and sometimes users shouldn’t even be able to see some of the dimension structures. There are two very powerful features in Analysis Services that ensure your application will stay secure – cell level security and dimension member security. [New in 2000] Dimension member security is used to hide selected dimension members – actually shrinking the multidimensional view for some users. Cell level security allows you to protect or unprotect cells or ranges of cells. And there are three different levels of security that can be granted on a cell – read access, contingent read access and read/write access. We will now secure the Sales and Expense Reporting Cubes.
51
Tartalom Alapfogalmak Az Analysis (OLAP) Server felépítése
Mi az OLAP? Az OLAP-pal szemben támasztott követelmények OLAP és DW, Adatbázis építés (elmélet) Az Analysis (OLAP) Server felépítése Tárolás Felösszegzés Adatbázis építés Eladások elemzése Pénzügyi modellezés Nagy mennyiségű adatok kezelése Elemzés az interneten keresztül
52
Nagy mennyiségű adatok Élő példa (még SQL 7-en)
Jellemzők Dimenziók száma Hierarchiák száma Cellák száma Felösszegzési idők … Válaszidők To address the volumes of data issue – which really is a scalability issue – Analysis Services has a feature called partitioning. This feature is only available on SQL Server Enterprise Edition. Partitioning allows you to physically store data from one cube in multiple partitions. These partitions can reside on more than one server, each partition can have a different aggregation design. Ideally, you would want to segment the data in such a way that only the new data would need to be processed – without touching historical data. For example, Time is an excellent way to partition your data. Perhaps you store current year data in a MOLAP partitioning on the local server. Prior year data is Stored HOLAP on a second server, and History is maintained on a third server using ROLAP. Both processing and query performance is enhanced by doing this. When a query comes in requesting only current year data – it will have a smaller set of data it needs to look at. If it needs current and prior year, the query request will be performed in parallel on the two servers. Processing is also improved since new data only requires the current partition to be re-aggregated.
53
Nagy mennyiségű adatok Particionálás
MOLAP 35% Agg Aktuális év HOLAP 20% Agg Előző év ROLAP 0% Agg Régi adatok To address the volumes of data issue – which really is a scalability issue – Analysis Services has a feature called partitioning. This feature is only available on SQL Server Enterprise Edition. Partitioning allows you to physically store data from one cube in multiple partitions. These partitions can reside on more than one server, each partition can have a different aggregation design. Ideally, you would want to segment the data in such a way that only the new data would need to be processed – without touching historical data. For example, Time is an excellent way to partition your data. Perhaps you store current year data in a MOLAP partitioning on the local server. Prior year data is Stored HOLAP on a second server, and History is maintained on a third server using ROLAP. Both processing and query performance is enhanced by doing this. When a query comes in requesting only current year data – it will have a smaller set of data it needs to look at. If it needs current and prior year, the query request will be performed in parallel on the two servers. Processing is also improved since new data only requires the current partition to be re-aggregated. Eltérő tárolási módok, aggregáltsási szintek Több kiszolgáló támogatása
54
Nagy mennyiségű adatok Nagyon nagy dimenziók támogatása
"Large" MOLAP dimenziók Nem memória rezidens (10 millió tagig) "Huge" ROLAP dimenziók Több 100 millió tag Új fajta virtuális dimenzió támogatás Nincs kihatással a tárolásra Nincs probléma a dimenzió tagjainak számával Több hierarchia szint támogatása Egy dimenzió tagnak >64K gyereke lehet Automatikus gyűjtő szint beiktatása Opcionálisan látható [Note- this section makes comparison to 7.0 features – disregard if this is a new audience to SQL Server] [New in 2000] SQL Server 2000 has enhanced the engine for managing dimensions. MOLAP dimensions need not be completely memory resident, like they were in 7.0, and therefore can handle millions of members – up to ten million members. [New in 2000] For the Amazon.com and e-bays of the world, there are also ROLAP dimensions which are stored relationally allowing for scaling of dimensions up to the 100s of millions of members. [Improved in 2000] Virtual dimensions are dimensions that are built off of attributes of another dimension. They offer the ability to enhanced the dimensionality without exploding the size or complexity of the cube. Most applications typically have no more than 8 – 10 distinct categories by which users want to view the data. There could be, however, dozens more dimensions that are simply characteristics of an already existing dimension. For example, Product is a very typical category by which users will view their data. But users may also be interested in viewing the data by product color, product weight, or product sizes. Color, weight, and product size may simply be attributes of the product SKU. Virtual dimensions can be used to add these additional viewpoints into the cube without increasing the overall complexity and size of the cube. [New in 2000] Finally, there is that usability issue with very large flat dimensions. Analysis Services can actually automate the process of bucketing the large flat dimension so you do not have to worry about large dimensions being unusable.
55
Elemzés az Interneten Virtuális és kapcsolt kiszolgálók
Helyi és távoli kockák összekapcsolása Belső és külső kockák Firewall Sales East West 3rd-Party – External The end game here is to integrate the Hits data with your operational data and possibly external data sources as well. This is where you really get the competitive advantage. By using a feature called linked cubes you can link to an external cube over the internet and use a virtual cube to combine the external data with your internal OLAP cubes. You may also want to make your cubes available over the internet to vendors, customers, visitors, etc. As more and more companies offer these types of cubes on the internet, application design becomes even more powerful and challenging. You will want the ability to incorporate data from these external cubes into your cube design. Although we don’t have any external data to display, we will quickly show you the benefit of using a virtual cube to combine data from the Hits cube and the Sales cube.
56
Kérdések?
57
Könyvek Microsoft OLAP Solutions by Eric Thomsen
OLAP Solutions – Building Multidimensional Information systems by Eric Thomsen Microsoft OLAP Unleashed by Tim Peterson
58
További információk WWW.OLAPINFO.HU
msdn.microsoft.com
Hasonló előadás
© 2024 SlidePlayer.hu Inc.
All rights reserved.