关键词不能为空

当前您在: 主页 > 英语 >

数据仓库(中英文翻译)

作者:高考题库网
来源:https://www.bjmy2z.cn/gaokao
2021-01-29 14:31
tags:

-flexible

2021年1月29日发(作者:standout)


DATA WAREHOUSE


Data warehousing provides architectures and tools for business executives to


systematically organize, understand, and use their data to make strategic decisions. A


large number of organizations have found that data warehouse systems are valuable tools


in today's competitive, fast evolving world. In the last several years, many firms have


spent millions of dollars in building enterprise-wide data warehouses. Many people feel


that with competition mounting in every industry, data warehousing is the latest


must-have marketing weapon


——


a way to keep customers by learning more about their


needs.


“So



Data warehouses have been defined in many ways, making it difficult to formulate a


rigorous definition. Loosely speaking, a data warehouse refers to a database that is


maintained separately from an organization's operational databases. Data warehouse


systems allow for the integration of a variety of application systems. They support


information processing by providing a solid platform of consolidated, historical data for


analysis.



According to W. H. Inmon, a leading architect in the construction of data warehouse


systems, “a data warehouse is a subject


-oriented, integrated, time- variant, and nonvolatile


collection of data in support of management's decision making process.


comprehensive definition presents the major features of a data warehouse. The four


keywords, subject-oriented, integrated, time-variant, and nonvolatile, distinguish data


warehouses from other data repository systems, such as relational database systems,


transaction processing systems, and file systems. Let's take a closer look at each of these


key features.



(1).Subject- oriented: A data warehouse is organized around major subjects, such as


customer, vendor, product, and sales. Rather than concentrating on the day-to-day


operations and transaction processing of an organization, a data warehouse focuses on the


modeling and analysis of data for decision makers. Hence, data warehouses typically


provide a simple and concise view around particular subject issues by excluding data that


are not useful in the decision support process.



(2) Integrated: A data warehouse is usually constructed by integrating multiple


heterogeneous sources, such as relational databases, flat files, and on-line transaction


records. Data cleaning and data integration techniques are applied to ensure consistency


in naming conventions, encoding structures, attribute measures, and so on.





(3).Time-variant: Data are stored to provide information from a historical perspective


(e.g., the past 5-10 years). Every key structure in the data warehouse contains, either


implicitly or explicitly, an element of time.



(4)Nonvolatile: A data warehouse is always a physically separate store of data


transformed from the application data found in the operational environment. Due to this


separation, a data warehouse does not require transaction processing, recovery, and


concurrency control mechanisms. It usually requires only two operations in data


accessing: initial loading of data and access of data.



In sum, a data warehouse is a semantically consistent data store that serves as a


physical implementation of a decision support data model and stores the information on


which an enterprise needs to make strategic decisions. A data warehouse is also often


viewed as an architecture, constructed by integrating data from multiple heterogeneous


sources to support structured and/or ad hoc queries, analytical reporting, and decision


making.



“OK



Based on the above, we view data warehousing as the process of constructing and


using data warehouses. The construction of a data warehouse requires data integration,


data cleaning, and data consolidation. The utilization of a data warehouse often


necessitates a collection of decision support technologies. This allows “knowledge


workers


conveniently obtain an overview of the data, and to make sound decisions based on


information in the warehouse. Some authors use the term “data warehousing


only to the process of data warehouse construction, while the term warehouse DBMS is


used to refer to the management and utilization of data warehouses. We will not make


this distinction here.



“How are organizations using the information from data warehouses?



Many


organizations are using this information to support business decision making activities,


including:



(1) increasing customer focus, which includes the analysis of customer buying


patterns (such as buying preference, buying time, budget cycles, and appetites for


spending),



(2) repositioning products and managing product portfolios by comparing the


performance of sales by quarter, by year, and by geographic regions, in order to fine- tune


production strategies,



(3) analyzing operations and looking for sources of profit,



(4) managing the customer relationships, making environmental corrections, and


managing the cost of corporate assets.



Data warehousing is also very useful from the point of view of heterogeneous


database integration. Many organizations typically collect diverse kinds of data and


maintain large databases from multiple, heterogeneous, autonomous, and distributed


information sources. To integrate such data, and provide easy and efficient access to it is


highly desirable, yet challenging. Much effort has been spent in the database industry and


research community towards achieving this goal.



The traditional database approach to heterogeneous database integration is to build


wrappers and integrators (or mediators) on top of multiple, heterogeneous databases. A


variety of data joiner and data blade products belong to this category. When a query is


posed to a client site, a metadata dictionary is used to translate the query into queries


appropriate for the individual heterogeneous sites involved. These queries are then


mapped and sent to local query processors. The results returned from the different sites


are integrated into a global answer set. This query-driven approach requires complex


information filtering and integration processes, and competes for resources with


processing at local sources. It is inefficient and potentially expensive for frequent queries,


especially for queries requiring aggregations.



Data warehousing provides an interesting alternative to the traditional approach of


heterogeneous database integration described above. Rather than using a query-driven


approach, data warehousing employs an update-driven approach in which information


from multiple, heterogeneous sources is integrated in advance and stored in a warehouse


for direct querying and analysis. Unlike on-line transaction processing databases, data


warehouses do not contain the most current information. However, a data warehouse


brings high performance to the integrated heterogeneous database system since data are


copied, preprocessed, integrated, annotated, summarized, and restructured into one


semantic data store. Furthermore, query processing in data warehouses does not interfere


with the processing at local sources. Moreover, data warehouses can store and integrate


historical information and support complex multidimensional queries. As a result, data


warehousing has become very popular in industry.



1.



Differences between operational database systems and data warehouses



Since most people are familiar with commercial relational database systems, it is


easy to understand what a data warehouse is by comparing these two kinds of systems.



The major task of on-line operational database systems is to perform on-line


transaction and query processing. These systems are called on-line transaction processing


(OLTP) systems. They cover most of the day-to-day operations of an organization, such


as, purchasing, inventory, manufacturing, banking, payroll, registration, and accounting.


Data warehouse systems, on the other hand, serve users or “knowledge workers


role of data analysis and decision making. Such systems can organize and present data in


various formats in order to accommodate the diverse needs of the different users. These


systems are known as on-line analytical processing (OLAP) systems.



The major distinguishing features between OLTP and OLAP are summarized as


follows.



(1). Users and system orientation: An OLTP system is customer-oriented and is used


for transaction and query processing by clerks, clients, and information technology


professionals. An OLAP system is market-oriented and is used for data analysis by


knowledge workers, including managers, executives, and analysts.



(2). Data contents: An OLTP system manages current data that, typically, are too


detailed to be easily used for decision making. An OLAP system manages large amounts


of historical data, provides facilities for summarization and aggregation, and stores and


manages information at different levels of granularity. These features make the data


easier for use in informed decision making.



(3). Database design: An OLTP system usually adopts an entity- relationship (ER)


data model and an application -oriented database design. An OLAP system typically


adopts either a star or snowflake model, and a subject-oriented database design.



(4). View: An OLTP system focuses mainly on the current data within an enterprise


or department, without referring to historical data or data in different organizations. In


contrast, an OLAP system often spans multiple versions of a database schema, due to the


evolutionary process of an organization. OLAP systems also deal with information that


originates from different organizations, integrating information from many data stores.


Because of their huge volume, OLAP data are stored on multiple storage media.



(5). Access patterns: The access patterns of an OLTP system consist mainly of short,


atomic transactions. Such a system requires concurrency control and recovery


mechanisms. However, accesses to OLAP systems are mostly read- only operations (since


most data warehouses store historical rather than up-to-date information), although many


could be complex queries.



Other features which distinguish between OLTP and OLAP systems include database


size, frequency of operations, and performance metrics and so on.



2.



But, why have a separate data warehouse?


-flexible


-flexible


-flexible


-flexible


-flexible


-flexible


-flexible


-flexible



本文更新与2021-01-29 14:31,由作者提供,不代表本网站立场,转载请注明出处:https://www.bjmy2z.cn/gaokao/585956.html

数据仓库(中英文翻译)的相关文章