Data Warehouses MGT 327 April 13 th, 2004 In the past decade, we have witnessed a computer revolution that was unimaginable. Ten to fifteen years ago, this world never would have imagined what computers would have done for business. Furthermore, the Internet and the ability to conduct electronic commerce have changed the way we are as consumers. One of the upcoming concepts of the computer revolution in the past ten years has been that of Data Warehousing.
In the following pages, we will examine this concept in the broadest sense first looking at a brief history of how databases and data warehouses have unrolled. Then we will look at Data warehousing, what it is, its definition, etc. Secondly, we will focus in on how it coincides with the Internet/Intranets and how this is affecting business today. Finally, this discussion will be summarized in what the future might hold for how information is stored and the effects the Internet will play in the scheme of things.
Before examining the development of data warehousing and how databases are emerging in business, let’s first review what has been done with data before data warehouses to better understand this issue. In the 1970’s virtually all business system development was done on the IBM mainframe using tools like Cobol and IMS. The 1980’s brought about mini-computer platforms such as AS/400 and VAX/VMS. The late 1980’s and early nineties made UNIX a popular platform with the introduction of client / server architecture. During the past decade, the sharply increasing popularity of the personal computer on business desktops has introduced many new options and opportunities for business analysis. The gap between the programmer and the end user has started to close as Analysts now have at their fingertips many of the tools required to gain proficiency in the uses of spreadsheets and databases.
... data warehouse 3. Past, Present and Future 4. Data Warehouses and Business Organisations 5. Conclusion 6.Bibliography 1. 0 Introduction In recent years, data warehousing ... data warehouse world focuses on data modelling and database design exclusively. Process design is not part of the data warehouse ... Now come data warehouses, commonly run on client / server networks of personal computers and more ...
The most important factor in this evolution of data warehousing has been the sharply increasing power of computer hardware. Along with the increase of this power, their prices have fallen just as sharply. This has played a key role in business today. No longer will high costs and huge mainframes be dominant factors in our ability to do business. The wide array of choices with the PC has allowed databases to evolve quickly both commercially and on the information superhighway So what is a Data Warehouse? A data warehouse is much like that of a storage or distribution warehouse.
It is simply a place where data can be gathered, organized in an orderly fashion and made available for easy access when required. It is storage that facilitates easy locating of the required goods when an order needs to be picked for delivery to the customer, that is, when data is required by the end user. Data warehousing assembles and organizes data from the enterprise operations such as transaction systems (registers, online order systems, etc. ) and stores the data in a format that business or technical people can analyze. The data warehouse is then made accessible through different means to those individuals in need of detailed information. The following diagram depicts the role a data warehouse plays in an order process system.
There are many benefits to using the data warehouse for this business fashion. First, the information is non-volatile. This means that after the data is in the data warehouse, there are no modifications to be made to this information. For example, the order status doesn’t change, the inventory snapshot doesn’t change, and marketing doesn’t change.
It is important to realize that once data is brought to the data warehouse, it should be modified only on rare occasions. It is very difficult, if not impossible, to maintain dynamic data in the data warehouse. Many data warehousing projects have failed miserably when they attempt to synchronize volatile data between the operational and data warehousing systems. Second, data put into a warehouse can be combined from several applications making a vast amount of information available to the end user.
... applications that use it as the source of operational system data. A data warehouse may feed data to other data warehouses or smaller data warehouses called data marts. The operational system interfaces with the data warehouse ... access to data warehouse Access to data warehouse information inside a company is often called Intranet decision support. A typical Intranet data warehouse application is ...
Data warehousing systems are most successful when data can be combined from more than one operational system. When the data needs to be brought together from more than one source application, it is only natural to integrate the information totally separate from the source applications. The data warehouse may very effectively combine data from multiple source applications such as sales, marketing, finance, and production. Many large data warehouses allow for the source applications to be integrated incrementally. The primary reason for combing data from multiple sources is the ability to cross reference data from these applications. Nearly all data in a typical warehouse is built around time.
Time is the primary criteria for filtering the information going into and out of the data warehouse. For example, an analyst may generate queries for a given week, month, quarter, or year. If designed properly, the data warehouse can allow for a year to year analysis even though a base operational application has changed. Finally, data put into a warehouse can be store over a long period of time. Data from most operational systems is archived after data becomes inactive.
For example, an order may become inactive after a set period from the fulfillment of the order, or a bank account becoming inactive over an extended period of time. The primary reason for archiving the inactive data has been the performance of the operational system. Large amounts of inactive data mixed with operational live data can significantly degrade the performance of a transaction that is only processing live data. This helps when a business has information that has already been processed, yet they might need it at a later date. As one can see, this process is much like a product warehouse.
... big corporate data warehouse is difficult, and ERP systems do a poor job of indicating which information has ... another often-underestimated cost. A typical manufacturing company may have add-on applications from the major e-commerce and supply ... Strength Standardisation driven Operational Simplicification Separability of the businesses Leveraging Shared Service (worldwide financial system) Staying ...
Instead of storing goods, the commodity is information. However, just like product storage warehouses, there are well-designed data warehouses and poorly and inflexible data warehouses. Therefore, there is a need for requirements to make a successful and usable installation of a data warehouse. There are two requirements of computer data in any business.
The first is an operational requirement to facilitate the processing of business transactions. The second is the need to analyze the results of these business transactions deliver, or could deliver, if they were better understood and utilized. In other words, there is an operational use and an informational use of data. It is important to recognize that data warehousing is still an evolving science.
As with any evolving technology, particular care must be taken to discount some marketing claims driven by vendors attempting to differentiate themselves from competitors. For example, the size of the data warehouse should not determine if a data warehouse is really a data warehouse. For example, a 50-gigabyte warehouse for a large company such as IBM might not meet their needs whereas a company like Mathis Brothers might have more than enough space for the next ten years. Three components A data warehouse is made up of three very different functional areas, each of which must be customized for the needs of business. The first component handles acquisition of data from a legacy system and outside sources.
There the data is identified, copied, formatted and prepared for loading into the warehouse. There are many products that assist with both the data extraction and preparation. The second component of the warehouse is the storage area, which is managed by relational databases like those from Oracle Corp. or Informix Software. The storage component holds data so that many different data mining, executive information and decision support systems can make use of it effectively. The third component of the warehouse is the access area.
There different end-user PC’s and workstations draw data from the warehouse. The majority of data warehousing software products available today focus on acquisition, storage, or access. Most companies providing these products have a proven track record for performing one of these jobs well. Data warehouses affect business through the Internet / intranet .
... data warehouse stores are built using SQL Server 2000 that stores information about the occurrences on the website. Business Desk reports enable processing of data ... hence the use of data warehousing. To make informed decisions, the management at all the levels within the company requires data analysis to make ...
The most important development in computing since the advent of the personal computer is the explosion of the Internet and Web based applications. One of the most exciting fields in the computing industry today is the development of Intranet applications. Intranets are private business networks that are based on Internet standards, although they are designed to be used internally. The Internet/Intranet trend has very important implications for data warehousing applications. First, data warehouses can be available world wide on public / private networks at much lower costs.
This availability minimizes the need to replicate data across diverse geographical locations. Second, this standard has allowed web servers to provide a middle ground where all have analysis takes place before it is presented to the web browsing client in use. Before the web existed, only a few dozen users accessed data warehouses. In extreme cases companies rolled out client / server based query tools to a few hundred employees. In the past three years, however, the web has significantly changed the scope of data warehousing implementations. Companies can now routinely specify hundreds, if not thousands, of data warehousing users.
Most of those users include sales representatives, customer service personnel, ad even customers and suppliers. In short, the web has effectively democratized information access, making it cost effective to deploy business tools and workers, instead of just a select pool of individuals. Another very significant influence on the business of data warehousing is the fundamental changes in the business organizational structure. The emergence of a vibrant global economy has profoundly changed the information demands by corporations in the United States and worldwide. Corporations have found markets for their products globally while competing with other companies in vastly different cultures and economic environments.
The mergers and acquisition of business have crossed country boundaries. Data warehouses continue to grow in popularity as more companies leverage them to get the most value from their business information. Increasingly, companies are extending the reach of their warehouses to customers, suppliers, and business partners. One company that has used this philosophy is Western Digital Corporation. They planned to let three of their supplier’s access data within the data warehouse so they view performance data on their parts. In the coming years, Western digital expects to offer access to thirty more suppliers.
Big data (also spelled Big Data) is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates — data that would take too much time and cost too much money to load into a relational database for analysis. Although Big data doesn’t refer to any specific quantity, the term is often used when speaking about petabytes and exabytes of ...
With timely performance data, the suppliers can improve their parts, which in turn make Western Digital hard drives better and boosts business. Sharing data warehouses with partners is the evolution of a trend that started when companies realized harnessing the information scattered throughout their organizations could mean improved operations, better products, happier customers, and increased revenue. One of the reasons for data warehousing in the beginning was to optimize your own corporations. Sharing data with suppliers is just and extensions of that. The auto giant, GM, is starting to use Internet technologies and data analysis tools so it can share its data warehouse with suppliers, and in effect, treat them like other divisions. GM is rolling out its supply chain data warehouse, which will ultimately be available via the web to more than 5, 000 suppliers and organizations worldwide.
GM suppliers can log on to a secure web site via a browser and perform queries of data that resides in a warehouse containing the information on the quantities of supplies shipped, delivery times, and prices. This helps them optimize their own product planning, their ability to source materials, and their shipping fulfillment process. Later last year, all North American suppliers were able to check on warranty claims received by GM for all the available components they provide to them. In addition, the auto maker is giving suppliers the access to quality metrics stored in the data warehouse. This will help everyone in the process to understand all aspects of the production process. Other companies are creating industry specific, community shared data warehouses.
Last year, manufactures and distributors participated in an IDX change, which was an industry wide extra net. Built and operated by MCI World Com in a deal with the Industry Data Exchange association, the extra nets core will be a data warehouse that will serve as a place for manufactures to distribute product information using electronic data interchange. Distributors authorized by the manufacturer pay a monthly fee to access the data warehouse and pull information into their operational systems. As one can see, in order for any major business to compete they must do 2 things.
The basic reasons organizations implement data warehouses are: To perform server/disk bound tasks associated with querying and reporting on servers/disks not used by transaction processing systems most firms want to set up transaction processing systems so there is a high probability that transactions will be completed in what is judged to be an acceptable amount of time. Reports and queries, ...
First they must have the ability to use the internet and Intranets for business. A company who doesn’t look to consumers through electronic commerce is only looking to fail in the future. Secondly, the company must let their customers have access to information. Whether it is through the process of extra nets or through the data warehouse, or even both, customers and suppliers need to understand what is going on with the business process. When we look at this concept of data warehousing, it must be noted that the largest warehouse of all is the Internet itself.
As an information resource, the web is the mother of all data warehouses. Its information content is rapidly exceeding the total capacity of traditional libraries and commercial databases. What the future holds… So what does the future hold for data warehousing? After over ten years of building momentum, data warehousing has finally crossed a bridge. Analysts have estimated that 90 percent of global 2000 companies had either built data warehouses or were planning on building one in the next two years. The size of the data warehouse market is expected to rises continually.
Yet, in spite of its market acceptance, the definition of what a data warehouse is, and what it takes to build one successfully, still remains uncertain. Although vendors and industry experts have tried to impose structure on data warehousing, their customers have found their own definitions and best practices. The boundary between data warehouses, data marts, and reporting systems differs from company to company. Two years ago, a survey was conducted by discovery solutions which concentrated on the use of data warehouses by companies who use them.
The study targeted data warehouse practitioners seeking an in depth guise to building and evolving data warehouses. The study included participants from a broad range of industries. All had data warehouses which have been deployed into production for at least 6 months; some for as long as ten years. Below is a partial list of the participants: o Anthem Blue Cross Bank of American Bell South General Accident Insurance Nike o Nine Wes to Sara Lee Union Pacific Railroad State of Washington Whirlpool Corp The data warehouses ranged in size from 10 GB to 1. 5 TB. Some had from 30 tables to 2000+ tables ranging between 10-2500 users.
The study provided an overview of how data warehousing is used by customers who have been successful. Building the data warehouse is a complex process with roadblocks along the way. The following are some of the conclusions that the study came up with: o Successful Data warehouse projects are ongoing and seek to solve a succession of specific problems. o Building a data warehouse requires many products and organizations to work together. Focus, sponsorship, and cooperation are the key. With out these warehouses, either fail to meet expectations or take on a life of their own.
o Successful data warehouses ultimately become part of the operations of the business. Over several years, warehouses can eventually cover the entire enterprise, supporting multiple subject areas, hundreds of tables, thousands of users, and millions of queries per month. o The majority of the time spent in building a data warehouse is consumed in requirements definition, product selection, understanding the data and ensuring quality Data quality is critical to a user’s willingness to use the data warehouse. Attention to data quality often produces a side benefit of improving the data quality of the operational source systems. o Simplification is a key to success over the long term. Companies must set the proper infrastructure from the beginning to ensure the warehouse remains simple to maintain.
o Finally, the most significant trends for data warehousing involve growth in complexity and usage, arrival of more casual users, data mining, web access, and the desire for more active warehouses and the integration of data marts into the warehouse environment. In the research of the up-in-coming concept of data warehousing, I looked into the many fundamental concepts that surround it. Using information from a vast amount of places and using it along with extra nets, intranets, and the internet will be a requirements to compete in today’s global economy. No longer can large companies such as IBM and General Motors rely on simple database programs to do business.
It is now at a much larger scale, entailing customer’s suppliers, and everyone within the company to access the information. Data warehousing is, by all means, a wave of the future. However, with the wave come growing pains and technological advancements that will take time to overcome. From all information present, along with past surveys, companies must choose to make information available, accurate, and above all fast. Works Cited- Benn is, Warren, and Robert Townsend, Reinventing Leadership.
New York: William Morrow, 1995. – Robbins, Stephen P. , Essentials of Organizational Behavior. New Jersey: Prentice Hall, 1988. – Smith, Hedrick.
, Rethinking America. New York: Random House, Inc. , 1995.