DATA MART

<< Click to Display Table of Contents >>

Current:  »No Previous Page«

DATA MART

Previous pageReturn to chapter overviewNext page

After the industrial revolution, the knowledge of book with letters as the carrier doubles in a decade; After 1970, the knowledge doubles in every three years; Currently, the total global information doubles in every two years; The data volume of internet in 2010 is more than the total volume of all previous years. 

Currently, people produces over PB data every day. In the industries of Internet, e-commerce, production and manufacture, transportation and logistics, finance and insurance, medical and health care, geographic information and government institutions, large amount of data are created every day. Big data has became the important feature of transition from industrial economy to knowledge economy, which has became the most significant production element and product form in the new era.

Google, Yahoo and Facebook has become the driving force of this revolution. Meanwhile, new enterprises emerge one after another. In the Business Intelligence (BI) field, once AsterData, Greenplum, Vertica stand out prominently, the traditional IT magnates EMC, IBM and HP just respectively pocket them in. Through the absorption and integration to the big data technology of new companies, the traditional IT magnates soon launch their own big data products and services. 

After the database era, with the continuous accumulation of available data, the leading enterprises in each industry start the discovery journey of data value. The Business Intelligent System at this stage generally focuses on Data Warehouse +OLAP. In a general way, the traditional Data Warehouse can store the big data, but not provides the analysis and statistical functions targeted at big data. Therefore, when developing the data application like OLAP, first the user's demands  of  analysis and statistics  should be put forward, then the result of these subjective analysis and Statistics should be predicted, finally the real-time interaction of OLAP system can be ensured. Nevertheless, The combination of Data Warehouse and OLAP has innate defects, which may be a minor change in the perspective of end-users, but might need a long response time. The overall operation and management level of enterprises continue to improve within the industry, the competitive situation is constantly reinforced, which all brings great challenges to every enterprise, the leading enterprise in particular. 

To better cope with the challenge, and keep the dominant position in the industry, higher demands fare required by the enterprise for the Business Intelligent System. Yonghong believes that the data modeling technology of direct import of detailed data transform the relation between data and application from tight coupling to loose coupling, which does not generate any change at data level for most analysis application; The Business Intelligent System based on MPP structure can directly carry out high performance analysis to the detailed data. In this way, the user can quickly develop data application, and conduct real-time analysis at once. Construct the discovery, self-service business intelligent system on demand. 

Yonghong Z-Data Mart is a data mart product integrating data storage and processing that is researched and developed by its own technology. Targeting at different scale of demand data that need to be proceed by the user, the different IT system structure and storage system, two solutions are provided for the customer to choose: one is local mode, and the other is MPP mode. When the scale of data that requires processing is under TB, or the ordinary storage structure is adopted, or the personal computer has met the performance demand, we suggest the user to choose the local mode. While facing the Heterogeneous Database storage system, the data scale that requires processing is on TB, PB and beyond, or the IT system and storage system adopt distributed mode, or only MPP mode can meet the performance demand, the parallel processing mode based on distributed structure will better fulfill the customer's demands. 

It totally abandoned Scale-up, and is all for Scale-out. 

MPP

In-Database Computing

Z-Suite supports all common summaries, and almost all professional statistical functions. Benefiting from Cross Granular Computing technology, Z-Suite data analysis engine will search the most optimized computing scheme, then transfer the computing which is expensive and relative large cost to the place of data storage, which is what we called In-Database Computing. This technology significantly reduces the data movement, decreases the burden of communication and ensures the high performance of data analysis.

MPP Computing

Z-Suite is the Business Intelligent Platform based on MPP architecture, who can distribute the computing into multiple computing nodes, and then appoint node to summarize and output the computing results generate calculated results at a specified node 

. Z-Suite is able to fully utilize all computing and storage resources, not matter server or regular PC, who does not have strict requirements to the network conditions. As the Scale-Out big data platform, Z-Suite is able to give full play of computing ability of each node, so that to realize the second response to TB/TB data analysis.

Column-based storage

Z-Suite is the column-based storage. Data Mart which is based on the column-based storage, do not read any irrelevant data, which can decrease the cost of read and write, at the same time, increase I/O effectiveness, so as to significantly improve the query performance. In addition, the column-based storage can better compress the data, and its general compression ratio is between 5-10 times. In this way, the space took up by data is decreased by 1/5 to 1/10 of traditional storage. The good data compression technology saves the cost of storage device and internal storage, but significantly improves the computing performance. 

Memory calculation

Benefiting from column-based storage technology and parallel computing technology, Z-Suite can largely compress data, and use the computing ability and memory capacity of multiple nodes. Generally, the memory access speed is faster for hundreds of times than disk access speed. Through the memory computing, CPU directly reads data from the memory rather than the disk, and computes the data. Memory computing is a speed-up for the traditional data processing mode, which is the key application technology to realize the big data analysis.