What is Data Mining?
Data mining is the process of sifting
through large quantities of information to gain insight into the underlying
processes. A classic data-mining example is law enforcement, where officers may
comb through reams of information (phone records, credit card receipts, noted
meetings, and so forth) to identify the relationships in a crime syndicate.
Another form of data mining is running volumes of
transactional data through a process to find patterns in the transactions. An
example of this form of data mining is crunching through years of sales receipts
for a grocery store to identify buying patterns of customers. This type of data
mining is a perfect application of OLAP technologies, because it is dependent
on aggregation of data. An interesting aspect of this use of the OLAP engine is
that you most likely won't be operating on a cube. Instead, you will create a
data-mining model, train it on transactional data, and use it to process transactional
data. To some degree, data-mining engines coexist in the same box as multidimensional
cubes, but they are only tangentially related.
Data mining is a powerful technology with great potential to help
companies focus on the most important information in their data warehouses.
Data mining tools predict future trends and behaviors, allowing businesses to
make proactive, knowledge-driven decisions.
Data mining is primarily
used today by companies with a strong consumer focus - retail, financial,
communication, and marketing organizations. It enables these companies to
determine relationships among "internal" factors such as price,
product positioning, or staff skills, and "external" factors such as
economic indicators, competition, and customer demographics. And, it enables
them to determine the impact on sales, customer satisfaction, and corporate
profits. Finally, it enables them to "drill down" into summary
information to view detail transactional data.
Data
mining consists of five major elements:
- Extract,
transform, and load transaction data onto the data warehouse system.
- Store and manage
the data in a multidimensional database system.
- Provide data
access to business analysts and information technology professionals.
- Analyze the data
by application software.
- Present the data
in a useful format, such as a graph or table.