What Is Data and Analytics?
Data and analytics (D&A) refers to the ways data is managed to support all uses of data, and the analysis of data to drive improved decisions, business processes and outcomes, such as discovering new business risks, challenges and opportunities.
What is the role of data and analytics in business?
Data and analytics is especially important to modern businesses as it can improve decision outcomes for all types of decisions (macro, micro, real-time, cyclical, strategic, tactical and operational). At the same time, D&A can unearth new questions and innovative solutions to questions — and opportunities — that business leaders had not even considered.
Progressive organizations use data in many ways and must often rely on data from outside their boundary of control for making smarter business decisions.
Data and analytics is also a catalyst for digital strategy and transformation as it enables faster, more accurate and more relevant decisions in complex and fast–changing business contexts.
Decisions are made by individuals (e.g., when a sales prospect is considering whether to buy a product or service) and by organizational teams (e.g., when determining how best to serve a client or citizen). Digital strategy is, therefore, as much about asking smarter questions via data to improve the outcome and impact of those decisions.
Data-driven decision making means using data to work out how to improve decision making processes. This leads to the idea of a decision model, which can include prescriptive analytical techniques that generate outputs that are able to specify which actions to take. Other analytical models are descriptive, diagnostic or predictive (also see “What are core analytics techniques?”) and these can help with other kinds of decisions.
Notably, decisions drive action but may equally determine when not to act.
Progressive organizations are infusing data and analytics into business strategy and digital transformation by creating a vision of a data-driven enterprise, quantifying and communicating business outcomes and fostering data-fueled business changes.
What are examples of data and analytics use cases in business?
Increasingly, organizations now use advanced analytics to tackle business problems, but the nature and complexity of the problem determines the choice of whether and how to use prediction, forecasting or simulation for the predictive analysis component. (Also see “What is advanced analytics?” and “What are core analytics techniques?”)
Scaling digital business especially complicates decision making and requires a mix of data science and more advanced techniques. The combination of predictive and prescriptive capabilities enables organizations to respond rapidly to changing requirements and constraints.
The following are examples of combining the predictive capabilities of forecasting and simulation with prescriptive capabilities:
-Forecasting the risk of infection during a surgical procedure combined with defined rules to drive actions that mitigate the risk
-Forecasting incoming orders for products combined with optimization to proactively respond to changing demand across the supply chain, but not relying on historical data that might be incomplete or “dirty”
-Simulating the division of customers into microsegments based on risk combined with optimization to quickly assess multiple scenarios and determine the optimal response strategy for each
-Data and analytics is also used in different ways for different types of decisions. Making more effective business decisions requires executive leaders to know when and why to complement the best of human decision making with the power of data and analytics and AI.
What are the key elements of data and analytics strategy?
It’s important for each organization to define what data and analytics means for them and what initiatives (projects) and budgets are necessary to capture the opportunities.
The key steps in data and analytics strategic planning are to:
-start with the mission and goals of the organization
-determine the strategic impact of data and analytics on those goals
-prioritize action steps to realize business goals using data and analytics objectives
-build a data and analytics strategic roadmap
-implement that roadmap (i.e., projects, programs and products) with a consistent and modern operating model
-communicate data and analytics strategy and its impact and results to win support for execution
The enterprise operating model for data and analytics must also work to overcome gaps in the data ecosystem, architectures and organizational delivery approaches needed to execute the D&A strategy.
What is data literacy?
Data literacy Is the ability to read, write and communicate data in context. It requires an understanding of data sources and constructs, analytical methods and techniques applied and the ability to describe the use-case application and resulting value. This might sound like an argument for training every employee as a data scientist, that’s not the case. From a business perspective, you might simply summarize data literacy as a program to help business leaders learn how to ask smarter questions of the data around them.
Building data literacy within an organization is a culture and change management challenge, not a technology one. D&A is ever-more pervasive in all aspects of all business, in communities and even in our personal lives. The ability to communicate in the associated language — to be data-literate — is increasingly important to organizations’ success. However, this kind of lasting, meaningful change requires people to learn new skills and behavior.
Best practices for organizations include putting much more emphasis, energy and effort into the change management piece of D&A strategy, leveraging leadership and change agents, addressing both data literacy (“skills,” also expressed as “aptitude”) and culture (“will,” alternatively expressed as “attitude”). Data literacy must start with a leader taking a stance. For example, the CIO or chief data officer, along with the finance (usually business intelligence (BI)) leaders and HR organizations (development and training), can introduce data literacy programs to provide their peers with the tools to adapt and adopt D&A in their respective departments.
As part of an overall data literacy program, data storytelling can create positive and impactful stakeholder engagement. It applies deliberate techniques to frame data and insights in data-driven stories that make it easy for stakeholders to interpret, understand and act on the data being shared.
What is data and analytics governance?
Data and analytics governance (or what many organizations call “information governance”) specifies decision rights and accountability to ensure appropriate behavior as organizations seek to value, create, store, access, analyze, consume, retain and dispose of their information assets. It’s critical to link data and analytics governance to overall business strategy and anchor it to those data and analytics assets that organizational stakeholders consider critical.
Data and analytics governance encompasses the people (such as executive policymakers, decision makers and business D&A stewards), processes (such as the D&A architecture and engineering process and decision-making processes) and technologies (such as master data management hubs) that provision trusted and reliable mission critical data throughout an enterprise.
Notably, while governance originally focused only on regulatory compliance, it is now evolving and expanding to govern the least amount of data for the largest business impact — in other words, D&A governance has grown to accommodate offensive capabilities that add business value, as well as defense capabilities to protect the organization.
Effective data and analytics governance must also balance enterprisewide and business-area governance, but it requires a standardized enterprise approach that has proven to sufficiently engage business leaders. D&A governance does not exist in a vacuum; it must take its cues from the D&A strategy. Make sure to reference specific business outcomes by integrating concrete, measurable metrics (e.g., percentage of customer retention in a specific market segment and percentage of revenue via ecosystem partners) that link data and analytics assets and initiatives with business and stakeholder value.
What is the future of data and analytics technologies?
The data group was once separate from the analytics team, and each entity was managed accordingly, but the formerly distinct markets for these technologies are colliding in many different ways. For example, data management platforms increasingly incorporate analytics, especially ML, to speed up their capabilities.
Analytics and BI platforms are developing data science capabilities, and new platforms are emerging in cases such as D&A governance. Cloud service providers are creating yet another form of complexity as they increasingly dominate the infrastructure platform on which all these services are used.
Traditional platforms across the data, analytics and AI markets struggle to accommodate the growing number of data and analytics use cases, so organizations must balance the high total cost of ownership of existing, on-premises solutions against the need for increased resources and emerging capabilities, such as natural language query, text mining, and analysis of semistructured and unstructured data.
The future of data and analytics therefore requires organizations to invest in composable, augmented data management and analytics architectures to support advanced analytics. Modern D&A systems and technologies are likely to include the following.
Data management systems
Master data management (MDM) is a technology-enabled business discipline in which business functions and IT work together to ensure the uniformity, accuracy, stewardship, governance, semantic consistency and accountability of the enterprise’s official shared master data assets.
Data hubs are focused on enabling data sharing and governance. Producers and consumers of data connect with one another through the data hub, with governance controls and common models applied to enable effective data sharing. MDM is a data hub focused only on master data. Data catalogs are increasingly moving into the governance space, and so they too are starting to become data (and analytics) hubs.
Data centers physically house servers (as opposed to warehouses, which are data structures housed on servers or in the cloud), and their future depends on the degree to which workloads can be moved to the cloud. Those migration decisions must be based on the business benefits of doing so.
Data warehouses provide an endpoint for collecting transactional, detailed (and sometimes other types of) data. They support predictable analyses for data whose value is well-established — that is, well-known, predefined and repeatable analytics that are scalable across many users in the enterprise.
Data lakes collect unrefined data (in its native form, with limited transformation and quality assurance and intrinsic governance) and allow users to explore and analyze it in a highly interactive way. Data lakes don’t replace data warehouses or other systems of record; rather, they complement them by storing unrefined data that may hold great value. The sweet spot for data lakes is the world of pure discovery, data science and iterative innovation.
Data fabric
Data fabric is an emerging data management design that enables augmented data integration and sharing across heterogeneous data sources. Data fabrics have emerged as an increasingly popular design choice to simplify an organization’s data integration infrastructure and create a scalable architecture.
Once widely implemented, data fabrics could significantly eliminate manual data integration tasks and augment (and, in some cases, completely automate) data integration design and delivery. However, data fabrics are still an emergent design concept, and no single vendor currently delivers, in an integrated manner, all the mature components that are needed to stitch together the data fabric. Ultimately, organizations must decide whether to develop their own data fabric using modernized capabilities spanning the above technologies and more, such as active metadata management.
Data fabric also consists of a mix of mature and less mature technology components, so organizations must carefully mix and match composable technology components as their use cases evolve.
D&A in the cloud
Traditional D&A platforms are challenged to handle increasingly complicated analytics, and the total cost of ownership of on-premises solutions continues to grow because of the complexity, increased resources and maintenance of the environment. In contrast, cloud data and analytics offers more value and capabilities through new services, simplicity and agility to handle data modernization — and demands new types of analytics, such as streaming analytics, specialized data stores and more self-service-friendly tools to support end-to-end deployment.
Cloud deployment — whether hybrid, multicloud or intercloud — must account for many D&A components, including data ingestion, data integration, data modeling, data optimization, data security, data quality, data governance, management reporting, data science and ML.
What is advanced analytics?
Advanced analytics uses sophisticated quantitative methods to produce insights unlikely to be discovered through traditional approaches to business intelligence (BI). It spans predictive, prescriptive and artificial intelligence techniques, such as ML. In short:
Analytics and BI represent the foundational or traditional way to develop insights, reports and dashboards
Advanced analytics represents the use of data science and machine learning technologies to support predictive and prescriptive models.
While both are valuable to every organization for different reasons, the market as a whole is changing. Instead of being focused on traditional and separately advanced analytics, the technologies are becoming composable and organizing around roles and personas — from business roles who want self-service capabilities to advanced analytics roles looking to program and engineer.
Augmented analytics refers to the use of ML/AI techniques to transform how insights from analytics are developed, consumed and shared. Augmented analytics includes natural language processing and conversational interfaces, which allow users without advanced skills to interact with data and insights.
Advanced analytics enables executive leaders to ask and answer more complex and challenging questions in a timely and innovative way. This creates a foundation for better decisions by leveraging sophisticated and clever mechanisms to solve problems (interpret events, support and automate decisions and take actions).
Advanced analytics can leverage different types and sources of data inputs than traditional analytics does and, in some cases, create net new data, so it requires a rigorous data governance strategy and a plan for required infrastructure and technologies. For example, data lakes can be used to manage unstructured data in its raw form. (Also see “What is the future of data and analytics technologies?”)
Advanced analytics provides a growing opportunity for data and analytics leaders to accelerate the maturation and use of data and analytics to drive smarter business decisions and improved outcomes in their organizations. Gauging the current and desired future state of the D&A strategy and operating models is critical to capturing the opportunity.
What are core analytics techniques?
Data is widely used in every organization, and while not all data is used for analytics, analytics cannot be performed without data. The technologies needed across data, all its use cases, and the analysis of that data exist across a wide range, and this helps explain the varied use — by organizations and vendors — of the term “data and analytics” (or “data analytics”).
References to “data” imply or should imply operational uses of that data in, say, business applications and systems, such as core banking, enterprise resource planning and customer service. “Analytics” (or what some call “data analytics”) refers to the analytical use cases of data that often take place downstream, as in after the transaction has occurred.
Analytics, as described, comprises four techniques:
Descriptive analytics
This uses business intelligence (BI) tools, data visualization and dashboards to answer, what happened? or what is happening? Procurement, for example, can answer questions like, what did we spend on commodity X in the last quarter? and who are our biggest suppliers for commodity Y?
Diagnostic analytics
This requires more drilled-down and data mining abilities to answer, why did X happen? For example, sales leaders can use diagnostics to identify the behaviors of sellers who are on track to meet their quotas.
Predictive analytics
Predictive analytics typically deals with probabilities and can be used to predict a series of outcomes over time (that is, forecasting) or to highlight uncertainties related to multiple possible outcomes (that is, simulation). It tells us what to expect, addressing the question of, what is likely to happen? It does not, however, answer other questions, such as, what should be done about it?
Predictive analytics relies on techniques such as predictive modeling, regression analysis, forecasting, multivariate statistics, pattern matching and machine learning (ML).
Prescriptive analytics
Prescriptive analytics intends to calculate the best way to achieve or influence the outcome — it aims to drive action. When combined with predictive analytics, prescriptive analytics naturally draws on and extends predictive insights, addressing the questions of, what should be done? or what can we do to make a given outcome happen?
Prescriptive analytics includes both rule-based approaches (incorporating known knowledge in a structured manner) and optimization techniques (traditionally used by operations research groups) that look for optimal outcomes within constraints to generate executable plans of action. Prescriptive analytics relies on techniques such as graph analysis, simulation, complex-event processing and recommendation engines. (Also see “What is advanced analytics?”)
Combining predictive and prescriptive capabilities is often a key first step in solving business problems and driving smarter decisions. Understanding the potential use cases for different types of analytics is critical to identifying the roles and competencies, infrastructure and technologies that your organization will need to be truly data-driven, especially as the four core types of analytics converge with artificial intelligence (AI) augmentation.
What is “big data?”
The term “big data” has been used for decades to describe data characterized by high volume, high velocity and high variety, and other extreme conditions. However, the big data era is epitomized for businesses by the risks and opportunities — specifically that the explosion in data traffic (especially with the evolution of Internet use and computing power) offers a rich source of insights to improve decisions but creates challenges for organizations in how they store, manage and analyze big data.
Most organizations have found ways to derive business intelligence from big data, but many struggle to manage and analyze a diverse and broad set of content (including audio, video and image assets) at scale — particularly as the universe of data sources grows and changes and the need for insights is increasingly driven by advanced analytics.
Progressive organizations no longer distinguish between efforts to manage, govern and derive insight from non-big and big data; today, it's all just data. Instead, they are aggressively looking to leverage new kinds of data and analysis — and to find relationships in combinations of diverse data to improve their business decisions, processes and outcomes.
Synthetic data, for example, is exploited by generating a sampling technique to real-world data or by creating simulation scenarios where models and processes interact to create completely new data not directly taken from the real world. This is most helpful with ML built on data sets that do not include exceptional conditions that business users know are possible, even if remotely. Such data is still needed to help train these ML models.
The global pandemic and other business disruptions have also accelerated the need to use more types of data across a broad range of use cases (especially as historical big data has proved less relevant as a basis for future decisions). Concerns over data sourcing, data quality, bias and privacy protection have also affected big data gathering and, as a result, new approaches known as “small data” and “wide data” are emerging.
The wide data approach enables the data analytics and synergy of a variety of small and large data sources — both highly organized largely quantitative (structured) data and qualitative (unstructured) data. The small-data approach uses a range of analytical techniques to generate useful insights, but it does so with less data.
We now use the term X-analytics to collectively describe small, wide and big data — in fact, all kinds of data — but we expect that by 2025, 70% of organizations will be compelled to shift their focus from big data to small and wide data to leverage available data more effectively, either by reducing the required volume or by extracting more value from unstructured, diverse data sources. (Also see “What is advanced analytics?”)
This and other predictions for the evolution of data analytics offer important strategic planning assumptions to enhance D&A vision and delivery.