Exploring Definitions and Roles
Data is everywhere — literally. From the moment you awaken until the time you sleep, some system somewhere collects data on your behalf. Even as you sleep, data is being generated that correlates to some aspect of your life. What is done with this data is often the proverbial 64-million-dollar ques tion. Does the data make sense? Does it have any sort of structure? Is the dataset so voluminous that finding what you’re looking for is like finding a needle in a haystack? Or is it more like you can’t even find what you need unless you have a special tool to help you navigate?
The answer to that last question is an emphatic yes, and that’s where data analyt ics and business intelligence join the party. And let’s be honest: The party can be overwhelming if data is consistently generating something on your behalf.
This chapter discusses the different types of data you may encounter when you begin working with data. It introduces the key terminology you should become familiar with upfront. You learn a few key concepts to give you a head start working with business intelligence, and you get the “what’s what” of business intelligence tools and techniques
What Is Data, Really?
Ask a hundred people in a room what the definition of data is and you may receive one hundred different answers. Why is that? Because, in the world of business, data means a lot of different things to a lot of different people. So, let’s try to get a streamlined response. Data contains facts. Sometimes, the facts make sense; sometimes, they’re meaningless unless you add a bit of context.
The facts can sometimes be quantities, characters, symbols, or a combination of sorts that come together when collecting information. The information allows people — and more importantly, businesses — to make sense of the facts that, unless brought together, make absolutely no sense whatsoever
When you have an information system full of business data, you also must have a set of unique data identifiers you can use so that, when searched, it’s easy to make sense of the data in the form of a transaction. Examples of transactions might include the number of jobs completed, inquiries processed, income received, and expenses incurred.
The list can go on and on. To gain insight into business interactions and conduct analyses, your information system must have relevant and timely data that is of the highest quality.
Data isn’t the same as information. Data is the raw facts. That means you should think of data in terms of the individual fields or columns of data you may find in a relational database or perhaps the loose document (tagged with some descriptors called metadata) stored in a document repository. On their own, these items are unlikely to make much sense to you or a business. And that’s perfectly okay — sometimes. Information is the collective body of all those data parts that result in the factoids making logical sense.
Working with structured data
Have you ever opened a database or spreadsheet and noticed that data is bound to specific columns or rows? For example, would you ever find a United States zip code containing letters of the alphabet? Or, perhaps when you think of a first name, middle initial, and last name, you notice that you always find letters in those specific fields. Another example is when you’re limited to the number of characters you can input into a field. Think of Y as Yes; N is for No. Anything else is irrelevant.
This type of data is called structured data. When you evaluate structured data, you notice that it conforms to a tabular format, meaning that each column and rowmust maintain an interrelationship. Because each column has a representative name that adheres to a predefined data model, your ability to analyze the data should be straightforward.
If you’re using Power BI (covered in Book 2) or Tableau (covered in Book 3), you notice that structured data conform to a formal specification of tables with rows and columns, commonly referred to as a data schema. In Figure 1-1, you find an example of structured data as it appears in a Microsoft Excel spreadsheet
Looking at unstructured data
Unstructured data is ambiguous, having no rhyme, reason, or consistency what soever. Pretend that you’re looking at a batch of photos or videos. Are there explicit data points that one can associate with a video or photo? Perhaps, because the file itself may consist of a structure and be made of some metadata. However, the byproduct itself — the represented depiction — is unique. The data isn’t rep licable; therefore, it’s unstructured. That’s why any video, audio, photo, or text f ile is considered unstructured data. Products such as Power BI and Tableau offer limited support for unstructured data
Adding semi-structured data to the mix
Semi-structured data does have some formality, but it isn’t stored in a rela tional system and it has no set format. Fields containing the data are by no means neatly organized into strategically placed tables, rows, or columns. Instead, semi structured data contains tags that make the data easier to organize in some form of hierarchy. Nonrelational data systems or NoSQL databases are best associated with semi-structured data, where the programmatic code, often serialized, is driven by the technical requirements. There is no hard-and-fast coding practice.
For the business intelligence developer utilizing semi-structured languages, seri alized programming practices can assist in writing sophisticated code. Whether the goal is to write data to a file, send a data snippet to another system, or parse the data to be translatable for structured consumption, semi-structured data does have the potential for business intelligence systems. A semi-structured dataset has great potential if the serialized language can communicate and speak the same language
Discovering Business Intelligence
Many IT vendors define business intelligence differently. They put their spin on the term by injecting their tool lingo into the definition. For example, if you were to go to a Microsoft website, you’d be sure to find a page or two that would have a pure definition of business intelligence, but you’d also find a gazillion pages detailing how you can apply Power BI or Excel-based solutions to every conceiv able business problem.
So, let’s avoid the vendor websites and stick with a no-frills definition of business intelligence: Simply put, business intelligence (BI) is what businesses use in order to be in a position where they can analyze current as well as historical data. Throughout the process of data analysis, the hope is that an organization will be able to uncover the insights needed to make the right decisions for the business’s future. By using a combination of available tools, an organization can process large datasets across multiple data sources in order to come up with findings that can then be presented to upper management. Using the enterprise BI tool, for example, interested parties can produce visualizations via reports, dashboards, and KPIs as a way to ground their growth strategies in the world of facts
Not so very long ago, businesses had to do many tasks manually. BI tools now save the day by reducing the effort to complete mundane tasks. You can take four actions right now to transform raw data into readily accessible data
Collect and transform your data: When using multiple data sources, BI tools allow you to extract, transform, and load (ETL) data from structured and unstructured sources. When that process is complete, you can then store the data in a central repository so that an application can analyze and query the data
Analyze data to discover trends: The term data analysis can mean many things, from data discovery to data mining. The business objective, however, is all the same: It all boils down to the size of the dataset, the automation process, and the objective for pattern analysis. BI often provides users with a variety of modeling and analytics tools. Some come equipped with visualiza tion options, and others have data modeling and analytics solutions for exploratory, descriptive, predictive, statistical, and even cognitive evaluation analysis. All these tools help users explore data — past, present, and future.
Use visualization options in order to provide data clarity: You may have lots of data stored in one or more repositories. Querying the data to be understood and shared among users and groups is the actual value of business intelligence tools. Visualization options often include reporting, dashboards, charts, graphics, mapping, key performance indicators, and — yes — datasets.
Taking action and making decisions: The process culminates with all the data at your fingertips to make actionable decisions. Companies act by taking insights across a dataset. They parse through data in chunks, reviewing small subsets of data and potentially making significant decisions. That’s why companies embrace business intelligence — because with its help, they can quickly reduce inefficiency, correct problems, and adapt the business to support market conditions
Understanding Data Analytic
Raw data is largely useless. If you’ve ever briefly glanced at a large data set that has columns and rows of numbers, it quickly becomes clear that not much can be gleaned from it
In order to make sense of data, you have to apply specific tools and techniques. The process of examining data to produce answers or find conclusions is called data analytics. Data analysts take a formal and disciplined approach to data ana lytics. This step is necessary for any individual or organization seeking to make good decisions
The process of data analytics varies depending on resources and context, but gen erally follows the steps outlined in Figure 1-2. These steps commence after the problem and questions have been identified
Descriptive: Existing data sets of historical data are accessed, and analysis is performed to determine what the data tells stakeholders about the perfor mance of a key performance indicator (KPI) or other business objective. It is insight on past performance
Diagnostic: As the term suggests, this analysis tries to glean the answer from the data as to why something happened. It uses descriptive analysis to look at the cause.
Predictive: In this approach, the analyst uses techniques to determine what may occur in the future. It applies tools and techniques to historical data and trends to predict the likelihood of certain outcomes.
Prescriptive: This analysis focuses on what action should be taken. In combination with predictive analytics, prescriptive techniques provide estimates on the probabilities of a variety of future outcomes
Data analytics involves the use of a variety of software tools depending on the needs, complexities, and skills of the analyst. Beyond your favorite spreadsheet program, which can deliver a lot of capabilities, data analysts use products such as R, Python, Tableau, Power BI, QlikView, and others.
If your organization is big enough and has the budget, one or more data analysts is certainly a minimum requirement for serious analytics. With that said, every organization should now consider some basic data analytic skills for most staff. In a data-centric, digital world, having data science as a growing business compe tency may be as important as basic word processing and email skills.
Exploring Data Management
No, data management is not the same as data governance. But they work closely together to deliver results in the use of enterprise data.
Data governance concerns itself with, for example, defining the roles, policies, controls, and processes for increasing the quality and value of organizational data
Data management is the implementation of data governance. Without data man agement, data governance is just wishful thinking. To get value from data, there must be execution.
At some level, all organizations implement data management. If you collect and store data, technically you’re managing that data. What matters in data manage ment is the degree of sophistication that is applied to managing the value and quality of data sets. If it’s on the low side, data may be a bottleneck rather than an advantage. Poor data management often results in data silos across an orga nization, security and compliance issues, errors in data sets, and an overall low confidence in the quality of data
Who would choose to make decisions based on bad data?
On the other hand, good data management can result in more success in the mar ketplace. When data is handled and treated as a valuable enterprise asset, insights are richer and timelier, operations run smoother, and team members have what they need to make more informed decisions. Well-executed data management can translate to reduced data security breaches and lower compliance, regulatory, and privacy issues
Data management processes involve the collection, storage, organization, main tenance, and analytics of an organization’s data. It includes the architecture of technology systems such that data can flow across the enterprise and be accessed whenever and by whom it is approved for use. Additionally, responsibilities will likely include such areas as data standardization, encryption, and archiving
Technology team members have elevated roles in all these activities, but all busi ness stakeholders have some level of data responsibilities, such as compliance with data policies and realizing data value
Diving into Data Analysis
Data analysis is the application of tools and techniques to organize, study, reach conclusions, and sometimes make predictions about a specific collection of information.
For example, a sales manager might use data analysis to study the sales history of a product, determine the overall trend, and produce a forecast of future sales. A scientist might use data analysis to study experimental findings and determine the statistical significance of the results. A family might use data analysis to find the maximum mortgage it can afford or how much it must put aside each month to finance retirement or the kids’ education
No comments:
Post a Comment