Your company is about to begin a Big Data project and it seems overwhelming. That’s understandable — Big Data is, after all, big. There’s a lot of it, often generated by many different sources: Internet of Things (IoT) sensors, spreadsheet data or customer behavior online. And those are just a few examples of possible sources.
As intimidating as Big Data can be, data management shouldn’t be painful, as long as you plan ahead. To make the most of the data you have coming in, your organization will need to build a Big Data management strategy — a plan for how you will manage, use, share and store data within your company.
We recommend a 5-step approach to planning for your data. Here’s a quick look at the steps:
- Select the right data
- Choose your data storage
- Plan for how data will be shared
- Combine your data
- Plan for data governance
Let’s delve further into that list.
1. Select the right data
You have dozens of data streams of all different kinds coming into your organization. For example, your business might be pulling in the following data: weather, social comments and shares, email, sales, overhead costs, healthcare incidents and information from your Internet of Things devices. It’s a lot of data and you’re not going to be able to use every piece of it. To find meaningful patterns in all that information, you need to use the right data. That’s where your data management strategy comes in.
Your data strategy needs to lay out the parameters for the kind of data that will be selected and included in your overall program. Your plan should help you label the data in a consistent way, identify its source and tie in any available metadata. As part of this step, you may want to create a data glossary that helps your internal staff quickly find and understand the data stream, the context of how it comes to your business and why it's important to your organization.
2. Choose your data storage
Once you’ve chosen your data, it’s time to decide how you’ll store it. You have a lot of options. You may choose a traditional relational database or, if you need to easily search different kinds of information, you might want a NoSQL database. If you need to house a vast amount of unstructured data, you might look into a product like Azure data lake. If you’re sharing data with other organizations, you may be interested in blockchain.
The data you’ve selected will guide your choice, so look at the data you outlined in step one and start asking yourself some questions:
- Is your data structured or unstructured?
- How much data do you expect to store?
- Who will need access to it?
- Will you need to store the source data or can it be summarized?
- Can the data be centrally stored?
- Will it need to be replicated to other locations for ease of access?
Once you’ve got your answers, you’re probably well on your way to knowing what data storage solution will work best for your organization.
3. Plan for how data will be shared
Technology has never been good at sharing. Applications typically have their own way of storing data and sharing data between systems can be a nightmare made of import/export profiles, comma delimited text files, insufficient validation rules and lost records from import errors.
Need an example? Look across your own business application landscape. If you wanted to analyze your sales data in Excel, can you do it? Can you pull the resumes of your last 100 applicants into Microsoft Access? If you wanted to add website traffic to a PowerPoint presentation, how many people would you have to speak with?
You can see the problem. Your data is a corporate asset and it all needs to be shareable. A Big Data management strategy needs to address and define this issue so data sharing becomes easier moving forward.
4. Combine your data
The value of Big Data is, at its core, about data analytics and business intelligence (BI): integrating multiple datasets and analyzing them for business insights. It's about using data to find connections you haven't seen before. Ideally, your data analytics should give you historical, current and future views of your organization’s business operations.
Because your data needs to be distilled and merged into common datasets multiple systems can use, your data management strategy needs to lay out a new approach to collaboration between different development teams. Data development must be approached as a central business priority.
5. Plan for data governance
The final thing your Big Data management strategy needs is a plan for governance, or data supervision. A governance plan establishes rules, policies and methods for data retrieval, management, archival and deletion. It maps out the lifecycle of all your data and ensures each piece of data has an owner, someone within the company who can respond if something goes wrong with the data.
Governance plans are essentially insurance policies for data, preventing disrepair and neglect and they are a crucial part of a Big Data strategy.
Data management for Big (and small) Data
It’s important to plan for Big Data because there’s so much of it but even if your company isn’t working with Big Data, you still need a plan in place. Data of all sizes needs to be planned for properly so that your organization can easily use, store and integrate it into your business.
Need help building a strategy? Omni is a technology consulting company with deep experience in a variety of data management platforms. We can look at your current systems, listen to our business needs and offer a data management strategy tailored to your organization.
Ready to get started with data management?