• Follow us


How to build an agile data pipeline

Agility and data are two of the most overused buzzwords of the business community – and for good reason.

Every business wants to be agile, to be responsive to the changing environment, to survive and thrive. Likewise, forward-thinking businesses are majorly focused on data as a route to greater insights, creativity and efficiency. It seems buzzword squared to put these two concepts together, but rather than being a technology to hype, it refers to a smarter way of managing with what enterprises already have, or with readily acquired skills.

An agile data pipeline is what data-centric organisations are putting in place in order to make the best use out of their data investments and ensure that the business can incorporate data-led analytical decision-making in a healthy and sustainable way.

As with any business process, building an agile pipeline involves several stages and should properly encompass a range of appropriate stakeholders within the business. As it is, that’s not always the case as many organisations tend to develop their analytics functions in a higgledy-piggledy manner.

It’s no surprise that the data estate of a business can quickly grow out of control – the four Vs of big data, as defined by IBM are the variety, velocity, volume, and veracity of big data and show that data is no monolithic thing. It’s a living, changing entity. So fluid in fact, that in 2017 Experian built on this format and added two more Vs: Vulnerability and value.

So how do you corral and harness the bucking bronco of data and put it behind the corporate plough, to turn up the nuggets of true insight?

A data catalogue makes storing, finding and using data a much more seamless experience. It’s an organised solution that allows business users to explore data sources and understand them. It saves the user time and can stop them recreating new data if they might have failed to find what they wanted in a non-catalogued state. It’s a great resource to keep the analytical process ticking over at speed, without slowing down the work of data scientists or ‘line of business’ analysts.

A faultless data catalogue doesn’t arrive fully formed, and the history of data governance integrations is littered with solutions that have failed to achieve a critical adoption in an organisation. To truly deliver on a data catalogue the business must also focus on the people and the process, not just the technology. Analytic leaders must build a culture that enables users to succeed with data.

Discover together

Data discovery can be fun, but it’s a hygiene factor that the analyst needs to get through before they can do the job they want to: Analytics, insights, and adding value to the business. Really, the organisation wants to unite all of the data workers with the data and analytic assets they could possibly (but legitimately!) need in a controlled and secure way. It’s important to take steps to make data both searchable and trackable. A platform will offer this and event data lineage, offering more visibility for better governance. When data discovery and data security are breathtakingly easy, there’s no room for data governance missteps. It’s a great first step before an enterprise can create a culture of collaboration, sharing, and innovation by extending formally tribal knowledge across the organisation.

Culture the data culture

The data catalogue is the starting point for most analytical activities. Searching and finding content, understanding context and gaining trust in the results through community feedback and interaction – it’s a great resource when it’s used correctly, saving time and energy, and greatly aiding productivity.

The success of the catalogue is tied into the success of the organisation. Track and reward the most active contributors who add value to the analytic process, understand the assets that are creating the most impactful results, and promote those users to ensure that information assets are well curated and maintained.

The right data culture is socially engaging. It empowers users to impart and share knowledge, and is supported by technology that supports the different ways that users bring their experience together to solve problems. This includes creating and annotating definitions, discussing quality and purpose in conversation threads, and even simple social gestures like sharing a link or giving a 'thumbs up' reinforce the value of the underlying asset and make it richer and easier to find for future users.

Collaborate or die!

It might be that during the course of the pre-data-focused days others in the organisation have already collected the same information or performed a similar analysis, but different analysts have no good way of finding it. Data assets and resulting information proliferate, thus compounding the problem and creating inefficiencies and delays in answering critical business questions.

Taking a cue from social media and wiki techniques, social interactions can help users share and utilise organisational tribal knowledge easily. And everything in the analytic process: Data, analytic apps, workflows, macros, visualisations, and dashboards, should be sharable. When everything is seamlessly shareable and it is fast work to identify trusted information assets as well as insights into how they are used and lineage, it’s very simple to make more impactful business decisions.

One of the most important pieces to this is closing the gap not only around finding the right data but around the roles within an organisation: Between IT, business analysts, data scientists, everyday ‘citizen data scientists’, and onwards to all who use data. Sharing across an organisation is the grease to the wheels of innovation.

Define the best working practices

From the moment you embark on analytics project you stand at a base camp with the peak of expectations staring at you from across the chasm of ignorance. Building a social repository of all the organisation's data sources, reports, workflows, terminology, and more (potentially thousands of lifetimes of accumulated knowledge) is as daunting as climbing Mount Everest. So, don't.

Start small, but think big. Tackle smaller challenges to get some early victories and build momentum from there.

Pick a single department or project. Perhaps start with a handful of critical datasetsDocument expertise while reports and data sources are being created, before the skills and the knowledge leaves the project (or the company!) Ensure that new people can understand the function of dashboards, reports other datasetsFollow your business strategy: Document and socialise the assets associated with key strategic projects, and use the catalogue as a means to change the culture towards greater collaborationTo ensure adoption, it’s vital that users find the information always up-to-date. Without timeliness, the catalogue immediately loses trust and credibility and the pipeline starts to leakA business glossary is a critical component of your data strategy. A glossary can take many forms: definitions, concepts, subject areas, etc. It captures the unique language of your organisation in a central location, and then connects that meaning with the contents of the catalogueA proper analytics pipeline lives-or-dies on whether users find value in the information within. There is no-one central to the organisation, not even BI and IT teams that have a 100 per cent understanding of all those data sources, data sets, and reports and other types of assets. This expertise and 'know-how' is in the heads of staff: Business teams, analysts, knowledge workers, analytics groups, and more. It's pervasive and waiting to be harnessedTrusting data

It’s one thing to have data, it’s another to trust it and use it properly. Famously executives relied on their experience, their ‘gut’, when making decisions, and sometimes, that’s not a necessarily a bad idea. Where data is not cleaned, rated and trusted, it might not be worth the time to review. But where the right steps are in place the data can tell a very honest and trustworthy story. It is a better resource than the thoughts and opinions of an executive who may not have access to all the facts, the long-term trends, or the powerful analytical ability to correlate all their contents appropriately.

So to stock the data pipeline put in place some simple best practices, encourage your people with good processes and give them the technology that makes this all easy. We’re not in the days of needing to know how code to operate analytical tools, and end-to-end platforms take out the sting of finding, moving, prepping and using data. In fact, stocking the analytics pipeline should be a breeze, exhilarating, process, the opening stages in a virtuoso performance by a data maestro.

Nick Jewell, Director of Product Strategy, AlteryxImage source: Shutterstock/alexskopje

Read More

Leave A Comment

More News

IT ProPortal

What is Big Data? Everything you need to 2019-02-07 10:32:16Big Data: What’s New  05/02 - FEATURE - New year’s resolutions for business looking to leverage big data in 2019 - We’ve spoken

Huawei defends actions, calls for time in UK 2019-02-07 08:00:54The company says it has never had a serious incident in almost two decades of international business.

Apple regains spot as top US tech company 2019-02-07 07:30:59iPhone maker retakes top spot from Microsoft.

Raspberry Pi opens first high-street store 2019-02-07 07:00:17Store aims to attract “customers who were curious about the brand”.

Google blocks 100m Gmail spam emails with TensorFlow 2019-02-07 06:30:19Finding extra 100m spam emails is quite a feat, Google says.

Microsoft joins OpenChain platform 2019-02-07 06:00:42Open sourced solutions are great for businesses, but many fear possible issues with governance.

Human voice: the next generation of data 2019-02-07 06:00:33Voice data is much harder to secure, deliver and analyse than ‘traditional data’.

2019 – The year of automation 2019-02-07 05:30:41The ongoing data generation, gathering and analysis is the fuel behind digital transformation, but if data is the fuel, and digital transformation is

Treading a digital path in 2019 2019-02-07 05:00:16Here are five key trends that Cognizant expects to emerge in the year ahead, as technologies continue to mature and become mainstream.

How to build an agile data pipeline 2019-02-07 05:00:05Agility and data are two of the most overused buzzwords of the business community – and for good reason.

Top 10 personal technologies to support digital business 2019-02-07 04:30:41Here are the 10 most effective technologies that technology leaders should begin to incorporate into their roadmaps and strategies.

Nine steps to building a business-oriented disaster recovery 2019-02-07 04:00:15The following steps will help you organise your thoughts, ask the right questions and develop a strategy for your DR plan that is closely aligned with

TechRadar: Internet news

This is the new Moto G7, and there 2019-02-07 08:18:44Motorola has officially launched the new Moto G7 phones, and we have all the information on the four new handsets.

Support agents versus conversational chatbots 2019-02-07 08:00:45iTouchVision's Swati Kungwani explains why chatbots still fall short when compared to human customer support agents.

Vodafone trials 5G form factor device with Ericsson 2019-02-07 07:58:30Lab tests give partners insight into how 5G devices will behave on the network

Does my home and contents insurance include gadgets? 2019-02-07 07:47:22Be sure your beloved tech is covered inside the home and out by using your home and gadget insurance policies correctly.

UK roaming charges will return immediately following no-deal 2019-02-07 06:48:39Higher bills coming after March 29 if no deal is agreed.

PS4 vs Xbox One: which gaming console is 2019-02-07 06:25:13Can't pick a console? Our Xbox One vs PS4 guide has all the facts you need to decide.

AMD Radeon VII launch stock looks worryingly thin 2019-02-07 06:22:34Will GPU supply meet demand? If not, then we can once again expect to see the specter of inflating prices…

LG G8 ThinQ front-camera set to rival Apple’s 2019-02-07 06:18:40A new time-of-flight chip in the LG G8 ThinQ rivals the technology Apple's recent iPhones use.

The Swytch electric conversion kit can power any 2019-02-07 06:10:59A box of tricks that turns your regular ride into an e-bike – for a surprisingly low price.

Why cloud migration doesn’t need to be as 2019-02-07 06:00:331&1 IONOS' Alexander Vierschrodt summarises the essential points to consider when choosing a cloud storage provider.

Six Nations 2019 live stream: how to watch 2019-02-07 06:00:05It's all to play for after week one of the 2019 6 Nations. You can watch every match using a live stream - wherever you are in the world.

The Google Home Max price gets an incredibly 2019-02-07 05:48:03Rarely on sale, save £20 on the powerful Google Home Max speaker with an exclusive coupon code.

Dev Pro

Apple Releases Software Fix for FaceTime Eavesdropping Flaw 2019-02-07 21:43:00Apple Inc. on Thursday released a software update for iPhones and its other devices to fix a bug that let users of its FaceTime video-chat service lis

Tractica Report: Virtual Reality for Enterprise and Industrial 2019-02-07 20:39:00Even with the continuing growth, market adoption of enterprise VR use cases is moving slower than previously anticipated due to market acceptance lagg

Microsoft Backs Facial Recognition Bill as Amazon Mulls 2019-02-07 20:37:00Two months after calling for laws to regulate facial-recognition software, Microsoft Corp. is lobbying on behalf of a first-of-its-kind bill in its ho

Tractica Report: Demand for Professional Artificial Intelligence Services 2019-02-07 20:33:00To fully access the operational and economic benefits of AI, organizations are realizing that, in most cases, enabling AI is not a plug-and-play propo

IBM Invests $2 Billion in New York Research 2019-02-07 17:56:00International Business Machines Corp., based in Armonk, New York, has been pushing into fast-growing new technologies, like AI, cloud-computing platfo

Facebook's Model Attacked by German Antitrust Regulator 2019-02-07 17:26:00Facebook Inc.’s advertising model came under attack in a landmark ruling from German antitrust regulators who ordered the social network to over

Assessment of Gartner’s Market Guide for Cloud Workload 2019-02-07 17:07:00Learn about the core capabilities in Gartner’s Market Guide for cloud workload protection platforms.

Supplementing the Limitations in Office 365 2019-02-07 17:06:00The focus of this whitepaper is to discuss what Office 365 does and does not do, as well as how to supplement its limitations.

The Top Five Myths of Hybrid Cloud Security 2019-02-07 17:05:00Let’s look at the top five myths surrounding hybrid cloud security.

Mapping the Future: Dealing with Pervasive and Persistent 2019-02-07 17:04:00Learn about Trend Micro’s security predictions for 2019.

Leveraging the Agility of DevOps Processes to Secure 2019-02-07 17:03:00Learn how to leverage the agility of DevOps processes to secure hybrid clouds.

Microsoft Aims to Connect Patient Health Records in 2019-02-07 16:44:00Microsoft Corp. is releasing a service to help health-care companies move vast amounts of patient data to its cloud and connect with other related sys

Enterprise – TechCrunch

Google doubles down on its Asylo confidential computing 2019-02-06 12:00:52Last May, Google introduced Asylo, an open-source framework for confidential computing, a technique favored by many of the big cloud vendors because i

Big companies are not becoming data-driven fast enough 2019-02-06 10:33:34I remember watching MIT professor Andrew McAfee years ago telling stories about the importance of data over gut feeling, whether it was predicting suc

vArmour, a security startup focused on multi-cloud deployments, 2019-02-06 06:35:15As more organizations move to cloud-based IT architectures, a startup that’s helping them secure that data in an efficient way has raised some c

Retail technology platform Relex raises $200M from TCV 2019-02-06 05:17:38Amazon’s formidable presence in the world of retail stems partly from the fact that it’s just not a commerce giant, it’s also a tech

Google’s still not sharing cloud revenue 2019-02-05 17:33:05Google has shared its cloud revenue exactly once over the last several years. Silence tends to lead to speculation to fill the information vacuum. Luc

Backed by Benchmark, Blue Hexagon just raised $31 2019-02-05 09:00:33Nayeem Islam spent nearly 11 years with chipmaker Qualcomm, where he founded its Silicon Valley-based R&D facility, recruited its entire team and

BetterCloud can now manage any SaaS application 2019-02-05 09:00:16BetterCloud began life as a way to provide an operations layer for G Suite. More recently, after a platform overhaul, it began layering on a handful o

Databricks raises $250M at a $2.75B valuation for 2019-02-05 03:01:40Databricks, the company founded by the original team behind the Apache Spark big data analytics engine, today announced that it has raised a $250 mill

Coda’s programmable document editor comes out of beta, 2019-02-05 03:01:19Coda, which is coming out of its limited beta today, wants to reinvent how you think about documents and spreadsheets. That’s about as tough a c

After 5 years, Microsoft CEO Satya Nadella has 2019-02-04 12:46:13Five years ago today, Satya Nadella took over as CEO at Microsoft, and by most any measure has been wildly successful. It’s common to look at th

Workplace messaging platform Slack has confidentially filed to 2019-02-04 12:44:37The company has taken its first official step toward a rumored direct listing.

Chicago RPA startup Catalytic hauls in $30M Series 2019-02-04 09:00:42Robotics process automation (RPA) is as hot as any enterprise technology at the moment, as companies look for ways to marry their legacy systems with

Disclaimer and Notice:WorldProNews.com is not responsible of these news or any information published on this website.