The image above represents the turning point for me; I realized my beloved Excel couldn’t handle the amount of data require to make decisions for my client.
There’s no doubt that we’ve all heard of Big Data, yet there seems to be a lack of awareness and competency in the digital marketing industry. Read: Big Data Skills Scarce Among Marketing Pros.
Big data is going to become a lot bigger. To give you an idea, Richard Currier, Senior VP at SSL says: “Just as in other regions of the world, the explosion of digital content is driving demand for all kinds of telecommunications infrastructure including both terrestrial and satellite networks. It has been estimated that in 2013 the amount of data generated worldwide will reach four zettabytes.” – Source.
FYI: A Zettabyte = 1000000000000000000000 bytes, a music CD is around 681600000 bytes , and that’s around 1.46 trillion CDs. Think about that for a second.
Everything we currently do in our daily activities, including analysing social media trends, backlink analysis, and content creation will depend on our ability to process and manipulate large amounts of data. Why do we need to do this? Someone will surely just build tools for us? Ask yourself when was the last time you didn’t need to use the CSV export button on any tool? How often was the interface enough to answer all of your questions without mucking around knee deep in the data? This is exactly why it’s important to dig into the data yourself to come up with ideas, questions, and answers.
The key points in this post:
- Getting onboard with big data now, means a brighter future for you tomorrow.
- The learning curve is steep; prepare to adapt and learn new skills.
- Requesting sensitive data is going to be your biggest challenge.
Point 1: Getting onboard with big data now, means a brighter future for you tomorrow.
Wikipedia defines Big data as the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. Also see IBM’s definition.
To put this in even simpler terms, “more data can lead to more accurate analysis. And more accurate analysis can lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.” – Vicky Boulton
Regular desktop applications like Excel just aren’t powerful enough to process the amount of data that you’ll need to do your analysis. Just last week, I had to download a backlink file from Majestic that unzipped to the tune of 765mB, and that was only ¼ of the data I needed to conduct my analysis.
Point 2: The learning curve is steep; prepare to adapt and learn new skills.
Below are resources you’ll need to start your big data adventure, classified into the following levels: noob, curious, data ass-whipper.
For the absolute beginner:
- Get rid of your n00b status by speaking the lingo: Big Data Terminology – From Hadoop to Hive and confabulation, you’ll need to at least know what these words mean to be taken seriously.
- Find out who’s who, and get breaking news from the professional who create the tech: Start following Top 50 #BigData Twitter Influencers according to SAP, and you might also want to check out a maintained list by Onalytica, and another list from Forbes.
- Stay informed and keep up to date by expanding your reading list: Top blogs and sites to start reading: Techrepublic, Planetbigdata (aggregator), Forrester Big Data Blog, Data Round Table, ZDNet, Dataversity (my favourite part of this site is the articles section)
For the curious:
- Mathematics and statistics doesn’t come naturally to many, and if you’re like me, you’ll want to brush up on your statistics skills. My personal recommendation is Udacity’s free intro to statistics course.
- Get your hands dirty with some online data science courses: BigdataUiniversity’s free courses, Coursera – introduction to data science (free, but no classes available right now).
For those who are committed to data-ass-whipping and become part of the next generation:
I believe the next generation best digital marketers will be data driven.
Over the years, I’ve been battling with which programming language to learn, what database to work with, and which big data platform to use. After much deliberation, expert advice, and many, many failures, I’ve committed myself to the following:
On learning a programming language:
- Python programming language
- Best places to learn: Learn Python the hard way, and Codecademy’s Python general course
On data gathering:
- Scraping ( programming required –BeautifulSoup), Scraping (no programming required – Scrapinghub)
- Codeacademy Python API course
On big data manipulation:
- MySQL – Widely used, open source, and just plain awesome. You’re going to have to trade your =Vlookup() for SELECT * FROM Products WHERE ProductGroup but the sky is the limit here (Okay, so you can cheat by using their handy MSQL Excel connector). If you don’t want to host your own MSQL server, you might want to look at Google Big Query instead.
- Where to learn MYSQL: I personally think that Udemy’s MySQL for beginners is a fantastic starting point, and also where you should learn. I also love this from the Udemy team on MySQL – “We don’t just learn faster. We learn cooler, more interesting shit” – from Why every startup marketer needs to learn SQL.
On data visualization, analytics, business intelligence and more:
- Splunk – I absolutely love everything these guys do, and I would bet on them in a heartbeat as the favourite in the big data race. I’ve been using the software for a while now primarily for SEO server log analysis and I’ve recently moved onto machine learning analytics provided by Prelert (this is pure genius by the way).
- Tableau – I’ve seen what’s possible with Tableau with former clients, and I was impressed. You’ve already heard or used them before, but it’s time to take a deeper look.
Point 3: Requesting sensitive data is going to be your biggest challenge.
Recently at Distilled, I encountered something entirely unexpected when requesting data from my client – a security addendum. I had never seen this before, yet, I wondered why more of my clients aren’t taking this much precaution when handing over their intimate details. Long story short, we pretty much failed the requirements and it lead to a sub-par consultation in my opinion.
Never again, I will not compromise my consulting because of my inability to ensure my client’s confidence, but I’m still working on it.
When you start requesting data from your IT department, or your clients, you should be able to answer the following questions to inspire confidence that you are capable and responsible enough to receive this data in the first place.
- How will you ensure that the information shared will not be disclosed to 3rd parties? Usually a non disclosure agreement does the trick here.
- How will you ensure that only authorized personnel will have access to this data? Who are the authorized personnel?
- Where will you be storing the data? Who will have access? How is it protected?
- What happens if a 3rd party gains access to the data whilst in your possession?
- How will you dispose of the data once finished?
- How will you ensure that the data is not used for your own commercial gain?
- How will you monitor the data, and where it is stored in case of a breach?
- What is the process if the data storage has been breached?
Being able to answer these questions can be difficult, so work with your client or IT department to find a solution that works best for you.
Hopefully this post inspires a few marketers to get out of their comfort zone and start learning / experimenting. The faster we learn to embrace big data, the faster we can avoid failure and put more trust into algorithms to make better decisions.
“Here’s a simple rule for the second machine age we’re in now: as the amount of data goes up, the importance of human judgment should go down.” – Andrew McAfee
Thanks for reading, good luck 🙂