Welcome!

Log Management Authors: David H Deans, Carmen Gonzalez, Eric Robertson, Liz McMillan, Pat Romanski

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, Log Management, @CloudExpo, SDN Journal

@BigDataExpo: Article

Big Data Requires Advanced Analytics

Complex carrier network performance data on HP Vertica yields performance and customer metrics boon for Empirix

The next edition of the HP Discover Performance Podcast Series explores how network testing, monitoring, and analytics provider Empirix developed unique and powerful data processing capabilities.

Empirix uses an advanced analytics engine to continuously and proactively evaluate carrier network performance and customer experience metrics -- amid massive data flows -- to automatically identify issues as they emerge.

To learn more about how a combination of large-scale, real-time performance and pervasive data access made the HP Vertica analytics platform stand out to support such demands for Empirix, join Navdeep Alam, Director of Engineering, Analytics and Prediction at Empirix, based in Billerica, Mass.

The discussion, which took place at the recent HP Vertica Big Data Conference in Boston, is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:

Gardner: Why do you have such demanding requirements for data processing and analysis?

Alam: What we do is actively and passively monitor networks. When you're in a network as a service provider, you have the opportunity to see the packets within that network, both on the control plane and on the user plane. That just means you're looking at signaling data and also user plane data -- what's going on with the behavior; what's going at the data layer. That’s a vast amount of data, especially with mobile, and most people doing stuff on their devices with data.

Alam

When you're in that network and you're tapping that data, there is a tremendous amount of data -- and there's a tremendous amount of insights about not only what's going on in the network, but what's going on with the subscribers and users of that network.

Empirix is able to collect this data from our probes in the network, as well as being able to look at other data points that might help augment the analysis. Through our analytics platform we're able to analyze that data, correlate it, mediate it, and drive metrics out of that data.

That’s a service for our customers, increasing value from that data, so that they can turn around a return on investment (ROI) and understand how they can leverage their networks better to increase operations and so forth. They can understand their customers better and begin to analyze, slice and dice, and visualize data of this complex network.

They can use our platform, as well to do proactive and predictive analysis, so that we can create even better ROI for our customers by telling them what potentially might go wrong and what might be the solution to get around that to avoid a catastrophe.

New opportunities

Gardner: It’s interesting that not only is this data being used for understanding the performance on the network itself, but it's giving people business development and marketing information about how people are using it and where the new opportunities might be.

Is that something fairly new? Were you able to do that with data before, or is it the scale and ability to get in there and create analysis in near-real-time that’s allowed for such a broad-based multilevel approach to data and analysis?

Alam: This is something we've gotten into. We definitely tried to do it before with success, but we knew that in order to really tackle mobile and the increasing demands of data, we really had to up the ante.

Our investment with HP Vertica and how we've introduced that in our new analytics platform, Empirix IntelliSight 1.0, that recently came out, is about leveraging that platform -- not only for scalability and our ability to ingest and process data, but to look at data in its more natural format, both as discrete data, and also as aggregate data. We allow our customers to view that data ad hoc and analyze that data.

It positioned us very well. Now that we have a central point from which all this data is being processed and analyzed, we now run analytics directly at this data, increasing our data locality and decreasing the data latency. This definitely ups our ante to do things much faster, in near real time.

We're right where the data is being generated, where it’s flowing, and because of that we're able to gain access to the data in real-time.

Gardner: Obviously, the sensors, probes, agents, and the ability to pull in the information from the network needs to reside or be at close proximity to the network, but how are you actually deployed? Where does the infrastructure for doing the data analysis reside? Is it in the networks themselves, or is there a remote site? Maybe you could just lay out the architecture of how this is set up.

Alam: We get installed on site. Obviously, the future could change, but right now we're an on-premise solution. We're right where the data is being generated, where it’s flowing, and because of that we're able to gain access to the data in real-time.

One of the things we learned is that this is a tremendous amount of data. It doesn't make sense for us to just hold it and assume that we will do something interesting with it afterward.

The way we've approached our customers is to say, "What kind of value do you seen in this data? What kind of metrics or key performance indicators (KPIs), or what do you think is valuable in this data? We then build a framework that defines the value that they can gain from data -- what are the metrics and what kind of structure they want to apply to this data. We're not just calculating metrics, but we're also applying some sort of model that gives this data some structure.

As they go through what we call the Empirix Intelligent Data Mediation and Correlation (IDMC) system, it's really an analytics calculator. It's putting our data into the Vertica system, so that at that point we have meaningful, actionable data that can be used to trigger alarms, to showcase thresholds, to give customers great insight to what's going on in their network.

Growing the business

From that, they can do various things, such as solve problems proactively, reach out to the customers to deal with those issues, or to make better investments with their technology in order to grow their business.

Gardner: How long have you been using Vertica and how did that come to be the choice that you made? Perhaps you could also tell us a little bit about where you see things going in terms of other capabilities that you might need or a roadmap for you?

Alam: We've been using Vertica for a few years, at least three or four, even before I came on-board. And we're using Vertica primarily for its ability to input and read data very quickly. We knew that, given our solutions, we needed to load a lot of data into the system and then read a lot of data out of it fast and to do it at the same time.

At that time, the database systems we used just couldn't meet the demands for the ever-growing data. So we leveraged Vertica there, and it was used more as an operational data store. When I came on board about a year-and-a-half ago, we wanted to evolve our use of Vertica to be not just for data warehousing, but a hybrid, because we knew that in supporting a lot of different types of data, it was very hard for us to structure all of those types of data.

We wanted to create a framework from which we can define measures and metrics and KPIs and store it in a more flat system from which we can apply various models to make sense of that data.

Ultimately, we wanted to allow customers to play with this data at will and to get response in seconds, not hours or minutes.

That really presented us a lot of challenges, not only in scalability, but our ability to work and play with data in various ways. Ultimately, we wanted to allow customers to play with this data at will and to get response in seconds, not hours or minutes.

It required us to look at how we could leverage Vertica as an intelligent data-storage system from which we could process data, store it, and then get answers out of that data very, very quickly. Again, we were looking for responses in a second or so.

Now that we've put all of our data in the data basket, so to speak, with Vertica, we wanted to take it to the next level. We have all this data, both looking at the whole data value chain from discrete data to aggregate data all in one place, with conforming dimensions, where the one truth of that data exists in one system.

We want to take it to the next step. Can we increase our analytical capabilities with the data? Can we find that signal from the noise now that we have all this data? Can we proactively find the patterns in the data, what's contributing to that problem, surface that to our customers, and reduce the noise that they are presented with.?

Solving problems

Instead of showing them that 50 things are wrong, can I show them that 50 things are wrong, but that these one or two issues are actually impacting your network or your subscribers the most? Can we proactively tell them what might be the cause or the reason toward that and how to solve it?

The faster we can load this data, the faster we can retrieve the value out of this data and find that needle in the haystack. That’s where the future resides for us.

Gardner: Clearly, you're creating value and selling insight to the network to your customers, but I know other organizations have also looked at data as a source of revenue in itself. The analysis could be something that you could market. Is there an opportunity with the insight you have in various networks -- maybe in some aggregate fashion -- to create analysis of behavior, network use, or patterns that would then become a revenue source for you, something that people would subscribe to perhaps?

Alam: That's a possibility. Right now, our business has been all about empowering our customers and giving them the ability to leverage that data for their end use. You can imagine, as a service provider, having great insight into their customers and the over-the-top applications that are being leveraged on their network.

Could they then use our analytics and the metadata that we're generating about their network to empower their business systems and their operations to make smarter decisions? Can they change their marketing strategy or even their APIs about how they service customers on their network to take advantage of the data that we are providing them?

The opportunity to grow other business opportunities from this data is tremendous, and it's going to be exciting to see what our customers end up doing with their data.

The opportunity to grow other business opportunities from this data is tremendous, and it's going to be exciting to see what our customers end up doing with their data.

Gardner: Are there any metrics of success that are particularly important for you. You've mentioned, of course, scale and volume, but things like concurrency, the ability to do queries from different places by different people at the same time is important. Help me understand what some of the other important elements of a good, strong data-analysis platform would be for you?

Alam: Concurrency is definitely important. For us it's about predictability or linear scalability. We know that when we do reach those types of scenarios to support, let’s say, 10 concurrent users or a 100 concurrent users, or to support a greater segmentation of data, because we have gone from 10 terabytes to 30 terabytes, we don't have to change a line of code. We don't have to change how or what we are doing with our data. Linear scalability, especially on commodity hardware, gives us the ability to take our solution and expand it at will, in order to deal with any type of bottlenecks.

Obviously, over time, we'll tune it so that we get better performance out of the hardware or virtual hardware that we use. But we know that when we do hit these bottlenecks, and we will, there is a way around that and it doesn't require us to recompile or rebuild something. We just have to add more nodes, whether it’s virtual or hardware.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

@ThingsExpo Stories
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, discussed some of the security challenges of the IoT infrastructure and related how these aspects impact Smart Living. The material was delivered interac...
No hype cycles or predictions of zillions of things here. IoT is big. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, Associate Partner at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He discussed the evaluation of communication standards and IoT messaging protocols, data analytics considerations, edge-to-cloud tec...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...
When growing capacity and power in the data center, the architectural trade-offs between server scale-up vs. scale-out continue to be debated. Both approaches are valid: scale-out adds multiple, smaller servers running in a distributed computing model, while scale-up adds fewer, more powerful servers that are capable of running larger workloads. It’s worth noting that there are additional, unique advantages that scale-up architectures offer. One big advantage is large memory and compute capacity...
"When we talk about cloud without compromise what we're talking about is that when people think about 'I need the flexibility of the cloud' - it's the ability to create applications and run them in a cloud environment that's far more flexible,” explained Matthew Finnie, CTO of Interoute, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Datanami has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datanami is a communication channel dedicated to providing insight, analysis and up-to-the-minute information about emerging trends and solutions in Big Data. The publication sheds light on all cutting-edge technologies including networking, storage and applications, and thei...
SYS-CON Events announced today that Silicon India has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Published in Silicon Valley, Silicon India magazine is the premiere platform for CIOs to discuss their innovative enterprise solutions and allows IT vendors to learn about new solutions that can help grow their business.
The Internet giants are fully embracing AI. All the services they offer to their customers are aimed at drawing a map of the world with the data they get. The AIs from these companies are used to build disruptive approaches that cannot be used by established enterprises, which are threatened by these disruptions. However, most leaders underestimate the effect this will have on their businesses. In his session at 21st Cloud Expo, Rene Buest, Director Market Research & Technology Evangelism at Ara...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), provided an overview of various initiatives to certify the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldwide re...
SYS-CON Events announced today that TechTarget has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TechTarget storage websites are the best online information resource for news, tips and expert advice for the storage, backup and disaster recovery markets.
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
SYS-CON Events announced today that EnterpriseTech has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. EnterpriseTech is a professional resource for news and intelligence covering the migration of high-end technologies into the enterprise and business-IT industry, with a special focus on high-tech solutions in new product development, workload management, increased effi...
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business...
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market acr...
SYS-CON Events announced today that Ayehu will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara California. Ayehu provides IT Process Automation & Orchestration solutions for IT and Security professionals to identify and resolve critical incidents and enable rapid containment, eradication, and recovery from cyber security breaches. Ayehu provides customers greater control over IT infras...
Artificial intelligence, machine learning, neural networks. We’re in the midst of a wave of excitement around AI such as hasn’t been seen for a few decades. But those previous periods of inflated expectations led to troughs of disappointment. Will this time be different? Most likely. Applications of AI such as predictive analytics are already decreasing costs and improving reliability of industrial machinery. Furthermore, the funding and research going into AI now comes from a wide range of com...
In this presentation, Striim CTO and founder Steve Wilkes will discuss practical strategies for counteracting fraud and cyberattacks by leveraging real-time streaming analytics. In his session at @ThingsExpo, Steve Wilkes, Founder and Chief Technology Officer at Striim, will provide a detailed look into leveraging streaming data management to correlate events in real time, and identify potential breaches across IoT and non-IoT systems throughout the enterprise. Strategies for processing massive ...
SYS-CON Events announced today that SourceForge has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SourceForge is the largest, most trusted destination for Open Source Software development, collaboration, discovery and download on the web serving over 32 million viewers, 150 million downloads and over 460,000 active development projects each and every month.