Chartio Presenting at Under The Radar (Vote for us!)

We're excited to announce that, following in the footsteps of some great companies, we'll be presenting at Under the Radar! The event is next week, April 25th and 26th in Mountain View California and registration is still available.

We'll be pitching an audience and judging panel along with some amazing companies. Voting has already begun for the User's Choice Award. Be sure to take a second to


If you're at the event be sure to come by and say hi!

After the event we'll be sure to post our presentation. For now take a look at some past presentations by these friends of ours:

CloudKick (2009)

DotCloud (2011)

GinzaMetrics (2011)

Major Chartio Updates!

New Release

Last week we rolled out a major new version of Chartio! We'll be writing more soon about the details of each of these features on our blog but for now here's a list of a few new things to notice:

  • New charts built with our new charting library
  • 3 new chart types (Area, Percent Area, Stacked Bar)
  • Improved tables
  • Improved drag & drop interface
    • Moved from x, y & split data structure to columns and grouped columns
    • Date filters
    • Multidimensional query possibilities
  • Automatic foreign key schema detection for database engines that support them
  • Automatic joins in the drag & drop query builder
  • Dashboards auto refresh with adjustable intervals
  • Completely new design
  • Overhauled and extended documentation
  • Schema Editor
  • Payment Invoicing
  • New blog and documentation CR'S and design

As mentioned in our previous note, there were quite a few significant migrations in this release. The deploy went smoothly however and all outstanding issues are currently resolved! Please be sure to send us a note to support@chart.io if you run into anything we've missed.

We hope you like the changes! Go check out your new dashboards!

Release Coming: Here's what's changing

We're excited to let you know that this upcoming week, we'll be rolling out some major improvements to Chartio. The improvements, based largely on your feedback, include a more flexible UI concept, automatic joins, invoicing, a new design, more chart types, and much more.

With all of the improvements in this release we'd like to let you know about a few things to be aware of during the migration:

  1. The cache will be cleared in your browser, meaning the very first time you view your dashboard on the new release, you will have to wait until your queries finish fetching the results before the data is rendered.

  2. All charts created with the current version of the drag and drop UI will be switched to SQL mode. This won't affect your dashboards or charts with the exception that if you'd like to edit your chart in the drag and drop mode, you'll have to rebuild it with the new (very similar) version of the user interface.

  3. Going forward, the ordering of the SQL syntax has changed slightly for charts that include the 3rd split field. Previously the results of the SQL were used as

    SELECT x-axis, y-axis, split FROM ....
    

    but now, the y-axis and split have been switched, meaning your queries should look like

    SELECT x-axis, split, y-axis FROM ...
    

    The change will be automatically applied to any of your current queries that require it and we'll be contacting anyone in the case of an issue with migrating the query.

We've been simulating the migrations and testing the new features thoroughly and expect it to go smoothly. If, however, you find any issues or have any questions/concerns please don't hesitate to get ahold of us at support@chart.io.

Thank you for your continued cooperation and support!

Big Data Meets the Hype Cycle

Big Data is all the rage these days, judging from the press, analyst reports and the endless webinars in my spam folder. Just as with every other mega-trend in technology, the hype is partly justified and partly not.

What is Big Data?

Sprawling amounts of data are being generated across quite a range of sources and media: financial transactions, weather data, health data, social data, sensor data, mobile data, geo-location data, search data, and the list goes on.

Some of this data, particularly the unstructured kind, is not very suited to traditional tools, mainly, the relational database. But as certain companies explore tools designed for this kind of data structure and volume, something of an echo chamber intermingled with hype has taken over the conversation, much in the same way that NoSQL was all the rage a few years back. The hype even made way for parody.

In the case of NoSQL, the ecosystem eventually realized that databases that don't conform to strict relational models have their place in certain contexts. But time revealed the hoopla to be a good deal of hype.

The Hype

The fact that data growth is immense is undeniable. But most businesses, practically speaking, do not contend with what is truly Big Data with a capital B & D. As it happens, in 2011, Gartner added Big Data to its notorious “Hype Cycle,” alongside other flavors of the year: Internet of Things, Consumerization and everyone’s favorite, Gamification.

The chart below describes the various phases of the hype cycle, which are segmented into 5 major sections.

gartner big data cycle

  1. Technology Trigger: A tech breakthrough gets the ball rolling. Commercial viability is uncertain.

  2. Peak of Inflated Expectations: Early publicity and a few success stories create buzz. Most implementations do not succeed, however.

  3. Trough of Disillusionment: Interest wanes as experiments and implementations fail to deliver. Lots of companies fail.

  4. Slope of Enlightenment: 2nd and 3rd generation versions start to emerge and bigger companies start to fund pilots.

  5. Plateau of Productivity: Mainstream adoption starts to take off. The technology’s broad market applicability and relevance are clearly paying off.

In this chart, Gartner situates “Big Data” on the steep climb up to peak hysteria, as the early adopters began uptake and the mass media (or Big Media, as it were) starts to froth.

gartner big data cycle image 2

The truth is, most businesses don’t have big data. They have mostly average data. And what might be mistaken for “Big Data” is often really just datasets that are larger than yesteryear, but are still workable with today’s tools.

According to an interview Forbes magazine conducted with Rasmus Wegener, a partner in Bain’s IT practice:

"Instances of Big Data are relatively rare and most Big Data is simply Large Data. And Large Data can be handled with traditional tools."

To separate Big Ol’ Data from its more modest kin, large data, Wegener suggests four questions:

  1. Do you have a significant amount of data that needs to be analyzed? “Typically they say yes, and then we ask if it all has to be analyzed at the same time.”

  2. How complex are the analyses you have to run, the number of computing operations required to transform the data into actionable insights?

  3. Critical issue — what’s the speed at which the data must be captured and the solution generated? “This is almost always a knockout criterion. When you walk through the airport and they take pictures of everybody in the security line to match every face through facial recognition, they have to do that almost in real-time. That becomes a big data problem. If I am a bank and looking at a vast number of credit scores and histories, and I don’t need to provide an answer in five seconds but can do it next day, then that is not a big data problem.”

  4. Degree of structure of the data. Does it contain a significant amount of unstructured data from video or audio, or can it be put into a relational database easily?

All this isn’t to say that massive amounts of data don’t exist and aren’t being generated, processesed and analyzed by business. It’s just at the moment, truly big data is mostly manifest in certain industries, including telecommunications firms, internet giants like Google and Facebook, some research firms, certain government organs and utilites running smart grids.

What will likely emerge from all this noise is a recognition that, yes, certain data volume, speed and structure requirements are better suited to new data technologies. That said, these tools fit certain and limited use cases for the time being and act mostly as a complement to existing data-warehousing deployments. For example, Ebay has retained its Teradata data warehouse for traditional transactional and customer data, but has also adopted Hadoop for clickstream, user behavior and other semi-structured data investigations.

But we should do what we can to clear the fog of hype from our discussions of data. The whole point of storing data is to make sense of it, by leveraging structured data. So relational databases are more than here to stay. And platforms for processing unstructured or semi-structured datasets certainly have a long way to go. That said, we're hopeful the hype will give way to quality products that make everyone's data more understandable, using a variety of complementary technologies.

Dave Fowler, Chartio Co-founder and CTO, Makes Forbes 30 Under 30 List!

We're incredibly pleased to announce that Dave Fowler, Chartio's Co-founder and Chief Technology Officer, has been selected as part of Forbes Magazine's 30 Under 30. The full list is available here.

Chartio's dave fowler makes forbes 30 under 30

The various top 30 lists were broken down into categories, including Technology, Science, Entertainment and others. In technology, the finalists ranged from startup co-founders to Quantum computing experts.

It is really an honor to be selected as part of such a talented and accomplished group, which includes Jeff Hammerbacher, Chief Science Officer of Cloudera (and Chartio investor), as well as fellow Y Combinator Alumni Erik Frenkiel (MemSQL), Drew Houston (Dropbox), Solomon Hykes (DotCloud), Jessica Mah and Andy Su (Indinero), Adam Goldstein (Hipmunk) and Daniel Gross (Greplin).

New on Chartio: Permissioning

Today we're announcing permissioning, a new feature that will give our customers more granular control over what their teammates can and can't do with Chartio.

With permissioning, Chartio project administrators can now assign teammates to one of three roles:

Admin

Admin users can do, well, everything. That includes creating dashboards, adding datasources, determining access for other users -- basically anything that's possible with Chartio.

Creator

Creators have a medium level of access. They can add new charts, modify existing ones, edit schema and add new dashboards. But unlike admins, they can't add new datasources, and can't edit information for other users.

Viewer

As the name implies, viewers can view dashboards, but they can't edit them or create new visualizations.

If you're an administrator and want to adjust your teammates' permission levels, just log into your Chartio account and click on Settings.

You'll notice that each of your teammates now has a permission level, listed in green.

Chartio permissioning

To change the level, just click "edit". You'll be taken to a page where you can edit that teammate's access level.

changing chartio permissioning

Permissioning is just the latest way we're making Chartio more efficient and useful for diverse teams. Give it a try, and as always, let us know what you think.

Data Science Accroding to LinkedIn's Monica Rogati

This past week, we sat down with Monica Rogati, one of the founding members of the LinkedIn Data Science team. Monica obtained her PhD in Computer Science from Carnegie Mellon, where she focused on text mining and applied machine learning. At LinkedIn, she is pioneering data driven products with multi-million dollar business impact, and is currently building mathematical models that power LinkedIn’s personalized recommendations.

Please note: because we weren't able to record the conversation, this interview represents my recollection and paraphrasing of what Monica said. None of these quotes are direct.

Chartio: So how’d you get into data science in the first place?

Rogati: Well, my background is in computer science. Specifically, It's in applied machine learning and text mining. I joined LinkedIn to work on their various data projects and totally love it!

What are some of the cooler data projects you’ve worked on so far?

I helped build the recommendation engine that lets you know what jobs you might be interested in.

How hard was that to build?

It was definitely hard in the beginning because we had lower density of data. As more people joined, however, we got increasing insight into how the graph connects. We were then able to look at several things people have in common. More and more signals began to appear.

You gave a talk recently about the evolution of data science. Can you share some of your thoughts about how the profession has changed over the years?

In my talk, I show a slide with a 2008 wanted ad for a data scientist. The ad referred to them as “analytic scientists” and even specified that no technical skills were required for the job. The implication was these important skills, machine learning, using R for statistical analysis, etc. could be learned along the way. Broadly, they were looking for analytical and creative types with intellectual curiosity. Today, the job description hasn’t changed much except we now require much more by way of technical skills.

What kinds of technical skills?

It depends. For some roles, we put a lot of emphasis on data visualization, whereas in others we encourage fluency in text mining. Overall, the types of skills we're looking for when hiring are analytical skills, technical skills and communication skills. It’s a tough combination to find.

Where do you find your candidates?

I think academia is starting to catch on to the data science trend. But our data scientists come from a wide variety of backgrounds, including neurosurgery and particle physics.

What do you think the the data scientist will look like in 5 years?

Given data scientists didn't exist 5 years ago, it’s hard to deduce a rate of growth in the job or field overall. But I can definitely say it’s a very high rate of change and the job will be quite different in some ways but the same in others. Roughly, it will still require intellectual curiosity and a fluency with data--the ability to manipulate and extract insights from it. But the lives of data scientists will become easier as the tooling and infrastructure improves and is indeed democratized.

Where will the tooling improve?

Well, data software is in its infancy. There is a ton of work happening on the bleeding edge. It still takes a really long time to manipulate data. Take Pig and Hadoop, for example. All of these will become smoother and easier and enable machines to do really well what machines do well and humans to excel and focus on their strengths.

Speaking of machines versus humans, what part of the process--the data analysis process--can be automated?

That's the holy grail. A lot of what data analysts are doing is still at the “pen and paper stage,” figuratively speaking. The tools are slow and clunky. Over time, the friction in the tooling will be eliminated. For example, to look at the top of a distribution in pig with grouping requires 5 pig commands. That's not friendly to really exploring the data. Maybe there's another tool you can use for exploratory data analysis. Maybe it’s hive or something that hasn’t yet been invented.

I suppose an even more ambitious goal goes beyond just improving the tooling to really democratizing data science and expanding it to a whole new segment of users. Is that a reasonable goal or more of a pipe dream?

I don't think it's a pipe dream. There must be a framework where people can collaborate on data analysis. Obviously, tools can be dangerous if used inappropriately. This is even more true in the case of data. You can easily draw the wrong conclusions. In many cases, it's quite tempting to draw the wrong conclusion. Collaboration in this regard might reduce the amount of errors by putting more eyes on the same piece of data. But you will need the human element for a really long time to even just know what questions to ask.

Can you elaborate on that last point?

Take for example the question of what industries are “hot” given an analysis of the data LinkedIn possesses. First, you have to define what “hot” means. Let’s walk through the way a data scientist might reason (this process is adapted from a presentation I gave at Strata, available at http://slidesha.re/o7JKfX).

Take 1: What industries have the highest YOY growth? This is an attractive first attempt, but it can be misleading in that it might only represent LinkedIn’s penetration in any given industry.

Take 2: People list start dates on their profiles. So why don’t we look at what industry people are flowing into in a given year? This can also be misleading because it ignores churn.

Take 3: A better question would be, what is the net inflow (which takes into account churn)? This is better, but it ignores seasonality (more interns in the summer and fewer teachers, for example).

Take 4: So let’s take seasonality into effect now. Again, better, but if we run the actual data, it looks like metals, dairy and mining are the “hottest industries.” This somehow seems wrong. We can adjust this to exclude industries below a certain size.

Take 5: And finally, whhat about bad data? We have to remove spammer accounts, the longtail of individuals who list 200 positions on their profile, etc.

So let’s see what this filtered YOY growth looks like.

linkedin data shows year over year growth

This is not too informative as it seems to track with broader economic conditions. To make this more useful, let’s normalize the data and look at differences between industries., as well as select just a few to focus on.

year over year growth filtered for certain industries

This is much better and certainly reveals an interesting view of the internet, real estate and financial services industries.

Interesting indeed.

Thank you so much, Dr. Rogati, for sharing your time and observations with Chartio. We appreciate your thoughtful insights into the world of data science!

New Chartio Improvements: Speed and Stability

Yesterday may have been a holiday, but we sure haven't been resting here at Chartio. In fact, we spent a good chunk of the summer squashing bugs and streamlining our code to make Chartio faster and more stable than ever.

Some improvements you might notice right away, while others reside under the hood. Highlights include:

  • Design unification. We've made interface tweaks throughout the site to ensure a consistent look and feel.

  • Better error handling. We've greatly improved our inline error reporting, showing you details about snafus like SQL errors as they happen, so you can better address issues in real-time.

  • Speed, speed, speed. We've simplified the DOM tree in our javascript code, making your pages render significantly more quickly. Plus, now your charts load along with the rest of the page -- no more waiting for visualizations to display after the fact. The result? Loading your dashboard is now about 3 times faster than before.

As always, we're committed to making your Chartio experience quick, smooth, and reliable. We'll continue to optimize our code and polish our UI in the weeks ahead. In the meantime, enjoy the new and improved Chartio!

The Decline of One-Size-Fits-All Business Intelligence

Yesterday may have been a holiday, but we sure haven't been resting here at Chartio. In fact, we spent a good chunk of the summer squashing bugs and streamlining our code to make Chartio faster and more stable than ever.

Some improvements you might notice right away, while others reside under the hood. Highlights include:

  • Design unification. We've made interface tweaks throughout the site to ensure a consistent look and feel.

  • Better error handling. We've greatly improved our inline error reporting, showing you details about snafus like SQL errors as they happen, so you can better address issues in real-time.

  • Speed, speed, speed. We've simplified the DOM tree in our javascript code, making your pages render significantly more quickly. Plus, now your charts load along with the rest of the page -- no more waiting for visualizations to display after the fact. The result? Loading your dashboard is now about 3 times faster than before.

As always, we're committed to making your Chartio experience quick, smooth, and reliable. We'll continue to optimize our code and polish our UI in the weeks ahead. In the meantime, enjoy the new and improved Chartio!

Introducing the New and Improved Chartio Website

Here at Chartio, we know that good design is crucial.

That's why we're particularly excited to launch our fully-redesigned, dare-we-say-beautiful homepage, which elegantly and effectively conveys what we're all about.

new chartio website

Some highlights of the new site include:

One-page design

No need to navigate from page to page -- all the information you want is right here. Just click any of the top menu items, and the homepage quickly and fluidly scrolls to your chosen topic.

one page design

Chartio is committed to making the content you want as accessible as possible, and now our homepage is no exception.

Clean and simple

No clutter or confusion -- the page elements are big and bold, with no unnecessary distractions.

Chartio product tour

Again, it’s just like our product -- getting information from our website is as easy and enjoyable as getting insight into your data with Chartio.

The same great content

Our homepage is still packed with plenty of product and company details, like side-by-side feature highlights, pricing plans, founder bios, and documentation.

Chartio about us

It's everything you want to know about chart.io, all in one place.

We’re pretty proud of our new design, but we could be biased. So, by all means, check out the new chart.io, and let us know what you think!