Sharon Machlis

Beauty and brains: Plotly combines dataviz and serious statistical analysis

November 06, 2013 5:12 AM EST

There are a number of cloud services that can help you visualize data, but few that I know of that try to excel in both visualization and statistical analysis -- without any sort of desktop download.

But that's the goal of newcomer Plotly, a service for creating and sharing data visualizations that also offers statistical analysis tools -- plus a robust API, the ability to graph custom functions and a built-in Python shell.

"We want you to be able to import from everywhere," says co-founder and COO Matt Sundquist -- and then have all necessary tools for data formatting, wrangling and visualizing.

One of my first reactions when seeing Plotly: I sure wish this was around back when I was studying statistics. And in fact teachers are among the service's intended markets. But Plotly also has its eye on practicing scientists, engineers, analysts and pretty much "anyone who has an interest in data ... [or] a data problem," Sundquist says. The company also hopes analysts and quants will start using Plotly the way developers use GitHub: as a way to showcase and share their best work.

 

Image of graph created on Plotly

A graph created with Plotly -- see the interactive version on Plotly

For now, Plotly users can share data and visualizations either privately or publicly with a free public beta account. Soon there is likely to be a GitHub-like freemium model where free accounts must share all data publicly and paid accounts will be needed in order to keep information private or shared with a private group. The Plotly team is also working on enterprise account capabilities for group administration, Sundquist said.

How it works. Data can be uploaded in several popular formats -- Excel, CSV, TSV, Matlab and Access as well as spreadsheets from Google Drive. Once uploaded, there are a few data-tweaking options for your table such as find-and-replace, convert Unix timestamps to human-readable date/time formats and swapping columns and rows.

With a few clicks you can visualize data as a line graph, scatter plot, area chart, bar chart, histogram, box plot or heat map (but not geography-based maps; Plot.ly doesn't do geocoding or plotting points by address). The resulting visualizations are interactive, allowing you not only to see data points while mousing over a chart or graph but also to zoom in and pan out. You can also click to see the full numerical data behind all Plotly visualizations.

Users can customize design options like fonts and colors, include text annotation and add a number of statistical graphing options. Built-in stats range from basics like mean, median, standard deviation, variance and standard error to integrals, ANOVAs, T-tests and Chi-squared tests. 

Plotly linear regression

Adding statistical analysis to a Plotly graphic

But even if you don't need any of these stats for a specific project -- or these terms don't mean anything to you -- you can still use the service to make nice interactive graphics. Sundquist said the Plotly team is working on additional themes so users can get different looks without having to select their own colors, fonts and axis designs.

For those who'd like to use Plotly's visualization service but not the Web interface, the API supports six languages and platforms: Python, Matlab, R, Arduino (a tool for managing physical-world sensors and controllers), Julia and Perl as well as basic REST. There's also a wrapper for Ruby written by Harvard computer science teaching fellow Louis Mullie.

Ease of use. It took me a bit of clicking around to figure out how to do things like add a new visualization to existing data or group data by a specific factor, but many tasks were quite doable without having to read instructions or watch tutorial videos.

Some tasks are less intuitive, such as embedding a visualization into an external Web site -- even Sundquist admitted the feature is "well hidden" for now, as a plot's share button offers options for a private or public link or sending an email. (To generate iframe embed code, you need to select a thumbnail view of all your files and plots and then click the share link under the one you want to embed.)

In general, though, it didn't take too much work to figure out a good portion of Plotly's capabilities.

Technology. Plotly was built in Python and the Django framework, with a front end using JavaScript -- primarily the visualization library D3, HTML and CSS. Files are hosted on Amazon S3. Visualizations can't be viewed unless they're shared, Sundquist said, even if someone guesses the URL.

Bottom line. For people who are serious about data and statistics but don't want to spend a ton of time designing and coding visualizations, Plotly is a potentially appealing platform. Ditto for those who are comfortable using languages like Python or R but would rather spend their time doing modeling and analysis than creating attractive graphics to share.

Want more free data tools? See my chart of 30+ free tools for data visualization and analysis.

 

High school science teacher explains how to use Plotly