A colleague recently pointed me at Datameer, an analytics front-end for Hadoop. As their website and datasheet mention, they use a familiar spreadsheet interface for large data. I recently saw a demo of the product, and I thought they had done a nice implementation of joins through a graphical user interface targeted at non-ETL experts. At least based on the demo, I thought anyone who has decent experience with Excel would be able to effectively use it. Note that it is not a tool targeting “BI for the masses”; it is definitely more of an analyst’s or an IT expert’s tool.
Added bonus that it easily integrates all the traditional data sources as well e.g. it’s easy to join a MySQL table against a “table” in Hadoop / Hive. Would be cool if they could automatically discover the schema of your Hive tables and your traditional DB tables. They may already be able to; I didn’t see anything specific on their website indicating that though.
Tools like Datameer are a great addition to any BI practitioner’s toolset. As datasets get larger, Hadoop allows ready access to large amounts of data at a reasonable price. Datameer will now allow us to make that data more broadly accessible to a larger group of analysts.