Apache Hadoop is a software framework that supports data intensive distributed applications under a free-to-use license. Hadoop has been the driving force behind the Big Data industry. It enables applications to work over massive quantities of data with built-in features enabling enhanced reliability, scalability, and functionality over old style data tools.
If your legacy data storage methods are costly, or if you need to access your data fast, Hadoop's Distributed File System (HDFS) may be for you. Hadoop’s ability to implement new MapReduce methods may be exactly what you need if you have to make sense of large quantities of data. Storing and interacting with data using familiar SQL-type commands and traditional tools are also made easier by Hadoop framework’s hBase and Hive.
Most programs will want these and other tools provided in a way where they are all configured and proven to work together. The free-to-use Cloudera Distribution featuring Apache Hadoop (CDH) is the most widely used method to achieve this goal.
Please join CTOvision.com publisher Bob Gourley as he provides context on the emerging Big Data discipline and discusses the genesis of the MapReduce concept. Omer Trajman, VP of Technology Solutions will follow with an update on CDH and Cloudera Enterprise, two of the most popular capabilities for fielding Hadoop in production environments.
Contact Name: Edward Walinsky
Sign up for our newsletter.
Senior Staff Writer Ross Wilkers talks with Hitachi Vantara CTO Gary HIx discusses how many of the changes brought by the COVID-19 pandemic might become permanent fixtures in the market.
In this episode of the Project 38 podcast, Editor Nick Wakeman hosts a roundtable with several reporters to explore some of the most pressing issues in the market.