Here is a dictionary of the big data technologies I most commonly run into. It is by no means, a complete list of all big data terms.

Name Type Definition
Accumulo Database A NoSQL database developed and open sourced by the National Security Agency that provides cell-level access labels.
Azure Server A cloud platform, owned by Microsoft, for running large scale applications.
BigTable Database A proprietary datastore developed by Google, accessible through the Google App Engine.
Cassandra Database A NoSQL database, originally developed at Facebook, that has a distributed key/value store.
Cloudera Company A company that provides a commercial distribution of Hadoop along with support, services, and training to customers.
CouchDB Database A NoSQL, document-oriented database with a JavaScript interface that uses JSON to store data.
DynamoDB Database A proprietary, NoSQL database developed at Amazon. It is offered as part of the Amazon Web Services portfolio.
EC2 Server A cloud platform owned by Amazon that allows users to rent virtual computers or running large scale applications. Stands for Elastic Compute Cloud.
ElasticSearch Search A search engine platform, built on Lucene that focuses more on web applications.
Flume Service A service that focuses on information gathering for log data.
Google App Engine Server A cloud platform, owned by Google, for hosting web applications.
Hadoop Framework A framework, written in Java, for distributed storage and processing of data across computer clusters. It consists of the Hadoop Distributed File System, for storage, and MapReduce for data processing.
Hadoop Distributed File System File System Commonly called HDFS. A file system written in Java for the Hadoop framework.
Hbase Database A NoSQL database, written in Java, modeled after BigTable that uses a data structure of keys, column families, and column names.
Heroku Server A cloud platform for hosting web applications, that focuses on scalability and ease of deployment.
Hive Service A data warehouse interface, developed at Facebook, that supports SQL-like language.
Hortonworks Company A public company that provides a commercial distribution of Hadoop.
Hypertable Database A NoSQL database, based on BigTable, written in C++, that focuses on performance.
Impala Service A SQL query engine, developed at Cloudera, that runs on Hadoop.
Java Language A computer programming language often used for building and connecting big data technology.
JSON Definition A notation for storing data objects.
Kafka Service A service for handling a large number of events related to real-time data feeds.
Kudu Service A columnar storage engine that runs on Hadoop.
Lucene Search A Java library for indexing and searching documents.
MapR Company A company that provides a commercial distribution of Hadoop for enterprises.
MapReduce Programming Model An implementation for processing large data sets in parallel, across a distributed computing cluster.
MongoDB Database A NoSQL, document-oriented database with JSON like objects.
NoSQL Definition A term meaning “non SQL” or “not only SQL” for databases that have models differing from relational databases.
Oozie Service A job control system for workflow scheduling.
Pig Service A high-level platform, designed for Hadoop, that supports SQL-like queries with a language called Pig Latin.
Python Language A programing language used for analytical computation with big data technologies.
Redis Database A NoSQL, in memory database that uses a key/value store.
Riak Database A NoSQL database, written in Erlang, that focuses on high availability and fault tolerance.
Scala Language A computer programming language used with big data applications.
Solr Search An enterprise search platform, built on the Lucene Java library.
Spark Framework A framework, written in Scala, for in-memory computing across computer clusters.
Voldemort Database A NoSQL database, based on DynamoDB developed at LinkedIn. It focuses on fast lookups for large distributed clusters.

Something missing? Let me know.