There is a huge variety of data models and query API to NoSQL databases. (Relevant references Thrift, Map/Reduce, Thrift, Cursor,Graph, Collection, Nested hashes, get/put, get/put, get/put) System column family (columnfamily) is used in Cassandra and HBase, and her idea was instilled in them from documents describing the Google Bigtable (Cassandra though a bit away from the ideas of Bigtable and introduced supercolumns). In both systems, you have rows and columns like you used to see, but the number of rows is not large: each line has more or fewer columns, depending on the need and the columns cannot be determined in advance. System key/value itself is simple, and not complicated to implement, but not effective if you are only interested in the query or updating of the data. It is also difficult to implement complex structures on top of distributed systems. Document-oriented databases are essentially the next level of systems, key/value, allowing nested data to associate with each key. Support for such queries is more effective than just returning the entire BLOB each time. Neo4J has a unique data model, storing objects and relationships as nodes and edges count. For queries that correspond to that model (e.g., hierarchical data), they can be a thousand times faster than the alternatives. Scalaris is unique in the use of distributed transactions across multiple keys. A discussion of the tradeoffs between consistency and availability is beyond the scope of this post, but this is another aspect that must be considered in the evaluation of distributed systems.

Leave a Reply

Your email address will not be published. Required fields are marked *