-
Notifications
You must be signed in to change notification settings - Fork 11
Description
It's possible I've got this all wrong, but I've thought about it for a while and I'm reasonably certain I understand Cassandra. In "WTF is a SuperColumn", the data model is described as Keyspace.SuperColumnFamily[Row][Super][Column] = value. Pandra does not have a Row type; instead each SCF has a keyID, which is the Row index (first key in brackets). This means that, in order to add rows to a SuperColumnFamily, we must create a NEW SuperColumnFamily object for every entry, with the same keyID (since Pandra uses that to mean the second member of the dotted pair) but a different name. This is backward: there should be a Row class or some such, which does what SCF does now (hold SuperColumns), and the SuperColumnFamily class should be repurposed to be solely a Map<String, Row>.
More than just a naming issue, this implementation has technical implications. Specifically, in PandraSuperColumnFamily::save(), there is a comment /* @todo there must be a better way */, followed by looping over all of the SuperColumn children. There is a better way! The Thrift method batch_mutate takes a keyspace, and a map<string, map<string, list>>. Mutation, meanwhile, can describe a SuperColumn insertion, which itself is a list of Column insertions. Pandra is not making use of all of these levels of hierarchy: every save() call in Pandra's API could be implemented as a single Thrift call, with no need for multiple requests.
My rough sketch of an implementation would be:
class SCF { function save() { $mutations = array(); foreach ($this->getRows() as $key => $superCol) { $mutations[$key] = array($superCol->getMutation()); // see below } $realParam = array($this->name => $mutations); // wrap it up to save just this SCF $client->batch_mutate($this->keyspace, $realParam); } } class SuperColumn { function getMutation() { $cols = array(); foreach ($this->getColumns() as $name => $value) { $cols[] = new ThriftColumn($name, $value); } return new ThriftMutation(INSERT, new ThriftSuperColumn($this->name, $cols)); } }
Obviously this glosses over quite a few details, like deletions, but I think the structure is right. I definitely sympathize with your erroneous (but see disclaimer at top!) implementation: even when you know exactly what to do it's hard to think about SuperColumnFamilies!