我知道cassandra可以合并表、行键、删除墓碑等等。
-
但是我真的很想知道它是如何执行压缩的
-
由于sstable是不可变的,它是否将所有相关数据复制到新文件?当写入这个新文件时,它会丢弃标记为data的墓碑。
我知道压缩是什么,但我想知道它是如何实现的(T)
我希望这个帖子有帮助,如果你关注它的所有帖子和评论
http://comments.gmane.org/gmane.comp.db.cassandra.user/10577AFAIK
Whenever memtable is flushed from memory to disk they are just appended[Not updated] to new SSTable created, sorted via rowkey.
SSTable merge[updation] will take place only during compaction.
Till then read path will read from all the SSTable having that key you look up and the result from them is merged to reply back,
Two types : Minor and Major
Minor compaction is triggered automatically whenever a new sstable is being created.
May remove all tombstones
Compacts sstables of equal size in to one [initially memtable flush size] when minor compaction threshold is reached [4 by default].
Major Compaction is manually triggered using nodetool
Can be applied over a column family over a time
Compacts all the sstables of a CF in to 1
Compacts the SSTables and marks delete over unneeded SSTables. GC takes care of freeing up that space
问候,泰米尔
有两种方式运行压缩:
A-轻微压实。自动运行。B-严重压实。运行mannualy。
在这两种情况下都需要x个文件(每个CF)并处理它们。在此过程中,将ttl过期的行标记为墓碑,并删除现有的墓碑。用它生成一个新文件。在这次压缩中生成的墓碑将在下一次压缩中被删除(如果使用宽限期gc_grace)。
A和B之间的差值是所取文件的数量和最终文件的数量。A获取几个相似的文件(大小相似)并生成一个新文件。B取所有文件,只生成一个大文件。