一. HLog在HDFS上位置和RegionServer对应关系
HLog持久化在HDFS之上, HLog存储位置查看:
hadoop fs -ls /hbase/.logs
通过HBase架构图, HLog与HRegionServer一一对应,
Found 5 items drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS02,61020,1365661380729 drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS03,61020,1365661378638 drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS04,61020,1365661379200 drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:22 /hbase/.logs/HADOOPCLUS05,61020,1365661378053 drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS06,61020,1365661378832
HADOOPCLUS02 ~ HADOOPCLUS06 为RegionServer.
上面显示的文件目录为HLog存储. 如果HLog已经失效(所有之前的写入MemStore已经持久化在HDFS),HLog存在于HDFS之上的文件会从/hbase/.logs转移至/hbase/.oldlogs, oldlogs会删除, HLog的生命周期结束.
二. HBase写流程和写HLog的阶段点.
向HBase Put数据时通过HBaseClient-->连接ZooKeeper--->-ROOT--->.META.-->RegionServer-->Region:
Region写数据之前会先检查MemStore.
1. 如果此Region的MemStore已经有缓存已有写入的数据, 则直接返回;
2. 如果没有缓存, 写入HLog(WAL), 再写入MemStore.成功后再返回.
MemStore内存达到一定的值调用flush成为StoreFile,存到HDFS.
在对HBase插入数据时,插入到内存MemStore所以很快,对于安全性不高的应用可以关闭HLog,可以获得更高的写性能.
三. HLog相关源码.
1. 总览.
写入HLog主要靠HLog对象的doWrite(HRegionInfo info, HLogKey logKey, WALEdit logEdit)
或者completeCacheFlush(final byte [] encodedRegionName, final byte [] tableName, final long logSeqId, final boolean isMetaRegion),
在这两方法中调用this.writer.append(new HLog.Entry(logKey, logEdit));方法写入操作.
在方法内构造HLog.Entry:使用当前构造好的writer, 见上图引用对象,
完整实现类: org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter,
HLog 方法createWriterInstance(fs, newPath, conf) 创建 Writer对象.
2. SequenceFileLogWriter 和SequenceFileLogReader
在SequenceFileLogWriter 类中可以看到, 使用Hadoop SequenceFile.Writer写入到文件系统. SequenceFile是HLog在Hadoop存储的文件格式.
HLog.Entry为HLog存储的最小单位.
public class SequenceFileLogWriter implements HLog.Writer { private final Log LOG = LogFactory.getLog(this.getClass()); // The hadoop sequence file we delegate to. private SequenceFile.Writer writer; // The dfsclient out stream gotten made accessible or null if not available. private OutputStream dfsClient_out; // The syncFs method from hdfs-200 or null if not available. private Method syncFs; // init writer need the key; private Class<? extends HLogKey> keyClass; @Override public void init(FileSystem fs, Path path, Configuration conf) throws IOException { // 1. create Hadoop file SequenceFile.Writer for writer initation. // 2. Get at the private FSDataOutputStream inside in SequenceFile so we can call sync on it. for dfsClient_out initation. } @Override public void append(HLog.Entry entry) throws IOException { this.writer.append(entry.getKey(), entry.getEdit()); } @Override public void sync() throws IOException { if (this.syncFs != null) { try { this.syncFs.invoke(this.writer, HLog.NO_ARGS); } catch (Exception e) { throw new IOException("Reflection", e); } } } }
SequenceFileLogReader为读取HLog.Entry对象使用.
3. HLog.Entry与属性logSeqNum
每一个Entry包含了 HLogKey和WALEdit
HLogKey包含了基本信息:
private byte [] encodedRegionName; private byte [] tablename; private long logSeqNum; // Time at which this edit was written. private long writeTime; private byte clusterId;
logSeqNum是一个重要的字段值, sequence number是作为StoreFile里的一个元数据字段,可以针对StoreFile直接得到longSeqNum;
public class StoreFile { static final String HFILE_BLOCK_CACHE_SIZE_KEY = "hfile.block.cache.size"; private static BlockCache hfileBlockCache = null; // Is this from an in-memory store private boolean inMemory; // Keys for metadata stored in backing HFile. // Set when we obtain a Reader. StoreFile row 140 private long sequenceid = -1; /** * @return This files maximum edit sequence id. */ public long getMaxSequenceId() { return this.sequenceid; } /** * Return the highest sequence ID found across all storefiles in * the given list. Store files that were created by a mapreduce * bulk load are ignored, as they do not correspond to any edit * log items. * @return 0 if no non-bulk-load files are provided or, this is Store that * does not yet have any store files. */ public static long getMaxSequenceIdInList(List<StoreFile> sfs) { long max = 0; for (StoreFile sf : sfs) { if (!sf.isBulkLoadResult()) { max = Math.max(max, sf.getMaxSequenceId()); } } return max; } /** * Writes meta data. important for maxSequenceId WRITE!! * Call before {@link #close()} since its written as meta data to this file. * @param maxSequenceId Maximum sequence id. * @param majorCompaction True if this file is product of a major compaction * @throws IOException problem writing to FS */ public void appendMetadata(final long maxSequenceId, final boolean majorCompaction) throws IOException { writer.appendFileInfo(MAX_SEQ_ID_KEY, Bytes.toBytes(maxSequenceId)); writer.appendFileInfo(MAJOR_COMPACTION_KEY, Bytes.toBytes(majorCompaction)); appendTimeRangeMetadata(); } } /** * Opens reader on this store file. Called by Constructor. * @return Reader for the store file. * @throws IOException * @see #closeReader() */ private Reader open() throws IOException { // ........ this.sequenceid = Bytes.toLong(b); if (isReference()) { if (Reference.isTopFileRegion(this.reference.getFileRegion())) { this.sequenceid += 1; } } this.reader.setSequenceID(this.sequenceid); return this.reader; } }
Store 类对StoreFile进行了管理, 如compact.在 很多StoreFile进行合并时, 取值最大的longSeqNum;
public class Store implements HeapSize { /** * Compact the StoreFiles. This method may take some time, so the calling * thread must be able to block for long periods. * * <p>During this time, the Store can work as usual, getting values from * StoreFiles and writing new StoreFiles from the memstore. * * Existing StoreFiles are not destroyed until the new compacted StoreFile is * completely written-out to disk. * * <p>The compactLock prevents multiple simultaneous compactions. * The structureLock prevents us from interfering with other write operations. * * <p>We don't want to hold the structureLock for the whole time, as a compact() * can be lengthy and we want to allow cache-flushes during this period. * * @param forceMajor True to force a major compaction regardless of thresholds * @return row to split around if a split is needed, null otherwise * @throws IOException */ StoreSize compact(final boolean forceMajor) throws IOException { boolean forceSplit = this.region.shouldForceSplit(); boolean majorcompaction = forceMajor; synchronized (compactLock) { /* get store file sizes for incremental compacting selection. * normal skew: * * older ----> newer * _ * | | _ * | | | | _ * --|-|- |-|- |-|---_-------_------- minCompactSize * | | | | | | | | _ | | * | | | | | | | | | | | | * | | | | | | | | | | | | */ // ............. this.lastCompactSize = totalSize; // Max-sequenceID is the last key in the files we're compacting long maxId = StoreFile.getMaxSequenceIdInList(filesToCompact); // Ready to go. Have list of files to compact. LOG.info("Started compaction of " + filesToCompact.size() + " file(s) in cf=" + this.storeNameStr + (references? ", hasReferences=true,": " ") + " into " + region.getTmpDir() + ", seqid=" + maxId + ", totalSize=" + StringUtils.humanReadableInt(totalSize)); StoreFile.Writer writer = compact(filesToCompact, majorcompaction, maxId); // Move the compaction into place. StoreFile sf = completeCompaction(filesToCompact, writer); } return checkSplit(forceSplit); } /** * Do a minor/major compaction. Uses the scan infrastructure to make it easy. * * @param filesToCompact which files to compact * @param majorCompaction true to major compact (prune all deletes, max versions, etc) * @param maxId Readers maximum sequence id. * @return Product of compaction or null if all cells expired or deleted and * nothing made it through the compaction. * @throws IOException */ private StoreFile.Writer compact(final List<StoreFile> filesToCompact, final boolean majorCompaction, final long maxId) throws IOException { // Make the instantiation lazy in case compaction produces no product; i.e. // where all source cells are expired or deleted. StoreFile.Writer writer = null; try { // ...... } finally { if (writer != null) { // !!!! StoreFile.Writer write Metadata for maxid. writer.appendMetadata(maxId, majorCompaction); writer.close(); } } return writer; } }
在compact时, 第一个compact(final boolean forceMajor)调用
compact(final List<StoreFile> filesToCompact, final boolean majorCompaction, final long maxId)
此方法最后写入writer.appendMetadata(maxId, majorCompaction); 也就是StoreFile中的appendMetadata方法.
可见, 是在finally中写入最大的logSeqNum. 这样StoreFile在取得每个logSeqNum, 可以由open读取logSeqNum;
clusterId 保存在Hadoop集群ID.
4. HLog的生命周期
这里就涉及到HLog的生命周期问题了.如果HLog的logSeqNum对应的HFile已经存储在HDFS了(主要是比较HLog的logSeqNum是否比与其对应的表的HDFS StoreFile的maxLongSeqNum小),那么HLog就没有存在的必要了.移动到.oldlogs目录,最后删除.
反过来如果此时系统down了,可以通过HLog把数据从HDFS中读取,把要原来Put的数据读取出来, 重新刷新到HBase.
补充资料:
HBase 架构101 –预写日志系统 (WAL)
http://cloudera.iteye.com/blog/911700
HLog的结构和生命周期
相关推荐
阿里云HBase备份恢复的原理以及实践.pdf
Hbase HLog源代码阅读笔记 HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)...
详细介绍了hbase的框架结构,运行原理,环境搭建,shell命令,java开发和接口集成。循序渐进,由浅入深,描述非常清晰,非常适合Hbase爱好者构建基础知识体系。内容包括四大部分:1.HBase组件和运行原理 2.环境搭建 ...
hbase表结构设计,新建表,查询表语句,删除表数据,删除表的例子。
里面包含HBase非关系数据库的工作原理,使用它的优点,缺点.
Hbase底层剖析结构,Hbase底层剖析结构,Hbase底层剖析结构
本文档图文并茂地详细的描述了HBASE列式数据的架构和原理,是HBASE入门不错的的资料
hbase 学习 hbase原理 hbase资料 ,呕心沥血整理的。很实用,不适用可拍砖。。
#资源达人分享计划#
HBase表结构.jpg
hbase原理和设计,包括二级索引,rowkey设计,常见的坑.
这里用图的方式详细分析了hbase的表结构。包括三张表等等
HBase 基本原理,出版于 2014,HBase is a NoSQL database that primarily works on top of Hadoop. HBase is based on the storage architecture followed by the BigTable. HBase inherits the storage design ...
描述的Hbase的原理,安装已经实现的API,是新手入门的不错教材。值得研究
HBase原理及实例
藏经阁-阿里云HBase备份恢复的原理以及实践-15.pdf
HBase作为Google发表BigTable论文的开源实现版本,是一种分布式列式存储的数据库,...当然,本文假设用户有一定的HBase知识基础,不会详细去介绍HBase的架构和原理,本文着重介绍HBase和Flink在实际场景中的结合使用
从HBase的集群搭建、HBaseshell操作、java编程、架构、原理、涉及的数据结构,并且结合陌陌海量消息存储案例来讲解实战HBase 课程亮点 1,知识体系完备,从小白到大神各阶段读者均能学有所获。 2,生动形象,化繁为...
hbase-sdk是基于hbase-client和hbase-thrift的原生API封装的一款轻量级的HBase ORM框架。 针对HBase各版本API(1.x~2.x)间的差异,在其上剥离出了一层统一的抽象。并提供了以类SQL的方式来读写HBase表中的数据。对...
HBase的模式Schema设计的一些概念和原则 5 1)模式的创建与更新 5 2)列族的数量 6 3)行键设计RowKey 6 5. HBase的拓扑结构是什么? 7 1)拓扑结构 7 2)HBase与ZooKeeper的关系是什么? 7 3)HBase的内部结构管理...