ZooKeeper 的一系列工具
脚本
zkServer.sh
ZooKeeper 服务器操作的命令。
Usage: ./zkServer.sh {start|start-foreground|stop|version|restart|status|upgrade|print-cmd}
# start the server
./zkServer.sh start
# start the server in the foreground for debugging
./zkServer.sh start-foreground
# stop the server
./zkServer.sh stop
# restart the server
./zkServer.sh restart
# show the status,mode,role of the server
./zkServer.sh status
JMX enabled by default
Using config: /data/software/zookeeper/conf/zoo.cfg
Mode: standalone
# Deprecated
./zkServer.sh upgrade
# print the parameters of the start-up
./zkServer.sh print-cmd
# show the version of the ZooKeeper server
./zkServer.sh version
Apache ZooKeeper, version 3.6.0-SNAPSHOT 06/11/2019 05:39 GMT
status
命令建立客户端连接到服务器以执行诊断命令。当 ZooKeeper 集群在仅客户端 SSL 模式下启动(通过从 zoo.cfg 中省略 clientPort)时,在使用 ./zkServer.sh status
命令找出 ZooKeeper 服务器是否正在运行之前,必须提供额外的 SSL 相关配置。示例
CLIENT_JVMFLAGS="-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.ssl.trustStore.location=/tmp/clienttrust.jks -Dzookeeper.ssl.trustStore.password=password -Dzookeeper.ssl.keyStore.location=/tmp/client.jks -Dzookeeper.ssl.keyStore.password=password -Dzookeeper.client.secure=true" ./zkServer.sh status
zkCli.sh
查看 ZooKeeperCLI
zkEnv.sh
ZooKeeper 服务器的环境设置
# the setting of log property
ZOO_LOG_DIR: the directory to store the logs
zkCleanup.sh
清理旧快照和事务日志。
Usage:
* args dataLogDir [snapDir] -n count
* dataLogDir -- path to the txn log directory
* snapDir -- path to the snapshot directory
* count -- the number of old snaps/logs you want to keep, value should be greater than or equal to 3
# Keep the latest 5 logs and snapshots
./zkCleanup.sh -n 5
zkTxnLogToolkit.sh
TxnLogToolkit 是随 ZooKeeper 一起提供的命令行工具,它能够恢复 CRC 损坏的事务日志条目。
在没有任何命令行参数或带有 -h,--help
参数的情况下运行它,它会输出以下帮助页面
$ bin/zkTxnLogToolkit.sh
usage: TxnLogToolkit [-dhrv] txn_log_file_name
-d,--dump Dump mode. Dump all entries of the log file. (this is the default)
-h,--help Print help message
-r,--recover Recovery mode. Re-calculate CRC for broken entries.
-v,--verbose Be verbose in recovery mode: print all entries, not just fixed ones.
-y,--yes Non-interactive mode: repair all CRC errors without asking
默认行为是安全的:它将给定事务日志文件的条目转储到屏幕上:(与使用 -d,--dump
参数相同)
$ bin/zkTxnLogToolkit.sh log.100000001
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
4/5/18 2:15:58 PM CEST session 0x16295bafcc40000 cxid 0x0 zxid 0x100000001 createSession 30000
CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
4/5/18 2:16:12 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x100000003 createSession 30000
4/5/18 2:17:34 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x200000001 closeSession null
4/5/18 2:17:34 PM CEST session 0x16295bd23720000 cxid 0x0 zxid 0x200000002 createSession 30000
4/5/18 2:18:02 PM CEST session 0x16295bd23720000 cxid 0x2 zxid 0x200000003 create '/andor,#626262,v{s{31,s{'world,'anyone}}},F,1
EOF reached after 6 txns.
上述事务日志文件的第二个条目中存在 CRC 错误。在转储模式下,该工具只会将此信息打印到屏幕上,而不会触及原始文件。在恢复模式(-r,--recover
标志)下,原始文件仍然保持不变,所有事务将被复制到一个带有“.fixed”后缀的新 txn 日志文件中。它重新计算 CRC 值并复制计算出的值,如果它与原始 txn 条目不匹配。默认情况下,该工具以交互方式工作:每当遇到 CRC 错误时,它都会要求确认。
$ bin/zkTxnLogToolkit.sh -r log.100000001
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
Would you like to fix it (Yes/No/Abort) ?
回答是意味着新计算的 CRC 值将输出到新文件中。否意味着将复制原始 CRC 值。中止将中止整个操作并退出。(在这种情况下,“.fixed”不会被删除,并保留在半完成状态:仅包含已处理的条目,或者如果操作在第一个条目处中止,则仅包含头。)
$ bin/zkTxnLogToolkit.sh -r log.100000001
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
Would you like to fix it (Yes/No/Abort) ? y
EOF reached after 6 txns.
Recovery file log.100000001.fixed has been written with 1 fixed CRC error(s)
恢复的默认行为是静默的:只有 CRC 错误的条目才会打印到屏幕上。可以使用 -v,--verbose
参数打开详细模式以查看所有记录。可以使用 -y,--yes
参数关闭交互模式。在这种情况下,所有 CRC 错误都将在新的事务文件中修复。
zkSnapShotToolkit.sh
将快照文件转储到 stdout,显示每个 zk 节点的详细信息。
# help
./zkSnapShotToolkit.sh
/usr/bin/java
USAGE: SnapshotFormatter [-d|-json] snapshot_file
-d dump the data for each znode
-json dump znode info in json format
# show the each zk-node info without data content
./zkSnapShotToolkit.sh /data/zkdata/version-2/snapshot.fa01000186d
/zk-latencies_4/session_946
cZxid = 0x00000f0003110b
ctime = Wed Sep 19 21:58:22 CST 2018
mZxid = 0x00000f0003110b
mtime = Wed Sep 19 21:58:22 CST 2018
pZxid = 0x00000f0003110b
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x00000000000000
dataLength = 100
# [-d] show the each zk-node info with data content
./zkSnapShotToolkit.sh -d /data/zkdata/version-2/snapshot.fa01000186d
/zk-latencies2/session_26229
cZxid = 0x00000900007ba0
ctime = Wed Aug 15 20:13:52 CST 2018
mZxid = 0x00000900007ba0
mtime = Wed Aug 15 20:13:52 CST 2018
pZxid = 0x00000900007ba0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x00000000000000
data = eHh4eHh4eHh4eHh4eA==
# [-json] show the each zk-node info with json format
./zkSnapShotToolkit.sh -json /data/zkdata/version-2/snapshot.fa01000186d
[[1,0,{"progname":"SnapshotFormatter.java","progver":"0.01","timestamp":1559788148637},[{"name":"\/","asize":0,"dsize":0,"dev":0,"ino":1001},[{"name":"zookeeper","asize":0,"dsize":0,"dev":0,"ino":1002},{"name":"config","asize":0,"dsize":0,"dev":0,"ino":1003},[{"name":"quota","asize":0,"dsize":0,"dev":0,"ino":1004},[{"name":"test","asize":0,"dsize":0,"dev":0,"ino":1005},{"name":"zookeeper_limits","asize":52,"dsize":52,"dev":0,"ino":1006},{"name":"zookeeper_stats","asize":15,"dsize":15,"dev":0,"ino":1007}]]],{"name":"test","asize":0,"dsize":0,"dev":0,"ino":1008}]]
zkSnapshotRecursiveSummaryToolkit.sh
递归收集和显示所选节点的子节点计数和数据大小。
$./zkSnapshotRecursiveSummaryToolkit.sh
USAGE:
SnapshotRecursiveSummary <snapshot_file> <starting_node> <max_depth>
snapshot_file: path to the zookeeper snapshot
starting_node: the path in the zookeeper tree where the traversal should begin
max_depth: defines the depth where the tool still writes to the output. 0 means there is no depth limit, every non-leaf node's stats will be displayed, 1 means it will only contain the starting node's and it's children's stats, 2 ads another level and so on. This ONLY affects the level of details displayed, NOT the calculation.
# recursively collect and display child count and data for the root node and 2 levels below it
./zkSnapshotRecursiveSummaryToolkit.sh /data/zkdata/version-2/snapshot.fa01000186d / 2
/
children: 1250511
data: 1952186580
-- /zookeeper
-- children: 1
-- data: 0
-- /solr
-- children: 1773
-- data: 8419162
---- /solr/configs
---- children: 1640
---- data: 8407643
---- /solr/overseer
---- children: 6
---- data: 0
---- /solr/live_nodes
---- children: 3
---- data: 0
zkSnapshotComparer.sh
SnapshotComparer 是一款工具,它加载并比较两个快照,其中包含可配置的阈值和各种筛选器,并输出有关增量的信息。
增量包括将一个快照与另一个快照进行比较后添加、更新、删除的特定 znode 路径。
它在涉及快照分析的用例中很有用,例如离线数据一致性检查和数据趋势分析(例如,在何时在哪个 zNode 路径下增长)。
此工具仅输出有关永久节点的信息,忽略会话和临时节点。
它提供了两个调整参数来帮助滤除噪声:1. --nodes
添加/删除的子节点阈值数量;2. --bytes
添加/删除的字节阈值数量。
查找快照
可以在 Zookeeper 数据目录 中找到快照,该目录在设置 Zookeeper 服务器时在 conf/zoo.cfg 中配置。
支持的快照格式
此工具支持未压缩的快照格式和压缩的快照文件格式:snappy
和 gz
。可以使用此工具直接比较具有不同格式的快照,而无需解压缩。
运行该工具
使用没有命令行参数或无法识别的参数运行该工具,它会输出以下帮助页面
usage: java -cp <classPath> org.apache.zookeeper.server.SnapshotComparer
-b,--bytes <BYTETHRESHOLD> (Required) The node data delta size threshold, in bytes, for printing the node.
-d,--debug Use debug output.
-i,--interactive Enter interactive mode.
-l,--left <LEFT> (Required) The left snapshot file.
-n,--nodes <NODETHRESHOLD> (Required) The descendant node delta size threshold, in nodes, for printing the node.
-r,--right <RIGHT> (Required) The right snapshot file.
示例命令
./bin/zkSnapshotComparer.sh -l /zookeeper-data/backup/snapshot.d.snappy -r /zookeeper-data/backup/snapshot.44 -b 2 -n 1
示例输出
...
Deserialized snapshot in snapshot.44 in 0.002741 seconds
Processed data tree in 0.000361 seconds
Node count: 10
Total size: 0
Max depth: 4
Count of nodes at depth 0: 1
Count of nodes at depth 1: 2
Count of nodes at depth 2: 4
Count of nodes at depth 3: 3
Node count: 22
Total size: 2903
Max depth: 5
Count of nodes at depth 0: 1
Count of nodes at depth 1: 2
Count of nodes at depth 2: 4
Count of nodes at depth 3: 7
Count of nodes at depth 4: 8
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for depth 0
Node found in both trees. Delta: 2903 bytes, 12 descendants
Analysis for depth 1
Node /zk_test found in both trees. Delta: 2903 bytes, 12 descendants
Analysis for depth 2
Node /zk_test/gz found in both trees. Delta: 730 bytes, 3 descendants
Node /zk_test/snappy found in both trees. Delta: 2173 bytes, 9 descendants
Analysis for depth 3
Node /zk_test/gz/12345 found in both trees. Delta: 9 bytes, 1 descendants
Node /zk_test/gz/a found only in right tree. Descendant size: 721. Descendant count: 0
Node /zk_test/snappy/anotherTest found in both trees. Delta: 1738 bytes, 2 descendants
Node /zk_test/snappy/test_1 found only in right tree. Descendant size: 344. Descendant count: 3
Node /zk_test/snappy/test_2 found only in right tree. Descendant size: 91. Descendant count: 2
Analysis for depth 4
Node /zk_test/gz/12345/abcdef found only in right tree. Descendant size: 9. Descendant count: 0
Node /zk_test/snappy/anotherTest/abc found only in right tree. Descendant size: 1738. Descendant count: 0
Node /zk_test/snappy/test_1/a found only in right tree. Descendant size: 93. Descendant count: 0
Node /zk_test/snappy/test_1/b found only in right tree. Descendant size: 251. Descendant count: 0
Node /zk_test/snappy/test_2/xyz found only in right tree. Descendant size: 33. Descendant count: 0
Node /zk_test/snappy/test_2/y found only in right tree. Descendant size: 58. Descendant count: 0
All layers compared.
交互模式
使用“-i”或“--interactive”进入交互模式
./bin/zkSnapshotComparer.sh -l /zookeeper-data/backup/snapshot.d.snappy -r /zookeeper-data/backup/snapshot.44 -b 2 -n 1 -i
有三种方法可以继续
- Press enter to move to print current depth layer;
- Type a number to jump to and print all nodes at a given depth;
- Enter an ABSOLUTE path to print the immediate subtree of a node. Path must start with '/'.
注意:如交互式消息所示,该工具仅显示根据调整参数字节阈值和节点阈值筛选的结果分析。
按 Enter 键打印当前深度层
Current depth is 0
Press enter to move to print current depth layer;
...
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for depth 0
Node found in both trees. Delta: 2903 bytes, 12 descendants
键入一个数字以跳转到并打印给定深度下的所有节点
(向前跳转)
Current depth is 1
...
Type a number to jump to and print all nodes at a given depth;
...
3
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for depth 3
Node /zk_test/gz/12345 found in both trees. Delta: 9 bytes, 1 descendants
Node /zk_test/gz/a found only in right tree. Descendant size: 721. Descendant count: 0
Filtered node /zk_test/gz/anotherOne of left size 0, right size 0
Filtered right node /zk_test/gz/b of size 0
Node /zk_test/snappy/anotherTest found in both trees. Delta: 1738 bytes, 2 descendants
Node /zk_test/snappy/test_1 found only in right tree. Descendant size: 344. Descendant count: 3
Node /zk_test/snappy/test_2 found only in right tree. Descendant size: 91. Descendant count: 2
(向后跳转)
Current depth is 3
...
Type a number to jump to and print all nodes at a given depth;
...
0
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for depth 0
Node found in both trees. Delta: 2903 bytes, 12 descendants
处理超出范围的深度
Current depth is 1
...
Type a number to jump to and print all nodes at a given depth;
...
10
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Depth must be in range [0, 4]
输入一个绝对路径以打印一个节点的直接子树
Current depth is 3
...
Enter an ABSOLUTE path to print the immediate subtree of a node.
/zk_test
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for node /zk_test
Node /zk_test/gz found in both trees. Delta: 730 bytes, 3 descendants
Node /zk_test/snappy found in both trees. Delta: 2173 bytes, 9 descendants
处理无效路径
Current depth is 3
...
Enter an ABSOLUTE path to print the immediate subtree of a node.
/non-exist-path
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for node /non-exist-path
Path /non-exist-path is neither found in left tree nor right tree.
处理无效输入
Current depth is 1
- Press enter to move to print current depth layer;
- Type a number to jump to and print all nodes at a given depth;
- Enter an ABSOLUTE path to print the immediate subtree of a node. Path must start with '/'.
12223999999999999999999999999999999999999
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Input 12223999999999999999999999999999999999999 is not valid. Depth must be in range [0, 4]. Path must be an absolute path which starts with '/'.
在比较完所有层后自动退出交互模式
Printing analysis for nodes difference larger than 2 bytes or node count difference larger than 1.
Analysis for depth 4
Node /zk_test/gz/12345/abcdef found only in right tree. Descendant size: 9. Descendant count: 0
Node /zk_test/snappy/anotherTest/abc found only in right tree. Descendant size: 1738. Descendant count: 0
Filtered right node /zk_test/snappy/anotherTest/abcd of size 0
Node /zk_test/snappy/test_1/a found only in right tree. Descendant size: 93. Descendant count: 0
Node /zk_test/snappy/test_1/b found only in right tree. Descendant size: 251. Descendant count: 0
Filtered right node /zk_test/snappy/test_1/c of size 0
Node /zk_test/snappy/test_2/xyz found only in right tree. Descendant size: 33. Descendant count: 0
Node /zk_test/snappy/test_2/y found only in right tree. Descendant size: 58. Descendant count: 0
All layers compared.
或随时使用 ^c
退出交互模式。
基准
YCSB
快速入门
本部分介绍如何在 ZooKeeper 上运行 YCSB。
1. 启动 ZooKeeper 服务器
2. 安装 Java 和 Maven
3. 设置 YCSB
Git 克隆 YCSB 并编译
git clone http://github.com/brianfrankcooper/YCSB.git
# more details in the landing page for instructions on downloading YCSB(https://github.com/brianfrankcooper/YCSB#getting-started).
cd YCSB
mvn -pl site.ycsb:zookeeper-binding -am clean package -DskipTests
4. 提供 ZooKeeper 连接参数
在计划运行的工作负载中设置 connectString、sessionTimeout、watchFlag。
zookeeper.connectString
zookeeper.sessionTimeout
zookeeper.watchFlag
- 启用 ZooKeeper 监视的参数,可选值:true 或 false,默认值为 false。
- 此参数无法测试监视性能,但可以测试启用监视时对读/写请求产生的影响。
./bin/ycsb run zookeeper -s -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p zookeeper.watchFlag=true
或者,可以使用 shell 命令设置配置,例如
# create a /benchmark namespace for sake of cleaning up the workspace after test.
# e.g the CLI:create /benchmark
./bin/ycsb run zookeeper -s -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p zookeeper.sessionTimeout=30000
5. 加载数据并运行测试
加载数据
# -p recordcount,the count of records/paths you want to insert
./bin/ycsb load zookeeper -s -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p recordcount=10000 > outputLoad.txt
运行工作负载测试
# YCSB workloadb is the most suitable workload for read-heavy workload for the ZooKeeper in the real world.
# -p fieldlength, test the length of value/data-content took effect on performance
./bin/ycsb run zookeeper -s -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p fieldlength=1000
# -p fieldcount
./bin/ycsb run zookeeper -s -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p fieldcount=20
# -p hdrhistogram.percentiles,show the hdrhistogram benchmark result
./bin/ycsb run zookeeper -threads 1 -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p hdrhistogram.percentiles=10,25,50,75,90,95,99,99.9 -p histogram.buckets=500
# -threads: multi-clients test, increase the **maxClientCnxns** in the zoo.cfg to handle more connections.
./bin/ycsb run zookeeper -threads 10 -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark
# show the timeseries benchmark result
./bin/ycsb run zookeeper -threads 1 -P workloads/workloadb -p zookeeper.connectString=127.0.0.1:2181/benchmark -p measurementtype=timeseries -p timeseries.granularity=50
# cluster test
./bin/ycsb run zookeeper -P workloads/workloadb -p zookeeper.connectString=192.168.10.43:2181,192.168.10.45:2181,192.168.10.27:2181/benchmark
# test leader's read/write performance by setting zookeeper.connectString to leader's(192.168.10.43:2181)
./bin/ycsb run zookeeper -P workloads/workloadb -p zookeeper.connectString=192.168.10.43:2181/benchmark
# test for large znode(by default: jute.maxbuffer is 1048575 bytes/1 MB ). Notice:jute.maxbuffer should also be set the same value in all the zk servers.
./bin/ycsb run zookeeper -jvm-args="-Djute.maxbuffer=4194304" -s -P workloads/workloadc -p zookeeper.connectString=127.0.0.1:2181/benchmark
# Cleaning up the workspace after finishing the benchmark.
# e.g the CLI:deleteall /benchmark
zk-smoketest
zk-smoketest 为 ZooKeeper 集群提供了一个简单的 smoketest 客户端。可用于验证新的、更新的、现有的安装。更多详细信息请参见 此处。
测试
故障注入框架
Byteman
- Byteman 是一款工具,可以轻松跟踪、监视和测试 Java 应用程序和 JDK 运行时代码的行为。它将 Java 代码注入到您的应用程序方法或 Java 运行时方法中,而无需您重新编译、重新打包甚至重新部署您的应用程序。可以在 JVM 启动时或启动后应用程序仍在运行时执行注入。
- 访问官方 网站 下载最新版本
- 简要教程请参见 此处
Preparations: # attach the byteman to 3 zk servers during runtime # 55001,55002,55003 is byteman binding port; 714,740,758 is the zk server pid ./bminstall.sh -b -Dorg.jboss.byteman.transform.all -Dorg.jboss.byteman.verbose -p 55001 714 ./bminstall.sh -b -Dorg.jboss.byteman.transform.all -Dorg.jboss.byteman.verbose -p 55002 740 ./bminstall.sh -b -Dorg.jboss.byteman.transform.all -Dorg.jboss.byteman.verbose -p 55003 758 # load the fault injection script ./bmsubmit.sh -p 55002 -l my_zk_fault_injection.btm # unload the fault injection script ./bmsubmit.sh -p 55002 -u my_zk_fault_injectionr.btm
查看以下示例以自定义您的 byteman 故障注入脚本
示例 1:此脚本使 leader 的 zxid 翻转,以强制重新选举。
cat zk_leader_zxid_roll_over.btm
RULE trace zk_leader_zxid_roll_over
CLASS org.apache.zookeeper.server.quorum.Leader
METHOD propose
IF true
DO
traceln("*** Leader zxid has rolled over, forcing re-election ***");
$1.zxid = 4294967295L
ENDRULE
示例 2:此脚本使 leader 放弃发送给特定跟随者的 ping 数据包。leader 将关闭与该跟随者的 LearnerHandler,而跟随者将进入状态:LOOKING,然后以状态:FOLLOWING 重新进入法定人数。
cat zk_leader_drop_ping_packet.btm
RULE trace zk_leader_drop_ping_packet
CLASS org.apache.zookeeper.server.quorum.LearnerHandler
METHOD ping
AT ENTRY
IF $0.sid == 2
DO
traceln("*** Leader drops ping packet to sid: 2 ***");
return;
ENDRULE
示例 3:此脚本使一个跟随者放弃 ACK 数据包,这在广播阶段不会产生太大影响,因为在从跟随者收到大多数 ACK 后,leader 可以提交该提案
cat zk_leader_drop_ping_packet.btm
RULE trace zk.follower_drop_ack_packet
CLASS org.apache.zookeeper.server.quorum.SendAckRequestProcessor
METHOD processRequest
AT ENTRY
IF true
DO
traceln("*** Follower drops ACK packet ***");
return;
ENDRULE
Jepsen 测试
一个用于分布式系统验证的框架,带有故障注入。Jepsen 已用于验证从最终一致的交换数据库到可线性化协调系统再到分布式任务调度程序的所有内容。更多详细信息请参见 jepsen-io。
运行 Dockerized Jepsen 是使用 Jepsen 的最简单方法。
安装
git clone [email protected]:jepsen-io/jepsen.git
cd docker
# maybe a long time for the first init.
./up.sh
# docker ps to check one control node and five db nodes are up
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8265f1d3f89c docker_control "/bin/sh -c /init.sh" 9 hours ago Up 4 hours 0.0.0.0:32769->8080/tcp jepsen-control
8a646102da44 docker_n5 "/run.sh" 9 hours ago Up 3 hours 22/tcp jepsen-n5
385454d7e520 docker_n1 "/run.sh" 9 hours ago Up 9 hours 22/tcp jepsen-n1
a62d6a9d5f8e docker_n2 "/run.sh" 9 hours ago Up 9 hours 22/tcp jepsen-n2
1485e89d0d9a docker_n3 "/run.sh" 9 hours ago Up 9 hours 22/tcp jepsen-n3
27ae01e1a0c5 docker_node "/run.sh" 9 hours ago Up 9 hours 22/tcp jepsen-node
53c444b00ebd docker_n4 "/run.sh" 9 hours ago Up 9 hours 22/tcp jepsen-n4
运行和测试
# Enter into the container:jepsen-control
docker exec -it jepsen-control bash
# Test
cd zookeeper && lein run test --concurrency 10
# See something like the following to assert that ZooKeeper has passed the Jepsen test
INFO [2019-04-01 11:25:23,719] jepsen worker 8 - jepsen.util 8 :ok :read 2
INFO [2019-04-01 11:25:23,722] jepsen worker 3 - jepsen.util 3 :invoke :cas [0 4]
INFO [2019-04-01 11:25:23,760] jepsen worker 3 - jepsen.util 3 :fail :cas [0 4]
INFO [2019-04-01 11:25:23,791] jepsen worker 1 - jepsen.util 1 :invoke :read nil
INFO [2019-04-01 11:25:23,794] jepsen worker 1 - jepsen.util 1 :ok :read 2
INFO [2019-04-01 11:25:24,038] jepsen worker 0 - jepsen.util 0 :invoke :write 4
INFO [2019-04-01 11:25:24,073] jepsen worker 0 - jepsen.util 0 :ok :write 4
...............................................................................
Everything looks good! ヽ(‘ー`)ノ
参考:阅读这篇博客,以了解有关 Zookeeper 的 Jepsen 测试的更多信息。