首页 技术 正文
技术 2022年11月16日
0 收藏 465 点赞 3,270 浏览 15443 个字

关键词:blktrace、blk tracer、blkparse、block traceevents、BIO。

本章只做一个记录,关于优化Block层IO性能方法工具。

对Block层没有详细分析,对工作的使用和结果分析也没有展开。

如果有合适的机会补充。

1. blktrace介绍

如下图可知整个Block I/O框架可以分为三层:VFS、Block和I/O设备驱动。

Linux IO性能分析blktrace/blk跟踪器

Linux内核中提供了跟踪Block层操作的手段,可以通过blk跟踪器、或者使用blktrace/blkparse/btt工具抓取分析、或者使用block相关trace events记录。

2. blk跟踪器分析

blk跟踪器作为Linux ftrace的一个跟踪器,主要跟踪Linux Block层相关操作。

2.1 使用blk跟踪器

执行脚本sh blk_tracer.sh 4k 512 fsync,抓取相关case。

#!/bin/bash

echo $1 $2 $3

#Prepare for the test
echo 1 > /sys/block/mmcblk0/trace/enable
echo 0 > /sys/kernel/debug/tracing/tracing_on
echo 0 > /sys/kernel/debug/tracing/events/enable

echo 10000 > /sys/kernel/debug/tracing/buffer_size_kb
echo blk > /sys/kernel/debug/tracing/current_tracer

echo 1 > /sys/kernel/debug/tracing/tracing_on

#Run the test case
if [ $3 = ‘fsync’ ]; then
dd bs=$1 count=$2 if=/dev/zero of=out_empty conv=fsync
else
dd bs=$1 count=$2 if=/dev/zero of=out_empty
fi

#Capture the test log
echo 0 > /sys/kernel/debug/tracing/tracing_on
echo 0 > /sys/block/mmcblk0/trace/enable
cat /sys/kernel/debug/tracing/trace > test_$1_$2_$3.txt
echo nop > /sys/kernel/debug/tracing/current_tracer
echo > /sys/kernel/debug/tracing/trace

2.2 注册blk跟踪器

首先register_tracer向ftrace注册blk跟踪器,然后打印核心函数是blk_tracer_print_line。

static int __init init_blk_tracer(void)
{
...
if (register_tracer(&blk_tracer) != ) {
pr_warning("Warning: could not register the block tracer\n");
unregister_ftrace_event(&trace_blk_event);
return ;
} return ;
}static struct tracer blk_tracer __read_mostly = {
.name = "blk",
.init = blk_tracer_init,
.reset = blk_tracer_reset,
.start = blk_tracer_start,
.stop = blk_tracer_stop,
.print_header = blk_tracer_print_header,
.print_line =blk_tracer_print_line,
.flags = &blk_tracer_flags,
.set_flag = blk_tracer_set_flag,
};static enum print_line_t blk_tracer_print_line(struct trace_iterator *iter)
{
if (!(blk_tracer_flags.val & TRACE_BLK_OPT_CLASSIC))
return TRACE_TYPE_UNHANDLED; return print_one_line(iter, true);
}

print_one_line是打印的核心,这里最主要是根据当前struct blk_io_trace->action选择合适的打印函数。

static enum print_line_t print_one_line(struct trace_iterator *iter,
bool classic)
{
struct trace_seq *s = &iter->seq;
const struct blk_io_trace *t;
u16 what;
int ret;
bool long_act;
blk_log_action_t *log_action; t = te_blk_io_trace(iter->ent);
what = t->action & (( << BLK_TC_SHIFT) - );-------------------------------what表示是什么操作类型?
long_act = !!(trace_flags & TRACE_ITER_VERBOSE);
log_action = classic ? &blk_log_action_classic : &blk_log_action; if (t->action == BLK_TN_MESSAGE) {
ret = log_action(iter, long_act ? "message" : "m");
if (ret)
ret = blk_log_msg(s, iter->ent);
goto out;
} if (unlikely(what == || what >= ARRAY_SIZE(what2act)))---------------------------what为0表示未知操作类型
ret = trace_seq_printf(s, "Unknown action %x\n", what);
else {
ret = log_action(iter, what2act[what].act[long_act]);--------------------------首先调用blk_long_action_classic或者blk_long_action打印
if (ret)
ret = what2act[what].print(s, iter->ent);----------------------------------然后在调用具体action对应的但因函数打印细节。
}
out:
return ret ? TRACE_TYPE_HANDLED : TRACE_TYPE_PARTIAL_LINE;
}static const struct {
const char *act[];
int (*print)(struct trace_seq *s, const struct trace_entry *ent);
} what2act[] = {
[__BLK_TA_QUEUE] = {{ "Q", "queue" }, blk_log_generic },
[__BLK_TA_BACKMERGE] = {{ "M", "backmerge" }, blk_log_generic },
[__BLK_TA_FRONTMERGE] = {{ "F", "frontmerge" }, blk_log_generic },
[__BLK_TA_GETRQ] = {{ "G", "getrq" }, blk_log_generic },
[__BLK_TA_SLEEPRQ] = {{ "S", "sleeprq" }, blk_log_generic },
[__BLK_TA_REQUEUE] = {{ "R", "requeue" }, blk_log_with_error },
[__BLK_TA_ISSUE] = {{ "D", "issue" }, blk_log_generic },
[__BLK_TA_COMPLETE] = {{ "C", "complete" }, blk_log_with_error },
[__BLK_TA_PLUG] = {{ "P", "plug" }, blk_log_plug },
[__BLK_TA_UNPLUG_IO] = {{ "U", "unplug_io" }, blk_log_unplug },
[__BLK_TA_UNPLUG_TIMER] = {{ "UT", "unplug_timer" }, blk_log_unplug },
[__BLK_TA_INSERT] = {{ "I", "insert" }, blk_log_generic },
[__BLK_TA_SPLIT] = {{ "X", "split" }, blk_log_split },
[__BLK_TA_BOUNCE] = {{ "B", "bounce" }, blk_log_generic },
[__BLK_TA_REMAP] = {{ "A", "remap" }, blk_log_remap },
}

从fill_rwbs()可知,RWBS之类的字符表示读写类型,W-BLK_TC_WRITE、S-BLK_TC_SYNC、R-BLK_TC_READ、N-BLK_TN_MESSAGE等等。

结合what2act可知当前当前打印的确切含义,结合打印函数,可知其大概含义。

              dd-   [] ....  1540.751312: ,    Q  WS  +  [dd]---------------Q-queue操作,WS-写同步,blk_log_generic打印-磁盘偏移量+写sector数目 [进程名]
dd- [] .... 1540.751312: , G WS + [dd]---------------G-getrq操作
dd- [] .... 1540.751312: , I WS + [dd]
dd- [] .... 1540.753937: , P N [dd]-------------------------P-plug操作,N-BLK_TN_MESSAGE
dd- [] .... 1540.753937: , A WS + <- (,)
dd- [] .... 1540.753937: , Q WS + [dd]
dd- [] .... 1540.753967: , G WS + [dd]
dd- [] .... 1540.753967: , I WS + [dd]
dd- [] .... 1540.753967: , I WS + [dd]
dd- [] .... 1540.753967: , U N [dd] 2------------------------U-unplug_io操作,
mmcqd/- [] .... 1540.753967: , D WS + [mmcqd/]
mmcqd/- [] .... 1540.753998: , D WS + [mmcqd/]
mmcqd/- [] .... 1540.757690: , C WS + []
mmcqd/- [] .... 1540.760681: , C WS + []
dd- [] .... 1540.760742: , A R + <- (,) 2832------A-remap操作,读取偏移量+读取sect数 <-源设备主从设备号 源设备偏移量

3.blktrace使用及原理

blktrace是针对Linux内核中Block I/O的跟踪工具,属于内核Block Layer。

通过这个工具可以获得I/O请求队列的详细情况,包括读写进程名、进程号、执行时间、读写物理块号、块大小等等。

 PS:代码是LInux 3.4.110,但是试验在Ubuntu Kernel 4.4.0-31进行。

3.1 blktrace的使用

安装blktrace

sudo apt-get install blktrace

blktrace用于抓取Block I/O相关的log,blkparse用于分析blktrace抓取的二进制log。通过btt也可以用blktrace抓取的二进制log。

sudo blktrace -d /dev/sda6 -o – | blkparse -i –

或者分开使用

sudo blktrace -d /dev/sda6 -o sda6 #生成sda6.blktrace.x文件

blkparse -i sda6.blktrace.* -o sda6.txt

btt -i sda6.blktrace.*

3.2 blktrace原理

1. blktrace在运行的时候会在/sys/kernel/debug/block下面设备名称对应的目录,并生成四个文件droppeed/msg/trace0/trace2。

2. blktrace通过ioctl对内核进行设置,从而抓取log。

3. blktrace针对系统每个CPU绑定一个线程来收集相应数据。

3.2.1 Block设备blktrace相关节点

使用strace跟中blktrace执行流程可以看出:打开/dev/sda6设备,然后通过ioctl发送BLKTRACESETUP和BLKTRACESTART。

open("/dev/sda6", O_RDONLY|O_NONBLOCK)  =
statfs("/sys/kernel/debug", {f_type=0x64626720, f_bsize=, f_blocks=, f_bfree=, f_bavail=, f_files=, f_ffree=, f_fsid={, }, f_namelen=, f_frsize=}) = ...
ioctl(, BLKTRACESETUP, {act_mask=, buf_size=, buf_nr=, start_lba=, end_lba=, pid=}, {name="sda6"}) = ...
ioctl(, BLKTRACESTART, 0x608c20) =

下面通过走查代码简单分析流程。

add_disk()
->blk_register_queue()
->blk_trace_init_sysfs()
->blk_trace_attr_group
->blk_trace_attrs

如果定义了CONFIG_BLK_DEV_IO_TRACE,会在/sys/block/sda/sda6下创建一个trace目录。执行blktrace可以打开enable。

3.2.2 Block设备ioctl相关命令

ioctl通过/dev/sda6设置进去,可以看一下ioctl代码流程。

def_blk_fops
->unlocked_ioctl = block_ioctl
->blkdev_ioctl
->BLKTRACESTART/BLKTRACESTOP/BLKTRACESETUPBLKTRACETEARDOWN
->blk_trace_ioctl

blk_trace_ioctl处理BLKTRACESETUP/BLKTRACESTART/BLKTRACESTOP/BLKTRACETEARDOWN四种情况,分别表示配置、开始、停止、释放资源。

int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg)
{
struct request_queue *q;
int ret, start = ;
char b[BDEVNAME_SIZE]; q = bdev_get_queue(bdev);
if (!q)
return -ENXIO; mutex_lock(&bdev->bd_mutex); switch (cmd) {
case BLKTRACESETUP:
bdevname(bdev, b);
ret = blk_trace_setup(q, b, bdev->bd_dev, bdev, arg);
break;
#if defined(CONFIG_COMPAT) && defined(CONFIG_X86_64)
case BLKTRACESETUP32:
bdevname(bdev, b);
ret = compat_blk_trace_setup(q, b, bdev->bd_dev, bdev, arg);
break;
#endif
case BLKTRACESTART:
start = ;
case BLKTRACESTOP:
ret =blk_trace_startstop(q, start);
break;
case BLKTRACETEARDOWN:
ret =blk_trace_remove(q);
break;
default:
ret = -ENOTTY;
break;
} mutex_unlock(&bdev->bd_mutex);
return ret;
}
int blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
struct block_device *bdev,
char __user *arg)
{
struct blk_user_trace_setup buts;
int ret; ret = copy_from_user(&buts, arg, sizeof(buts));
if (ret)
return -EFAULT; ret = do_blk_trace_setup(q, name, dev, bdev, &buts);
if (ret)
return ret; if (copy_to_user(arg, &buts, sizeof(buts))) {
blk_trace_remove(q);
return -EFAULT;
}
return ;
}int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
struct block_device *bdev,
struct blk_user_trace_setup *buts)
{
struct blk_trace *old_bt, *bt = NULL;
struct dentry *dir = NULL;
int ret, i;
...
bt = kzalloc(sizeof(*bt), GFP_KERNEL);
if (!bt)
return -ENOMEM; ret = -ENOMEM;
bt->sequence = alloc_percpu(unsigned long);
if (!bt->sequence)
goto err; bt->msg_data = __alloc_percpu(BLK_TN_MAX_MSG, __alignof__(char));
if (!bt->msg_data)
goto err; ret = -ENOENT; mutex_lock(&blk_tree_mutex);
if (!blk_tree_root) {
blk_tree_root = debugfs_create_dir("block", NULL);-------------------创建/sys/kernel/debug/block目录
if (!blk_tree_root) {
mutex_unlock(&blk_tree_mutex);
goto err;
}
}
mutex_unlock(&blk_tree_mutex); dir = debugfs_create_dir(buts->name, blk_tree_root);---------------------在/sys/kernel/debug/block目录下创建设备名称对应目录 if (!dir)
goto err; bt->dir = dir;
bt->dev = dev;
atomic_set(&bt->dropped, ); ret = -EIO;
bt->dropped_file = debugfs_create_file("dropped", , dir, bt,
&blk_dropped_fops);--------------------------------在/sys/kernel/debug/block/sda6下创建dropped节点
if (!bt->dropped_file)
goto err; bt->msg_file = debugfs_create_file("msg", , dir, bt, &blk_msg_fops);
if (!bt->msg_file)
goto err; bt->rchan = relay_open("trace", dir, buts->buf_size,
buts->buf_nr, &blk_relay_callbacks, bt);----------------------为每个CPU创建/sys/kernel/debug/block/sda6/tracex对应额relay channel
...
if (atomic_inc_return(&blk_probes_ref) == )
blk_register_tracepoints();-------------------------------------------注册和/sys/kernel/debug/tracing/events/block下同样的trace events。 return ;
err:
blk_trace_free(bt);
return ret;
}
blk_trace_startstop执行blktrace的开关操作,停止过后将per cpu的relay chanel强制flush出来。
int blk_trace_startstop(struct request_queue *q, int start)
{
int ret;
struct blk_trace *bt = q->blk_trace;
...
ret = -EINVAL;
if (start) {
if (bt->trace_state == Blktrace_setup ||
bt->trace_state == Blktrace_stopped) {
blktrace_seq++;
smp_mb();
bt->trace_state = Blktrace_running; trace_note_time(bt);
ret = ;
}
} else {
if (bt->trace_state == Blktrace_running) {
bt->trace_state = Blktrace_stopped;
relay_flush(bt->rchan);
ret = ;
}
} return ret;
}

释放blktrace设置创建的buffer、删除相关文件节点,并去注册trace events。

int blk_trace_remove(struct request_queue *q)
{
struct blk_trace *bt; bt = xchg(&q->blk_trace, NULL);
if (!bt)
return -EINVAL; if (bt->trace_state != Blktrace_running)
blk_trace_cleanup(bt); return ;
}static void blk_trace_cleanup(struct blk_trace *bt)
{
blk_trace_free(bt);
if (atomic_dec_and_test(&blk_probes_ref))
blk_unregister_tracepoints();
}

这里的print_one_line和blk跟踪器的打印稍有不同,此处调用blk_log_action打印公共部分信息。

static int __init init_blk_tracer(void)
{
if (!register_ftrace_event(&trace_blk_event)) {
pr_warning("Warning: could not register block events\n");
return ;
}...
}static enum print_line_t blk_trace_event_print(struct trace_iterator *iter,
int flags, struct trace_event *event)
{
return print_one_line(iter, false);
}

blk跟踪器使用blk_log_action_classic()打印,两者区别在于blk跟踪器

static int blk_log_action_classic(struct trace_iterator *iter, const char *act)
{
char rwbs[RWBS_LEN];
unsigned long long ts = iter->ts;
unsigned long nsec_rem = do_div(ts, NSEC_PER_SEC);
unsigned secs = (unsigned long)ts;
const struct blk_io_trace *t = te_blk_io_trace(iter->ent); fill_rwbs(rwbs, t); return trace_seq_printf(&iter->seq,
"%3d,%-3d %2d %5d.%09lu %5u %2s %3s ",
MAJOR(t->device), MINOR(t->device), iter->cpu,
secs, nsec_rem, iter->ent->pid, act, rwbs);
}static int blk_log_action(struct trace_iterator *iter, const char *act)
{
char rwbs[RWBS_LEN];
const struct blk_io_trace *t = te_blk_io_trace(iter->ent); fill_rwbs(rwbs, t);
return trace_seq_printf(&iter->seq, "%3d,%-3d %2s %3s ",
MAJOR(t->device), MINOR(t->device), act, rwbs);
}

3.3 blkparse结果分析

使用blkparse分析blktrace后,得出的结果和blk跟踪器大概类似。

可见实验结果多了第2列CPU序号,和第4列执行的时间戳。

  ,                 0.000000000     P   N [jbd2/sda1-]
, 0.000011134 U N [jbd2/sda1-]
, 4.000017080 A WS + <- (,)
, 4.000018315 Q WS + [jbd2/sda6-]
, 4.000024476 G WS + [jbd2/sda6-]
, 4.000025282 P N [jbd2/sda6-]
, 4.000026610 A WS + <- (,)
, 4.000026935 Q WS + [jbd2/sda6-]
, 4.000028253 M WS + [jbd2/sda6-]
, 4.000028895 A WS + <- (,)
, 4.000029180 Q WS + [jbd2/sda6-]
, 4.000029585 M WS + [jbd2/sda6-]
, 4.000030182 A WS + <- (,)
, 4.000030463 Q WS + [jbd2/sda6-]
, 4.000030845 M WS + [jbd2/sda6-]
, 4.000031406 A WS + <- (,)
, 4.000031687 Q WS + [jbd2/sda6-]
, 4.000032066 M WS + [jbd2/sda6-]
...
CPU0 (sda6):
Reads Queued: , 16816KiB Writes Queued: , 4648KiB
Read Dispatches: , 16632KiB Write Dispatches: , 3876KiB
Reads Requeued: Writes Requeued:
Reads Completed: , 132KiB Writes Completed: , 0KiB
Read Merges: , 108KiB Write Merges: , 1960KiB
Read depth: Write depth:
IO unplugs: Timer unplugs:
CPU1 (sda6):
Reads Queued: , 12724KiB Writes Queued: , 5080KiB
Read Dispatches: , 12908KiB Write Dispatches: , 5852KiB
Reads Requeued: Writes Requeued:
Reads Completed: , 29408KiB Writes Completed: , 9728KiB
Read Merges: , 0KiB Write Merges: , 3044KiB
Read depth: Write depth:
IO unplugs: Timer unplugs: Total (sda6):
Reads Queued: , 29540KiB Writes Queued: , 9728KiB
Read Dispatches: , 29540KiB Write Dispatches: , 9728KiB
Reads Requeued: Writes Requeued:
Reads Completed: , 29540KiB Writes Completed: , 9728KiB
Read Merges: , 108KiB Write Merges: , 5004KiB
IO unplugs: Timer unplugs: Throughput (R/W): 2KiB/s / 0KiB/s
Events (sda6): entries
Skips: forward ( - 0.0%)

4. block的Traceevents

关于block Traceevents和blk跟踪器的关联,可以通过blk_register_tracepoints()和include/trace/events/block.h中定义的时间关联起来,两者是一一对应的。

block事件打印的信息多了一个字符串,可读性更强一点。比blk跟踪器和blktrace更容易了解含义。

在/sys/kernel/debug/tracing/events/block下,可以分别打开相关事件:

block_bio_backmerge
block_bio_bounce
block_bio_complete
block_bio_frontmerge
block_bio_queue
block_bio_remap
block_getrq
block_plug
block_rq_abort
block_rq_complete
block_rq_insert
block_rq_issue
block_rq_remap
block_rq_requeue
block_sleeprq
block_split
block_unplug

同时根据这些字符串,也容易找出内核中代码位置。

也只有明白了block层流程,以及这些关键事件的log,才能知道问题点在哪里?然后再去进行优化。

# tracer: nop
#
# entries-in-buffer/entries-written: / #P:
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
dd- [] .... 3287.902618: block_bio_remap: , W + <- (,)
dd- [] .... 3287.902618: block_bio_queue: , W + [dd]
dd- [] .... 3287.902649: block_getrq: , W + [dd]
dd- [] .... 3287.902649: block_plug: [dd]
dd- [] d... 3287.902649: block_rq_insert: , W () + [dd]
dd- [] d... 3287.902679: block_unplug: [dd]
mmcqd/- [] d... 3287.902679: block_rq_issue: , W () + [mmcqd/]
mmcqd/- [] d... 3287.907440: block_rq_complete: , W () + []
dd- [] .... 3287.907501: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.907501: block_bio_queue: , WS + [dd]
dd- [] .... 3287.907501: block_getrq: , WS + [dd]
dd- [] d... 3287.907501: block_rq_insert: , WS () + [dd]
mmcqd/- [] d... 3287.907532: block_rq_issue: , WS () + [mmcqd/]
mmcqd/- [] d... 3287.921631: block_rq_complete: , WS () + []
dd- [] .... 3287.921631: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.921631: block_bio_queue: , WS + [dd]
dd- [] .... 3287.921631: block_getrq: , WS + [dd]
dd- [] .... 3287.921661: block_plug: [dd]
dd- [] .... 3287.921661: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.921661: block_bio_queue: , WS + [dd]
dd- [] .... 3287.921661: block_getrq: , WS + [dd]
dd- [] .... 3287.921661: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.921661: block_bio_queue: , WS + [dd]
dd- [] .... 3287.921661: block_getrq: , WS + [dd]
dd- [] .... 3287.921661: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.921661: block_bio_queue: , WS + [dd]
dd- [] .... 3287.921692: block_getrq: , WS + [dd]
dd- [] .... 3287.921692: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.921692: block_bio_queue: , WS + [dd]
dd- [] .... 3287.921692: block_getrq: , WS + [dd]
dd- [] .... 3287.921692: block_bio_remap: , WS + <- (,)
dd- [] .... 3287.921692: block_bio_queue: , WS + [dd]

5. 优化尝试

对文件系统性能的调优,主要通过两个目录下节点:/proc/sys/vm和/sys/block/sda/queue。

5.1 更改IO调度算法

echo cfq > /sys/block/sda/queue/scheduler

5.2 修改磁盘相关内核参数

/sys/block/sda/queue/scheduler

/sys/block/sda/queue/nr_requests 磁盘队列长度。默认只有 128 个队列,可以提高到 512 个.会更加占用内存,但能更加多的合并读写操作,速度变慢,但能读写更加多的量

/sys/block/sda/queue/iosched/antic_expire 等待时间 。读取附近产生的新请时等待多长时间

/sys/block/sda/queue/read_ahead_kb这个参数对顺序读非常有用,意思是,一次提前读多少内容,无论实际需要多少.默认一次读 128kb 远小于要读的,设置大些对读大文件非常有用,可以有效的减少读 seek 的次数,这个参数可以使用 blockdev –setra 来设置,setra 设置的是多少个扇区,所以实际的字节是除以2,比如设置 512 ,实际是读 256 个字节. /proc/sys/vm/dirty_ratio

这个参数控制文件系统的文件系统写缓冲区的大小,单位是百分比,表示系统内存的百分比,表示当写缓冲使用到系统内存多少的时候,开始向磁盘写出数 据.增大之会使用更多系统内存用于磁盘写缓冲,也可以极大提高系统的写性能.但是,当你需要持续、恒定的写入场合时,应该降低其数值,一般启动上缺省是 10.下面是增大的方法: echo ’40’>

/proc/sys/vm/dirty_background_ratio

这个参数控制文件系统的pdflush进程,在何时刷新磁盘.单位是百分比,表示系统内存的百分比,意思是当写缓冲使用到系统内存多少的时候, pdflush开始向磁盘写出数据.增大之会使用更多系统内存用于磁盘写缓冲,也可以极大提高系统的写性能.但是,当你需要持续、恒定的写入场合时,应该降低其数值,一般启动上缺省是 5.下面是增大的方法: echo ’20’ >

/proc/sys/vm/dirty_writeback_centisecs

这个参数控制内核的脏数据刷新进程pdflush的运行间隔.单位是 1/100 秒.缺省数值是500,也就是 5 秒.如果你的系统是持续地写入动作,那么实际上还是降低这个数值比较好,这样可以把尖峰的写操作削平成多次写操作.设置方法如下: echo ‘200’ > /proc/sys/vm/dirty_writeback_centisecs 如果你的系统是短期地尖峰式的写操作,并且写入数据不大(几十M/次)且内存有比较多富裕,那么应该增大此数值: echo ‘1000’ > /proc/sys/vm/dirty_writeback_centisecs

/proc/sys/vm/dirty_expire_centisecs

这个参数声明Linux内核写缓冲区里面的数据多“旧”了之后,pdflush进程就开始考虑写到磁盘中去.单位是 1/100秒.缺省是 30000,也就是 30 秒的数据就算旧了,将会刷新磁盘.对于特别重载的写操作来说,这个值适当缩小也是好的,但也不能缩小太多,因为缩小太多也会导致IO提高太快.建议设置为 1500,也就是15秒算旧. echo ‘1500’ > /proc/sys/vm/dirty_expire_centisecs 当然,如果你的系统内存比较大,并且写入模式是间歇式的,并且每次写入的数据不大(比如几十M),那么这个值还是大些的好.

参考文档

1. Linux IO Scheduler(Linux IO 调度器)

2. Linux文件系统性能优化

3. Linux 性能优化之 IO 子系统

 

相关推荐
python开发_常用的python模块及安装方法
adodb:我们领导推荐的数据库连接组件bsddb3:BerkeleyDB的连接组件Cheetah-1.0:我比较喜欢这个版本的cheeta…
日期:2022-11-24 点赞:878 阅读:9,105
Educational Codeforces Round 11 C. Hard Process 二分
C. Hard Process题目连接:http://www.codeforces.com/contest/660/problem/CDes…
日期:2022-11-24 点赞:807 阅读:5,582
下载Ubuntn 17.04 内核源代码
zengkefu@server1:/usr/src$ uname -aLinux server1 4.10.0-19-generic #21…
日期:2022-11-24 点赞:569 阅读:6,429
可用Active Desktop Calendar V7.86 注册码序列号
可用Active Desktop Calendar V7.86 注册码序列号Name: www.greendown.cn Code: &nb…
日期:2022-11-24 点赞:733 阅读:6,200
Android调用系统相机、自定义相机、处理大图片
Android调用系统相机和自定义相机实例本博文主要是介绍了android上使用相机进行拍照并显示的两种方式,并且由于涉及到要把拍到的照片显…
日期:2022-11-24 点赞:512 阅读:7,836
Struts的使用
一、Struts2的获取  Struts的官方网站为:http://struts.apache.org/  下载完Struts2的jar包,…
日期:2022-11-24 点赞:671 阅读:4,919