事务数高了会如何?

    数据库的事务数高了, 再好的机器, 也顶不了多少, Log Write这块会成关键, 在一个每秒有几百上千个事务的库上, 发现了这样的负载现象, Rq是Run Que, Act是Active Session.

www.AnySQL.net  Load SY/WT/US  Rq  Act
07/23-22:18:02  6.09 11/27/31  64  16 
07/23-22:18:12  6.43 10/29/29  48  41 
07/23-22:18:22  5.82  7/42/21  39 1173
07/23-22:18:32  6.07 12/41/34  72  196 
07/23-22:18:42  5.62  8/52/24  52  329 
07/23-22:18:52  5.41 16/27/35  60  50 
07/23-22:19:02  5.03 13/27/33  44  26

    看一下V$SYSTEM_EVENT中的内容, 就会明白了, 206是单块读事件, 203是log file sync事件.

07/23-22:18:02 206-32714:206455:63, 203-12736:14704:11
07/23-22:18:12 206-30236:227577:75, 203-11799:12400:10
07/23-22:18:22 203-8999:1922849:2136, 206-19601:358029:182
07/23-22:18:32 206-38475:3520870:915, 203-16434:1425936:867
07/23-22:18:42 203-12308:3448652:2801, 206-27495:3031272:1102
07/23-22:18:52 206-34469:322209:93, 203-13581:107794:79
07/23-22:19:02 206-32916:234741:71, 203-13072:16812:12

    两周前Oracle的人来交流时, 我反映过这个问题, 他们说有几个和log file sync超长时间等待有关的Bug. 我也向他们要这些Bug的patch了, 可还没有发送给我们, 有些bug, 发现的晚了, 在metalink是找不到老版本的patch的, 需要特殊对待.

Bug 5065930 - "log file sync" timeouts can occur
Bug 3748799 - Average "log file sync" wait per transaction greater than 1 (no timeouts)
Bug 5087592 - "log file sync" waits from read only commits
Bug 6319685 - LGWR posts do not scale on some platforms

    上面是我自已搜索到的相关Bug, 和交流时Oracle工程师给我看的有些不一样.

留言 (3)

What's your version? If I were you, I would talk to the Oracle support and apply one patch at a time (if needed), to see which one of them can solve the problem, and stop applying more if the problem is gone.

I don't have down time to test the patch, and I have create a SR for this.

Work with Oracle support to narrow down to the exact problem so you only apply the needed patches. It's possible you can do that by taking a few error stacks and compare the stack trace with those in the bug reports. You can use oradebug errorstack or an OS level debugger to do it.

发表留言: