标题 | SQL Server自动化运维系列——监控性能指标脚本(Power Shell) |
范文 | 需求描述 一般在生产环境中,有时候需要自动的检测指标值状态,如果发生异常,需要提前预警的,比如发邮件告知,本篇就介绍如果通过Power shell实现状态值监控。 监控值范围 根据经验,作为DBA一般需要监控如下系统能行指标。 cpu: Processor(_Total)% Processor Time Processor(_Total)% Privileged Time SQLServer:SQL StatisticsBatch Requests/sec SQLServer:SQL StatisticsSQL Compilations/sec SQLServer:SQL StatisticsSQL Re-Compilations/sec SystemProcessor Queue Length SystemContext Switches/sec Memory: MemoryAvailable Bytes MemoryPages/sec MemoryPage Faults/sec MemoryPages Input/sec MemoryPages Output/sec Process(sqlservr)Private Bytes SQLServer:Buffer ManagerBuffer cache hit ratio SQLServer:Buffer ManagerPage life expectancy SQLServer:Buffer ManagerLazy writes/sec SQLServer:Memory ManagerMemory Grants Pending SQLServer:Memory ManagerTarget Server Memory (KB) SQLServer:Memory ManagerTotal Server Memory (KB) Disk: PhysicalDisk(_Total)% Disk Time PhysicalDisk(_Total)Current Disk Queue Length PhysicalDisk(_Total)Avg. Disk Queue Length PhysicalDisk(_Total)Disk Transfers/sec PhysicalDisk(_Total)Disk Bytes/sec PhysicalDisk(_Total)Avg. Disk sec/Read PhysicalDisk(_Total)Avg. Disk sec/Write SQL Server: SQLServer:Access MethodsFreeSpace Scans/sec SQLServer:Access MethodsFull Scans/sec SQLServer:Access MethodsTable Lock Escalations/sec SQLServer:Access MethodsWorktables Created/sec SQLServer:General StatisticsProcesses blocked SQLServer:General StatisticsUser Connections SQLServer:LatchesTotal Latch Wait Time (ms) SQLServer:Locks(_Total)Lock Timeouts (timeout > 0)/sec SQLServer:Locks(_Total)Lock Wait Time (ms) SQLServer:Locks(_Total)Number of Deadlocks/sec SQLServer:SQL StatisticsBatch Requests/sec SQLServer:SQL StatisticsSQL Re-Compilations/sec 上述指标含义,可以参照我上一篇文章:SQL Server需要监控哪些计数器 监控脚本 $server = "(local)" $uid = "sa" $db="master" $pwd="password" $mailprfname = "SendEmail" $recipients = "" $subject = "数据库指标异常了!" $computernamexml = "f:computername.xml" $alter_cpuxml = "f:alter_cpu.xml" function GetServerName($xmlpath) { $xml = [xml] (Get-Content $xmlpath) $return = New-Object Collections.Generic.List[string] for($i = 0;$i -lt $xml.computernames.ChildNodes.Count;$i++) { if ( $xml.computernames.ChildNodes.Count -eq 1) { $cp = [string]$xml.computernames.computername } else { $cp = [string]$xml.computernames.computername[$i] } $return.Add($cp.Trim()) } $return } function GetAlterCounter($xmlpath) { $xml = [xml] (Get-Content $xmlpath) $return = New-Object Collections.Generic.List[string] $list = $xml.counters.Counter $list } function CreateAlter($message) { $SqlConnection = New-Object System.Data.SqlClient.SqlConnection $CnnString ="Server = $server; Database = $db;User Id = $uid; Password = $pwd" $SqlConnection.ConnectionString = $CnnString $CC = $SqlConnection.CreateCommand(); if (-not ($SqlConnection.State -like "Open")) { $SqlConnection.Open() } $cc.CommandText=" EXEC msdb..sp_send_dbmail @profile_name = '$mailprfname' ,@recipients = '$recipients' ,@body = '$message' ,@subject = '$subject' " $cc.ExecuteNonQuery()|out-null $SqlConnection.Close(); } $names = GetServerName($computernamexml) $pfcounters = GetAlterCounter($alter_cpuxml) foreach($cp in $names) { $p = New-Object Collections.Generic.List[string] $report = "" foreach ($pfc in $pfcounters) { $b = "" $counter ="\"+$cp+$pfc.get_InnerText().Trim() $p.Add($counter) } $count = Get-Counter $p for ($i = 0; $i -lt $count.CounterSamples.Count; $i++) { $v = $count.CounterSamples.Get($i).CookedValue $pfc = $pfcounters[$i] #$pfc.get_InnerText() $b = "" $lg = "" if($pfc.operator -eq "lt") { if ($v -ge [double]$pfc.alter) {$b = "alter" $lg = "Greater Than"} } elseif ($pfc.operator -eq "gt") { if( $v -le [double]$pfc.alter) {$b = "alter" $lg = "Less Than"} } if($b -eq "alter") { $path = "\"+$cp+$pfc.get_InnerText() $item = "{0}:{1};{2} Threshold:{3}" -f $path,$v.ToString(),$lg,$pfc.alter.Trim() $report += $item + "`n" } } if($report -ne "") { #生产警告 参数 计数器,阀值,当前值 CreateAlter $report } } 其中涉及到2个配置文件:computernamexml,alter_cpuxml分别如下: <computernames> <computername> wuxuelei-pc </computername> </computernames> <Counters> <Counter alter = "10" operator = "gt" >Processor(_Total)% Processor Time</Counter> <Counter alter = "10" operator = "gt" >Processor(_Total)% Privileged Time</Counter> <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsBatch Requests/sec</Counter> <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsSQL Compilations/sec</Counter> <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsSQL Re-Compilations/sec</Counter> <Counter alter = "10" operator= "lt" >SystemProcessor Queue Length</Counter> <Counter alter = "10" operator= "lt" >SystemContext Switches/sec</Counter> </Counters> 其中 alter 就是阀值,如第一条,如果 阀值 > 性能计数器值,就会发出警告。 其实这种自定义配置的方式,实现了灵活多变的自动化监控标准: 1、比如可以检测磁盘空间大小 2、检测运行峰值状态 3、定时的根据历史运行值,更改生产系统中的阀值大小,也就是所谓的运行基线 警告实现方式 1、SQL Agent配置Job方式实现 2、计划任务 以上两种配置方式,可以灵活掌握,操作还是蛮简单的,如果不会,可自行google。当然,如果不想干预正常的生产系统,可以添加一个Server专门用来自动化运维检测来用,实现远程监控。 后续文章中会分析关于Power Shell的远程调用,并且能实现事故当前状态下,自动化截图….自动Send Email……为DBA现场取证第一手材料…方便诊断问题… 效果图如下 ![]() 以上只提供实现方式,如需要内容更新,自己灵活更新。 |
随便看 |
|
在线学习网范文大全提供好词好句、学习总结、工作总结、演讲稿等写作素材及范文模板,是学习及工作的有利工具。