下的程序调试相对C/C++要简单很多少了那些令人头疼的指针越界的问题不过当你的程序遇到如下问题时依然非常棘手
进程异常终止解决方案见 Net 下未捕获异常的处理
内存洩漏或者内存申请后程序始终没有释放解决方案见 用 NET Memory Profiler 跟 应用内存使用情况基本应用篇 如果通过自己编写的程序监控我将在以后的文章中阐述
线程因未知原因挂起比如死锁
程序死循环
本文将阐述如果编写程序对后两者故障实时跟蹤并报告
首先我们需要一个单独的监控线程来监控需要监控的线程
我做了一个监控类 ThreadMonitor在开始监控之前我们将监控线程的优先级设置为最高
publicThreadMonitor()
{
_MonitorThread=newThread(newThreadStart(MonitorTask));
_MonitorThreadPriority=ThreadPriorityHighest;
_MonitorThreadIsBackground=true;
}
接下来我们为这个线程提供几个公共方法
方法让调用者启动监控
方法用于将需要监控的线程注册到监控列表中
方法后面说明
/**////
///Startmonitor
///
publicvoidStart()
{
_MonitorThreadStart();
}
/**////
///Monitorregister
///
///Monitorparameter
publicvoidRegister(MonitorParametermonitorPara)
{
DebugAssert(monitorPara!=null);
DebugAssert(monitorParaThread!=null);
if(GetTCB(monitorParaThread)!=null)
{
thrownewSystemArgumentException(Registerrepeatedly!);
}
lock(_RegisterLock)
{
_TCBTableAdd(monitorParaThreadManagedThreadIdnewTCB(monitorPara));
}
}
publicvoidHeartbeat(Threadt)
{
TCBtcb=GetTCB(t);
if(tcb==null)
{
thrownewSystemArgumentException(Thisthreadwasnotregistered!);
}
tcbLastHeartbeat=DateTimeNow;
tcbHitTimes=;
tcbStatus&=~ThreadStatusHang;
}
下面让我来说说如何监控某个线程挂起
监控线程提供了一个心跳调用 Heartbeat 被监控的线程必须设置一个定时器定时向监控线程发送心跳如果监控线程在一定时间内无法收到这个心跳消息则认为被监控线程非正常挂起了这个时间又MonitorParameter参数的HangTimeout指定
光监控到线程挂起还不够我们必须要报告线程当前挂起的位置才有实际意义那么如何获得线程当前的调用位置呢?Net framework 为我们提供了获取线程当前堆栈调用回溯的方法见下面代码
privatestringGetThreadStackTrace(Threadt)
{
boolneedFileInfo=NeedFileInfo;
tSuspend();
StackTracestack=newStackTrace(tneedFileInfo);
tResume();
returnstackToString();
}
这里需要说明的是StackTrace(t needFileInfo) 必须在线程t Suspend后 才能调用否则会发生异常但ThreadSuspend 调用是比较危险的因为调用者无法知道线程t挂起前的运行状况可能线程t目前正在等待某个资源这时强制挂起非常容易造成程序死锁不过值得庆幸的是StackTrace(t needFileInfo)的调用不会和其他线程尤其是调用线程产生资源沖突但我们必须在这一句执行结束后迅速调用 tResume 结束线程t的挂起状态
谈完了对线程非正常挂起的监控再谈谈对程序死循环的监控
在决定采用我现在的这个方案之前我曾经想通过 GetThreadTimes 这个API 函数得到被监控线程的实际CPU运行时间通过这个时间来计算其CPU占有率但很遗憾我的尝试失败了通过非当前线程下调用 GetThreadTimes 无法得到对应线程的CPU时间(好像非托管线程可以但Net的托管线程我试了确实不行但原因我还没弄明白)另外GetThreadTimes 统计不够准确 见 对老赵写的简单性能计数器的修改续 关于
所以没有办法我采用了一个不是很理想的方案
定时统计当前进程的TotalProcessorTime 来计算当前线程的CPU占有率如果这个CPU占有率在一段时间内大于 / (CPU 数)* % 则认为当前进程出现了死循环这个测试时间由 MonitorParameter参数的DeadCycleTimeout 属性指定
这就出现了一个问题我们只知道程序死循环了但不知道具体是那个线程死循环那么如何找到真正死循环的线程呢?
我采用的方法是每秒钟检测一次线程当前状态如果当前状态为运行状态则表示命中一次在确认出现死循环后我们在来检查在一个检查周期内的命中次数如果这个命中次数足够高则认为是该线程死循环了不过这样还是有问题主线程在等待windows 消息时 或者控制台程序线程在等待控制台输入时该线程的状态居然始终是 Runing 其实是阻塞了但我没有找到一个很好的方法来得到线程当前处于阻塞状态怎么办?我想了个笨办法就是在上面两个条件都符合的情况下再看看在此期间有没有心跳如果没有心跳说明死循环了但如果有心跳也不一定就没有死循环遇到这种情况就将可疑的都全部报告了靠人来判断吧
我写了一个示例代码代码中有一个Winform 主线程 和 一个计数器线程计数器线程每秒记一次数并更新界面监控线程检查到非正常挂起或者死循环将在当前目录下写一个Reportlog 输出监控报告
点击Hang后主线程休眠秒计数器线程由于要更新界面也同样会被挂起
监控线程检查到两个线程挂起后报告如下
ThreadMonitorEvent
Thread Name:Main thread
Thread Status:Hang
Thread Stack: at SystemThreadingThreadSleepInternal(Int millisecondsTimeout)
at SystemThreadingThreadSleep(Int millisecondsTimeout)
at DotNetDebugFormbuttonHang_Click(Object sender EventArgs e)
at SystemWindowsFormsControlOnClick(EventArgs e)
at SystemWindowsFormsButtonOnClick(EventArgs e)
at SystemWindowsFormsButtonOnMouseUp(MouseEventArgs mevent)
at SystemWindowsFormsControlWmMouseUp(Message& m MouseButtons button Int clicks)
at SystemWindowsFormsControlWndProc(Message& m)
at SystemWindowsFormsButtonBaseWndProc(Message& m)
at SystemWindowsFormsButtonWndProc(Message& m)
at SystemWindowsFormsControlControlNativeWindowOnMessage(Message& m)
at SystemWindowsFormsControlControlNativeWindowWndProc(Message& m)
at SystemWindowsFormsNativeWindowDebuggableCallback(IntPtr hWnd Int msg IntPtr wparam IntPtr lparam)
at SystemWindowsFormsUnsafeNativeMethodsDispatchMessageW(MSG& msg)
at SystemWindowsFormsApplicationComponentManagerSystemWindowsFormsUnsafeNativeMethodsIMsoComponentManagerFPushMessageLoop(Int dwComponentID Int reason Int pvLoopData)
at SystemWindowsFormsApplicationThreadContextRunMessageLoopInner(Int reason ApplicationContext context)
at SystemWindowsFormsApplicationThreadContextRunMessageLoop(Int reason ApplicationContext context)
at SystemWindowsFormsApplicationRun(Form mainForm)
at DotNetDebugProgramMain()
at SystemAppDomain_nExecuteAssembly(Assembly assembly String[] args)
at SystemAppDomainExecuteAssembly(String assemblyFile Evidence assemblySecurity String[] args)
at MicrosoftVisualStudioHostingProcessHostProcRunUsersAssembly()
at SystemThreadingThreadHelperThreadStart_Context(Object state)
at SystemThreadingExecutionContextRun(ExecutionContext executionContext ContextCallback callback Object state)
at SystemThreadingThreadHelperThreadStart()
:: PM
ThreadMonitorEvent
Thread Name:Counter thread
Thread Status:Hang
Thread Stack: at SystemThreadingWaitHandleWaitOneNative(SafeWaitHandle waitHandle UInt millisecondsTimeout Boolean hasThreadAffinity Boolean exitContext)
at SystemThreadingWaitHandleWaitOne(Int timeout Boolean exitContext)
at SystemThreadingWaitHandleWaitOne(Int millisecondsTimeout Boolean exitContext)
at SystemWindowsFormsControlWaitForWaitHandle(WaitHandle waitHandle)
at SystemWindowsFormsControlMarshaledInvoke(Control caller Delegate method Object[] args Boolean synchronous)
at SystemWindowsFormsControlInvoke(Delegate method Object[] args)
at SystemWindowsFormsControlInvoke(Delegate method)
at DotNetDebugFormCounter()
at SystemThreadingThreadHelperThreadStart_Context(Object state)
at SystemThreadingExecutionContextRun(ExecutionContext executionContext ContextCallback callback Object state)
at SystemThreadingThreadHelperThreadStart()
点击DeadCycle 按钮后让计数器线程死循环但主线程不死循环
监控线程检查到计数器线程死循环后报告如下
:: PM
ThreadMonitorEvent
Thread Name:Counter thread
Thread Status:Hang
Thread Stack: at DotNetDebugFormDoDeadCycle()
at DotNetDebugFormCounter()
at SystemThreadingThreadHelperThreadStart_Context(Object state)
at SystemThreadingExecutionContextRun(ExecutionContext executionContext ContextCallback callback Object state)
at SystemThreadingThreadHelperThreadStart()
:: PM
ThreadMonitorEvent
Thread Name:Counter thread
Thread Status:Hang DeadCycle
Thread Stack: at DotNetDebugFormDoDeadCycle()
at DotNetDebugFormCounter()
at SystemThreadingThreadHelperThreadStart_Context(Object state)
at SystemThreadingExecutionContextRun(ExecutionContext executionContext ContextCallback callback Object state)
at SystemThreadingThreadHelperThreadStart()
下面是示例代码在
以下是测试代码完整源码的下载位置:完整源码
usingSystem;
usingSystemCollectionsGeneric;
usingSystemComponentModel;
usingSystemData;
usingSystemDrawing;
usingSystemText;
usingSystemWindowsForms;
usingSystemThreading;
usingSysDiagnostics;
namespaceDotNetDebug
{
publicpartialclassForm:Form
{
Thread_CounterThread;
ThreadMonitor_ThreadMonitor=newThreadMonitor();
bool_DeadCycle=false;
delegatevoidCounterDelegate();
privatevoidDoDeadCycle()
{
while(_DeadCycle)
{
}
}
privatevoidCounter()
{
intcount=;
while(true)
{
DoDeadCycle();
labelCounterInvoke(newCounterDelegate(delegate(){labelCounterText=(count++)ToString();}));
_ThreadMonitorHeartbeat(ThreadCurrentThread);
ThreadSleep();
}
}
publicForm()
{
InitializeComponent();
}
voidOnThreadMonitorEvent(objectsenderThreadMonitorThreadMonitorEventargs)
{
StringBuildersb=newStringBuilder();
sbAppendLine(DateTimeNowToLongTimeString());
sbAppendLine(ThreadMonitorEvent);
sbAppendLine(ThreadName:+argsName);
sbAppendLine(ThreadStatus:+argsStatusToString());
sbAppendLine(ThreadStack:+argsStackTrace);
using(SystemIOFileStreamfs=
newSystemIOFileStream(reportlogSystemIOFileModeAppend
SystemIOFileAccessWrite))
{
using(SystemIOStreamWritersw=newSystemIOStreamWriter(fs))
{
swWriteLine(sbToString());
}
}
}
privatevoidForm_Load(objectsenderEventArgse)
{
_ThreadMonitorThradMonitorEventHandler+=
newEventHandler<ThreadMonitorThreadMonitorEvent>(OnThreadMonitorEvent);
_CounterThread=newThread(newThreadStart(Counter));
_CounterThreadIsBackground=true;
_ThreadMonitorRegister(newThreadMonitorMonitorParameter(
ThreadCurrentThreadMainthread
ThreadMonitorMonitorFlagMonitorHang|
ThreadMonitorMonitorFlagMonitorDeadCycle));
_ThreadMonitorRegister(newThreadMonitorMonitorParameter(
_CounterThreadCounterthread
ThreadMonitorMonitorFlagMonitorHang|
ThreadMonitorMonitorFlagMonitorDeadCycle));
_CounterThreadStart();
timerHeartbeatInterval=;
timerHeartbeatEnabled=true;
_ThreadMonitorStart();
}
privatevoidtimerHeartBeat_Tick(objectsenderEventArgse)
{
_ThreadMonitorHeartbeat(ThreadCurrentThread);
}
privatevoidButtonDeadCycle_Click(objectsenderEventArgse)
{
_DeadCycle=true;
}
privatevoidbuttonHang_Click(objectsenderEventArgse)
{
ThreadSleep();
}
}
}