GHC 9.6 provides a function to list up the current threads finally. The function is listThreads exported from the GHC.Conc.Sync module. listThreads is a killer debug method for thread leaks.
If you have Haskell programs which run for a long time, it's quite nice to provide feature to monitor threads with the following functions:
import Data.List (sort) import Data.Maybe (fromMaybe) import GHC.Conc.Sync (ThreadStatus, listThreads, threadLabel, threadStatus) printThreads :: IO () printThreads = threadSummary >>= mapM_ (putStrLn . showT) where showT (i, l, s) = i ++ " " ++ l ++ ": " ++ show s threadSummary :: IO [(String, String, ThreadStatus)] threadSummary = (sort <$> listThreads) >>= mapM summary where summary t = do let idstr = drop 9 $ show t l <- fromMaybe "(no name)" <$> threadLabel t s <- threadStatus t return (idstr, l, s)
The following is an example of how printThreads displays a list of thread status:
1 (no name): ThreadFinished 2 IOManager on cap 0: ThreadRunning 3 TimerManager: ThreadBlocked BlockedOnForeignCall 4 main: ThreadRunning 5 accepting: ThreadBlocked BlockedOnMVar 6 server:recv: ThreadBlocked BlockedOnForeignCall 7 server:gracefulClose: ThreadRunning
Let's label threads
Threads spawned via forkIO or others do not have its label by default. Threads without label displayed "(no name)" in the example above. If there are a lot of threads without label, debugging is hard. So, I have already asked GHC developers to label threads created in the libraries shipped with GHC.
I would also like to ask all library maintainers to label threads if forked. You can use the following code to label your threads:
import Control.Concurrent (myThreadId) import GHC.Conc.Sync (labelThread) labelMe :: String -> IO () labelMe lbl = do tid <- myThreadId labelThread tid lbl
labelThread is a very old function. So, you can use it without worrying about GHC versions.
labelThread override the current label if exists. If you don't want to override it, use the following function:
{-# LANGUAGE CPP #-}
import Control.Concurrent (myThreadId)
import GHC.Conc.Sync (labelThread, threadLabel)
labelMe :: String -> IO ()
#if MIN_VERSION_base(4,18,0)
labelMe name = do
tid <- myThreadId
mlabel <- threadLabel tid
case mlabel of
Nothing -> labelThread tid name
Just _ -> return ()
#else
labelMe name = do
tid <- myThreadId
labelThread tid name
#endif
Unfortunately, the first appear of threadLabel is GHC 9.6. So, #if is necessary.
ThreadFinished
Threads in the ThreadFinished status should be GCed quickly. If you see a long-lived threads in this status, their ThreadIds are held somewhere. Surprisingly, ThreadId is not integer but reference! The following is an example that a WAI timeout manger holds ThreadIds, resulting in thread leaks.
10150 WAI timeout manager (Reaper): ThreadBlocked BlockedOnMVar 10190 Warp HTTP/1.1 192.0.2.1:43390: ThreadFinished 10191 Warp HTTP/1.1 192.0.2.1:43392: ThreadFinished 10193 Warp HTTP/1.1 192.0.2.1:43404: ThreadFinished 10202 Warp HTTP/1.1 192.0.2.1:43406: ThreadFinished 10204 Warp HTTP/1.1 192.0.2.1:41256: ThreadFinished
To prevent this thread leaks, hold Weak ThreadId instead. This can be created via mkWeakThreadId provided by Control.Concurrent. To convert Weak ThreadId to ThreadId, use deDefWeak exported from GHC.Weak.
More labels
The ThreadBlocked constructor of the ThreadStatus type contains BlockReason. It has the following constructors:
BlockedOnMVarBlockedOnBlackHoleBlockedOnExceptionBlockedOnSTMBlockedOnForeignCallBlockedOnOther
It's nice if we can label MVar via labelMVar and BlockedOnMVar contains its label. STM data types should follow this way, too.