I have a couple of scheduled workflows - mostly maintenance workflow actions - that goes into error state with the status "Failed on Start (Retrying)" or "Error Occurred" due to some problem. This results in the scheduled workflow never to run until I manually terminate it - causing maintenance actions never to be performed. I do not want to monitor this every day.
How can determine (and perhaps automatically cancel and restart) these kind of workflows on a farm level (or site collection at lease)?