I manage the new product verification team for a small manufacturer of industrial automation equipment. We sell most of our products through partners who sell them as their own products. About six month ago, we had one of our partners come to us with an intermittent customer issue. The customer had one of our analog output (4-20mA) modules installed next to a relay module (made by our partner).
The analog module controlled the speed of a conveyor through an oven, and the relay module switched a contactor that controlled the heaters in the oven. The customer had installed a number of these systems at various locations with both AC and DC power to the heaters.
After installation, the systems worked very well, but after about a month, only on the AC-powered systems, one channel of the analog output would go to 0 mA, stopping the conveyor and burning a lot of product. After a power cycle, the system would work again, but with a decreasing failure interval. Our module had been redesigned recently and the older version was not showing the problem at all.
Our partner asked us to try to duplicate the problem with our own equipment. They had managed to duplicate it on one system, but could not on another. One of my test engineers worked with the design engineer for two months trying various loads and accelerated switching rates, but he could not recreate the failure. It appeared that either our module was not the source of the problem, or no one understood the conditions of the failure very well.
Our partner came back to us with more information on the system. They told us that they were able to demonstrate the failure regularly on two systems: one with a large contactor as a load, and another with a resistive load. They also had a third system with a resistive load that would not show the failure. While reading the new data, I noticed that the test system they had managed to duplicate it on had a step down transformer between the relay and the resistive load (local power was 220V and the load was designed for 110V). All of the systems that showed the failure were switching a large inductive load with the relay, while those that worked properly were switching a resistive load directly.
I set up a similar system with the relay switching AC to the largest coil I could find (a 10-pound reel of 18-gauge magnet wire, 210mH, 14Ohms) with the relay switching on and off as fast as the system could do it reliably. I figured that if I could grossly exaggerate what I thought was going on, I could duplicate the failure quickly. Within 20 minutes I had demonstrated the reported failure three times and seen two other failure modes as well, one of which was a complete module reset. I called the design engineer, and we started looking for the cause.