CPU power problem [SOLVED]

I may have mentioned on a Sunday night or two my CPU had been struggling for a while, I didn’t think much of it, I managed my processes to reduce overhead and put up with any shortfalls caused by it.

By chance Corsair recently updated their software, this latest update came with a control panel which I decided to play with; it turned out it had the option to turn on system monitors, so I thought why not. One of the monitors happened to be for the CPU and when it popped up I thought “that is more information than I was expecting”. It was a few moments before I spotted an obvious problem … all 4 cores had temperature readouts in the range of 90-97 degrees … while the system was mostly idle; this was clearly a problem that needed to be investigated.

Step 1, I downloaded a copy of the Intel diagnostic tool in the hopes it might point out the obvious, which in a way it did, in that it ran for a few moments until the CPU temp hit 100 degrees and the background temp test stopped all other tests and stamped FAILED on the screen.

Step 2, check the obvious, I hadn’t cleaned out my case in a while and the vents have obvious dust build ups, so I cracked open the case to have a look.

The healthy layer of fluff seems to be a good start ... but ultimately cleaning it out made no difference :(

Step 3, check the motherboard isn’t doing any crazy overclocking, this was an outside option but had been a previous issue (under vastly different circumstances).

While in the bios I noticed the idle temp was a steady 90 degrees but everything else checked out, but I turned off the OC options as well as the Intel speed boost options just in case; one reboot and Intel diagnostic later, no improvement.

Step 4, check the heat-sink position. By this point annoyance was starting to set in and given its temperature couldn’t get any higher, I decided to apply pressure to the heat-sink while the system was running and monitor the temperature readouts for any change. After a bit of poking and prodding it made no noticeable difference, but it did seem to wiggle a bit too much, prompting a closer look i.e. actually digging the case out from under the desk for a close inspection.

It was fair to say, this was the point that broke the case, 3 of the 4 pins holding the heatsink down were in varying states of release.

Testing showed immediate results, with a 53 degree idle and the diagnostic test being able to finish and pushing the temp to a max 75 degrees. I expected this to be the end of it, but then I decided it was time to turn the speed boost back on; this upped to the idle to 70, not great but manageable, and the diagnostic at seemed ok with a spike to 90, unfortunately it crept slowly up until ultimately hitting 100 degrees again :(

Even without the speed boost the system was performing soooo … much better than before (unsurprisingly) load times were down and frame rates were up, life was good … but 75 degrees is still not ideal and it the CPU has a speed boost … I want it!

Two days later Amazon delivered and I installed a £25 monstrously large heat-sink and life is now good again with turbo on under full load it doesn’t go over 60 degrees, which isn’t bad given getting rid of the CPU bottleneck the GPU is under more load (dumping more heat into the case).

I had a think about what might have caused the pins to come undone, which is most likely when I was house sitting around August last year, the vibrations in the car must have shaken them out … which means the CPU has been running at its temp limit (100 degrees) for almost a year, which makes the intel CPU a modern miracle given it took that beating and not only lasted as long as it did but still put out enough processing power to keep up (mostly) with my gaming needs.

Time will tell if this little issue will cause the silicon to fail before my next upgrade, but for the time being life is good again.

Comments

Brilliant write up, Big R. Thoroughly enjoyed reading that. It's amazing what difference a well seated heatsink can make. I had a similar problem years ago (my Pentium 2, 400 IIRC) but the PC wouldn't start. The CPU hit the cut off immediately. The bios had only the basic telemetry back then but it would power on even without a CPU, so that's how I discovered the issue.

brainwipe's picture

I had something similar happen on a previous machine although in that case it was a fan that failed on the main heatsink. Amazing how long a modern cpu can limp along by adjusting its clockspeed to reduce temperature an old one would have melted its way to the center of the earth citys of gold style.

Evilmatt's picture