LinHES Forums :: View topic

A true OS like Linux benefits from SMP (multi-processor - including dual core) platforms. This means that if you're running Linux on a hyperthread capable processor, turn the hyperthreading OFF. in the BIOS. The Linux kernel will multitask better without Intel hyperthreading.

...

For the average user:

If you run MS Windows - leave Hyperthreading ON.

If you run Linux - turn Hyperthreading OFF.

Hyper-threading's greatest strength--shared resources--also turns out to be its greatest weakness, as well. Problems arise when one thread monopolizes a crucial resource, like the floating-point unit, and in doing so starves the other thread and causes it to stall. The problem here is the exact same problem that we discussed with cooperative multi-tasking: one resource hog can ruin things for everyone else. Like a cooperative multitasking OS, the Xeon for the most part depends on each thread to play nicely and to refrain from monopolizing any of its shared resources.

For example, if two floating-point intensive threads are trying to execute a long series of complex, multi-cycle floating-point instructions on the same physical processor, then depending on the activity of the scheduler and the composition of the scheduling queue one of the threads could potentially tie up the floating-point unit while the other thread stalls until one of its instructions can make it out of the scheduling queue. On a non-SMT processor, each thread would get only its fair share of execution time because at the end of its time-slice it would be swapped off the CPU and the other thread would be swapped onto it. Similarly, with a time-slice multithreaded CPU no one thread can tie up an execution unit for multiple consecutive pipeline stages. The SMT processor, on the other hand, would see a significant decline in performance as each thread contends for valuable but limited execution resources. In such cases, an SMP solution would be far superior, and in the worst of such cases a non-SMT solution would even give better performance.

The advantages of Hyper-Threading are listed as: improved support for multi-threaded code, allowing multiple threads to run simultaneously, improved reaction and response time.

According to Intel, the first implementation only used an additional 5% of the die area over the comparable non-hyperthreaded processor, yet yielded performance improvements of 15â€“30%.

Intel claims up to a 30% speed improvement compared against an otherwise identical, non-simultaneous multithreading Pentium 4. The performance improvement seen is very application-dependent, however, and some programs actually slow down slightly when Hyper Threading Technology is turned on. This is due to the replay system of the Pentium 4 tying up valuable execution resources, thereby starving the other thread. (The Pentium 4 Prescott core gained a replay queue, which reduces execution time needed for the replay system, but this is not enough to completely overcome the performance hit.) However, any performance degradation is unique to the Pentium 4 (due to various architectural nuances), and is not characteristic of simultaneous multithreading in general.

Author:	pugfantus [ Thu Nov 08, 2007 12:54 pm ]
Post subject:	Hyperthreading
Hi, I was just wondering if there was a reason that HyperThreading wasn't enabled in the kernel on R5F27? SMP is there, but SMT isn't according to /usr/src/linux/.config: # CONFIG_SCHED_SMT is not set Code: pug@mythtv:/usr/src/linux$ dmesg \| grep -i cpu ACPI: SSDT (v001 PmRef Cpu0Ist 0x00003000 INTL 0x20040311) @ 0x3eff6ab0 ACPI: SSDT (v001 PmRef CpuPm 0x00003000 INTL 0x20040311) @ 0x3eff6f40 Initializing CPU#0 CPU: After generic identify, caps: bfebfbff 20000000 00000000 00000000 0000e59d 00000000 00000001 CPU: After vendor identify, caps: bfebfbff 20000000 00000000 00000000 0000e59d 00000000 00000001 CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 20000000 00000000 00000180 0000e59d 00000000 00000001 CPU0: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 05 Brought up 1 CPUs ACPI: Getting cpuindex for acpiid 0x1 ACPI: Getting cpuindex for acpiid 0x2 ACPI: Getting cpuindex for acpiid 0x3 pug@mythtv:/usr/src/linux$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 6 model name : Intel(R) Pentium(R) 4 CPU 3.20GHz stepping : 5 cpu MHz : 3207.443 cache size : 2048 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 6 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss [color=red]ht[/color] tm pbe lm constant_tsc up pni monitor ds_cpl est tm2 cid cx16 xtpr lahf_lm bogomips : 6436.28

Author:	tjc [ Thu Nov 08, 2007 7:32 pm ]
Post subject:
Disabled because it's implicated in DMA issues with certain cards/drivers.

Author:	Liv2Cod [ Thu Nov 08, 2007 7:46 pm ]
Post subject:
My Prescott never ran as well when I loaded R5F27. I wonder if this hyperthreading flag is the difference. My Prescott ran uber-hot and had issues playing HD sources after I upgraded. I finally punted the whole mainboard over it, thinking something was amiss with the hardware (which there still may be).

Author:	larrybpsu [ Thu Nov 08, 2007 9:55 pm ]
Post subject:
Just to add some wisdom to the board: "Hyperthreading" as Intel processors implemented it is/was a quasi-solution ONLY for the poor performance caused by the Microsoft brand of OS'es. It simulates an additional CPU core to boost the perceived performance, NOT the actual computing horsepower. There were articles a LONG time ago, stating that using multi-processor systems were ONLY to enhance the user feedback, by giving the GUI top priority on the primary core. All the work threads would be dispatched to the secondary cores. That's why some programs have 'threads,' seperate processes that they can delegate to other cores, and leave the primary core for the GUI. A true OS like Linux benefits from SMP (multi-processor - including dual core) platforms. This means that if you're running Linux on a hyperthread capable processor, turn the hyperthreading OFF. in the BIOS. The Linux kernel will multitask better without Intel hyperthreading. As far as I know, AMD CPU's do NOT hyperthread. I could be wrong. In terms of SMP systems, the added value of additional CPU's peaks out around 4 cores for mainstream applications. Parallel computing (provided the programming is proper) extends that threshold. A lot of technical details: It has to do with programming applications with 'threads.' SMP allocates processes to the most idle core, but threads can further divide the work if needed. Most applications do neither. For the average user: If you run MS Windows - leave Hyperthreading ON. If you run Linux - turn Hyperthreading OFF.

Author:	brfransen [ Sat Nov 10, 2007 12:08 am ]
Post subject:
larrybpsu, thanks for the input. I have a P4 3GHz HT and with HT enabled I couldn't watch a HD program and commflag at the same time without the playback having some serious artifacts. To avoid the artifacts I was using the pause_commflag script. After I read this post earlier today I decided to try turning off HT and disable the pause_commflag script. I was very pleased to find that I could playback a HD program, commflag and record 2 HD programs all at the same time without any artifacts disrupting the playback. You are exactly right, the Linux kernel does seem to multi task better with hyperthreading turned off. Thanks, Britney

LinHES Forums http://forum.linhes.org/

Hyperthreading http://forum.linhes.org/viewtopic.php?f=6&t=17212	Page 1 of 2

Author:	cecil [ Sat Nov 10, 2007 12:35 pm ]
Post subject:
Wow! Great tip!

Author:	syphr42 [ Tue Nov 13, 2007 8:19 pm ]
Post subject:
I don't know if this is a universal rule. I get a hiccup every now and then if I am flagging commercials while watching HD, so I disabled hyperthreading in the BIOS. I then booted up and tried watching HD while commercial flagging was running and it was impossible to watch. Much, much worse, with the video freezing every second or two. So I turned hyperthreading back on and now things are as they were, a hiccup every now and then, but still watchable.

Author:	jzigmyth [ Wed Nov 14, 2007 7:52 am ]
Post subject:
syphr42, Have you tried this tip with hyperthreading off? http://mysettopbox.tv/phpBB2/viewtopic.php?t=16990

Author:	soundoff [ Thu Nov 15, 2007 7:28 pm ]
Post subject:
very interesting. I am running a 2.6ghz HT at the moment and it doesnt perform as well as i hoped it would. I will give it a go with HT turned off.

Author:	larrybpsu [ Fri Nov 16, 2007 8:04 pm ]
Post subject:
syphr42: Looking at the hardware that you have listed, I might guess that there's an I/O bottleneck with data moving on both the IDE and SATA interfaces. Are you recording/watching material from one drive, and commflagging data on the other drive at the same time? Each data access generates background interrupts that the core has to change gears (context transfers) on. These could work better on a HT core, but I'm still leaning towards an I/O bottleneck. If you were to perform some testing using only IDE or SATA data streams, that may provide some better insight. Another issue may be: Is the DVD burner the Master or Slave? I would always set the hard drive as the master, since it would have better I/O performance than the DVD burner. Of course, I'm assuming that you have them on the same IDE interface to the mobo. I could be wrong!

Author:	syphr42 [ Fri Nov 16, 2007 8:41 pm ]
Post subject:
Thanks for both replies. jzigmyth -- I haven't gotten around to trying that thread yet, but I will report back when I do. larrybpsu -- The SATA drive is storage only, the only part of mythtv that has any access to it is mythvideo for playback of some stored video. All of the recordings are on the IDE (so all commflagging happens there as well). Also, you are correct about the IDE chain. The hard drive is master and the NEC burner is the slave.

Author:	marc.aronson [ Thu Nov 22, 2007 12:23 pm ]
Post subject:
larrybpsu wrote: A true OS like Linux benefits from SMP (multi-processor - including dual core) platforms. This means that if you're running Linux on a hyperthread capable processor, turn the hyperthreading OFF. in the BIOS. The Linux kernel will multitask better without Intel hyperthreading. ... For the average user: If you run MS Windows - leave Hyperthreading ON. If you run Linux - turn Hyperthreading OFF. I agree that Linux takes advantage of SMP architectures. Having said this, your advise is contrary to the experience I had when I used a p4-2.8ghz HT CPU under R5D1 about a year ago. I tried setting HT on and off in the bios and performance was significantly better with HT enabled. I would also point out the HT is not SMP. In an SMP system all cores have the same capabilities. The "extra CPU" that comes with HT processors is only able to execute a subset of the instructions that the CPU supports. It winds up that this subset of instructions is used extensively by software that decodes mpeg2 video for playback, which is why mythtv benefits from HT CPUs. Marc

Author:	larrybpsu [ Thu Nov 22, 2007 2:58 pm ]
Post subject:
marc, Please point me to the discussuion about HT using a subset of CPU instructions, so I and others can read up on the topic. I got out of the hardware details years ago, and what I remember was that HT was a kludge to improve MS Windows' performance. Anyone that wants more power from a system needs multiple cores at the fastest speed that they can get, afford, etc. My quick searching shows that the Linux kernel does indeed support HT, so I'll just go with the flow for now. The link is: http://www.ibm.com/developerworks/linux/library/l-htl/ I'll refine my initial statement to say: If you have a single core HT processor, try keeping the HT turned on. If you have problems, try it again with HT turned off.

Author:	marc.aronson [ Thu Nov 22, 2007 4:34 pm ]
Post subject:
I did my reading on hyperthreading several years ago and did not retain copies of the articles I read, but I found one at http://arstechnica.com/articles/paedia/ ... eading.ars that does a good job of describing it. Here is an extract from that article that highlights one of the key differences between HT and SMP: Quote: Hyper-threading's greatest strength--shared resources--also turns out to be its greatest weakness, as well. Problems arise when one thread monopolizes a crucial resource, like the floating-point unit, and in doing so starves the other thread and causes it to stall. The problem here is the exact same problem that we discussed with cooperative multi-tasking: one resource hog can ruin things for everyone else. Like a cooperative multitasking OS, the Xeon for the most part depends on each thread to play nicely and to refrain from monopolizing any of its shared resources. For example, if two floating-point intensive threads are trying to execute a long series of complex, multi-cycle floating-point instructions on the same physical processor, then depending on the activity of the scheduler and the composition of the scheduling queue one of the threads could potentially tie up the floating-point unit while the other thread stalls until one of its instructions can make it out of the scheduling queue. On a non-SMT processor, each thread would get only its fair share of execution time because at the end of its time-slice it would be swapped off the CPU and the other thread would be swapped onto it. Similarly, with a time-slice multithreaded CPU no one thread can tie up an execution unit for multiple consecutive pipeline stages. The SMT processor, on the other hand, would see a significant decline in performance as each thread contends for valuable but limited execution resources. In such cases, an SMP solution would be far superior, and in the worst of such cases a non-SMT solution would even give better performance. As I reflect on what I have read in this article, I did not characterize the limitations of HT 100% accurately, although it was directionally correct. I would probably be more accurate to state the following: 1. If the two threads need the same processor resources on an HT-capable processor, one thread will have to stall until the other thread releases that resource. 2. Some resources are duplicated on the processor so that both threads can be serviced simultaneously. My recollection is that an HT processor has 2 integer arithmetic units, and hence has the ability to keep two threads running simultaneously even if both need access to the processor's integer arithmetic capability. The same is not true for the floating point processor and many other parts of the processor. I would sum it up as follows: HT is an attempt to provide a level of parallelism without fully replicating the entire processor logic. It works well for some applications -- especially those that make extensive use of integer arithmetic, if memory services me correctly. It differs from SMP in a Symmetric Multi Processor architecture provides symmetric (equivalent) capabilities in both of its "processors". HT does not. Hope this helps. Marc

Page 1 of 2	All times are UTC - 6 hours
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/

Author:	marc.aronson [ Fri Nov 23, 2007 10:31 am ]
Post subject:
There is one more thing about hyperthreading that I would like to comment on. I understand why one might think of "HT" as a "kludge", but I'd like to suggest another way of thinking about it. When I first started to read about HT I saw it as a creative way to drive incremental performance out of an existing processor architecture, and a natural progression in thinking. Consider: 1. Very early CPUs could only execute 1 instruction at a time. A good example of this was the CPU used in the early Univac 1108. 2. As time went by, processors were designed to enable the execution of multiple instructions in a single cycle. The restrictions were: All instructions had to be from the same thread; none of the instructions in the group could have a "conditional jump" instruction; none of the instructions could need access to the same processor resources. This parallel execution was referred to as a execution "stack" or "pipeline" and violating any of the aforementioned rules was sometimes referred to as "breaking the stack". When hand-coding assembler routines the mantra become "don't break the stack", as getting it right significantly improved performance. This type of architecture was available on the Univac 1110. 3. Somewhere along the way Intel processors leveraged the concept of "predictive branching". In this scheme, instead of a "conditional jump" always breaking the stack, or pipeline, the processor would attempt to predict which way a conditional jump would go. If it was right, the stack would continue to execute. If it predicted incorrectly, part of the results computed would be discarded and instructions re-executed. Please don't ask me which machine / processor introduced this -- that would stretch my memory way too hard . 4. HT seemed like a natural progression of this type of work. In essence they allowed parallelism to now span multiple threads and replicated a very small subset of the processors resources to enable parallelism under certain circumstances. As the wikipedia article below indicates, in exchange for a 5% increase in die area, certain application mixes could achieve a 15% - 30% increase in performance. In my case, I saw a 40% increase in performance when it came to decoding and displaying HD video. An interesting question is: What is the future of HT? Design and manufacturing advances have made it possible to produce true SMP processors at very low prices, so my guess is that HT becomes less interesting, perhaps to the point where you don't bother to support it anymore. I know my dual-core Pentium-D processor does not support HT. On the other hand, I read somewhere that intel will re-introduce HT into its multi-core architecture. Marc http://en.wikipedia.org/wiki/Hyper-threading Quote: The advantages of Hyper-Threading are listed as: improved support for multi-threaded code, allowing multiple threads to run simultaneously, improved reaction and response time. According to Intel, the first implementation only used an additional 5% of the die area over the comparable non-hyperthreaded processor, yet yielded performance improvements of 15â€“30%. Intel claims up to a 30% speed improvement compared against an otherwise identical, non-simultaneous multithreading Pentium 4. The performance improvement seen is very application-dependent, however, and some programs actually slow down slightly when Hyper Threading Technology is turned on. This is due to the replay system of the Pentium 4 tying up valuable execution resources, thereby starving the other thread. (The Pentium 4 Prescott core gained a replay queue, which reduces execution time needed for the replay system, but this is not enough to completely overcome the performance hit.) However, any performance degradation is unique to the Pentium 4 (due to various architectural nuances), and is not characteristic of simultaneous multithreading in general.