|
| |
| | CPU cache -- Facts, Info, and Encyclopedia article |
 | | The first hardware cache used in a computer system was not actually a data or instruction cache, but rather a TLB. |  | | These predictors are caches in the sense that they store information that is costly to compute. |  | | Trace lines are stored in the trace cache based on the (Click link for more info and facts about program counter) program counter of the first instruction in the trace and a set of branch predictions. |
|
http://www.absoluteastronomy.com/encyclopedia/c/cp/cpu_cache2.htm
(6850 words)
|
|
| |
| | [No title] |
 | | Trace Cache is a hardware structure, each line of which stores a snapshot, or trace, of dynamic instruction stream. |  | | A line of trace cache is filled as instructions are fetched from the instruction cache. |  | | A word about the simulator software We integrated our Trace Cache in a modified version of the SimpleScalar Simulator, a popular toolset used frequently in testing and analysis of mainstream microarchitectural modeling research. |
|
http://longwood.cs.ucf.edu/~mali/TraceCache.doc
(1654 words)
|
|
| |
| | The Trace Cache |
 | | Trace cache is a special I-cache that captures dynamic instruction sequences in contrast to the I-cache that contains static instruction sequences. |  | | Like the I-cache, the trace cache is accessed using the starting address of the next block of instructions. |  | | A trace cache line stores a segment of the dynamic instruction trace across multiple, potentially taken branches. |
|
http://www-csd.ijs.si/courses/trends/sld067.htm
(88 words)
|
|
| |
| | [No title] |
 | | Since any cache design should be essentially transparent to the programmer (except insofar as varying the cache design may effect the relative timings of memory-accessing instructions), almost all of the actual cache design would be implementation specific given the traditional definition. |  | | A unified cache was simulated since information about whether particular transactions occurred as a result of an instruction or data miss is lost in the trace translation process, but these figures should provide a rough estimate of the performance of various secondary cache configurations. |  | | If the primary to secondary cache communications consists of separate input and output buses, then it would be possible for the secondary cache to send the requested data as soon as it is ready, in parallel with receiving the dirty data from the primary cache. |
|
http://www.ecse.rpi.edu/frisc/theses/MaierThesis/Chapter2.html
(8904 words)
|
|
| |
| | A Trace Cache Microarchitecture and Evaluation - Rotenberg, Bennett, Smith (ResearchIndex) |
 | | Trace caches overcome this limitation by caching traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear... |  | | Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. |  | | 15 Alternative fetch and issue policies for the trace cache fet.. |
|
http://citeseer.lcs.mit.edu/rotenberg99trace.html
(668 words)
|
|
| |
| | Computer Science Department Calendar: Seminar - Delivering Instruction Bandwidth using a Trace Cache |
 | | In the trace cache, logically contiguous instructions are placed in physically contiguous storage. |  | | The trace cache is a mechanism that can deliver many instructions per cycle, more than a single basic block, by caching segments of the dynamic instruction stream. |  | | In this talk, I will describe our work on developing the trace cache into an effective means of delivering instructions to a 16-wide superscalar processor. |
|
http://klamath.stanford.edu/~molinero/calendus/CS/read/event_3792_CS_read.html
(367 words)
|
|
| |
| | Dinero IV Trace-Driven Uniprocessor Cache Simulator |
 | | Dinero IV is a cache simulator for memory reference traces. |  | | The basic idea is to simulate a memory hierarchy consisting of various caches connected as one or more trees, with reference sources (the processors) at the leaves and a memory at each root. |  | | After initialization, each reference is fed to the appropriate top-level cache by a single simple function call. |
|
http://www.cs.wisc.edu/~markhill/DineroIV
(318 words)
|
|
| |
| | Chip Architect: Looking at Intel's Prescott Die |
 | | The Trace Cache values were already published with the introduction of the Willamette and are still the same in the latest Prescott PNI document. |  | | A logical consequence of a Trace Cache which can provide 4 micro operations per cycle is the Processors ability to retire micro operations at the same rate at the very end of the processing pipeline. |  | | The Trace Cache keeps the same 4096 entries but now with each containing 4 instructions. |
|
http://www.chip-architect.com/news/2003_03_06_Looking_at_Intels_Prescott.html
(1370 words)
|
|
| |
| | Traces |
 | | The Oracle buffer cache replacement algorithm is similar to LRU [5]. |  | | All traces contained only misses from one or multiple client buffer caches that use LRU or its variations as their replacement algorithms. |  | | Auspex Server Trace was an NFS file system activity trace on an Auspex file server in 1993 at UC Berkeley [16]. |
|
http://www.usenix.org/event/usenix01/full_papers/zhou/zhou_html/node4.html
(523 words)
|
|
| |
| | Trace Cache |
 | | Instruction cache that holds recent sequences of instructions |  | | May store same instructions in multiple traces -- less efficient |
|
http://www.cs.umass.edu/~weems/CmpSci635/Lecture8/L6.50C.html
(28 words)
|
|
| |
| | Software Trace Cache |
 | | The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. |  | | We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance: the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. |  | | We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. |
|
http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/trans/tc/&toc=comp/trans/tc/2005/01/t1toc.xml&DOI=10.1109/TC.2005.13
(246 words)
|
|
| |
| | sandpile.org -- discussion forum |
 | | the trace cache is flushed, the above solution can't be the what |  | | between trace cache lines and the TLB entry (or entries) used to |  | | the trace cache is flushed, the above solution can't be the what |
|
http://www.sandpile.org/post/msgs/20001975.htm
(820 words)
|
|
| |
| | LAUTERBACH - TRACE32 Data Trace in Cache on PPC4XX (PPC403, PPC405, PPC440) |
 | | The trace data can also be used for complex statistics, task aware performance analysis or graphical displays. |  | | The program flow trace of the PPC4xx processors usually allow only tracing of program flow information. |  | | LAUTERBACH - TRACE32 Data Trace in Cache on PPC4XX (PPC403, PPC405, PPC440) |
|
http://www.lauterbach.com/news_191.html
(111 words)
|
|
| |
| | What is trace cache? - A Word Definition From the Webopedia Computer Dictionary |
 | | (trās kash) (n.) An instruction cache in a microprocessor that stores dynamic instruction sequences after they have been fetched and executed in order to follow the instructions at subsequent times without needing to return to the regular cache or the memory for the same instruction sequence. |  | | Evertek: Network Caching - The leading wholesale distributor of computer and electronics products for resellers. |  | | KnowledgeStorm: Network Caching - Business technology search site offering software, service, reseller and hardware information on thousands of IT solutions. |
|
http://www.pcwebopaedia.com/TERM/T/trace_cache.html
(173 words)
|
|
| |
| | Trace Cache Performance Parameters |
 | | The fetch performance of the processor can be improved with the aid of an instruction memory structure known as Trace Cache. |  | | Tulip is also used to understand Trace Cache performance tradeoffs. |  | | Instruction fetch mechanism is a performance bottleneck of a Superscalar Processor. |
|
http://csdl.computer.org/comp/proceedings/iccd/2002/1700/00/17000348abs.htm
(183 words)
|
|
| |
| | (GCJ4BY) The Stagecoach Trace Cache by maggie potts |
 | | An easy cache to do with kids, with a bonus bushwacking micro which provides a better view of the Trace. |  | | ...other caches hidden or found by this user |  | | [Hide & Seek a Cache] [Track a Travel Bug] [Find a Benchmark] |
|
http://www.geocaching.com/seek/cache_details.asp?ID=129332
(778 words)
|
|
| |
| | CPU Technology Overview from ExtremeTech |
 | | One big advantage of the trace cache is that it lets the CPU contiguously store a sequence of micro-ops prior to a branch, the branch instruction itself, and the branch target instructions in the same trace line. |  | | The CPU reads instructions and data from the unified on-board L2 cache and feeds them to the separate and smaller L1 instruction and data caches. |  | | As noted earlier, the P4 decodes x86 instructions into micro-ops, which it writes into the trace cache as sequences of micro-ops, called traces, in program order. |
|
http://www.extremetech.com/article2/0,1697,1829618,00.asp
(1017 words)
|
|
| |
| | Tom's Hardware Guide Processors: Intel's New Pentium 4 Processor - Hardware Prefetch |
 | | Once you understood it, the idea of the trace cache is actually rather simple, but it takes quite a bit more silicon resources and design skill to replace the good old L1 instruction cache with something like Pentium 4's trace cache. |  | | Basically, the 'Execution Trace Cache' is nothing but a L1 instruction cache that lies BEHIND the decoders. |  | | With Pentium III or Athlon, who both have an L1 instruction cache, code is fetched by this cache and stored until it's about time to enter the execution path. |
|
http://www.tomshardware.com/cpu/20001120/p4-06.html
(1230 words)
|
|
| |
| | Tom's Hardware Guide Processors: Intel's New Pentium 4 Processor - The Trace Cache Branch Prediction Unit |
 | | Read register file (to ensure that the correct ones of the 128 all-purpose register files are used as the register(s) for the actual instruction) |  | | Tom's Hardware Guide Processors: Intel's New Pentium 4 Processor - The Trace Cache Branch Prediction Unit |  | | Its branch target buffer is 8 times as large as the one found in Pentium III and its new algorithm is supposed to be way better than AMD's latest G-share algorithm used in Thunderbird and Spitfire. |
|
http://www20.tomshardware.com/cpu/20001120/p4-09.html
(1013 words)
|
|
| |
| | Palati_Computing_Offers3 |
 | | Featuring a 2.53GHz Intel® Pentium® 4 processor, 17" F70 LCD flat panel monitor, 512MB RAM and the Microsoft® Windows® XP Home Edition operating system, the HP refurbished 724c-b is a powerful desktop computer perfect for your business and personal needs. |  | | 8KB L1 cache plus 12k micro-op trace cache |
|
http://www.palati.com/palati_computing_offers3.htm
(293 words)
|
|
| |
| | Ace's Hardware - General Message Board |
 | | I still tend to think that some kind of trace-cache design which stores macroops or x86 instructions repackaged in easy to decode slots would be a good compromise for x86 front stages. |  | | Re: Think trace cache, not old style :) ( |  | | It would cut down a few cycles from decoding, and avoid bubbles on predicted paths, while making a better use of I$ bandwidth and capacity (relative to P4-style uop trace cache). |
|
http://www.aceshardware.com/forums/read_post.jsp?id=115143747&forumid=1
(815 words)
|
|
|