Thu, 03 Nov 2016 08:00:00 EDTA data processing method for a data processing system, comprising: initializing a value of a counter associated with a first entry to indicate a number of destinations of other entries on which the first entry depends; changing the value of the counter in a first direction in response to selecting a first one of the other entries; and changing the value of the counter in a second direction opposite the first direction in response to cancelling a second one of the other entries.
Thu, 03 Nov 2016 08:00:00 EDTA fault tolerant multi-threaded processor uses the temporal and/or spatial separation of instructions running in two or more different threads. An instruction is fetched, decoded and executed by each of two or more threads to generate a result for each of the two or more threads. These results are then compared using comparison hardware logic and if there is a mismatch between the results obtained, then an error or event is raised. The comparison is performed on an instruction by instruction basis so that errors are identified (and hence can be resolved) quickly.
Thu, 03 Nov 2016 08:00:00 EDTA method for managing mappings of storage on a code cache for a processor. The method includes storing a plurality of guest address to native address mappings as entries in a conversion look aside buffer, wherein the entries indicate guest addresses that have corresponding converted native addresses stored within a code cache memory, and receiving a subsequent request for a guest address at the conversion look aside buffer. The conversion look aside buffer is indexed to determine whether there exists an entry that corresponds to the index, wherein the index comprises a tag and an offset that is used to identify the entry that corresponds to the index. Upon a hit on the tag, the corresponding entry is accessed to retrieve a pointer to the code cache memory corresponding block of converted native instructions. The corresponding block of converted native instructions are fetched from the code cache memory for execution.
Thu, 03 Nov 2016 08:00:00 EDTAn apparatus and method for performing parallel decoding of prefix codes such as Huffman codes. For example, one embodiment of an apparatus comprises: a first decompression module to perform a non-speculative decompression of a first portion of a prefix code payload comprising a first plurality of symbols; and a second decompression module to perform speculative decompression of a second portion of the prefix code payload comprising a second plurality of symbols concurrently with the non-speculative decompression performed by the first compression module.
Thu, 03 Nov 2016 08:00:00 EDTAn integrated circuit device has a first central processing unit including a digital signal processing (DSP) engine, and a plurality of contexts, each context having a CPU context with a plurality of registers and a DSP context, wherein the DSP context has control bits and a plurality of DSP registers, wherein after a reset of the integrated circuit device the control bits of all DSP context are linked together such that data written to the control bits of a DSP context is written to respective control bits of all other DSP contexts and only after a context switch to another context and a modification of at least one of the control bits of the another DSP context, the control bits of the another context is severed from the link to form independent control bits of the DSP context.
Thu, 03 Nov 2016 08:00:00 EDTIn one embodiment of the present invention, a programmable vision accelerator enables applications to collapse multi-dimensional loops into one dimensional loops. In general, configurable components included in the programmable vision accelerator work together to facilitate such loop collapsing. The configurable elements include multi-dimensional address generators, vector units, and load/store units. Each multi-dimensional address generator generates a different address pattern. Each address pattern represents an overall addressing sequence associated with an object accessed within the collapsed loop. The vector units and the load store units provide execution functionality typically associated with multi-dimensional loops based on the address pattern. Advantageously, collapsing multi-dimensional loops in a flexible manner dramatically reduces the overhead associated with implementing a wide range of computer vision algorithms. Consequently, the overall performance of many computer vision applications may be optimized.
Thu, 03 Nov 2016 08:00:00 EDTProvided are a method and an apparatus for controlling a register of a reconfigurable processor. The power of a register may be efficiently used by Obtaining a command for each of a plurality of read ports of the register from a memory, obtaining activation information for each of the plurality of read ports from the obtained command, and determining an address value of each of the plurality of read ports on the basis of the obtained activation information.
Thu, 03 Nov 2016 08:00:00 EDTA programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path with of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
Thu, 03 Nov 2016 08:00:00 EDTAn arithmetic processing device includes a plurality of arithmetic processing units each including, an internal circuit that, in an instruction processing state in which an instruction is processed, processes the instruction and that, in an instruction processing stopped state in which instruction processing is stopped, transitions to a state of power save operation, and a power control circuit that disables the power save operation; and a monitoring circuit that monitors the instruction processing stopped state of the plurality of arithmetic processing units and counts the number of the arithmetic processing units in the instruction processing stopped state. The power control circuit of each of the plurality of arithmetic processing units disables the power save operation of the arithmetic processing unit in the instruction processing stopped state, in a case where the number of the arithmetic processing units in the instruction processing stopped state exceeds a threshold.