Institut für Technische Informatik und Mikroelektronik

95 Items

Recent Submissions
A reconfigurable architecture for real-time image compression on-board satellites

Manthey, Kristian (2017)

Data products of optical remote sensing systems are increasingly used in many areas of our everyday life. The spatial as well as the spectral resolution of satellite image data increases steadily with new missions resulting in a higher precision of known procedures and new application scenarios. While the memory capacity requirements can still be fulfilled, the transmission capacity becomes inc...

libWater: heterogeneous distributed computing made easy

Grasso, Ivan ; Pellegrini, Simone ; Cosenza, Biagio ; Fahringer, Thomas (2013)

Clusters of heterogeneous nodes composed of multi-core CPUs and GPUs are increasingly being used for High Performance Computing (HPC) due to the benefits in peak performance and energy efficiency. In order to fully harvest the computational capabilities of such architectures, application developers often employ a combination of different parallel programming paradigms (e.g. OpenCL, CUDA, MPI an...

Low-power high-efficiency video decoding using general purpose processors

Chi, Chi Ching ; Álvarez-Mesa, Mauricio ; Juurlink, Ben (2015)

In this article, we investigate how code optimization techniques and low-power states of general-purpose processors improve the power efficiency of HEVC decoding. The power and performance efficiency of the use of SIMD instructions, multicore architectures, and low-power active and idle states are analyzed in detail for offline video decoding. In addition, the power efficiency of techniques suc...

A generic implementation of a quantified predictor on FPGAs

Thomas, Gervin ; Elhossini, Ahmed ; Juurlink, Ben (2014)

Predictors are used in many fields of computer architectures to enhance performance. With good estimations of future system behaviour, policies can be developed to improve system performance or reduce power consumption. These policies become more effective if the predictors are implemented in hardware and can provide quantified forecasts and not only binary ones. In this paper, we present and e...

An automatic input-sensitive approach for heterogeneous task partitioning

Kofler, Kofler ; Grasso, Ivan ; Cosenza, Biagio ; Fahringer, Thomas (2013)

Unleashing the full potential of heterogeneous systems, consisting of multi-core CPUs and GPUs, is a challenging task due to the difference in processing capabilities, memory availability, and communication latencies of different computational resources. In this paper we propose a novel approach that automatically optimizes task partitioning for different (input) problem sizes and different h...

A QHD-capable parallel H.264 decoder

Chi, Chi Ching ; Juurlink, Ben (2011)

Video coding follows the trend of demanding higher performance every new generation, and therefore could utilize many-cores. A complete parallelization of H.264, which is the most advanced video coding standard, was found to be difficult due to the complexity of the standard. In this paper a parallel implementation of a complete H.264 decoder is presented. Our parallelization strategy exploits ...

Composable local memory organisation for streaming applications on embedded MPSoCs

Ambrose, Jude ; Molnos, Anca ; Nelson, Andrew ; Cotofana, Sorin ; Goossens, Kees ; Juurlink, Ben (2011)

Multi-Processor Systems on a Chip (MPSoCs) are suitable platforms for the implementation of complex embedded applications. An MPSoC is composable if the functional and temporal behaviour of each application is independent of the absence or presence of other applications. Composability is required for application design and analysis in isolation, and integration with linear effort. In this paper...

Poster: implications of merging phases on scalability of multi-core architectures

Manivannan, Madhavan ; Juurlink, Ben ; Stenstrom, Per (2011)

Amdahl's Law estimates parallel applications with negligible serial sections to potentially scale to many cores. However, due to merging phases in data mining applications, the serial sections do not remain constant. We extend Amdahl's model to accommodate this and establish that Amdahl's Law can overestimate the scalability offered by symmetric and asymmetric architectures for such application...

Automatic problem size sensitive task partitioning on heterogeneous parallel systems

Grasso, Ivan ; Kofler, Klaus ; Cosenza, Biagio ; Fahringer, Thomas (2013)

In this paper we propose a novel approach which automatizes task partitioning in heterogeneous systems. Our framework is based on the Insieme Compiler and Runtime infrastructure. The compiler translates a single-device OpenCL program into a multi-device OpenCL program. The runtime system then performs dynamic task partitioning based on an offline-generated prediction model. In order to derive t...

Programming parallel embedded and consumer applications in OpenMP superscalar

Andersch, Michael ; Chi, Chi Ching ; Juurlink, Ben (2012)

In this paper, we evaluate the performance and usability of the parallel programming model OpenMP Superscalar (OmpSs), apply it to 10 different benchmarks and compare its performance with corresponding POSIX threads implementations.

Amdahl's law for predicting the future of multicores considered harmful

Juurlink, Ben ; Meenderinck, Cor (2012)

Several recent works predict the future of multicore systems or identify scalability bottlenecks based on Amdahl's law. Amdahl's law implicitly assumes, however, that the problem size stays constant, but in most cases more cores are used to solve larger and more complex problems. There is a related law known as Gustafson's law which assumes that runtime, not the problem size, is constant. In ot...

Evaluation of parallel H.264 decoding strategies for the Cell Broadband Engine

Chi, Chi Ching ; Juurlink, Ben ; Meenderinck, Cor (2010)

How to develop efficient and scalable parallel applications is the key challenge for emerging many-core architectures. We investigate this question by implementing and comparing two parallel H.264 decoders on the Cell architecture. It is expected that future many-cores will use a Cell-like local store memory hierarchy, rather than a non-scalable shared memory. The two implemented parallel algor...

Spatio-temporal SIMT and scalarization for improving GPU efficiency

Lucas, Jan ; Andersch, Michael ; Álvarez-Mesa, Mauricio ; Juurlink, Ben (2015)

Temporal SIMT (TSIMT) has been suggested as an alternative to conventional (spatial) SIMT for improving GPU performance on branch-intensive code. Although TSIMT has been briefly mentioned before, it was not evaluated. We present a complete design and evaluation of TSIMT GPUs, along with the inclusion of scalarization and a combination of temporal and spatial SIMT, named Spatiotemporal SIMT (STS...

Exploitation of environmental constraints in human and robotic grasping

Eppner, Clemens ; Deimel, Raphael ; Álvarez-Ruiz, José ; Maertens, Marianne ; Brock, Oliver (2015)

We investigate the premise that robust grasping performance is enabled by exploiting constraints present in the environment. These constraints, leveraged through motion in contact, counteract uncertainty in state variables relevant to grasp success. Given this premise, grasping becomes a process of successive exploitation of environmental constraints, until a successful grasp has been establish...

Special Issue on the Sixteenth International Symposium on Robotics Research, 2013

Barfoot, Tim ; Brock, Oliver (2015)


A novel type of compliant and underactuated robotic hand for dexterous grasping

Deimel, Raphael ; Brock, Oliver (2016)

The usefulness and versatility of a robotic end-effector depends on the diversity of grasps it can accomplish and also on the complexity of the control methods required to achieve them. We believe that soft hands are able to provide diverse and robust grasping with low control complexity. They possess many mechanical degrees of freedom and are able to implement complex deformations. At the same...

Soft robotic hands for compliant grasping

Deimel, Raphael (2017)

The thesis considers the problem of grasping for autonomous robots, with a focus on the design and construction of robotic hands and grippers. The approach we take is to fundamentally reconsider the basic motivation and goals for grasping that steer hand design. We consider grasping as the result of reliable and robust patterns of interaction between hand, object and environment which are mecha...

On decomposability in robot reinforcement learning

Höfer, Sebastian (2017)

Reinforcement learning is a computational framework that enables machines to learn from trial-and-error interaction with the environment. In recent years, reinforcement learning has been successfully applied to a wide variety of problem domains, including robotics. However, the success of the reinforcement learning applications in robotics relies on a variety of assumptions, such as the availab...

Medical image analysis of gastric cancer in digital histopathology: methods, applications and challenges

Sharma, Harshita (2017)

Medical image analysis in digital histopathology is a currently expanding and exciting field of scientific research. In this work, histopathological image analysis is extensively studied and a systematic framework for computer-based analysis in H&E stained whole slide images of gastric carcinoma is proposed. The exhaustive experimental study comprises of three fundamental stages, namely, prepar...

Ultrasonic flow metering with highly accurate jitter and offset compensation

Hamouda, Assia (2017)

This thesis proposes a new method for measuring water flow with a transit time ultrasonic flow meter device. The developed method allows the ultrasonic flow meter to reach a better performance than currently available commercial flow meters by accurately detecting very low flow rates of less than two liters per hour (2 l/h) in a typical household water meter. In principle, the flow velocity of ...