Listar Arquitectura de Computadores - (AC) por título
Mostrando ítems 167-186 de 198
-
A scheduling theory framework for GPU tasks efficient execution
(2018-07-16)Concurrent execution of tasks in GPUs can reduce the computation time of a workload by overlapping data transfer and execution commands. However it is difficult to implement an efficient run- time scheduler ... -
Simplified Floating-Point Units for High Dynamic Range Image and Video Systems
(2015-06-29)The upcoming arrival of high dynamic range image and video applications to consumer electronics will force the utilization of floating-point numbers on them. This paper shows that introducing a slight modification on ... -
Un sistema de evaluación de innumerables variantes para tiempos de confinamiento.
(Universidad de Las Palmas de Gran Canaria, 2020)En apenas unos meses, y debido a la pandemia mundial del COVID19, se ha producido un cambio dramático en el contexto educativo que ha obligado a los docentes de todo el mundo a resolver problemas académicos que inmediatamente ... -
Siting Multiple Observers for Maximum Coverage: An Accurate Approach
(2015-06-05)The selection of the minimal number of observers that ensures the maximum visual coverage over an area represented by a digital elevation model (DEM) have great interest in many elds, e.g., telecommunications, environment ... -
SkewEngine: enhancing performance of intensive calculations on regular meshes
(2023)In various applications such as hyperspectral data manipulation, MRI data exploration, or visual basin identification in digital elevation models, performing arithmetic operations on each point of a data mesh that involve ... -
SkewEngine: enhancing performance of intensive calculations on regular meshes
(Springer Nature, 2024-02-24)In various applications such as hyperspectral data manipulation, MRI data exploration, or viewshed identification in digital elevation models, performing arithmetic operations on each point of a data mesh that involves ... -
SkewEngine: reorganización de mallas regulares para cálculo intensivo
(2023)En muchas aplicaciones, tales como la manipulación de datos hiperespectrales, la exploración de datos de resonancia magnética o la identificación de cuencas visuales en modelos digitales de elevación, es necesario realizar ... -
Smith-Waterman Acceleration in Multi-GPUs: A Performance per Watt Analysis
(Springer, 2017)We present a performance per watt analysis of CUDAlign 4.0, a parallel strategy to obtain the optimal alignment of huge DNA se- quences in multi-GPU platforms using the exact Smith-Waterman method. Speed-up factors and ... -
Solución de múltiples sistemas lineales en GPUs
(2013-11-05)Este trabajo se centra en el calculo, de forma concurrente, de múltiples sistemas lineales definidos por matrices densas de una dimensión media. Se considera una solución basada en la factorización de Cholesky y su ... -
Solving Large-Scale Markov Decision Processes on Low-Power Heterogeneous Platforms
(2019-07-11)Markov Decision Processes (MDPs) provide a framework for a machine to act autonomously and intelligently in environments where the effects of its actions are not deterministic. MDPs have numerous applications. We focus ... -
Soporte de Orden y Reducciones en Sistemas de Memoria Transaccional
(UMA Editorial, 2018-11-12)En esta tesis se proponen soluciones basadas en memoria transaccional (TM) para optimizar la detección y resolución de conflictos en códigos que presentan operaciones de reducción. Las reducciones aparecen frecuentemente ... -
Speculative Barriers with Transactional Memory
(IEEE, 2020-12-14)Transactional Memory (TM) is a synchronization model for parallel programming which provides optimistic concurrency control. Transactions can run in parallel and are only serialized in case of conflict. In this work we use ... -
System-Level Design of Energy-Proportional Many-Core Servers for Exascale Computing
(2017-10-13)Continuous advances in manufacturing technologies are enabling the development of more powerful and compact high-performance computing (HPC) servers made of many-core processing architectures. However, this soaring demand ... -
Tasks Fairness Scheduler for GPU
(2019-09-24)Nowadays GPU clusters are available in almost every data processing center. Their GPUs are typically shared by different applications that might have different processing needs and/or different levels of priority. As current ... -
Teaching the Cache Memory System Using a Reconfigurable Approach
(IEEE, 2008)This paper presents a tool that simulates a reconfigurable cache whose parameters can be changed at runtime through a special instruction at the instruction set architecture (ISA) level. The proposed tool simulates a cache ... -
Three is not a crowd: ACPU-GPU-FPGA K-means implementation
(2017-06-15)Clustering is the task of assigning a set of objects into groups (clusters) so that objects in the same group are more similar to each other than to those in other groups. In particular, K-means is a clustering algorithm ... -
Time series analysis acceleration with advanced vectorization extensions
(Springer, 2023)Time series analysis is an important research topic and a key step in monitoring and predicting events in many felds. Recently, the Matrix Profle method, and particularly two of its Euclidean-distance-based implementations—SCRIMP ... -
Time Series Analysis Using Transprecision Computing
(2019-09-11)This work presents results using transprecision techniques for reducing the precision of the computation of time series analysis. The developed benchmark allows to explore how the accuracy of the results is affected by ... -
Time Series Heterogeneous Co-execution on CPU+GPU
(2019-07-10)Time series motif (similarities) and discords discovery is one of the most important and challenging problems nowadays for time series analytics. We use an algorithm called “scrimp” that excels in collecting the relevant ... -
TMbarrier: speculative barriers using hardware transactional memory
(2018-11-15)Barrier is a very common synchronization method used in parallel programming. Barriers are used typically to enforce a partial thread execution order, since there may be dependences between code sections before and after ...