Close

Session

Posters, Research Posters: Research Posters Display
Event TypePosters, Research Posters
TimeWednesday, 15 November 202310am - 5pm MST
Registration Categories
TP
XO/EX
Presentations
PanSim: A Performance-Portable Agent Based Model
Ares – Simulating Type Ia Supernovae on Heterogeneous HPC Architectures
Balancing Latency and Throughput of Distributed Inference by Interleaved Parallelism
Scalable Algorithms for Analyzing Large Dynamic Networks Using CANDY
Parallel Optimization Methods for Direct Numerical Simulation of High Reynolds Number Wall Turbulence with a Grid Size of 100 Billion
Performant Low-Order Matrix-Free Finite Element Kernels on GPE Architectures
Introducing Prefetching and Data Compression to Accelerate Checkpointing for Inverse Seismic Problems
GPU-Accelerated Dense Covariance Matrix Generation for Spatial Statistics Applications
ParLeiden: Boosting Parallelism of Distributed Leiden Algorithm on Large-Scale Graphs
Scalable Reduced-Order Modeling for Three-Dimensional Turbulent Flow
Unstructured Finite Element Models of Cardiac Electrophysiology Using a Deal.II-Based Library
A Methodology for Accelerating Variant Calling on GPU
Developing an Inverse Reinforcement Learning Methodology to Predict the Progression of Colorectal Cancer
Accelerating Actor-Based Distributed Triangle Counting
Scaling K-Path Centrality Using Optimized Distributed Data Structure
Simulating Quantum Systems with NWQ-Sim on HPC
A Hybrid Factorization Solver with Mixed Precision Arithmetic for Sparse Matrices
Towards Enabling Digital Twins Capabilities for a Cloud Chamber
High-Performance PMEM-Aware Collective I/Os
Architecture and Networks
I/O and File Systems
An Early Case Study with Multi-Tenancy Support in SPDK’s NVMe-over-Fabric Designs
Optimizing Workflow Performance by Elucidating Semantic Data Flow
The Many Facets of a Dynamic Graph Processing System
sys-sage: A Fresh View on Dynamic Topologies and Attributes of HPC Systems
Simulating Application Agnostic Process Assignment for Graph Workloads on Dragonfly and Fat Tree Topologies
Geospatial Filter and Refine Computations on NVIDIA Bluefield Data Processing Units (DPU)
NeoRodinia: Evaluation of High-Level Parallel Programming Models and Compiler Transformation for GPU Offloading
Integrating TEZIP into LibPressio: A Case Study of Integrating a Dynamic Application into a Static C Environment
Characterizing GPU Effectiveness on NRP for IceCube fp32 Compute
Exploring Userspace Memory Mapping for RDMA-Enabled Network-Attached Memory
Minimizing Data Movement Using Distant Futures
Why Wait!? Hades: An Active, Content-Aware System for Precalculating Derived Quantities
Exploring Green Cryptographic Hashing Algorithms for Eco-Friendly Blockchains
Automating HPC Model Selection on Edge Devices
Graph Based Anomaly Detection in Chimbuko: Feasible or Fallible?
Investigating Anomalies in Compute Clusters: An Unsupervised Learning Approach
Temporal Classification of Allocations for Reduced Memory Usage
Toward Inductive Synthesis of Compiler Heuristics: A Case Study with Register Allocation
Neural Domain Decomposition for Variable Coefficient Poisson Solvers
Software Development Case Study: The Acceleration of a Distributed Application Using GPUs
Delivering Digital Skills Across the Digital Divide: Creating an Accessible On-Demand Self-Paced HPC Virtual Training Lab
EE-HPC – A Framework for Energy Efficient HPC System Operation
Real-Time Change Point Detection in Molecular Dynamics Streaming Data
A High-Performance I/O Framework for Accelerating DNN Model Updates Within Deep Learning Workflow
HPC Accelerated Generative Deep Learning Approach for Creating Digital Twins of Climate Models
A Portable Software Environment for Ultrahigh-Resolution ELM Development on GPUs
Optimizing Uncertainty Quantification of Vision Transformers in Deep Learning on Novel AI Architectures
Two-Phase IO Enabling Large-Scale Performance Introspection
Performance Measurement, Modeling, and Tools
Characterizing One-/Two-Sided Designs in OpenSHMEM Collectives
Modeling Parallel Programs Using Large Language Models
MPI Performance Analysis in Vlasiator: Unraveling Communication Bottlenecks
Exploring Julia as a Unifying End-to-End Workflow Language for HPC on Frontier
Exploring the Impacts of Multiple I/O Metrics in Identifying I/O Bottlenecks
Pipit: Simplifying Analysis of Parallel Execution Traces
Characterizing the Performance of the Implicit Massively Parallel Particle-in-Cell iPIC3D Code
Early Experience in Characterizing Training Large Language Models on Modern HPC Clusters
Transfer Learning Workflow for High-Quality I/O Bandwidth Prediction with Limited Data
DFToy: A New Proxy App for DFT Calculations
Hybrid CPU-GPU Implementation of Edge-Connected Jaccard Similarity in Graph Datasets
Preserving Data Locality in Multidimensional Variational Quantum Classification
Artificial Intelligence/Machine Learning
Post-Moore Computing
Quantum Computing
SCALABLE – Scalable Lattice Boltzmann Leaps to Exascale
Improving Memory Interfacing in HLS-Generated Accelerators with Custom Caches
Programming Frameworks and System Software
Evaluating Performance Portability of GPU Programming Models
Heterogeneous Computing
Performance Measurement, Modeling, and Tools
The Impact of Process Topology on RMA Programming Models: A Study on NERSC Perlmutter
Scalable Fine-Grained Gang Scheduling for HPC Systems with Unreliable Broadcast Synchronization Mechanisms
Sophisticated Tools for Performance Analysis and Auto-Tuning of Performance Portable Parallel Programming
That's Right – The Same C++ STL Asynchronous Parallel Code Runs on CPUs and GPUs
Simulating Larger Quantum Circuits with Circuit Cutting and Quantum Serverless
Quantum Task Offloading with the OpenMP API
Unleashing CGRA Potential for HPC
Quantum Computing Case Study in Aerospace Field
Radium: Transparent Distributed Execution via Process Virtualization
QASM-to-HLS: A Framework for Accelerating Quantum Circuit Emulation on High-Performance Reconfigurable Computers