Publikationen

2026

Laso Rodriguez, R., Salimi Beni, M., Vardas, I., Benkner, S., & Hunold, S. (2026). To ncclsee, or Not to ncclsee: That is the Profiling Question. In Austrian-Slovenian HPC Meeting 2026 – ASHPC26 (pp. 13–13).
| To ncclsee, or Not to ncclsee: That is the Profiling Question auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2026). Lectures on Parallel Computing (Vol. 14600). Springer. https://doi.org/10.1007/978-3-031-86578-7
| Lectures on Parallel Computing auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

2025

Salimi Beni, M., Laso, R., Cosenza, B., Benkner, S., & Hunold, S. (2025). Exploring NCCL Tuning Strategies for Distributed Deep Learning. In 2025 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (pp. 59–62). IEEE. https://doi.org/10.1109/IPDPSW66978.2025.00015
| Exploring NCCL Tuning Strategies for Distributed Deep Learning auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Träff, J. L., Laso, R., & Hunold, S. (2025). Mpisee: communicator-centric profiling of MPI applications. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 37(15–17), Article e70158. https://doi.org/10.1002/cpe.70158
| Mpisee: communicator-centric profiling of MPI applications auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2025). Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan). arXiv. https://doi.org/10.34726/10821
| Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan) auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Salimi Beni, M., Laso, R., Cosenza, B., Benkner, S., & Hunold, S. (2025). Optimizing Distributed Deep Learning Training by Tuning NCCL. In ASHPC25 : Austrian-Slovenian HPC Meeting 2025 : Rimske Terme, Slovenia : 19-22 May 2025 (pp. 38–38). https://doi.org/10.34726/10424
| Optimizing Distributed Deep Learning Training by Tuning NCCL auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Laso Rodriguez, R., & Salimi Beni, M. (2025). ncclsee: A Lightweight Profiling Tool for NCCL. In ASHPC25 : Austrian-Slovenian HPC Meeting 2025 : Rimske Terme, Slovenia : 19-22 May 2025 (pp. 39–39). https://doi.org/10.34726/10426
| ncclsee: A Lightweight Profiling Tool for NCCL auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2025). Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms. arXiv. https://doi.org/10.34726/10760
| Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Carpentieri, L., De Caro, A., Salimibeni, M., Fan, K., & Cosenza, B. (2025). Phase-Based Frequency Scaling for Energy-Efficient Heterogeneous Computing. In 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 824–836). IEEE. https://doi.org/10.1109/IPDPS64566.2025.00078
| Phase-Based Frequency Scaling for Energy-Efficient Heterogeneous Computing auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

2024

Salimibeni, M., Cosenza, B., & Hunold, S. (2024). MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns. In Proceedings : 2024 IEEE International Conference on Cluster Computing : 24 – 27 September 2024 Kobe, Japan (pp. 108–119). https://doi.org/10.1109/CLUSTER59578.2024.00017
| MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2024). Exploring Mapping Strategies for Co-allocated HPC Applications. In Demetris Zeinalipour, D. Blanco Heras, G. Pallis, H. Herodotou, D. Trihinas, D. Balouek, P. Diehl, T. Cojean, K. Fürlinger, M. H. Kirkeby, M. Nardelli, & P. Di Sanzo (Eds.), Euro-Par 2023: Parallel Processing Workshops : Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28 – September 1, 2023, Revised Selected Papers, Part II (pp. 271–276). Springer Nature. https://doi.org/10.1007/978-3-031-48803-0_41
| Exploring Mapping Strategies for Co-allocated HPC Applications auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2024). Lectures on Parallel Computing. arXiv. https://doi.org/10.34726/10819
| Lectures on Parallel Computing auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2024). Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction. arXiv. https://doi.org/10.34726/10820
| Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Laso Rodriguez, R., Krupitza, D., & Hunold, S. (2024). pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations. arXiv. https://doi.org/10.48550/arXiv.2402.06384
| pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., Xie, B., & Shu, K. (Eds.). (2024). Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected Papers (Vol. 14521). Springer Singapore. https://doi.org/10.1007/978-981-97-0316-6
| Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected Papers auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Laso Rodriguez, R., Krupitza, D., & Hunold, S. (2024). Exploring Scalability in C++ Parallel STL Implementations. In ICPP ’24: Proceedings of the 53rd International Conference on Parallel Processing (pp. 284–293). ACM. https://doi.org/10.1145/3673038.3673065
| Exploring Scalability in C++ Parallel STL Implementations auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Salimi Beni, M., Hunold, S., & Cosenza, B. (2024). Analysis and prediction of performance variability in large-scale computing systems. Journal of Supercomputing, 80(10), 14978–15005. https://doi.org/10.1007/s11227-024-06040-w
| Analysis and prediction of performance variability in large-scale computing systems auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Hunold, S., SWARTVAGHER, P., & Träff, J. L. (2024). Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process Mapping. In 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid) (pp. 119–124). IEEE. https://doi.org/10.1109/CCGrid59990.2024.00023
| Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process Mapping auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

2023

Träff, J. L. (2023). Round-optimal 𝑛-Block Broadcast Schedules in Logarithmic Time. arXiv. https://doi.org/10.34726/7320
| Round-optimal 𝑛-Block Broadcast Schedules in Logarithmic Time auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S. (2023, December 8). Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems. Universität Münster, Münster, Germany.
| Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S. (2023). Verifying Performance Guidelines for MPI Collectives at Scale. In Proceedings of 2023 SC23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC23 Workshops) (pp. 1264–1268). ACM. https://doi.org/10.1145/3624062.3625532
| Verifying Performance Guidelines for MPI Collectives at Scale auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Swartvagher, P., Hunold, S., Träff, J. L., & Vardas, I. (2023). Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures. In Proceedings of 2023 SC23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis (SC 2023 Workshops) (pp. 405–415). ACM. https://doi.org/10.1145/3624062.3624109
| Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Laso Rodriguez, R., & Casado, F. E. (2023, November 3). The research career after the PhD. CiTIUS (USC), Santiago de Compostela, Spain.
| The research career after the PhD auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., & Vardas, I. (2023). Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes. In Proceedings of the 30th European MPI Users’ Group Meeting (EUROMPI 23). 30th European MPI Users’ Group Meeting (EuroMPI 2023), Bristol, United Kingdom of Great Britain and Northern Ireland (the). ACM. https://doi.org/10.1145/3615318.3615323
| Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Schuchart, J., Hunold, S., & Bosilca, G. (2023). Synchronizing MPI Processes in Space and Time. In EuroMPI “23: Proceedings of the 30th European MPI Users” Group Meeting (pp. 1–11). ACM. https://doi.org/10.1145/3615318.3615325
| Synchronizing MPI Processes in Space and Time auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Realizing multioperations and multiprefixes in Thick Control Flow processors. Microprocessors and Microsystems, 98, Article 104807. https://doi.org/10.1016/j.micpro.2023.104807
| Realizing multioperations and multiprefixes in Thick Control Flow processors auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors. In J. Nurmi, M. Shen, P. Ellervee, P. Koch, & F. Moradi (Eds.), Proceedings 2023 IEEE Nordic Circuits and Systems Conference (NorCAS) (pp. 1–7). IEEE. https://doi.org/10.1109/NorCAS58970.2023.10305463
| Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., & Hagn, M. (2023). MPI is Good, Control is Better: Checking Performance Guidelines of Collectives. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 60–60). EuroCC Austria. https://doi.org/10.34726/5367
| MPI is Good, Control is Better: Checking Performance Guidelines of Collectives auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., & Kraßnitzer, K. D. V. (2023). A Quantitative Analysis of OpenMP Task Runtime Systems. In A. Gainaru, C. Zhang, & C. Luo (Eds.), Benchmarking, Measuring, and Optimizing : 14th BenchCouncil International Symposium, Bench 2022, Virtual Event, November 7-9, 2022, Revised Selected Papers (pp. 3–18). Springer. https://doi.org/10.1007/978-3-031-31180-2_1
| A Quantitative Analysis of OpenMP Task Runtime Systems auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., & Steiner, S. (2023). OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning. In Proceedings of PMBS 2022: performance modeling, benchmarking and simulation of high performance computer systems (pp. 123–128). IEEE. https://doi.org/10.1109/PMBS56514.2022.00016
| OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., Vardas, I., Ibis, G., & Langer, T. (2023). Massively Scaling Molecular Screening Workloads on EuroHPC Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 51–51). EuroCC Austria. https://doi.org/10.34726/5366
| Massively Scaling Molecular Screening Workloads on EuroHPC Supercomputers auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Swartvagher, P., Vardas, I., Hunold, S., & Träff, J. L. (2023). Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 61–61). EuroCC Austria. https://doi.org/10.34726/5368
| Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., Hunold, S., Vardas, I., & Funk, N. M. (2023). Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI. In 2023 IEEE International Conference on Cluster Computing (CLUSTER) (pp. 284–294). IEEE. https://doi.org/10.1109/CLUSTER52292.2023.00031
| Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2023). Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 10–10). EuroCC Austria. https://doi.org/10.34726/5330
| Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

2022

Träff, J. L. (2022). Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In K. Agrawal & I.-T. A. Lee (Eds.), Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2022) (pp. 143–146). ACM. https://doi.org/10.1145/3490148.3538560
| Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., Ajanohoun, J. I., Vardas, I., & Träff, J. L. (2022). An Overhead Analysis of MPI Profiling and Tracing Tools. In C. Scully-Allison, R. Liem, & A. V. Solorzano (Eds.), PERMAVOST 2022: Proceedings of the 2nd Workshop on Performance Engineering, Modelling, Analysis, and Visualization Strategy (pp. 5–13). Association for Computing Machinery (ACM). https://doi.org/10.1145/3526063.3535353
| An Overhead Analysis of MPI Profiling and Tracing Tools auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., & Przybylski, B. (2022, May 18). Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. New Challenges in Scheduling Theory (Centre CNRS “Paul-Langevin”, Aussois, France), Aussois, France.
| Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Ajanohoun, J. I., Vardas, I., Träff, J. L., & Hunold, S. (2022). MPI Performance Tools under the Microscope: A Thorough Overhead Analysis. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 16). EuroCC Austria.
| MPI Performance Tools under the Microscope: A Thorough Overhead Analysis auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Forsell, M., Nikula, S., Roivainen, J., Leppänen, V., & Träff, J. L. (2022). Performance and programmability comparison of the thick control flow architecture and current multicore processors. The Journal of Supercomputing, 78(3), 3152–3183. https://doi.org/10.1007/s11227-021-03985-0
| Performance and programmability comparison of the thick control flow architecture and current multicore processors auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S. (2022). Performance Tuning of MPI Collectives - Status Quo and Open Problems. CaSToRC HPC National Competence Center Fall Seminar Series 2022, Unknown.
| Performance Tuning of MPI Collectives - Status Quo and Open Problems auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2022). (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI. arXiv. https://doi.org/10.48550/arXiv.2205.10072
| (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2022). Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In Proceedings IEEE International Conference on Cluster Computing (CLUSTER 2022) (pp. 142–151). IEEE. https://doi.org/10.1109/CLUSTER51413.2022.00028
| Fast(er) Construction of Round-optimal n-Block Broadcast Schedules auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Hunold, S., Ajanohoun, J. I., & Traff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022) (pp. 520–529). IEEE. https://doi.org/10.1109/IPDPSW55747.2022.00092
| mpisee: MPI Profiling for Communication and Communicator Structure auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Vardas, I., Hunold, S., Ajanohoun, J. I., & Träff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 15). EuroCC Austria.
| mpisee: MPI Profiling for Communication and Communicator Structure auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

2021

Hunold, S., & Przybylski, B. (2021). Teaching Complex Scheduling Algorithms. In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 11th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar 2021) in conjunction with 35th IEEE IPDPS 2021 - Online Conference, Portland, Oregon, USA, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw52791.2021.00058
| Teaching Complex Scheduling Algorithms auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., Ajanohoun, J. I., & Carpen-Amarie, A. (2021). MicroBench Maker: Reproduce, Reuse, Improve. In 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). 12th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2021) in conjunction with SC 2021, St. Louis, Missouri, United States of America (the). IEEE. https://doi.org/10.1109/pmbs54543.2021.00013
| MicroBench Maker: Reproduce, Reuse, Improve auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2021). A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation. arXiv. https://doi.org/10.48550/arXiv.2109.12626
| A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., & Pöter, M. (2021). A more pragmatic implementation of the lock-free, ordered, linked list. In J. Lee & E. Petrank (Eds.), Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM. https://doi.org/10.1145/3437801.3441579
| A more pragmatic implementation of the lock-free, ordered, linked list auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2021). MPI collective communication through a single set of interfaces: A case for orthogonality. Parallel Computing: Systems & Applications, 107(102826), 102826. https://doi.org/10.1016/j.parco.2021.102826
| MPI collective communication through a single set of interfaces: A case for orthogonality auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

2020

Faraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. arXiv. https://doi.org/10.48550/arXiv.2001.07134
| High-Quality Hierarchical Process Mapping auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Faraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. In S. Faro & D. Cantone (Eds.), 18th International Symposium on Experimental Algorithms, SEA 2020 (pp. 4:1-4:15). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.SEA.2020.4
| High-Quality Hierarchical Process Mapping auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Forsell, M., Roivainen, J., & Träff, J. L. (2020). Optimizing Memory Access in TCF Processors with Compute-Update Operations. In 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020) in conjunction with IPDPS 2020 - Online Conference, New Orleans, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw50202.2020.00100
| Optimizing Memory Access in TCF Processors with Compute-Update Operations auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., & Przybylski, B. (2020). Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. arXiv. https://doi.org/10.48550/arXiv.2003.05217
| Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., Bhatele, A., Bosilca, G., & Knees, P. (2020). Predicting MPI Collective Communication Performance Using Machine Learning. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00036
| Predicting MPI Collective Communication Performance Using Machine Learning auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Hunold, S., von Kirchbach, K., Lehr, M., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. arXiv. https://doi.org/10.48550/arXiv.2005.09521
| Efficient Process-to-Node Mapping Algorithms for Stencil Computations auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Kirchbach, K. V., Schulz, C., & Träff, J. L. (2020). Better Process Mapping and Sparse Quadratic Assignment. ACM Journal on Experimental Algorithmics, 25, 1–19. https://doi.org/10.1145/3409667
| Better Process Mapping and Sparse Quadratic Assignment auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Lehr, M., & von Kirchbach, K. (2020). Improved Cartesian Topology Mapping in MPI. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 27). IST Austria.
| Improved Cartesian Topology Mapping in MPI auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Pachajoa, C., Levonyak, M., Pacher, C., Träff, J. L., & Gansterer, W. (2020). Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Booklet : Austrian High-Performance-Computing Meeting (AHPC 2020) (pp. 13–13). IST Austria.
| Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. SPCL_Bcast, ETH Zürich, Zürich, Switzerland.
| Decomposing MPI Collectives for Exploiting Multi-lane Communication auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2020). k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms. arXiv. https://doi.org/10.48550/arXiv.2008.12144
| k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2020). Exploiting Multi-lane Communication in MPI Collectives. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (pp. 30–30). IST Austria.
| Exploiting Multi-lane Communication in MPI Collectives auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L. (2020). Signature Datatypes for Type Correct Collective Operations, Revisited. In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416324
| Signature Datatypes for Type Correct Collective Operations, Revisited auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., & Hoefler, T. (2020). Special issue: Selected papers from EuroMPI 2019. Parallel Computing, 99, Article 102695. https://doi.org/10.1016/j.parco.2020.102695
| Special issue: Selected papers from EuroMPI 2019 auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., & Hunold, S. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00037
| Decomposing MPI Collectives for Exploiting Multi-lane Communication auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., & Pöter, M. (2020). A more Pragmatic Implementation of the Lock-free, Ordered, Linked List. arXiv. https://doi.org/10.48550/arXiv.2010.15755
| A more Pragmatic Implementation of the Lock-free, Ordered, Linked List auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
Träff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2020). Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives). In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416319
| Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives) auf reposiTUm , öffnet eine externe URL in einem neuen Fenster
von Kirchbach, K., Lehr, M., Hunold, S., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00011
| Efficient Process-to-Node Mapping Algorithms for Stencil Computations auf reposiTUm , öffnet eine externe URL in einem neuen Fenster

Name	Zweck	Ablauf	Typ	Anbieter
CookieConsent	Speichert Ihre Einstellungen zur Verwendung von Cookies auf dieser Website.	1 Jahr	HTML	Homepage TU Wien
SimpleSAML	Wird benötigt, um die Sessions der eingeloggten Benutzer_innen voneinander unterscheiden zu können.	Session	HTTP	Login TU Wien
SimpleSAMLAuthToken	Wird benötigt, um die Sessions der eingeloggten Benutzer_innen voneinander unterscheiden zu können.	Session	HTTP	Login TU Wien
fe_typo_user	Wird benötigt, damit im Falle eines Typo3-Frontend-Logins die Session-ID wiedererkannt wird um Zugang zu geschützten Bereichen zu gewähren.	Session	HTTP	Homepage TU Wien
staticfilecache	Wird benötigt, um die Auslieferungszeit der Website zu optimieren.	Session	HTTP	Homepage TU Wien
JESSIONSID	Wird benötigt, damit im Falle eines LectureTube-Logins die Session-ID wiedererkannt wird um Zugang zu geschützten Bereichen zu gewähren.	Session	HTTP	LectureTube TU Wien
_shibsession_lecturetube	Wird benötigt, um die Sessions der eingeloggten Benutzer_innen voneinander unterscheiden zu können.	Session	HTTP	LectureTube TU Wien

Name	Zweck	Ablauf	Typ	Anbieter
_pk_id	Wird verwendet, um ein paar Details über den Benutzer wie die eindeutige Besucher-ID zu speichern.	13 Monate	HTML	Matomo TU Wien
_pk_ref	Wird benutzt, um die Informationen der Herkunftswebsite des Benutzers zu speichern.	6 Monate	HTML	Matomo TU Wien
_pk_ses	Wird benötigt, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo TU Wien

Name	Zweck	Ablauf	Typ	Anbieter
facebook	Wird verwendet, um Anzeigen auszuliefern oder Retargeting zu ermöglichen	90 Tage	HTTP	Meta
__fb_chat_plugin	Wird zum Speichern und Verfolgen von Interaktionen (Marketing/Tracking) benötigt.	Persistent	HTTP	Meta
_js_datr	Wird benötigt, um Benutzer_inneneinstellungen zu speichern.	2 Jahre	HTTP	Meta
_fbc	Wird benötigt, um den letzten Besuch zu speichern (Marketing/Tracking).	2 Jahre	HTTP	Meta
fbm	Wird benötigt, um Kontodaten zu speichern (Marketing/Tracking).	1 Jahr	HTTP	Meta
xs	Wird zum Speichern einer eindeutigen Sitzungs-ID benötigt (Marketing/Tracking).	1 Jahr	HTTP	Meta
wd	Wird benötigt, um die Bildschirmauflösung zu loggen.	1 Woche	HTTP	Meta
fr	Wird benötigt, um Anzeigen zu schalten und deren Relevanz zu messen und zu verbessern.	3 Monate	HTTP	Meta
act	Wird benötigt, um angemeldete Benutzer_innen zu speichern (Marketing/Tracking).	90 Tage	HTTP	Meta
_fbp	Wird zum Speichern und Verfolgen von Besuchen auf verschiedenen Websites benötigt (Marketing/Tracking).	3 Monate	HTTP	Meta
datr	Wird benötigt, um den Browser für Sicherheits- und Website-Integritätszwecke, einschließlich der Wiederherstellung von Konten und der Identifizierung von potenziell gefährdeten Konten zu identifizieren.	2 Jahre	HTTP	Meta
dpr	Wird für Analysezwecke verwendet. Technische Parameter werden protokolliert (z. B. Seitenverhältnis und Abmessungen des Bildschirms), damit Facebook-Apps korrekt angezeigt werden können.	1 Woche	HTTP	Meta
sb	Wird benötigt, um Browserdetails und Sicherheitsinformationen des Facebook-Kontos zu speichern.	2 Jahre	HTTP	Meta
dbln	Wird benötigt, um Browserdetails und Sicherheitsinformationen des Facebook-Kontos zu speichern.	2 Jahre	HTTP	Meta
spin	Wird für Werbezwecke und Berichterstattung über soziale Kampagnen benötigt.	Session	HTTP	Meta
presence	Enthält den "Chat"-Status eingeloggter Benutzer_innen.	1 Monat	HTTP	Meta
cppo	Wird für statistische Zwecke benötigt.	90 Tage	HTTP	Meta
locale	Wird benötigt, um die Spracheinstellungen zu speichern.	Session	HTTP	Meta
pl	Wird für Facebook Pixel benötigt.	2 Jahre	HTTP	Meta
lu	Wird für Facebook Pixel benötigt.	2 Jahre	HTTP	Meta
c_user	Wird für Facebook Pixel benötigt.	3 Monate	HTTP	Meta
bcookie	Wird zur Speicherung von Browserdaten benötigt (Marketing/Tracking).	2 Jahre	HTTP	LinkedIn
li_oatml	Wird verwendet, um LinkedIn-Mitglieder außerhalb von LinkedIn zu Werbe- und Analysezwecken zu identifizieren.	1 Monat	HTTP	LinkedIn
BizographicsOptOut	Wird zum Speichern von Datenschutzeinstellungen benötigt.	10 Jahre	HTTP	LinkedIn
li_sugr	Wird zur Speicherung von Browserdaten benötigt (Marketing/Tracking).	3 Monate	HTTP	LinkedIn
UserMatchHistory	Wird zur Bereitstellung von Werbeeinblendungen oder Retargeting benötigt (Marketing/Tracking).	30 Tage	HTTP	LinkedIn
linkedin_oauth_	Wird benötigt, um seitenübergreifende Funktionen bereitzustellen.	Session	HTTP	LinkedIn
lidc	Wird benötigt, um durchgeführte Aktionen auf der Website zu speichern (Marketing/Tracking).	1 Tag	HTTP	LinkedIn
bscookie	Wird benötigt, um durchgeführte Aktionen auf der Website zu speichern (Marketing/Tracking).	2 Jahre	HTTP	LinkedIn
X-LI-IDC	Wird benötigt, um seitenübergreifende Funktionen bereitzustellen (Marketing/Tracking).	Session	HTTP	LinkedIn
AnalyticsSyncHistory	Speichert den Zeitpunkt, zu dem der/die Benutzer_in mit dem "lms_analytics"-Cookie synchronisiert wurde.	30 Tage	HTTP	LinkedIn
lms_ads	Wird benötigt, um LinkedIn-Mitglieder außerhalb von LinkedIn zu identifizieren.	30 Tage	HTTP	LinkedIn
lms_analytics	Wird benötigt, um LinkedIn-Mitglieder zu Analysezwecken zu identifizieren.	30 Tage	HTTP	LinkedIn
li_fat_id	Wird für eine indirekte Mitgliederidentifikation benötigt, die für Conversion Tracking, Retargeting und Analysen verwendet wird.	30 Tage	HTTP	LinkedIn
U	Wird benötigt, um den Browser zu identifizieren.	3 Monate	HTTP	LinkedIn
_guid	Wird benötigt, um ein LinkedIn-Mitglied für Werbung über Google Ads zu identifizieren.	90 Tage	HTTP	LinkedIn