Publications parallel computing informatics computer science

2026

Laso Rodriguez, R., Salimi Beni, M., Vardas, I., Benkner, S., & Hunold, S. (2026). To ncclsee, or Not to ncclsee: That is the Profiling Question. In Austrian-Slovenian HPC Meeting 2026 – ASHPC26 (pp. 13–13).
| To ncclsee, or Not to ncclsee: That is the Profiling Question at reposiTUm , opens an external URL in a new window
Träff, J. L. (2026). Lectures on Parallel Computing (Vol. 14600). Springer. https://doi.org/10.1007/978-3-031-86578-7
| Lectures on Parallel Computing at reposiTUm , opens an external URL in a new window

2025

Salimi Beni, M., Laso, R., Cosenza, B., Benkner, S., & Hunold, S. (2025). Exploring NCCL Tuning Strategies for Distributed Deep Learning. In 2025 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (pp. 59–62). IEEE. https://doi.org/10.1109/IPDPSW66978.2025.00015
| Exploring NCCL Tuning Strategies for Distributed Deep Learning at reposiTUm , opens an external URL in a new window
Vardas, I., Träff, J. L., Laso, R., & Hunold, S. (2025). Mpisee: communicator-centric profiling of MPI applications. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 37(15–17), Article e70158. https://doi.org/10.1002/cpe.70158
| Mpisee: communicator-centric profiling of MPI applications at reposiTUm , opens an external URL in a new window
Träff, J. L. (2025). Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan). arXiv. https://doi.org/10.34726/10821
| Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan) at reposiTUm , opens an external URL in a new window
Salimi Beni, M., Laso, R., Cosenza, B., Benkner, S., & Hunold, S. (2025). Optimizing Distributed Deep Learning Training by Tuning NCCL. In ASHPC25 : Austrian-Slovenian HPC Meeting 2025 : Rimske Terme, Slovenia : 19-22 May 2025 (pp. 38–38). https://doi.org/10.34726/10424
| Optimizing Distributed Deep Learning Training by Tuning NCCL at reposiTUm , opens an external URL in a new window
Vardas, I., Laso Rodriguez, R., & Salimi Beni, M. (2025). ncclsee: A Lightweight Profiling Tool for NCCL. In ASHPC25 : Austrian-Slovenian HPC Meeting 2025 : Rimske Terme, Slovenia : 19-22 May 2025 (pp. 39–39). https://doi.org/10.34726/10426
| ncclsee: A Lightweight Profiling Tool for NCCL at reposiTUm , opens an external URL in a new window
Träff, J. L. (2025). Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms. arXiv. https://doi.org/10.34726/10760
| Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms at reposiTUm , opens an external URL in a new window
Carpentieri, L., De Caro, A., Salimibeni, M., Fan, K., & Cosenza, B. (2025). Phase-Based Frequency Scaling for Energy-Efficient Heterogeneous Computing. In 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 824–836). IEEE. https://doi.org/10.1109/IPDPS64566.2025.00078
| Phase-Based Frequency Scaling for Energy-Efficient Heterogeneous Computing at reposiTUm , opens an external URL in a new window

2024

Salimibeni, M., Cosenza, B., & Hunold, S. (2024). MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns. In Proceedings : 2024 IEEE International Conference on Cluster Computing : 24 – 27 September 2024 Kobe, Japan (pp. 108–119). https://doi.org/10.1109/CLUSTER59578.2024.00017
| MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns at reposiTUm , opens an external URL in a new window
Vardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2024). Exploring Mapping Strategies for Co-allocated HPC Applications. In Demetris Zeinalipour, D. Blanco Heras, G. Pallis, H. Herodotou, D. Trihinas, D. Balouek, P. Diehl, T. Cojean, K. Fürlinger, M. H. Kirkeby, M. Nardelli, & P. Di Sanzo (Eds.), Euro-Par 2023: Parallel Processing Workshops : Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28 – September 1, 2023, Revised Selected Papers, Part II (pp. 271–276). Springer Nature. https://doi.org/10.1007/978-3-031-48803-0_41
| Exploring Mapping Strategies for Co-allocated HPC Applications at reposiTUm , opens an external URL in a new window
Träff, J. L. (2024). Lectures on Parallel Computing. arXiv. https://doi.org/10.34726/10819
| Lectures on Parallel Computing at reposiTUm , opens an external URL in a new window
Träff, J. L. (2024). Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction. arXiv. https://doi.org/10.34726/10820
| Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction at reposiTUm , opens an external URL in a new window
Laso Rodriguez, R., Krupitza, D., & Hunold, S. (2024). pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations. arXiv. https://doi.org/10.48550/arXiv.2402.06384
| pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations at reposiTUm , opens an external URL in a new window
Hunold, S., Xie, B., & Shu, K. (Eds.). (2024). Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected Papers (Vol. 14521). Springer Singapore. https://doi.org/10.1007/978-981-97-0316-6
| Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected Papers at reposiTUm , opens an external URL in a new window
Laso Rodriguez, R., Krupitza, D., & Hunold, S. (2024). Exploring Scalability in C++ Parallel STL Implementations. In ICPP ’24: Proceedings of the 53rd International Conference on Parallel Processing (pp. 284–293). ACM. https://doi.org/10.1145/3673038.3673065
| Exploring Scalability in C++ Parallel STL Implementations at reposiTUm , opens an external URL in a new window
Salimi Beni, M., Hunold, S., & Cosenza, B. (2024). Analysis and prediction of performance variability in large-scale computing systems. Journal of Supercomputing, 80(10), 14978–15005. https://doi.org/10.1007/s11227-024-06040-w
| Analysis and prediction of performance variability in large-scale computing systems at reposiTUm , opens an external URL in a new window
Vardas, I., Hunold, S., SWARTVAGHER, P., & Träff, J. L. (2024). Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process Mapping. In 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid) (pp. 119–124). IEEE. https://doi.org/10.1109/CCGrid59990.2024.00023
| Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process Mapping at reposiTUm , opens an external URL in a new window

2023

Träff, J. L. (2023). Round-optimal 𝑛-Block Broadcast Schedules in Logarithmic Time. arXiv. https://doi.org/10.34726/7320
| Round-optimal 𝑛-Block Broadcast Schedules in Logarithmic Time at reposiTUm , opens an external URL in a new window
Hunold, S. (2023, December 8). Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems. Universität Münster, Münster, Germany.
| Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems at reposiTUm , opens an external URL in a new window
Hunold, S. (2023). Verifying Performance Guidelines for MPI Collectives at Scale. In Proceedings of 2023 SC23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC23 Workshops) (pp. 1264–1268). ACM. https://doi.org/10.1145/3624062.3625532
| Verifying Performance Guidelines for MPI Collectives at Scale at reposiTUm , opens an external URL in a new window
Swartvagher, P., Hunold, S., Träff, J. L., & Vardas, I. (2023). Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures. In Proceedings of 2023 SC23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis (SC 2023 Workshops) (pp. 405–415). ACM. https://doi.org/10.1145/3624062.3624109
| Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures at reposiTUm , opens an external URL in a new window
Laso Rodriguez, R., & Casado, F. E. (2023, November 3). The research career after the PhD. CiTIUS (USC), Santiago de Compostela, Spain.
| The research career after the PhD at reposiTUm , opens an external URL in a new window
Träff, J. L., & Vardas, I. (2023). Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes. In Proceedings of the 30th European MPI Users’ Group Meeting (EUROMPI 23). 30th European MPI Users’ Group Meeting (EuroMPI 2023), Bristol, United Kingdom of Great Britain and Northern Ireland (the). ACM. https://doi.org/10.1145/3615318.3615323
| Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes at reposiTUm , opens an external URL in a new window
Schuchart, J., Hunold, S., & Bosilca, G. (2023). Synchronizing MPI Processes in Space and Time. In EuroMPI “23: Proceedings of the 30th European MPI Users” Group Meeting (pp. 1–11). ACM. https://doi.org/10.1145/3615318.3615325
| Synchronizing MPI Processes in Space and Time at reposiTUm , opens an external URL in a new window
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Realizing multioperations and multiprefixes in Thick Control Flow processors. Microprocessors and Microsystems, 98, Article 104807. https://doi.org/10.1016/j.micpro.2023.104807
| Realizing multioperations and multiprefixes in Thick Control Flow processors at reposiTUm , opens an external URL in a new window
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors. In J. Nurmi, M. Shen, P. Ellervee, P. Koch, & F. Moradi (Eds.), Proceedings 2023 IEEE Nordic Circuits and Systems Conference (NorCAS) (pp. 1–7). IEEE. https://doi.org/10.1109/NorCAS58970.2023.10305463
| Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors at reposiTUm , opens an external URL in a new window
Hunold, S., & Hagn, M. (2023). MPI is Good, Control is Better: Checking Performance Guidelines of Collectives. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 60–60). EuroCC Austria. https://doi.org/10.34726/5367
| MPI is Good, Control is Better: Checking Performance Guidelines of Collectives at reposiTUm , opens an external URL in a new window
Hunold, S., & Kraßnitzer, K. D. V. (2023). A Quantitative Analysis of OpenMP Task Runtime Systems. In A. Gainaru, C. Zhang, & C. Luo (Eds.), Benchmarking, Measuring, and Optimizing : 14th BenchCouncil International Symposium, Bench 2022, Virtual Event, November 7-9, 2022, Revised Selected Papers (pp. 3–18). Springer. https://doi.org/10.1007/978-3-031-31180-2_1
| A Quantitative Analysis of OpenMP Task Runtime Systems at reposiTUm , opens an external URL in a new window
Hunold, S., & Steiner, S. (2023). OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning. In Proceedings of PMBS 2022: performance modeling, benchmarking and simulation of high performance computer systems (pp. 123–128). IEEE. https://doi.org/10.1109/PMBS56514.2022.00016
| OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning at reposiTUm , opens an external URL in a new window
Hunold, S., Vardas, I., Ibis, G., & Langer, T. (2023). Massively Scaling Molecular Screening Workloads on EuroHPC Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 51–51). EuroCC Austria. https://doi.org/10.34726/5366
| Massively Scaling Molecular Screening Workloads on EuroHPC Supercomputers at reposiTUm , opens an external URL in a new window
Swartvagher, P., Vardas, I., Hunold, S., & Träff, J. L. (2023). Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 61–61). EuroCC Austria. https://doi.org/10.34726/5368
| Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers at reposiTUm , opens an external URL in a new window
Träff, J. L., Hunold, S., Vardas, I., & Funk, N. M. (2023). Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI. In 2023 IEEE International Conference on Cluster Computing (CLUSTER) (pp. 284–294). IEEE. https://doi.org/10.1109/CLUSTER52292.2023.00031
| Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI at reposiTUm , opens an external URL in a new window
Vardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2023). Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 10–10). EuroCC Austria. https://doi.org/10.34726/5330
| Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications at reposiTUm , opens an external URL in a new window

2022

Träff, J. L. (2022). Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In K. Agrawal & I.-T. A. Lee (Eds.), Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2022) (pp. 143–146). ACM. https://doi.org/10.1145/3490148.3538560
| Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules at reposiTUm , opens an external URL in a new window
Hunold, S., Ajanohoun, J. I., Vardas, I., & Träff, J. L. (2022). An Overhead Analysis of MPI Profiling and Tracing Tools. In C. Scully-Allison, R. Liem, & A. V. Solorzano (Eds.), PERMAVOST 2022: Proceedings of the 2nd Workshop on Performance Engineering, Modelling, Analysis, and Visualization Strategy (pp. 5–13). Association for Computing Machinery (ACM). https://doi.org/10.1145/3526063.3535353
| An Overhead Analysis of MPI Profiling and Tracing Tools at reposiTUm , opens an external URL in a new window
Hunold, S., & Przybylski, B. (2022, May 18). Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. New Challenges in Scheduling Theory (Centre CNRS “Paul-Langevin”, Aussois, France), Aussois, France.
| Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia at reposiTUm , opens an external URL in a new window
Ajanohoun, J. I., Vardas, I., Träff, J. L., & Hunold, S. (2022). MPI Performance Tools under the Microscope: A Thorough Overhead Analysis. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 16). EuroCC Austria.
| MPI Performance Tools under the Microscope: A Thorough Overhead Analysis at reposiTUm , opens an external URL in a new window
Forsell, M., Nikula, S., Roivainen, J., Leppänen, V., & Träff, J. L. (2022). Performance and programmability comparison of the thick control flow architecture and current multicore processors. The Journal of Supercomputing, 78(3), 3152–3183. https://doi.org/10.1007/s11227-021-03985-0
| Performance and programmability comparison of the thick control flow architecture and current multicore processors at reposiTUm , opens an external URL in a new window
Hunold, S. (2022). Performance Tuning of MPI Collectives - Status Quo and Open Problems. CaSToRC HPC National Competence Center Fall Seminar Series 2022, Unknown.
| Performance Tuning of MPI Collectives - Status Quo and Open Problems at reposiTUm , opens an external URL in a new window
Träff, J. L. (2022). (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI. arXiv. https://doi.org/10.48550/arXiv.2205.10072
| (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI at reposiTUm , opens an external URL in a new window
Träff, J. L. (2022). Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In Proceedings IEEE International Conference on Cluster Computing (CLUSTER 2022) (pp. 142–151). IEEE. https://doi.org/10.1109/CLUSTER51413.2022.00028
| Fast(er) Construction of Round-optimal n-Block Broadcast Schedules at reposiTUm , opens an external URL in a new window
Vardas, I., Hunold, S., Ajanohoun, J. I., & Traff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022) (pp. 520–529). IEEE. https://doi.org/10.1109/IPDPSW55747.2022.00092
| mpisee: MPI Profiling for Communication and Communicator Structure at reposiTUm , opens an external URL in a new window
Vardas, I., Hunold, S., Ajanohoun, J. I., & Träff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 15). EuroCC Austria.
| mpisee: MPI Profiling for Communication and Communicator Structure at reposiTUm , opens an external URL in a new window

2021

Hunold, S., & Przybylski, B. (2021). Teaching Complex Scheduling Algorithms. In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 11th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar 2021) in conjunction with 35th IEEE IPDPS 2021 - Online Conference, Portland, Oregon, USA, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw52791.2021.00058
| Teaching Complex Scheduling Algorithms at reposiTUm , opens an external URL in a new window
Hunold, S., Ajanohoun, J. I., & Carpen-Amarie, A. (2021). MicroBench Maker: Reproduce, Reuse, Improve. In 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). 12th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2021) in conjunction with SC 2021, St. Louis, Missouri, United States of America (the). IEEE. https://doi.org/10.1109/pmbs54543.2021.00013
| MicroBench Maker: Reproduce, Reuse, Improve at reposiTUm , opens an external URL in a new window
Träff, J. L. (2021). A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation. arXiv. https://doi.org/10.48550/arXiv.2109.12626
| A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation at reposiTUm , opens an external URL in a new window
Träff, J. L., & Pöter, M. (2021). A more pragmatic implementation of the lock-free, ordered, linked list. In J. Lee & E. Petrank (Eds.), Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM. https://doi.org/10.1145/3437801.3441579
| A more pragmatic implementation of the lock-free, ordered, linked list at reposiTUm , opens an external URL in a new window
Träff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2021). MPI collective communication through a single set of interfaces: A case for orthogonality. Parallel Computing: Systems & Applications, 107(102826), 102826. https://doi.org/10.1016/j.parco.2021.102826
| MPI collective communication through a single set of interfaces: A case for orthogonality at reposiTUm , opens an external URL in a new window

2020

Faraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. arXiv. https://doi.org/10.48550/arXiv.2001.07134
| High-Quality Hierarchical Process Mapping at reposiTUm , opens an external URL in a new window
Faraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. In S. Faro & D. Cantone (Eds.), 18th International Symposium on Experimental Algorithms, SEA 2020 (pp. 4:1-4:15). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.SEA.2020.4
| High-Quality Hierarchical Process Mapping at reposiTUm , opens an external URL in a new window
Forsell, M., Roivainen, J., & Träff, J. L. (2020). Optimizing Memory Access in TCF Processors with Compute-Update Operations. In 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020) in conjunction with IPDPS 2020 - Online Conference, New Orleans, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw50202.2020.00100
| Optimizing Memory Access in TCF Processors with Compute-Update Operations at reposiTUm , opens an external URL in a new window
Hunold, S., & Przybylski, B. (2020). Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. arXiv. https://doi.org/10.48550/arXiv.2003.05217
| Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia at reposiTUm , opens an external URL in a new window
Hunold, S., Bhatele, A., Bosilca, G., & Knees, P. (2020). Predicting MPI Collective Communication Performance Using Machine Learning. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00036
| Predicting MPI Collective Communication Performance Using Machine Learning at reposiTUm , opens an external URL in a new window
Hunold, S., von Kirchbach, K., Lehr, M., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. arXiv. https://doi.org/10.48550/arXiv.2005.09521
| Efficient Process-to-Node Mapping Algorithms for Stencil Computations at reposiTUm , opens an external URL in a new window
Kirchbach, K. V., Schulz, C., & Träff, J. L. (2020). Better Process Mapping and Sparse Quadratic Assignment. ACM Journal on Experimental Algorithmics, 25, 1–19. https://doi.org/10.1145/3409667
| Better Process Mapping and Sparse Quadratic Assignment at reposiTUm , opens an external URL in a new window
Lehr, M., & von Kirchbach, K. (2020). Improved Cartesian Topology Mapping in MPI. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 27). IST Austria.
| Improved Cartesian Topology Mapping in MPI at reposiTUm , opens an external URL in a new window
Pachajoa, C., Levonyak, M., Pacher, C., Träff, J. L., & Gansterer, W. (2020). Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Booklet : Austrian High-Performance-Computing Meeting (AHPC 2020) (pp. 13–13). IST Austria.
| Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience at reposiTUm , opens an external URL in a new window
Träff, J. L. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. SPCL_Bcast, ETH Zürich, Zürich, Switzerland.
| Decomposing MPI Collectives for Exploiting Multi-lane Communication at reposiTUm , opens an external URL in a new window
Träff, J. L. (2020). k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms. arXiv. https://doi.org/10.48550/arXiv.2008.12144
| k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms at reposiTUm , opens an external URL in a new window
Träff, J. L. (2020). Exploiting Multi-lane Communication in MPI Collectives. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (pp. 30–30). IST Austria.
| Exploiting Multi-lane Communication in MPI Collectives at reposiTUm , opens an external URL in a new window
Träff, J. L. (2020). Signature Datatypes for Type Correct Collective Operations, Revisited. In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416324
| Signature Datatypes for Type Correct Collective Operations, Revisited at reposiTUm , opens an external URL in a new window
Träff, J. L., & Hoefler, T. (2020). Special issue: Selected papers from EuroMPI 2019. Parallel Computing, 99, Article 102695. https://doi.org/10.1016/j.parco.2020.102695
| Special issue: Selected papers from EuroMPI 2019 at reposiTUm , opens an external URL in a new window
Träff, J. L., & Hunold, S. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00037
| Decomposing MPI Collectives for Exploiting Multi-lane Communication at reposiTUm , opens an external URL in a new window
Träff, J. L., & Pöter, M. (2020). A more Pragmatic Implementation of the Lock-free, Ordered, Linked List. arXiv. https://doi.org/10.48550/arXiv.2010.15755
| A more Pragmatic Implementation of the Lock-free, Ordered, Linked List at reposiTUm , opens an external URL in a new window
Träff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2020). Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives). In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416319
| Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives) at reposiTUm , opens an external URL in a new window
von Kirchbach, K., Lehr, M., Hunold, S., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00011
| Efficient Process-to-Node Mapping Algorithms for Stencil Computations at reposiTUm , opens an external URL in a new window

Name	Purpose	Lifetime	Type	Provider
CookieConsent	Saves your settings for the use of cookies on this website.	1 year	HTML	Homepage TU Wien
SimpleSAML	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	Login TU Wien
SimpleSAMLAuthToken	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	Login TU Wien
fe_typo_user	Is needed so that in case of a Typo3 frontend login the session ID is recognized to grant access to protected areas.	session	HTTP	Homepage TU Wien
staticfilecache	Is needed to optimize the delivery time of the website.	session	HTTP	Homepage TU Wien
JESSIONSID	Is needed so that in case of a LectureTube the session ID is recognized to grant access to protected areas.	session	HTTP	LectureTube TU Wien
_shibsession_lecturetube	This is needed to distinguish between the sessions of the logged-in users.	session	HTTP	LectureTube TU Wien

Name	Purpose	Lifetime	Type	Provider
_pk_id	Used to store a few details about the user such as the unique visitor ID.	13 months	HTML	Matomo TU Wien
_pk_ref	Is used to store the information of the users home website.	6 months	HTML	Matomo TU Wien
_pk_ses	Is needed to store temporary data of the visit.	30 minutes	HTML	Matomo TU Wien

Name	Purpose	Lifetime	Type	Provider
facebook	Is used to Enable ad delivery or retargeting	90 days	HTTP	Meta
__fb_chat_plugin	Is needed to store and track interactions (marketing/tracking).	persistent	HTTP	Meta
_js_datr	Is needed to save user settings.	2 years	HTTP	Meta
_fbc	Is needed to save the last visit (marketing/tracking).	2 years	HTTP	Meta
fbm	Is needed to store account data (marketing/tracking).	1 year	HTTP	Meta
xs	Is needed to store a unique session ID (marketing/tracking).	1 year	HTTP	Meta
wd	Is needed to log the screen resolution.	1 week	HTTP	Meta
fr	Is needed to serve ads and measure and improve their relevance.	3 months	HTTP	Meta
act	Is needed to store logged in users (marketing/tracking).	90 days	HTTP	Meta
_fbp	Is needed to store and track visits to various websites (marketing/tracking).	3 months	HTTP	Meta
datr	Is needed to identify the browser for security and website integrity purposes, including account recovery and identification of potentially compromised accounts.	2 years	HTTP	Meta
dpr	Is used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	1 week	HTTP	Meta
sb	Is needed to store browser details and security information of the Facebook account.	2 years	HTTP	Meta
dbln	Is needed to store browser details and security information of the Facebook account.	2 years	HTTP	Meta
spin	Is needed for promotional purposes and social campaign reporting.	session	HTTP	Meta
presence	Contains the "chat" status of logged in users.	1 month	HTTP	Meta
cppo	Is needed for statistical purposes.	90 days	HTTP	Meta
locale	Is needed to save the language settings.	session	HTTP	Meta
pl	Required for Facebook Pixel.	2 years	HTTP	Meta
lu	Required for Facebook Pixel.	2 years	HTTP	Meta
c_user	Required for Facebook Pixel.	3 months	HTTP	Meta
bcookie	Is needed to store browser data (marketing/tracking).	2 years	HTTP	LinkedIn
li_oatml	Is needed to identify LinkedIn members outside of LinkedIn for advertising and analytics purposes.	1 month	HTTP	LinkedIn
BizographicsOptOut	Is needed to save privacy settings.	10 years	HTTP	LinkedIn
li_sugr	Is needed to store browser data (marketing/tracking).	3 months	HTTP	LinkedIn
UserMatchHistory	Is needed to provide advertising or retargeting (marketing/tracking).	30 days	HTTP	LinkedIn
linkedin_oauth_	Is needed to provide cross-page functionality.	session	HTTP	LinkedIn
lidc	Is needed to store performed actions on the website (marketing/tracking).	1 day	HTTP	LinkedIn
bscookie	Is needed to store performed actions on the website (marketing/tracking).	2 years	HTTP	LinkedIn
X-LI-IDC	Is needed to provide cross-page functionality (marketing/tracking).	session	HTTP	LinkedIn
AnalyticsSyncHistory	Stores the time when the user was synchronized with the "lms_analytics" cookie.	30 days	HTTP	LinkedIn
lms_ads	Is needed to identify LinkedIn members outside of LinkedIn.	30 days	HTTP	LinkedIn
lms_analytics	Is needed to identify LinkedIn members for analytics purposes.	30 days	HTTP	LinkedIn
li_fat_id	Required for indirect member identification used for conversion tracking, retargeting and analytics.	30 days	HTTP	LinkedIn
U	Is needed to identify the browser.	3 months	HTTP	LinkedIn
_guid	Is needed to identify a LinkedIn member for advertising via Google Ads.	90 days	HTTP	LinkedIn