Publications
Publications since 2018
-
| Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected Papers at reposiTUm , opens an external URL in a new windowHunold, S., Xie, B., & Shu, K. (Eds.). (2024). Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected Papers (Vol. 14521). Springer Singapore. https://doi.org/10.1007/978-981-97-0316-6, opens an external URL in a new window
-
| pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations at reposiTUm , opens an external URL in a new windowLaso Rodriguez, R., Krupitza, D., & Hunold, S. (2024). pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations (arXiv:2402.06384). https://doi.org/10.48550/arXiv.2402.06384, opens an external URL in a new window
-
| Analysis and prediction of performance variability in large-scale computing systems at reposiTUm , opens an external URL in a new windowSalimi Beni, M., Hunold, S., & Cosenza, B. (2024). Analysis and prediction of performance variability in large-scale computing systems. Journal of Supercomputing, 80(10), 14978–15005. https://doi.org/10.1007/s11227-024-06040-w, opens an external URL in a new window
-
| Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems at reposiTUm , opens an external URL in a new windowHunold, S. (2023, December 8). Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems. Universität Münster, Münster, Germany.
-
| Verifying Performance Guidelines for MPI Collectives at Scale at reposiTUm , opens an external URL in a new windowHunold, S. (2023). Verifying Performance Guidelines for MPI Collectives at Scale. In Proceedings of 2023 SC23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC23 Workshops) (pp. 1264–1268). ACM. https://doi.org/10.1145/3624062.3625532, opens an external URL in a new window
-
| Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures at reposiTUm , opens an external URL in a new windowSwartvagher, P., Hunold, S., Träff, J. L., & Vardas, I. (2023). Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures. In Proceedings of 2023 SC23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis (SC 2023 Workshops) (pp. 405–415). ACM. https://doi.org/10.1145/3624062.3624109, opens an external URL in a new window
-
| The research career after the PhD at reposiTUm , opens an external URL in a new windowLaso Rodriguez, R., & Casado, F. E. (2023, November 3). The research career after the PhD. CiTIUS (USC), Santiago de Compostela, Spain.
-
| Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes at reposiTUm , opens an external URL in a new windowTräff, J. L., & Vardas, I. (2023). Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes. In Proceedings of the 30th European MPI Users’ Group Meeting (EUROMPI 23). 30th European MPI Users’ Group Meeting (EuroMPI 2023), Bristol, United Kingdom of Great Britain and Northern Ireland (the). ACM. https://doi.org/10.1145/3615318.3615323, opens an external URL in a new window
-
| Synchronizing MPI Processes in Space and Time at reposiTUm , opens an external URL in a new windowSchuchart, J., Hunold, S., & Bosilca, G. (2023). Synchronizing MPI Processes in Space and Time. In EuroMPI “23: Proceedings of the 30th European MPI Users” Group Meeting (pp. 1–11). ACM. https://doi.org/10.1145/3615318.3615325, opens an external URL in a new window
-
| Realizing multioperations and multiprefixes in Thick Control Flow processors at reposiTUm , opens an external URL in a new windowForsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Realizing multioperations and multiprefixes in Thick Control Flow processors. Microprocessors and Microsystems, 98, Article 104807. https://doi.org/10.1016/j.micpro.2023.104807, opens an external URL in a new window
-
| Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors at reposiTUm , opens an external URL in a new windowForsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors. In J. Nurmi, M. Shen, P. Ellervee, P. Koch, & F. Moradi (Eds.), Proceedings 2023 IEEE Nordic Circuits and Systems Conference (NorCAS) (pp. 1–7). IEEE. https://doi.org/10.1109/NorCAS58970.2023.10305463, opens an external URL in a new window
-
| MPI is Good, Control is Better: Checking Performance Guidelines of Collectives at reposiTUm , opens an external URL in a new windowHunold, S., & Hagn, M. (2023). MPI is Good, Control is Better: Checking Performance Guidelines of Collectives. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 60–60). EuroCC Austria. https://doi.org/10.34726/5367, opens an external URL in a new window
-
| A Quantitative Analysis of OpenMP Task Runtime Systems at reposiTUm , opens an external URL in a new windowHunold, S., & Kraßnitzer, K. D. V. (2023). A Quantitative Analysis of OpenMP Task Runtime Systems. In A. Gainaru, C. Zhang, & C. Luo (Eds.), Benchmarking, Measuring, and Optimizing : 14th BenchCouncil International Symposium, Bench 2022, Virtual Event, November 7-9, 2022, Revised Selected Papers (pp. 3–18). Springer. https://doi.org/10.1007/978-3-031-31180-2_1, opens an external URL in a new window
-
| OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning at reposiTUm , opens an external URL in a new windowHunold, S., & Steiner, S. (2023). OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning. In Proceedings of PMBS 2022: performance modeling, benchmarking and simulation of high performance computer systems (pp. 123–128). IEEE. https://doi.org/10.1109/PMBS56514.2022.00016, opens an external URL in a new window
-
| Massively Scaling Molecular Screening Workloads on EuroHPC Supercomputers at reposiTUm , opens an external URL in a new windowHunold, S., Vardas, I., Ibis, G., & Langer, T. (2023). Massively Scaling Molecular Screening Workloads on EuroHPC Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 51–51). EuroCC Austria. https://doi.org/10.34726/5366, opens an external URL in a new window
-
| Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers at reposiTUm , opens an external URL in a new windowSwartvagher, P., Vardas, I., Hunold, S., & Träff, J. L. (2023). Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 61–61). EuroCC Austria. https://doi.org/10.34726/5368, opens an external URL in a new window
-
| Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI at reposiTUm , opens an external URL in a new windowTräff, J. L., Hunold, S., Vardas, I., & Funk, N. M. (2023). Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI. In 2023 IEEE International Conference on Cluster Computing (CLUSTER) (pp. 284–294). IEEE. https://doi.org/10.1109/CLUSTER52292.2023.00031, opens an external URL in a new window
-
| Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications at reposiTUm , opens an external URL in a new windowVardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2023). Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 10–10). EuroCC Austria. https://doi.org/10.34726/5330, opens an external URL in a new window
-
| Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules at reposiTUm , opens an external URL in a new windowTräff, J. L. (2022). Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In K. Agrawal & I.-T. A. Lee (Eds.), Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2022) (pp. 143–146). ACM. https://doi.org/10.1145/3490148.3538560, opens an external URL in a new window
-
| An Overhead Analysis of MPI Profiling and Tracing Tools at reposiTUm , opens an external URL in a new windowHunold, S., Ajanohoun, J. I., Vardas, I., & Träff, J. L. (2022). An Overhead Analysis of MPI Profiling and Tracing Tools. In C. Scully-Allison, R. Liem, & A. V. Solorzano (Eds.), PERMAVOST 2022: Proceedings of the 2nd Workshop on Performance Engineering, Modelling, Analysis, and Visualization Strategy (pp. 5–13). Association for Computing Machinery (ACM). https://doi.org/10.1145/3526063.3535353, opens an external URL in a new window
-
| Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia at reposiTUm , opens an external URL in a new windowHunold, S., & Przybylski, B. (2022, May 18). Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. New Challenges in Scheduling Theory (Centre CNRS “Paul-Langevin”, Aussois, France), Aussois, France.
-
| MPI Performance Tools under the Microscope: A Thorough Overhead Analysis at reposiTUm , opens an external URL in a new windowAjanohoun, J. I., Vardas, I., Träff, J. L., & Hunold, S. (2022). MPI Performance Tools under the Microscope: A Thorough Overhead Analysis. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 16). EuroCC Austria.
-
| Performance and programmability comparison of the thick control flow architecture and current multicore processors at reposiTUm , opens an external URL in a new windowForsell, M., Nikula, S., Roivainen, J., Leppänen, V., & Träff, J. L. (2022). Performance and programmability comparison of the thick control flow architecture and current multicore processors. The Journal of Supercomputing, 78(3), 3152–3183. https://doi.org/10.1007/s11227-021-03985-0, opens an external URL in a new window
-
| Performance Tuning of MPI Collectives - Status Quo and Open Problems at reposiTUm , opens an external URL in a new windowHunold, S. (2022). Performance Tuning of MPI Collectives - Status Quo and Open Problems. CaSToRC HPC National Competence Center Fall Seminar Series 2022, Unknown.
-
| (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI at reposiTUm , opens an external URL in a new windowTräff, J. L. (2022). (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI. arXiv. https://doi.org/10.48550/arXiv.2205.10072, opens an external URL in a new window
-
| Fast(er) Construction of Round-optimal n-Block Broadcast Schedules at reposiTUm , opens an external URL in a new windowTräff, J. L. (2022). Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In Proceedings IEEE International Conference on Cluster Computing (CLUSTER 2022) (pp. 142–151). IEEE. https://doi.org/10.1109/CLUSTER51413.2022.00028, opens an external URL in a new window
-
| mpisee: MPI Profiling for Communication and Communicator Structure at reposiTUm , opens an external URL in a new windowVardas, I., Hunold, S., Ajanohoun, J. I., & Traff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022) (pp. 520–529). IEEE. https://doi.org/10.1109/IPDPSW55747.2022.00092, opens an external URL in a new window
-
| mpisee: MPI Profiling for Communication and Communicator Structure at reposiTUm , opens an external URL in a new windowVardas, I., Hunold, S., Ajanohoun, J. I., & Träff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 15). EuroCC Austria.
-
| Teaching Complex Scheduling Algorithms at reposiTUm , opens an external URL in a new windowHunold, S., & Przybylski, B. (2021). Teaching Complex Scheduling Algorithms. In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 11th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar 2021) in conjunction with 35th IEEE IPDPS 2021 - Online Conference, Portland, Oregon, USA, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw52791.2021.00058, opens an external URL in a new window
-
| MicroBench Maker: Reproduce, Reuse, Improve at reposiTUm , opens an external URL in a new windowHunold, S., Ajanohoun, J. I., & Carpen-Amarie, A. (2021). MicroBench Maker: Reproduce, Reuse, Improve. In 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). 12th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2021) in conjunction with SC 2021, St. Louis, Missouri, United States of America (the). IEEE. https://doi.org/10.1109/pmbs54543.2021.00013, opens an external URL in a new window
-
| A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation at reposiTUm , opens an external URL in a new windowTräff, J. L. (2021). A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation. arXiv. https://doi.org/10.48550/arXiv.2109.12626, opens an external URL in a new window
-
| A more pragmatic implementation of the lock-free, ordered, linked list at reposiTUm , opens an external URL in a new windowTräff, J. L., & Pöter, M. (2021). A more pragmatic implementation of the lock-free, ordered, linked list. In J. Lee & E. Petrank (Eds.), Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM. https://doi.org/10.1145/3437801.3441579, opens an external URL in a new window
-
| MPI collective communication through a single set of interfaces: A case for orthogonality at reposiTUm , opens an external URL in a new windowTräff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2021). MPI collective communication through a single set of interfaces: A case for orthogonality. Parallel Computing: Systems & Applications, 107(102826), 102826. https://doi.org/10.1016/j.parco.2021.102826, opens an external URL in a new window
-
| High-Quality Hierarchical Process Mapping at reposiTUm , opens an external URL in a new windowFaraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. arXiv. https://doi.org/10.48550/arXiv.2001.07134, opens an external URL in a new window
-
| High-Quality Hierarchical Process Mapping at reposiTUm , opens an external URL in a new windowFaraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. In S. Faro & D. Cantone (Eds.), 18th International Symposium on Experimental Algorithms, SEA 2020 (pp. 4:1-4:15). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.SEA.2020.4, opens an external URL in a new window
-
| Optimizing Memory Access in TCF Processors with Compute-Update Operations at reposiTUm , opens an external URL in a new windowForsell, M., Roivainen, J., & Träff, J. L. (2020). Optimizing Memory Access in TCF Processors with Compute-Update Operations. In 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020) in conjunction with IPDPS 2020 - Online Conference, New Orleans, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw50202.2020.00100, opens an external URL in a new window
-
| Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia at reposiTUm , opens an external URL in a new windowHunold, S., & Przybylski, B. (2020). Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. arXiv. https://doi.org/10.48550/arXiv.2003.05217, opens an external URL in a new window
-
| Predicting MPI Collective Communication Performance Using Machine Learning at reposiTUm , opens an external URL in a new windowHunold, S., Bhatele, A., Bosilca, G., & Knees, P. (2020). Predicting MPI Collective Communication Performance Using Machine Learning. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00036, opens an external URL in a new window
-
| Efficient Process-to-Node Mapping Algorithms for Stencil Computations at reposiTUm , opens an external URL in a new windowHunold, S., von Kirchbach, K., Lehr, M., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. arXiv. https://doi.org/10.48550/arXiv.2005.09521, opens an external URL in a new window
-
| Better Process Mapping and Sparse Quadratic Assignment at reposiTUm , opens an external URL in a new windowKirchbach, K. V., Schulz, C., & Träff, J. L. (2020). Better Process Mapping and Sparse Quadratic Assignment. ACM Journal on Experimental Algorithmics, 25, 1–19. https://doi.org/10.1145/3409667, opens an external URL in a new window
-
| Improved Cartesian Topology Mapping in MPI at reposiTUm , opens an external URL in a new windowLehr, M., & von Kirchbach, K. (2020). Improved Cartesian Topology Mapping in MPI. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 27). IST Austria. https://doi.org/10.15479/AT:ISTA:7474, opens an external URL in a new window
-
| Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience at reposiTUm , opens an external URL in a new windowPachajoa, C., Levonyak, M., Pacher, C., Träff, J. L., & Gansterer, W. (2020). Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 13). IST Austria. https://doi.org/10.15479/AT:ISTA:7474, opens an external URL in a new window
-
| Decomposing MPI Collectives for Exploiting Multi-lane Communication at reposiTUm , opens an external URL in a new windowTräff, J. L. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. SPCL_Bcast, ETH Zürich, Zürich, Switzerland.
-
| k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms at reposiTUm , opens an external URL in a new windowTräff, J. L. (2020). k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms. arXiv. https://doi.org/10.48550/arXiv.2008.12144, opens an external URL in a new window
-
| Exploiting Multi-lane Communication in MPI Collectives at reposiTUm , opens an external URL in a new windowTräff, J. L. (2020). Exploiting Multi-lane Communication in MPI Collectives. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 30). IST Austria. https://doi.org/10.15479/AT:ISTA:7474, opens an external URL in a new window
-
| Signature Datatypes for Type Correct Collective Operations, Revisited at reposiTUm , opens an external URL in a new windowTräff, J. L. (2020). Signature Datatypes for Type Correct Collective Operations, Revisited. In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416324, opens an external URL in a new window
-
| Special issue: Selected papers from EuroMPI 2019 at reposiTUm , opens an external URL in a new windowTräff, J. L., & Hoefler, T. (2020). Special issue: Selected papers from EuroMPI 2019. Parallel Computing, 99, Article 102695. https://doi.org/10.1016/j.parco.2020.102695, opens an external URL in a new window
-
| Decomposing MPI Collectives for Exploiting Multi-lane Communication at reposiTUm , opens an external URL in a new windowTräff, J. L., & Hunold, S. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00037, opens an external URL in a new window
-
| A more Pragmatic Implementation of the Lock-free, Ordered, Linked List at reposiTUm , opens an external URL in a new windowTräff, J. L., & Pöter, M. (2020). A more Pragmatic Implementation of the Lock-free, Ordered, Linked List. arXiv. https://doi.org/10.48550/arXiv.2010.15755, opens an external URL in a new window
-
| Collectives and Communicators: A Case for Orthogonality at reposiTUm , opens an external URL in a new windowTräff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2020). Collectives and Communicators: A Case for Orthogonality. In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416319, opens an external URL in a new window
-
| Efficient Process-to-Node Mapping Algorithms for Stencil Computations at reposiTUm , opens an external URL in a new windowvon Kirchbach, K., Lehr, M., Hunold, S., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00011, opens an external URL in a new window
-
| On Optimal Trees for Irregular Gather and Scatter Collectives at reposiTUm , opens an external URL in a new windowTräff, J. L. (2019). On Optimal Trees for Irregular Gather and Scatter Collectives. IEEE Transactions on Parallel and Distributed Systems, 30(9), 2060–2074. https://doi.org/10.1109/tpds.2019.2899843, opens an external URL in a new window
-
| Proceedings of the 26th European MPI Users' Group Meeting, EuroMPI 2019 at reposiTUm , opens an external URL in a new windowHoefler, T., & Träff, J. L. (Eds.). (2019). Proceedings of the 26th European MPI Users’ Group Meeting, EuroMPI 2019. ACM.
-
| On the Importance of Data Quality when Tuning MPI Libraries at reposiTUm , opens an external URL in a new windowHunold, S., & Carpen-Amarie, A. (2019). On the Importance of Data Quality when Tuning MPI Libraries. In G. Haase (Ed.), Austrian HPC Meeting 2019 - AHPC19 (AHPC19 booklet of abstracts) (p. 15). Institut für Mathematik und wissenschaftliches Rechnen der Universität Graz.
-
| More Parallelism in Dijkstra's Single-Source Shortest Path Algorithm at reposiTUm , opens an external URL in a new windowKainer, M., & Träff, J. L. (2019). More Parallelism in Dijkstra’s Single-Source Shortest Path Algorithm. arXiv. https://doi.org/10.48550/arXiv.1903.12085, opens an external URL in a new window
-
| LigandScout Remote: A New User-Friendly Interface for HPC and Cloud Resources at reposiTUm , opens an external URL in a new windowKainrad, T., Hunold, S., Seidel, T., & Langer, T. (2019). LigandScout Remote: A New User-Friendly Interface for HPC and Cloud Resources. Journal of Chemical Information and Modeling, 59(1), 31–37. https://doi.org/10.1021/acs.jcim.8b00716, opens an external URL in a new window
-
| Scalable Algorithms for MPI Intergroup Allgather and Allgatherv at reposiTUm , opens an external URL in a new windowKang, Q., Träff, J. L., Al-Bahrani, R., Agrawal, A., Choudhary, A., & Liao, W. (2019). Scalable Algorithms for MPI Intergroup Allgather and Allgatherv. Parallel Computing: Systems & Applications, 85, 220–230. https://doi.org/10.1016/j.parco.2019.04.015, opens an external URL in a new window
-
| How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node Failures at reposiTUm , opens an external URL in a new windowPachajoa, C., Levonyak, M., Gansterer, W. N., & Träff, J. L. (2019). How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node Failures. In Proceedings of the 48th International Conference on Parallel Processing. 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan. ACM. https://doi.org/10.1145/3337821.3337849, opens an external URL in a new window
-
| How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node Failures at reposiTUm , opens an external URL in a new windowPachajoa, C., Levonyak, M., Gansterer, W., & Träff, J. L. (2019). How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node Failures (1907.13077). arXiv. https://doi.org/10.48550/arXiv.1907.13077, opens an external URL in a new window
-
| Cartesian Collective Communication: "Advice to users", "Advice to implementers", and "Advice to Standardizers" at reposiTUm , opens an external URL in a new windowTräff, J. L. (2019). Cartesian Collective Communication: “Advice to users”, “Advice to implementers”, and “Advice to Standardizers.” University of Bordeaux, Bordeaux, France.
-
| Decomposing Collectives for Exploiting Multi-lane Communication at reposiTUm , opens an external URL in a new windowTräff, J. L. (2019). Decomposing Collectives for Exploiting Multi-lane Communication. arXiv. https://doi.org/10.48550/arXiv.1910.13373, opens an external URL in a new window
-
| On Optimal Trees for Irregular Gather and Scatter Collectives at reposiTUm , opens an external URL in a new windowTräff, J. L. (2019). On Optimal Trees for Irregular Gather and Scatter Collectives. Kolloquium Mathematische Informatik, Goethe-Universität Frankfurt am Main, Frankfurt am Main, Germany.
-
| On optimal Trees for irregular gather and scatter collectives? at reposiTUm , opens an external URL in a new windowTräff, J. L. (2019). On optimal Trees for irregular gather and scatter collectives? FernUniversität in Hagen, Prof. Dr. Jörg Keller, Hagen, Germany.
-
| On optimal Trees for irregular gather and scatter collectives? at reposiTUm , opens an external URL in a new windowTräff, J. L. (2019). On optimal Trees for irregular gather and scatter collectives? Humboldt-Universität zu Berlin, Research Group on Modeling and Analysis of Complex Systems, Berlin, Germany.
-
| Foreword EuroMPI 2019 at reposiTUm , opens an external URL in a new windowTräff, J. L., & Hoefler, T. (2019). Foreword EuroMPI 2019. In T. Hoefler & J. L. Träff (Eds.), Proceedings of the 26th European MPI Users’ Group Meeting on - EuroMPI ’19. ACM. https://doi.org/10.1145/3343211.3343212, opens an external URL in a new window
-
| Cartesian Collective Communication at reposiTUm , opens an external URL in a new windowTräff, J. L., & Hunold, S. (2019). Cartesian Collective Communication. In Proceedings of the 48th International Conference on Parallel Processing. 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan. ACM. https://doi.org/10.1145/3337821.3337848, opens an external URL in a new window
-
| Implementation of Multioperations in Thick Control Flow Processors at reposiTUm , opens an external URL in a new windowForsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2018). Implementation of Multioperations in Thick Control Flow Processors. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 20th Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2018) in conjunction with IPDPS 2018, Vancouver, British Columbia, Canada, Non-EU. IEEE. https://doi.org/10.1109/ipdpsw.2018.00121, opens an external URL in a new window
-
| Supporting concurrent memory access in TCF processor architectures at reposiTUm , opens an external URL in a new windowForsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2018). Supporting concurrent memory access in TCF processor architectures. Microprocessors and Microsystems, 63, 226–236. https://doi.org/10.1016/j.micpro.2018.09.013, opens an external URL in a new window
-
| Algorithm Selection of MPI Collectives Using Machine Learning Techniques at reposiTUm , opens an external URL in a new windowHunold, S., & Carpen-Amarie, A. (2018). Algorithm Selection of MPI Collectives Using Machine Learning Techniques. In 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). 9th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2018) in conjunction with SC 2018, Dallas, Texas, USA, Non-EU. IEEE. https://doi.org/10.1109/pmbs.2018.8641622, opens an external URL in a new window
-
| Autotuning MPI Collectives using Performance Guidelines at reposiTUm , opens an external URL in a new windowHunold, S., & Carpen-Amarie, A. (2018). Autotuning MPI Collectives using Performance Guidelines. In Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region. International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2018), Tokyo, Japan, Non-EU. ACM. https://doi.org/10.1145/3149457.3149461, opens an external URL in a new window
-
| Hierarchical Clock Synchronization in MPI at reposiTUm , opens an external URL in a new windowHunold, S., & Carpen-Amarie, A. (2018). Hierarchical Clock Synchronization in MPI. In 2018 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing, CLUSTER 2018, Belfast, United Kingdom, EU. IEEE. https://doi.org/10.1109/cluster.2018.00050, opens an external URL in a new window
-
| Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal Bandwidth at reposiTUm , opens an external URL in a new windowKang, Q., Träff, J. L., Al-Bahrani, R., Agrawal, A., Choudhary, A., & Liao, W. (2018). Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal Bandwidth. In Proceedings of the 25th European MPI Users’ Group Meeting. 25th European MPI Users’ Group Meeting (EuroMPI 2018), Barcelona, Spain, EU. ACM. https://doi.org/10.1145/3236367.3236374, opens an external URL in a new window
-
| Memory Models for C/C++ Programmers at reposiTUm , opens an external URL in a new windowPöter, M., & Träff, J. L. (2018). Memory Models for C/C++ Programmers. arXiv. https://doi.org/10.48550/arXiv.1803.04432, opens an external URL in a new window
-
| Stamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model at reposiTUm , opens an external URL in a new windowPöter, M., & Träff, J. L. (2018). Stamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model. arXiv. https://doi.org/10.48550/arXiv.1805.08639, opens an external URL in a new window
-
| <i>Stamp-it</i> , amortized constant-time memory reclamation in comparison to five other schemes at reposiTUm , opens an external URL in a new windowPöter, M., & Träff, J. L. (2018). Stamp-it , amortized constant-time memory reclamation in comparison to five other schemes. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 23rd Symposium on Principles and Practice of Parallel Programming (PPoPP 2018), Vienna, Austria, Austria. ACM. https://doi.org/10.1145/3178487.3178532, opens an external URL in a new window
-
| Brief Announcement at reposiTUm , opens an external URL in a new windowPöter, M., & Träff, J. L. (2018). Brief Announcement. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. 30th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2018), Vienna, Austria, Austria. ACM. https://doi.org/10.1145/3210377.3210661, opens an external URL in a new window
-
| On Optimal trees for Irregular Gather and Scatter Collectives at reposiTUm , opens an external URL in a new windowTräff, J. L. (2018). On Optimal trees for Irregular Gather and Scatter Collectives. Uppsala University, Uppsala, Sweden, EU.
-
| Parallel Quicksort without Pairwise Element Exchange at reposiTUm , opens an external URL in a new windowTräff, J. L. (2018). Parallel Quicksort without Pairwise Element Exchange. arXiv. https://doi.org/10.48550/arXiv.1804.07494, opens an external URL in a new window
-
| Practical, distributed, low overhead algorithms for irregular gather and scatter collectives at reposiTUm , opens an external URL in a new windowTräff, J. L. (2018). Practical, distributed, low overhead algorithms for irregular gather and scatter collectives. Parallel Computing: Systems & Applications, 75, 100–117. https://doi.org/10.1016/j.parco.2018.04.003, opens an external URL in a new window