At SC20, Intel’s Trish Damkroger, vice president and general manager of HPC at Intel, shows how Intel and its partners are building the future of HPC today through hardware and software technologies that accelerate the broad deployment of advanced HPC systems. (Credit: Intel Corporation)
Intel
Using Intel Cluster Checker (Part 3)
Framework Definition (FWD) Selection and Definition
you wish to select a framework definitions. You can do it by using the command
% clck-analyze -X list
Framework definition file path: /usr/local/intel/cc2019/clck/2019.10/etc/fwd/ Framework definition list: avx512_performance_ratios_priv avx512_performance_ratios_user basic_internode_connectivity basic_shells benchmarks bios_checker clock cluster cpu_admin cpu_base cpu_intel64 cpu_user dapl_fabric_providers_present dgemm_cpu_performance environment_variables_uniformity ethernet exclude_hpl file_system_uniformity hardware health health_admin health_base health_extended_user hpcg_cluster hpcg_single hpl_cluster_performance hyper_threading imb_allgather imb_allgatherv imb_allreduce imb_alltoall imb_barrier imb_bcast imb_benchmarks_blocking_collectives imb_benchmarks_non_blocking_collectives imb_gather imb_gatherv imb_iallgather imb_iallgatherv imb_iallreduce imb_ialltoall imb_ialltoallv imb_ibarrier imb_ibcast imb_igather imb_igatherv imb_ireduce imb_ireduce_scatter imb_iscatter imb_iscatterv imb_pingping imb_pingpong_fabric_performance imb_reduce imb_reduce_scatter imb_reduce_scatter_block imb_scatter imb_scatterv infiniband_admin infiniband_base infiniband_user intel_dc_persistent_memory_capabilities_priv intel_dc_persistent_memory_dimm_placement_priv intel_dc_persistent_memory_events_priv intel_dc_persistent_memory_firmware_priv intel_dc_persistent_memory_kernel_support intel_dc_persistent_memory_mode_uniformity_priv intel_dc_persistent_memory_namespaces_priv intel_dc_persistent_memory_priv intel_dc_persistent_memory_tools_priv intel_hpc_platform_base_compat-hpc-2018.0 intel_hpc_platform_base_core-intel-runtime-2018.0 intel_hpc_platform_base_high-performance-fabric-2018.0 intel_hpc_platform_base_hpc-cluster-2018.0 intel_hpc_platform_base_sdvis-cluster-2018.0 intel_hpc_platform_base_sdvis-core-2018.0 intel_hpc_platform_base_sdvis-single-node-2018.0 intel_hpc_platform_compat-hpc-2018.0 intel_hpc_platform_compliance_tcl_version intel_hpc_platform_core-2018.0 intel_hpc_platform_core-intel-runtime-2018.0 intel_hpc_platform_cpu_sdvis-single-node-2018.0 intel_hpc_platform_firmware_high-performance-fabric-2018.0 intel_hpc_platform_high-performance-fabric-2018.0 intel_hpc_platform_hpc-cluster-2018.0 intel_hpc_platform_kernel_version_core-2018.0 intel_hpc_platform_libfabric_high-performance-fabric-2018.0 intel_hpc_platform_libraries_core-intel-runtime-2018.0 intel_hpc_platform_libraries_sdvis-cluster-2018.0 intel_hpc_platform_libraries_sdvis-core-2018.0 intel_hpc_platform_libraries_second-gen-xeon-sp-2019.0 intel_hpc_platform_linux_based_tools_present_core-intel-runtime-2018.0 intel_hpc_platform_memory_sdvis-cluster-2018.0 intel_hpc_platform_memory_sdvis-single-node-2018.0 intel_hpc_platform_minimum_memory_requirements_compat-hpc-2018.0 intel_hpc_platform_minimum_storage intel_hpc_platform_minimum_storage_sdvis-cluster-2018.0 intel_hpc_platform_minimum_storage_sdvis-single-node-2018.0 intel_hpc_platform_mount intel_hpc_platform_perl_core-intel-runtime-2018.0 intel_hpc_platform_rdma_high-performance-fabric-2018.0 intel_hpc_platform_sdvis-cluster-2018.0 intel_hpc_platform_sdvis-core-2018.0 intel_hpc_platform_sdvis-single-node-2018.0 intel_hpc_platform_second-gen-xeon-sp-2019.0 intel_hpc_platform_subnet_management_high-performance-fabric-2018.0 intel_hpc_platform_version_compat-hpc-2018.0 intel_hpc_platform_version_core-2018.0 intel_hpc_platform_version_core-intel-runtime-2018.0 intel_hpc_platform_version_high-performance-fabric-2018.0 intel_hpc_platform_version_hpc-cluster-2018.0 intel_hpc_platform_version_sdvis-cluster-2018.0 intel_hpc_platform_version_sdvis-core-2018.0 intel_hpc_platform_version_sdvis-single-node-2018.0 intel_hpc_platform_version_second-gen-xeon-sp-2019.0 iozone_disk_bandwidth_performance kernel_parameter_preferred kernel_parameter_uniformity kernel_version_uniformity local_disk_storage lsb_libraries lshw_disks lshw_hardware_uniformity memory_uniformity mpi mpi_bios mpi_environment mpi_ethernet mpi_libfabric mpi_local_functionality mpi_multinode_functionality mpi_prereq_admin mpi_prereq_user network_time_uniformity node_process_status opa_admin opa_base opa_user osu_allgather osu_allgatherv osu_allreduce osu_alltoall osu_alltoallv osu_barrier osu_bcast osu_benchmarks_blocking_collectives osu_benchmarks_non_blocking_collectives osu_benchmarks_point_to_point osu_bibw osu_bw osu_gather osu_gatherv osu_iallgather osu_iallgatherv osu_iallreduce osu_ialltoall osu_ialltoallv osu_ialltoallw osu_ibarrier osu_ibcast osu_igather osu_igatherv osu_ireduce osu_iscatter osu_iscatterv osu_latency osu_mbw_mr osu_reduce osu_reduce_scatter osu_scatter osu_scatterv perl_functionality precision_time_protocol privileged_user python_functionality rpm_snapshot rpm_uniformity second-gen-xeon-sp second-gen-xeon-sp_parallel_studio_xe_runtimes_2019.0 second-gen-xeon-sp_priv second-gen-xeon-sp_user select_solutions_redhat_openshift_base select_solutions_redhat_openshift_plus select_solutions_sim_mod_benchmarks_base_2018.0 select_solutions_sim_mod_benchmarks_plus_2018.0 select_solutions_sim_mod_benchmarks_plus_second_gen_xeon_sp select_solutions_sim_mod_priv_base_2018.0 select_solutions_sim_mod_priv_plus_2018.0 select_solutions_sim_mod_priv_plus_second_gen_xeon_sp select_solutions_sim_mod_user_base_2018.0 select_solutions_sim_mod_user_plus_2018.0 select_solutions_sim_mod_user_plus_second_gen_xeon_sp services_status sgemm_cpu_performance shell_functionality single std_libraries stream_memory_bandwidth_performance syscfg_settings_uniformity tcl_functionality tools
Node Roles:
The role annotation keyword is used to assign a node to one or more roles. A role describes the intended functionality of a node.
For example, the following nodefile defines 4 nodes: node1 is a head and compute node; node2, node3, and node4 are compute nodes; and node5 is disabled.
node1 # role: head node2 # role: compute node3 # role: login
Valid node role values are described below.
- boot – Provides software imaging / provisioning capabilities.
- compute – Is a compute resource (mutually exclusive with enhanced).
- enhanced – Provides enhanced compute resources, for example, contains additional memory (mutually exclusive with compute).
- external – Provides an external network interface.
- head – Alias for the union of boot, external, job_schedule, login, network_address, and storage.
- job_schedule – Provides resource manager / job scheduling capabilities.
- login – Is an interactive login system.
- network_address – Provides network address to the cluster, for example, DHCP.
- storage – Provides network storage to the cluster, like NFS.
More Information:
- Using Intel Cluster Checker (Part 1)
- Using Intel Cluster Checker (Part 2)
- Using Intel Cluster Checker (Part 3)
User Guide:
Using Intel Cluster Checker (Part 2)
Continuation of Using Intel Cluster Checker (Part 1)
Setup Environment Variables
In you .bashrc, add the following
export CLCK_ROOT=/usr/local/intel/cc2019/clck/2019.10 export CLCK_SHARED_TEMP_DIR=/scratch/tmp
Command Line Options
-c / –config=FILE: Specifies a configuration file. The default configuration file is CLCK_ROOT/etc/clck.xml.
-C / –re-collect-data: Attempts to re-collect any missing or old data for use in analysis. This option only applies to data collection.
-D / –db=FILE: Specifies the location of the database file. This option works in clck-analyze and in clck, but not currently in clck-collect.
-f / –nodefile: Specifies a nodefile containing the list of nodes, one per line. If a nodefile is not specified for clck or clck-collect, a Slurm query will be used to determine the available nodes. If no nodefile is specified for clck-analyze, the nodes already present in the database will be used.
-F / –fwd=FILE: Specifies a framework definition. If a framework definition is not specified, the health framework definition is used. This option can be used multiple times to specify multiple framework definitions. To see a list of available framework definitions, use the command line option -X list.
-h / –help: Displays the help message.
-l / –log-level: Specifies the output level. Recognized values are (in increasing order of verbosity)**: alert, critical, error, warning, notice, info, and debug. The default log level is error.
-M / –mark-snapshot: Takes a snapshot of the data used in an analysis. The string used to mark the data cannot contain the comma character “,” or spaces. This option only applies to analysis.
-n / –node-include: Displays only the specified nodes in the analyzer output.
-o / –logfile: Specifies a file where the results from the run are written. By default, results are written to clck_results.log.
-r / –permutations: Number of permutations of nodes to use when running cluster data providers. By default, one permutation will run. This option only applies to data collection.
-S / –ignore-subclusters: Ignores the subcluster annotations in the nodefile. This option only applies to data collection.
-t / –threshold-too-old: Sets the minimum number of days since collection that will trigger a data too old error. This option only applies to data analysis.
-v / –version: Prints the version and exits.
-X / –FWD_description: Prints a description of the framework definition if available. If the value passed is “list”, then it prints a list of found framework definitions.
-z / –fail-level: Specifies the lowest severity level at which found issues fail. Recognizes values are (in increasing order of severity)**: informational, warning, and critical. The default level at which issues cause a failure is warning.
–sort-asc: Organizes the output in ascending order of the specified field. Recognized values are “id”, “node”, and “severity”.
–sort-desc: Organizes the output in descending order of the specified field. Recognized values are “id”, “node”, and “severity.”
For more information about the available command line options and their uses, run Intel® Cluster Checker with the -h option, or see the man pages.
More Information:
- Using Intel Cluster Checker (Part 1)
- Using Intel Cluster Checker (Part 2)
- Using Intel Cluster Checker (Part 3)
User Guide:
Using Intel Cluster Checker (Part 1)
What is Intel Cluster Checker?
Intel® Cluster Checker provides tools to collect data from the cluster, analysis of the collected data, and provides a clear report of the analysis. Using Intel® Cluster Checker helps to quickly identify issues and improve utilization of resources.
Intel® Cluster Checker verifies the configuration and performance of Linux®-based clusters through analysis of cluster uniformity, performance characteristics, functionality and compliance with Intel® High Performance Computing (HPC) specifications. Data collection tools and analysis provide actionable remedies to identified issues. Intel® Cluster Checker tools and analysis are ideal for use by developers, administrators, architects, and users to easily identify issues within a cluster.
Installing Intel Cluster Checker Using Yum Repository
If you are using Yum Installation, do take a look at Intel Cluster Checker 2019 Installation
If not, you can untar the package if you have the tar.gz
Environment Setup
# source /usr/local/intel/2018u3/bin/compilervars.sh intel64 # source /usr/local/intel/2018u3/mkl/bin/mklvars.sh intel64 # source /usr/local/intel/2018u3/impi/2018.3.222/bin64/mpivars.sh intel64 # source /usr/local/intel/2018u3/parallel_studio_xe_2018/bin/psxevars.sh intel64 # export MPI_ROOT=/usr/local/intel/2018u3/impi/2018.3.222/intel64
# source /usr/local/intel/cc2019/clck/2019.10/bin/clckvars.sh
Create a nodefile and put the hosts in
% vim nodefile
node1 node2 node3
Running Intel Cluster Checker
*Make sure you have SSH login to the nodes without password. See SSH Login without Password
% clck -f nodefile
Examples of run…..
Running Collect ................................................................................................................................................................................................................ Running Analyze SUMMARY Command-line: clck -f nodefile Tests Run: health_base **WARNING**: 3 tests failed to run. Information may be incomplete. See clck_execution_warnings.log for more information. Overall Result: 8 issues found - HARDWARE UNIFORMITY (2), PERFORMANCE (2), SOFTWARE UNIFORMITY (4) ----------------------------------------------------------------------------------------------------------------------------------------- 8 nodes tested: node010, node[003-009] 0 nodes with no issues: 8 nodes with issues: node010, node[003-009] ----------------------------------------------------------------------------------------------------------------------------------------- FUNCTIONALITY No issues detected. HARDWARE UNIFORMITY The following hardware uniformity issues were detected: 1. The InfiniBand PCI physical slot for device 'MT27800 Family [ConnectX-5]' PERFORMANCE The following performance issues were detected: 1.Zombie processes detected. 1 node: node010 2. Processes using high CPU. 7 nodes: node010, node[003,005-009] SOFTWARE UNIFORMITY The following software uniformity issues were detected: 1. The OFED version, 'MLNX_OFED_LINUX-4.5-1.0.1.0 (OFED-4.5-1.0.1)', is not uniform..... 5 nodes: node[003-004,006-007,009] 2. The OFED version, 'MLNX_OFED_LINUX-4.3-1.0.1.0 (OFED-4.3-1.0.1)', is not uniform..... 3 nodes: node010, node[005,008] 3. Environment variables are not uniform across the nodes. ..... 4. Inconsistent Ethernet driver version. ..... See the following files for more information: clck_results.log, clck_execution_warnings.log
Intel MPI Library Troubleshooting
If you are an admin and if you make sure their cluster is set up to work with the Intel® MPI Library, do the following
% clck -f nodefile -F mpi_prereq_admin
If you are non-privileged user and if you make sure their cluster is set up to work with the Intel® MPI Library, do the following
% clck -f nodefile -F mpi_prereq_user
More Information:
- Using Intel Cluster Checker (Part 1)
- Using Intel Cluster Checker (Part 2)
- Using Intel Cluster Checker (Part 3)
User Guide:
Intel® Enpirion® Multi-Rail Power Sequencing and Monitoring
The Multi-Rail Power Sequencer and Monitor is a programmable module within the Intel MAX 10 FPGA and MAX V CPLD. The sequencer can monitor up to 144 power rails to meet the full range of power requirements for FPGAs, ASICs, CPUs, and other processors. It can be easily configured and scaled with our user-friendly Platform Designer GUI.
Intel FPGA Technology Day (IFTD 2020)
Event Overview
Intel® FPGA Technology Day (IFTD) is a one-day virtual event that showcases the latest Intel FPGA products and solutions through a series of webinars and demonstrations from Intel, partners, and customers. During this day-long event, you will learn how Intel® FPGAs, Intel® eASIC™ structured ASICs, Intel® SmartNICs, and Intel® Enpirion® power solutions can shorten your design cycle, help to overcome your many design challenges, and accelerate business innovation. In the IFTD 2020 virtual exhibition space, you will see Intel® FPGA technology in action in the edge, in data centers, and in the cloud. IFTD 2020 takes place on November 18, 2020.
To Register see https://plan.seek.intel.com/IFTD-2020_RegPage?trackSrc=Invite_registernow_CTA
Keynote
- Accelerate Your Future with the broad Intel Portfolio including Intel FPGAs
- Accelerate your Cloud and Enterprise Applications with Intel Data Center Solutions including Intel FPGAs
- Conquer Your Most Difficult Embedded Challenges
- How to use Intel FPGAs and eASIC Structured ASICs to develop solutions for a wide array of applications
Find CPU and GPU Performance Headroom using Roofline Analysis
Join Technical Consulting Engineer and HPC programming expert Cedric Andreolli for a session covering:
- How to perform GPU headroom and GPU caches locality analysis using Advisor Roofline extensions for oneAPI and OpenMP
- An introduction to a new memory-level Roofline feature that helps pinpoint which specific memory level (L1, L2, L3, or DRAM) is causing the bottleneck
- A walkthrough of Intel Advisor’s improved user interface
To see video, see https://techdecoded.intel.io/essentials/find-cpu-gpu-performance-headroom-using-roofline-analysis/#gs.fpbz93
Intel AI Summit 2020
Intel AI Summit 2020 is here. To register or get more information, do https://webinar.intel.com/AISummit2020
Intel Launches 11th Gen Intel Core and Intel Evo (code-named “Tiger Lake”)
Intel released 11th Gen Intel® Core™ mobile processors with Iris® Xe graphics (code-named “Tiger Lake”). The new processors break the boundaries of performance with unmatched capabilities in productivity, collaboration, creation, gaming and entertainment on ultra-thin-and-light laptops. They also power the first class of Intel Evo platforms, made possible by the Project Athena innovation program. (Credit: Intel Corporation)
- Intel launches 11th Gen Intel® Core™ processors with Intel® Iris® Xe graphics, the world’s best processors for thin-and-light laptops1, delivering up to 2.7x faster content creation2, more than 20% faster office productivity3 and more than 2x faster gaming plus streaming4 in real-world workflows over competitive products.
- Intel® Evo™ platform brand introduced for designs based on 11th Gen Intel Core processors with Intel Iris Xe graphics and verified through the Project Athena innovation program’s second-edition specification and key experience indicators (KEIs).
- More than 150 designs based on 11th Gen Intel Core processors are expected from Acer, Asus, Dell, Dynabook, HP, Lenovo, LG, MSI, Razer, Samsung and others.
Accelerate Insights with AI and HPC Combined by Intel
In this presentation the presenter will address those questions and give an overview of respective technology for Artificial Intelligence, including hardware platforms & software stacks with a special focus on how to enable successful development of AI solutions. The presenter will look into how to do this on the datacenter technology you know and use today, as well as specific technology for AI workloads. This will also be illustrated with practical customer examples