AMD’s 96 Core Epyc Genoa CPU faster than Intel’s Sapphire Rapid

Taken from the article “AMD’s 96 Core Epyc Genoa CPU is Over 70% Faster than Intel’s Xeon Sapphire Rapids Flagship in 2S Mode”

Linux kernel Compilation
….The dual Epyc 9554 (64 cores per socket) is 25-30% faster than the top Intel combo, while the Dual 9654 (96 cores per socket) is over 70% faster than the Sapphire Rapids-SP flagship….

Kernel-based Virtual Machine (KVM)
…..64-core Epyc 9554 pair is 25% faster than the latter makes it hard to defend the Intel offering. The Xeon Platinum 8490H mostly competes with the last-gen Milan and Milan-X flagships…..

MariaDB and Nginx
…..The Genoa parts are faster, with a lead of 10% to 40%, while the 64-core Milan-X deals with Sapphire Rapids…. 

AMD’s 96 Core Epyc Genoa CPU is Over 70% Faster than Intel’s Xeon Sapphire Rapids Flagship in 2S Mode


Intel OpenVINO 2022.2 is available

Key Updates includes:

Broader Model & Hardware Support

  • Preview support for upcoming Intel® processors, including the Intel® Data Center GPU Flex Series and Intel® Arc™ GPU
  • Support for 4th Gen Intel® Xeon Scalable processor (code named Sapphire Rapids)
  • Reduced memory consumption when using dynamic shapes on CPU to improve efficiency of NLP applications

Portability and Performance

Introducing new performance hint “Cumulative throughput” in AUTO device plug-in, enabling multiple accelerators (e.g. multiple GPUs) to be used at once maximizing inferencing performance.

To download the latest release, do take a look at Intel® Distribution of OpenVINO™ Toolkit

Changes to SSH Server on DevCloud

When connecting to the DevCloud for oneAPI

$ ssh devcloud @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the ED25519 key sent by the remote host is SHA256:/Dlip01tdMyRmhMDc870Z4Uk7AancwwoTnbb0EZajK0. Please contact your system administrator. Add correct host key in /home/<user_name>/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /home/<user_name>/.ssh/known_hosts:# Host key for ssh.devcloud.intel.com has changed and you have requested strict checking. Host key verification failed. kex_exchange_identification: Connection closed by remote host Connection closed by UNKNOWN port 65535

Cause:

DevCloud have just migrated oneAPI DevCloud to a new SSH tunnel server and upgraded the SSH server version to OpenSSH _8.2p1. For this reason the DevCloud are unable to reuse the old SSH fingerprint for the new server.

Remediation:

Step 1: Remove the Offending FingerPrint(s)

Method 1: rename your existing ~/.ssh/known_hosts file to something else, such as ~/.ssh/known_hosts.yymmdd

$ mv ~/.ssh/known_hosts ~/.ssh/known_hosts.220623 

Method 2: remove the offending host SSH fingerprint only:

$ ssh-keygen -R ssh.devcloud.intel.com # Host ssh.devcloud.intel.com found: line 1 # Host ssh.devcloud.intel.com found: line 2 # Host ssh.devcloud.intel.com found: line 3 /home/<user_name>/.ssh/known_hosts updated. Original contents retained as /home/<user_name>/.ssh/known_hosts.old 

Step 2: reconnect to the DevCloud and accept the new key.

$ ssh devcloud The authenticity of host 'ssh.devcloud.intel.com (12.229.61.118)' can't be established. ED25519 key fingerprint is SHA256:/Dlip01tdMyRmhMDc870Z4Uk7AancwwoTnbb0EZajK0. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added 'ssh.devcloud.intel.com' (ED25519) to the list of known hosts. 

Intel Distribution OpenVINO Toolkit 2022.1 is available!

For more information, do take a look at Intel® Distribution of OpenVINO™ Toolkit

Updated, Cleaner API

  • The new OpenVINO API 2.0 was introduced, which aligns OpenVINO inputs and outputs with frameworks. Input and output tensors use native framework layouts and element types. 
  • The API parameters in Model Optimizer have been reduced to minimize complexity. Performance has been significantly improved for model conversion on Open Neural Network Exchange (ONNX*) models.

Broader Model Support

  • With Dynamic Input Shapes capabilities on CPU, OpenVINO is able to adapt to multiple input dimensions in a single model providing more complete NLP support. Support for Dynamic Shapes on additional XPUs is expected in a future dot release.
  • New models with a focus on NLP and a new category, Anomaly Detection, and support for conversion and inference of select PaddlePaddle* models:
    • Pretrained models for anomaly segmentation focus on industrial inspection making speech denoising trainable, plus updates on speech recognition and speech synthesis
    • Combined demonstration that includes noise reduction, speech recognition, question answering, translation, and text to speech
    • Public models with a focus on NLP ContextNet, Speech-Transformer, HiFi-GAN, Glow-TTS, FastSpeech2, and Wav2Vec

Portability and Performance

  • New AUTO plug-in self-discovers available system inferencing capacity based on model requirements so applications no longer need to know their compute environment in advance.
  • Automatic batching functionality via code hints automatically scale batch size based on XPU and available memory.
  • Built with 12th generation Intel® Core™ processors (formerly code named Alder Lake) in mind. Supports the hybrid architecture necessary to deliver enhancements for high performance inferencing on CPUs and integrated GPUs.

oneAPI DevSummit at ISC 2022

The Event Website can be found here. OneAPI DevSummit ISC 2022

Join us for this year’s all-remote technical conference to see the growing momentum of oneAPI and learn about how the community is using oneAPI on various platforms such as ARM, NVIDIA, Intel and more for HPC and AI workloads.This year offers a full day of hands-on tutorials, tech talks, and workshops spanning all things high-performance computing and AI: hardware, oneAPI software tools, best-practice techniques, and more to advance and deploy next-generation innovations that scale across platforms.

What you’ll get:

  • Fresh new content from industry-leading experts—SiPearl, Argonne, Pázmány Péter Catholic University, Durham University, and more.
  • Topics covering cross-architecture computing, SYCL programming challenges and opportunities, AI analytics, and Exascale.
  • Advance insight about the latest advancements shaping the future of high-performance computing.
  • A deep dive into cross-architecture software development—tech talks, how-to’s, and hands-on training sessions

Date: 27th May 2022

Time: 9am to 6:30pm CET

Register

Learn to Accelerate TensorFlow on Intel® Architecture with Minimal Code Changes

The OpenVINO™ integration with TensorFlow enables you to speed up the TensorFlow workflow by adding just two lines of code. Enhance performance on Intel platforms while using the familiar TensorFlow APIs. Download this whitepaper to get started.

Do sign up and get the white papers Learn to Accelerate TensorFlow on Intel® Architecture with Minimal Code Changes

Efficient Heterogeneous Parallel Programming Using OpenMP

This article is taken from Intel “Efficient Heterogeneous Parallel Programming Using OpenMP”. In this article, we will show you how to do CPU+GPU asynchronous calculations using OpenMP.

In some cases, offloading computations to an accelerator like a GPU means that the host CPU sits idle until the offloaded computations are finished. However, using the CPU and GPU resources simultaneously can improve the performance of an application. In OpenMP® programs that take advantage of heterogenous parallelism, the master clause can be used to exploit simultaneous CPU and GPU execution. In this article, we will show you how to do CPU+GPU asynchronous calculation using OpenMP.
…..
…..
…..

The Intel® oneAPI DPC++/C++ Compiler was used with following command-line options:
‑O3 ‑Ofast ‑xCORE‑AVX512 ‑mprefer‑vector‑width=512 ‑ffast‑math ‑qopt‑multiple‑gather‑scatter‑by‑shuffles ‑fimf‑precision=low
‑fiopenmp ‑fopenmp‑targets=spir64=”‑fp‑model=precise”

…..
…..
…..
OpenMP provides true asynchronous, heterogeneous execution on CPU+GPU systems. It’s clear from our timing results and VTune profiles that keeping the CPU and GPU busy in the OpenMP parallel region gives the best performance. We encourage you to try this approach.

Intel: Efficient Heterogeneous Parallel Programming Using OpenMP (Best Practices to Keep the CPU and GPU Working at the Same Time)

Building a Deployment-Ready TensorFlow Model (Part 1)

This is an interesting 3-part article on OpenVINO Deep Learning Workbench.

Pruning deep learning models, combining network layers, developing for multiple hardware targets—getting from a trained deep learning model to a ready-to-deploy inference model seems like a lot of work, which it can be if you hand code it.

With Intel® tools you can go from trained model to an optimized, packaged inference model entirely online without a single line of code. In this article, we’ll introduce you to the Intel® toolkits for deep learning deployments, including the Intel® Distribution of OpenVINO™ toolkit and Deep Learning Workbench. After that, we’ll get you signed up for a free Intel DevCloud for the Edge account so that you can start optimizing your own inference models.

The No-Code Approach to Deploying Deep Learning Models on Intel® Hardware

For more information, see The No-Code Approach to Deploying Deep Learning Models on Intel® Hardware