Posts filed under ‘Parallel Programming’

Phi and Kepler Run Monte Carlo Race

There have been a number of efforts lately to help delineate the differences between performance, portability and functionality on GPUs over the new Xeon Phi coprocessors, with some organizations benchmarking according to industry-specific algorithms.

Read more

September 25, 2013 at 8:24 am Leave a comment

NC Researchers Claim Pathway for Processor Scalability

It can be said the scalability is one of biggest challenges that I found in parallel computing. Hence, those types of research are quite interesting.

RESEARCH TRIANGLE PARK, N.C., Apr 16 — Researchers sponsored by Semiconductor Research Corporation (SRC), the world’s leading university-research consortium for semiconductors and related technologies, today announced that they have identified a path to overcome challenges for scaling multi-core semiconductors by successfully addressing how to scale memory communications among the cores. The results can lead to continued design of ever-smaller integrated circuits (ICs) into computer hardware without expensive writing of all new software from scratch to accommodate the increased capabilities.

Read more

April 17, 2012 at 1:17 am Leave a comment

OpenMP Announces Improvements for Multicore and Accelerators

CHAMPAIGN, Ill., Mar 27 — OpenMP, the de-facto standard for parallel programming on shared memory systems, continues to extend its reach beyond pure HPC to include embedded systems, multicore and real time systems. A new version is being developed that will include support for accelerators, error handling, thread affinity, tasking extensions and Fortran 2003. The OpenMP consortium welcomes feedback from all interested parties and will use this feedback to improve the next version of OpenMP.

Read more…

April 1, 2012 at 10:29 am Leave a comment

AMD Opens Up Heterogeneous Computing


HSA, which until recently was know as the Fusion architecture, is AMD’s platform design for integrating CPU and GPU cores onto the same chip. But HSA is more than AMD’s attempt to define an architecture for internal use, as was the case for Fusion. Rather HSA is an open specification that AMD wants the industry to adopt as the de facto platform for heterogenous computing…”

For full article, visit this link:

March 14, 2012 at 10:21 pm Leave a comment

DDN solutions now available through Penguin Computing

DataDirect Networks (DDN) has announced that Penguin Computing has signed an agreement to offer DDN’s award-winning suite of HPC and Big Data storage solutions to its global customer base. Effective immediately, customers will be able to source DDN products from Penguin Computing, including the SFA storage platforms, the GRIDScaler and EXAScaler parallel file storage systems, NAS Scaler, DDN’s enterprise scale-out NAS platform, and WOS, the company’s hyperscale object storage system.

For full article, visit here:

March 14, 2012 at 10:11 pm Leave a comment

CPU affinitity – Why need and How to ?

CPU affinity  ( or processor affinity ) is an ability provided by operating systems such as Windows or Linux that allows you to select  specific CPUs or processors to run your program/application on. In a multi-core and multi-processor system, the assignment of a process to a CPU/processor is automatically decided by the OS via its scheduler. However, you still can interfere in this scheduling task by specifying a CPU that your program will be run on.

Why CPU affinity is needed ?

You may ask “Why do I need it if OS is hanlding everything for me?”. You are right. In most cases, you won’t need this function. However, if the runtime performance is your concern,  in some cases, it’s worth a try to decide whether to use CPU affinity. For demonstration, I wrote a simple parallel program doing some computations using multi-cores. When I ran this program on a 12-core machine (two prossesors , six cores per processor) with/without CPU affinity, I got below runtime performance.

CPU affinity improves the  runtime performance because it optimizes cache performance by reducing cache miss. In a NUMA system, setting CPU affinity and allocating memory also on the faster RAM can speed up the process as well.

How to ?

There are two ways to set the CPU affinity in both Linux and Windows.

In Windows:
Method 1: Set the CPU affinity using Task manager

  • Open Task manager by pressing Ctrl + Alt + Delete and selecting the Task manager
  • Select Processes tab 
  • Right click on the process that you want to set CPU affinity
  • Select “Set affinity…” from the drop down menu
  • Set the CPUs that you want your program to run on

Method 2: Second, program from your source code
 You can use the Windows API SetProcessAffinityMask to set the CPU affinity from your program code

In Linux:

Method 1: Launch the program from the command line using settask
The below command will launch gedit in CPU 1 & 4 (or 0 and 3).

taskset -c 0,3 gedit

Method 2: Second, program from your source code
You can use the function sched_setaffinity in sched.h to manage CPU affinity from your code

August 5, 2011 at 2:40 pm 1 comment

AMD Intros OpenCL University Kit

SUNNYVALE, Calif., Feb. 23 — AMD today announced the introduction of the OpenCL University Kit, a set of materials that can be leveraged by any university to assist them in teaching a semester course in OpenCL programming.

This effort underscores AMD’s commitment to the educational community, which currently includes a number of strategic research initiatives, to enable the next generation of software developers and programmers with the knowledge needed to lead the era of heterogeneous computing. OpenCL, the only non-proprietary industry standard available today for true heterogeneous computing, helps developers to harness the full compute power of both the CPU and GPU to create innovative applications for vivid computing experiences.

Read more

February 24, 2011 at 7:13 am Leave a comment

Command line error D8016 : ‘/openmp’ and ‘/clr:pure’ command-line options are incompatible

Did you get this error when compiled the OpenMP program with Visual C++ ?

Error “Command line error D8016 : ‘/openmp’ and ‘/clr:pure’ command-line options are incompatible

That is because  Microsoft’s OpenMP implementation is not compatible with Pure MSIL Common Language Runtime (/clr:pure) or Safe MSIL Common Language Runtime (/clr:safe ). To fix this issue, you will need to use /clr command-line option instead of /clr:pure. Following is steps to change this option:

  1. Select menu Project or right-click on project name of Solution Explorer
  2. Choose the Properties to open properties window of the project
  3. Select General in Configuration Properties
  4. In the Common Language Runtime, choose /clr
  5. Then OK and re-compile your project

January 25, 2011 at 10:42 pm 2 comments

Intel is preparing a 1000-core processor

According to Timothy Mattson, scientific expert Intel, speaking at the conference Supercomputing Conference 2010 in New Orleans in the U.S., it is based on a scalable architecture 48-core processor and is based concept chip with 1000 cores working. Mattson said that on the scalability of the 1000 nuclear chips can match the entire datacenter in use today. Read more

1000 core processor ? This is not new because Scottish Scientists recently created 1000-Core Processor on a Single FPGA Chip (Read more). This gave me the feeling that supercomputing era is back. But this hardware must be very expensive because FPGA is never cheap. So, for any one who are looking for a powerful and cheap multi-core processor, why don’t consider GPUs. Nvidia is now has 448 core chip and ATI has GPU with up to 1000 cores. Although, there are limitations on writing programs running on GPU, using GPUs is big shot if we are short in budget.

Anyway, I hope that this Intel 1000-core processor will come soon.

January 12, 2011 at 1:39 pm Leave a comment

Facing the Multicore-Challenge

“Facing the Multicore-Challenge: Aspects of New Paradigms and Technologies in Parallel Computing”, edited by Rainer Keller, David Kramer, and Jan-Philipp Weiss is an outcome of the conference titled “Facing the Multicore” held at the Heidelberger Akademie der Wissenschaften, March 17–19, 2010. The conference focused on topics related to the impact of multicore and coprocessor technologies in science and for large-scale applications in an interdisciplinary environment. Read more

January 12, 2011 at 1:37 pm Leave a comment

Older Posts

Recent Posts

/openmp AMD app fixing dead pixel iphone C++ cmd program CPU C sharp dead pixels directory download ibm synthetic data generator Dynamic Dynamic Memory Allocation Edit Environment Variables in Windows Environment Variables error lnk2019: unresolved external symbol _getprocessmemoryinfo@12 referenced in function error lnk2019: unresolved external symbol _ getprocessmemoryinfo@12 referenced in function "void __cdecl printmemoryinfo(unsigned long example code Fixing Dead Pixels and Gray Lines on the iPhone Screen Fixing Dead Pixels on the iPhone Screen Fixing Gray Lines on the iPhone Screen Fix iPhone getprocessmemoryinfo GPU Gray Line iPhone Screen gray lines gray pixels green pixels GUI how to "new" a two-dimension array in C++ how to use ibm quest synthetic data generator ibm data generator ibm quest data generator ibm quest data generator exe ibm quest data mining project ibm quest market-basket synthetic data generator ibm quest market basket market-basket synthetic data generator ibm quest synthetic data generator ibm quest synthetic data generator linux ibm synthetic data generator ibm synthetic generator Intel iPhone Iphone 3G iPhone 3GS iPhone 4 iphone gray lines on startup iphone pixel damage iPhone Screen iPhone screen damage Linux market-basket synthetic data generator Memory Allocation Multicore multithread multi thread multi threaded multithreading mysql extract data into file new OpenCL Path processor quest data generator quest synthetic data generator R SAS Set Environment Variables Set Environment Variables in Windows souce code source code stuck pixels system file two dimension array Windows 7 Windows Vista