RaAd's blogs: 2016

Monday, October 24, 2016

NodeJS, Firebase & Github

I've recently been working NodeJS (Express, Braintree & Ember), with Firebase Tools (Amazing Auth, No-SQL DB etc.) and a few other projects.

To have a look at the projects go to my Github - Click here

Wednesday, August 17, 2016

JCUDA - ITTERATIVE SOFT THRESHOLDING(IST) FOR SPARSE NUCLEAR MAGNETIC RESONANCE(NMR)

So I decided to look back at one of the projects I completed at ANU and decided to give it a CUDA update.

Project Link-https://cs.anu.edu.au/courses/csprojects/15S1/Reports/Badruddin_Kamal_Report.pdf

Git-https://github.com/bkraad47/JCUDA_SPASE_NMR_IST

Wednesday, July 20, 2016

Own the power of a super computer for just 200$'s (As of 5/6/2016)

So, I have always been an NVIDIA fanboy (No secret- I have owned AMD products before, don't really degrade them, but its just been a tradition. And yes was flipping over the GTX 1080), but AMD's RX480 (Still haven't used one, but hope to get one soon) just seems to have hit a home run. It costs 200$'s and delivers 5 TFLOPs, that's almost 40$ per TFLOP (https://en.wikipedia.org/wiki/FLOPS), that does make a very persuasive argument. I moved away from gaming for a few years now and I mostly use GPU's now to do calculations, and 40$'s per TFLOP makes a very persuasive argument to move to AMD and OpenCL, compared with NVIDIA's 77.8$'s per TFLOP.

Tuesday, April 12, 2016

Basic Introduction to HPC and Parallel Computing.

So, I've been pursuing my Masters at ANU for the last 2 years and have been doing some really neat stuff which might interest you. It involved doing some C/C++ programming and got to use the supercomputer there. I also got to do some advanced AI, Neural Networks, HCI stuff as well as HPC & Parallel computing. If you are looking to speed up your code or Algorithm's the HPC & Parallel computing, techniques may be quite useful.

So first and foremost you must understand Amdahl's law. - https://en.wikipedia.org/wiki/Amdahl%27s_law

Speed up =(1/ (1-P))+ (P/N)

Where, P = Parallel portion of code and (1-P) is the sequential portion of the code.

         N = No. of Processors.

A more detailed description can be found here- http://www.shodor.org/media/content/petascale/materials/UPModules/beginnersGuideHPC/moduleDocument_pdf.pdf

Loop Unrolling- https://www.cs.umd.edu/class/fall2001/cmsc411/proj01/proja/loop.html

Vector Processing using SSE (Liks Below)

One of the things I've been looking at is SIMD processor programming. I've fallen in love with MPI and CUDA, There are some really cool blogs on this out there, I'll post a few, I found useful here.

Eric Holk's post- https://theincredibleholk.wordpress.com/2012/10/26/are-gpus-just-vector-processors/

If you want to try CUDA, SSE or MPI. I highly recommend trying respective documentation and software as it come's in great detail.

CUDA- https://developer.nvidia.com/cuda-zone(Has amazingly good documentation)

JCUDA- http://www.jcuda.org/

SSE/AVX - https://software.intel.com/sites/landingpage/IntrinsicsGuide/

SSE tutorial - http://neilkemp.us/src/sse_tutorial/sse_tutorial.html

MPI Tutorial - http://mpitutorial.com/tutorials/

MPI based fftw - http://www.sandia.gov/~sjplimp/docs/fft/README.html

And to analyze performance PAPI Profiler - https://icl.cs.utk.edu/projects/papi/wiki/Main_Page

Hope this helps.

Cheers

Raad