Department of Computer Science, UNC Chapel Hill

LUGPU: Algorithms for Dense Linear Systems on Graphics Hardware


Most present day computers sport powerful GPUs (Graphics Processing Units) capable of beating even high-end CPUs in terms of raw FLOPS when performing at their peak. Consequently, there has been lot of work recently on implementing different problems on GPUs to exploit their high processing power. GPUs have been used to solve many scientific computations, ranging from fluid dynamics to sparse matrix solvers. Here we provide a GPGP solution to LU decomposition.
Please refer to the documentation for details regarding the API and the contents of the distribution. Also, please read through the system requirements below before using the library.

System Requirements

  • OS: Microsoft Windows XP/2000
  • RAM: Atleast a size of graphics processor video memory is required.
  • GPU: NVIDIA GeForce/Quadro family card with support for the following OpenGL extensions:
    1. EXT_framebuffer_object
    2. ARB_texture_rectangle
    3. ARB_fragment_program
  • The above requirements are met by NV30-based GPUs and above (GeForce FX and GeForce 6 series)
  • The library has been tested on the following cards:
    • GeForce 7800 Ultra/GTO
    • GeForce 6800 Ultra/GTO
    • Quadro FX 4000
    • Laptop graphics cards: QuadroFX 700 Go, QuadroFX 1000 Go, QuadroFX 1400 Go, GeForce 6800 Go
      For obtaining reasonably high performance, we recommend a PC with AGP8X/PCI-Express NVIDIA GeForce 6800 GT or faster GPU.
  • Video RAM: The Video RAM willdetermine the maximum matrix size.
  • Drivers: Latest drivers from NVIDIA (version 7772 or higher for windows)
  • .


  • ATI cards: ATI cards are not supported in the present release of LUGPU mainly due to the lack of suport for framebuffer objects and ARB_texture_rectangle in current ATI drivers. These cards may be supported in future releases.
  • LAPACK Consistency: This implementation tries to emulate the LAPACK API. The exact specification is not currently met. This may be fixed in future releases.
      The info parameter may not produce the exact same code as LAPACK.
      The parameter lda is ignored and assumed to be equal to the corresponding matrix dimension.


©2003 Department of Computer Science, UNC Chapel Hill