I am thrilled to share that GPU accelerated tensorflow plugins on PixInsight now work with AMD 7000 series (76000,7800,7900xtx) GPUs. It took a little bit to figure out the build process and successfully get a library. I hope this simple guide can help get you up and running and enjoying GPU acceleration with StarNet++ and other apps.
This guide has been updated to reflect that WSL2 GPU Pass through support works on Windows and Ubuntu with latest ROCm release. This will allow windows users to run PixInsight from WSL2 and enjoy GPU Pass through acceleration. I hope we see continued development for native Windows tensorflow, but for now, this will allow you to run the Linux version almost like it’s native.
Requirements
- Ubuntu 24.0.4
- PixInsight (Latest version)
- AMD GPU (AMD 7000 series)
- GPU enabled scripts to verify (Starnet++)
Optional: WSL2 ROcm Requirements
These steps are only needed if you run WSL2 Ubuntu inside Windows. Only tested on 7000 series GPUs.
- Install latest Radeon Driver available for Windows (as of 11/1/2024)
- Reboot
- Jump to the Install ROCM Windows WSL2 section.
Install rocm – Ubuntu no WSL2
These steps are for bare metal / VM running Ubuntu 24.04. Please be sure that you have completed all apt updates prior to installing and make sure you have plenty of free disk space available.
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
sudo usermod -a -G render,video $LOGNAME # Adding current user to Video, Render groups. See prerequisites.
sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.2.3/ubuntu/noble/amdgpu-install_6.2.60203-1_all.deb
sudo apt install ./amdgpu-install_6.2.60203-1_all.deb
amdgpu-install --usecase=rocm,dkms
echo "Please reboot system for all settings to take effect."
Be sure to reboot as the above script says. This process will build kernel drivers and also set up permissions for render/video so the GPU can access the kernel drivers. GPU Acceleration in PixInsight with tensorflow based apps will NOT work until you restart.
Note: If you are having issues with PixInsight failing for QT crashes on Ubuntu, please be sure to install the following packages and their dependencies.
sudo apt-get install libqt5core5a libqt5svg5 libqt5webenginecore5 libqt5webengine5 libqt5webenginewidgets5 libqt5x11extras5
Install ROCM – Windows WSL2
These steps are for WSL2 on Windows 10/11 running Ubuntu 24.04
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.2.3/ubuntu/noble/amdgpu-install_6.2.60203-1_all.deb
sudo apt install ./amdgpu-install_6.2.60203-1_all.deb
wget https://repo.radeon.com/amdgpu/6.2.3/ubuntu/pool/main/h/hsa-runtime-rocr4wsl-amdgpu/hsa-runtime-rocr4wsl-amdgpu_1.14.0-2057403.24.04_amd64.deb
sudo apt install ./hsa-runtime-rocr4wsl-amdgpu_1.14.0-2057403.24.04_amd64.deb
amdgpu-install -y --usecase=wsl,rocm --no-dkms
echo "Please restart WSL2 session for these to take effect."
Verify ROCMinfo
You should be able to run rocminfo to see that your GPU is detected and operating within ROCM on Ubuntu/WSL
byron@3900X:~$ rocminfo
WSL environment detected.
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
Runtime Ext Version: 1.6
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: NO
==========
HSA Agents
==========
*******
Agent 1
*******
Name: CPU
Uuid: CPU-XX
Marketing Name: CPU
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Internal Node ID: 0
Compute Unit: 24
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 49331708(0x2f0bdfc) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 49331708(0x2f0bdfc) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1100
Marketing Name: AMD Radeon RX 7900 XTX
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 16(0x10)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29772(0x744c)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2371
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 2280
SDMA engine uCode:: 21
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 25080996(0x17eb4a4) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Download libtensorflow.so for AMD
I compiled this on Ubuntu 22.0.4 against ROCm 6.x using kernel 6.6. Please make sure you are on a kernel 6.6 (or if 6.8 is available when you read this, use kernel 6.8 or higher)
Please download latest release
Updated on 07/02/2024 – Libtensorflow217 – compiled against ROCm 6.x
Install libtensorflow into Pixinsight
Backup your original libtensorflow in PixInsight:
cd /opt/PixInsight/bin/lib
mkdir bak
mv libtensorflow* bak
Extract the download file to /usr/local from the path where you downloaded it too.
sudo tar -C /usr/local -xzf /path/to/Downloads/libtensorflow217.tar.gz
Run linker to build OS library
sudo ldconfig /usr/local/lib
Exit/Restart PixInsight and try StarNet or run StarNet CLI. PixInsight won’t load this library if you don’t restart after running ldconfig.
Update mesa & Drivers
Optional:PixInsight requires updated MESA drivers for WSL2 and standard Ubuntu installations. The issue was more pronounced with Ubuntu 22.04, prompting us to revise our guide to concentrate solely on 24.04. Should you encounter graphics or performance issues, consider installing these updated components; otherwise, this section may be omitted if you’re using Ubuntu 24.04.
sudo add-apt-repository ppa:kisak/kisak-mesa
sudo apt update
sudo apt upgrade
Once this is complete, then you will want to reboot as it may have updated the kernel blobs for your graphics driver as well.
GPU accelerated Starnet++
Now let’s verify that StarNet will use the AMD GPU. StarNet includes a libtensorflow.so in its directory that you will need to replace with my AMD ones if you run the CLI version, but if you run this tool in PixInsight it should use the one we installed in your ld path.
bymiller@byron-X570:~/Downloads/StarNetv2CLI_linux$ sh run_starnet.sh
2024-05-12 21:01:54.012227: E external/local_xla/xla/stream_executor/plugin_registry.cc:90] Invalid plugin kind specified: DNN
2024-05-12 21:01:54.033232: E external/local_xla/xla/stream_executor/plugin_registry.cc:90] Invalid plugin kind specified: BLAS
Reading input image... Done!
Bits per sample: 16
Samples per pixel: 3
Height: 712
Width: 1048
Restoring neural network checkpoint... Done!
2024-05-12 21:01:54.384899: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-12 21:01:54.436196: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:812] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-05-12 21:01:54.458272: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:812] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-05-12 21:01:54.458309: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:812] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-05-12 21:01:54.458427: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:812] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-05-12 21:01:54.458460: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:812] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-05-12 21:01:54.458493: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:812] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-05-12 21:01:54.458511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 23512 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:0a:00.0
Total number of tiles: 15
2024-05-12 21:01:54.798149: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
100% finished
Done!
Hooray! AMD Users Rejoice at beautiful GPU accelerated StartNet++ (and other PixInsight tools that are GPU enabled)
Radeontop (Bare metal/VM only)
You can install radeontop and run it in a shell while you run starnet or other Tensorflow jobs and see the GPU load spike.
Note: This only works on VM and bare metal. GPU pass through on WSL2 does not show up as a linux device as radeontop and nvtop expect. You can use Windows task manager to view your GPU resources.
sudo apt install radeontop
View GPU load on Windows
From Windows machine, you can right click on your start windows icon, select task manager and then click performance and select GPU. When you run a PixInsight process that uses GPU, you can see it spike here in Windows.
Here you can see the memory and Computer spike when running Noise Xterminator. This process would usually take 20 minutes or longer on full frame masters but now finishes in seconds.
AMD GPU Accelerated RC-Astro
I’m happy to report that the amazing rc-astro tools all work with this set up as well.
StarXTerminator – Remove stars or create star mask.
NoiseXTerminator – AI Noise Removal.
BlurXTerminator – Deconvolution.
You can download the trial version of these from: https://www.rc-astro.com/pixinsight-installation-instructions/
Comments
Please leave a comment below if this works or doesn’t work for you. I have another tensorflow library built on 2.13 that may work with older AMD GPUs if this one doesn’t work for you.
May your skies be clear.
Hi Byron – thanks for sharing – this is a breakthrough! Managed to get it working on Arch with ubuntu 22.04 going in distrobox using your guide (starnet++ cli). Fingers are crossed for PixInsight.
All the best and thanks once again.
Cheers
Awesome news! It should work on Debian based systems. I’ll be trying to get it to work on Fedora as well next. Which video card do you have? I’d like to track ones it works on.
yep very awesome. good luck with Fedora – maybe Arch after that 😉 – the way pixinsight installs appears to not play ball w/ the distrobox sandboxing. i have a 7900xtx in the main rig – I do also have a 6900xt but that is in the home server for a windoze vm (isn’t getting much use – if i get some time i could spin up a ubuntu or fedora vm and see if it works on the older libraries).
Hi Byron –
I’m not a linux expert by any means but I stopped at the end of the first step (Install ROCM) because after the instal amdgpu-dkms and install rocm steps it said it couldn’t find either package.
Also it looks like linux mint is behind the times a bit, I can’t upgrade to any kernel past 6.5 at the moment.
I don’t have linux mint to test. Ubuntu has vendor supported ROCm and Fedora has upstreamed ROCm. I’ll be trying out the latest ROCm build to confirm on Ubuntu 22.04. Can’t wait for official support of 24.04
The AMD Adrenalin drivers for WSL2 do not support 6600 series as of yet and installing the linux headers in WSL2 only works by compiling them yourself (maybe update the tutorial on this or am I missing something?).
I’ll make it clearer. This is only tested on 7000 series as far as I know.
If someone has trouble with installing on UBUNTU 24: The ROCm drivers are still being built for UBUNTU v22.04. Solution can be found here: https://askubuntu.com/questions/1517236/rocm-not-working-on-ubuntu-24-04-desktop
We’re keeping a close eye on this and as 24.04 gets supported we’ll update our post! ty!
Works perfectly on Ubuntu 24.04LTS with a 7900XT – StarXterminator went from 3m16s to 13s (roughly 25x speed!), thank you very much Byron!
Hi! How did you get kernel 6.6 on Ubuntu 22.04? The mainline kernel doesn’t work due to wrong libc version. I managed to get it manually compiled and installed, but when I install rocm, dkms complains about unsupported kernel. Should I try Ubuntu 24.04 instead? Thanks!
Thanks for sharing this guide! I am on WSL2 Windows 11 and have 24.04 LTS installed. PixInsight is already installed. When I try to start following your guide with the line below:
sudo apt install “linux-headers-$(uname -r)” “linux-modules-extra-$(uname -r)”
I get
Reading package lists… Done
Building dependency tree… Done
Reading state information… Done
E: Unable to locate package linux-headers-5.15.153.1-microsoft-standard-WSL2
E: Couldn’t find any package by glob ‘linux-headers-5.15.153.1-microsoft-standard-WSL2’
E: Unable to locate package linux-modules-extra-5.15.153.1-microsoft-standard-WSL2
Any ideas on what to do in this case?
The 24.04 stuff was just recently added to support, i’ll update the guide over the next few days to reference 24.04 support.