Skip to content

Experiencing Ubuntu and a Yocto based distribution for the Nvidia Jetson AGX Orin

  • Khem 

Are you working on products using Nvidia Jetsons? You might be wondering what choices for software distribution options are available to you. Having options is a good thing but making a good choice is hard. Understanding the trade-offs is key. When it comes to Nvidia Tegra devices, L4T ( Ubuntu ) and OE4T ( Yocto BSP ) are realistically two choices. Are you wondering which one fits your requirements ? Read on !

Ubuntu (L4T)

Nvidia’s Tegra Developer Kits are a powerhouse when it comes to edge AI workloads. Officially L4T ( Linux For Tegra ) supports BSP’s based on Ubuntu ( 22.04 ) as of this writing. In order to get a devkit going, one needs the SDK manager which runs on another Ubuntu based host or windows. I have a Jetson AGX Orin Developer Kit which comes pre-installed with L4T but I had erased it for Yocto images, so I wanted to find what it takes to reflash it with L4T. I tried using the Quick Start but soon realized I will need an Ubuntu distribution running on my host machine as it expects some Debian packages e.g. dpkg, start-stop-daemon etc and I did not have native Ubuntu installed on any of my machines. Second approach was to use SDK manager on my desktop running ArchLinux. Thankfully there is a docker option available which saved my day. I installed Ubuntu 20.04 based docker image for SDK Manager. In order to download the SDK manager I needed an Nvidia Developer account. Launching SDK Manager went well, it even printed a barcode on terminal which could be used to register my session with my account but I used the URL method.

docker load -i ./sdkmanager-2.3.0.12626-Ubuntu_20.04_docker.tar.gz
docker tag sdkmanager:2.3.0.12626-Ubuntu_20.04 sdkmanager:latest
docker run -it --privileged -v /dev/bus/usb:/dev/bus/usb/ -v /dev:/dev -v /media/$USER:/media/nvidia:slave --name JetPack_AGX_Orin_Devkit --network host sdkmanager --cli --action install --login-type devzone --product Jetson --target-os Linux --version 6.2.1 --target JETSON_AGX_ORIN_TARGETS --flash --license accept --stay-logged-in true --collect-usage-data enable --exit-on-finish

I put the AGX Devkit into recovery mode and connected to the docker host and it detected the AGX and offered a set of SDK elements to install. I chose all and it took close to 2 hours but in the end it flashed all of them into EMMC ( Took 12G space ), It goes through the motions of creating a new user etc which is good.

Building CUDA Samples

Cuda samples is a nice collection of programs which can showcase different processing units on the Tegra. I thought of compiling them. However, I did realize that I needed to install a bunch of build dependencies first e.g. cuda-toolkit cuda-toolkit-12-6-meta cuda-nvcc, I tried latest cuda-samples and it did not compile as it kept asking for dependencies and got stuck when it needed nvJPEG and I could not teach cmake to find it anywhere, anyway using v12.5 release was better option

git clone https://github.com/NVIDIA/cuda-samples.git
git checkout v12.5
make -j10
find . -name "matrixMul"
./bin/aarch64/linux/release/matrixMul

After few hours of playing and 18G of EMMC consumed I ran the matrixMul sample successfully.

[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Ampere" with compute capability 8.7

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 142.86 GFlop/s, Time= 0.917 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

/dev/mmcblk0p1    57G   18G   37G  33% /

I applied all the updates, it also offerred me to run apt autoremove which save 200M of space. In the end following updates needed pro subscription maybe.

kraj@ubuntu:~$ sudo apt dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
Get more security updates through Ubuntu Pro with 'esm-apps' enabled:
  python2.7-minimal libzvbi-common libzbar0 libiperf0 libjs-jquery-ui
  libopenexr25 python3-scipy libpostproc55 libopencv-core4.5d libavcodec58
  libgstreamer-plugins-bad1.0-0 iperf3 libpython2.7 libavutil56 libswscale5
  gir1.2-gst-plugins-bad-1.0 libswresample3 libavformat58 libzvbi0
  gstreamer1.0-plugins-bad libgstreamer-opencv1.0-0 libpmix-dev python2.7
  libde265-0 libpython2.7-minimal libpmix2 libpython2.7-stdlib
  libgstreamer-plugins-bad1.0-dev libavfilter7
Learn more about Ubuntu Pro at https://ubuntu.com/pro
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

Oh in the end I took a snapshot of htop ( Look at memory used )

L4T based on Ubuntu 22.04 ( idling 2.53G )

Using the Yocto based Yoe Distribution

I wanted to try OE4T which is an OpenEmbedded layer for Tegra platforms, OE4T is a vibrant community and works hard on reacting to changes that come down from L4T releases, and as a result Tegra BSP Yocto layers are well maintained and works with Yocto master and LTS releases equally well. It unleashes a lot of useful Yocto features – reproducible builds, SBOMs, image based updates, read-only-rootfs etc. Yoe is a rolling distribution so naturally the master branch is used. On a Linux build machine, here are steps

git clone --recurse-submodules -j8 -b master https://github.com/YoeDistro/yoe-distro.git yoe
cd yoe
. ./envsetup.sh jetson-agx-orin-devkit
bitbake yoe-swupdate-image-tegra

It produced a bunch of artifacts. Firstly it needed to be flashed, so put the AGX devkit into recovery mode and ensure it appears as USB device. (flashing takes around 10 minutes) because its a big image

mkdir agx && cd agx
tar xf /mnt/b/yoe/master/build/tmp/deploy/images/p3737-0000-p3701-0005/yoe-simple-image-p3737-0000-p3701-0005.rootfs.tegraflash.tar.zst
sudo ./doflash.sh

It boots after flashing and on first boot it allocates /data partition and expands it to remaining size of EMMC

root@p3737-0000-p3701-0005:~# df -h | grep mmc
/dev/mmcblk0p1            9.3G      4.2G      4.5G  48% /
/dev/mmcblk0p11          63.0M    129.5K     62.9M   0% /boot/efi
/dev/mmcblk0p16          37.3G      2.0M     35.4G   0% /data

Lets run same cuda sample (MatrixMul)

root@p3737-0000-p3701-0005:~# /usr/bin/cuda-samples/matrixMul
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Ampere" with compute capability 8.7

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 305.50 GFlop/s, Time= 0.429 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

The GFlops/s with Yocto are high because yoe distro is configured with NVPMODEL_CONFIG_DEFAULT = “0” out of box, same can be achieved with L4T BSP with tweaking power manager running nvpmodel -m 0 and rebooting the system. Yoe’s change is a distro policy conveniently repeatable at scale automatically.

System Updates

Yoe uses SWUpdate for managing updates for Tegra devices. Its a myth that Yocto does not have OTA system, reality is that it has more than one which are production grade used to manage large fleets of devices.

To check which bank is active

root@p3737-0000-p3701-0005:~# nvbootctrl dump-slots-info
Current version: 36.4.4
Capsule update status: 0
Current bootloader slot: A
Active bootloader slot: A
num_slots: 2
slot: 0,             status: normal
slot: 1,             status: normal

Make a change and build the image then copy the .swu file to target. Its applying it manually for illustration but swupdate can be configured to monitor a location to apply the update automatically or use hawkbit backend to stage updates in central location to manage fleet.

bitbake yoe-swupdate-image-tegra
scp build/tmp/deploy/images/p3737-0000-p3701-0005/yoe-swupdate-image-tegra-p3737-0000-p3701-0005.rootfs.swu root@p3737-0000-p3701-0005.local:
ssh root@p3737-0000-p3701-0005.local
swupdate -i yoe-swupdate-image-tegra-p3737-0000-p3701-0005.rootfs.swu
reboot

check the current partition

root@p3737-0000-p3701-0005:~# nvbootctrl dump-slots-info
Current version: 36.4.4
Capsule update status: 1
Current bootloader slot: B
Active bootloader slot: B
num_slots: 2
slot: 0,             status: normal
slot: 1,             status: normal

Snapshot of htop on yocto image ( Idling 661M )

Finally, oh Nvidia has this thing called “Bring your own kernel” which is awesome approach so I followed it because tegra layer supports it with linux-yocto ( Yocto’s Reference Kernel )

root@p3737-0000-p3701-0005:~# uname -a
Linux p3737-0000-p3701-0005 6.12.55-yocto-standard #1 SMP PREEMPT Tue Oct 28 02:29:28 UTC 2025 aarch64 GNU/Linux

Summary

Both system builds have distinct properties. It is prudent to analyze your use-case thoroughly and then see which one fits your product needs. Spending time in vetting distribution infrastructure is backbone of your software delivery, it must be chosen with care. Yocto is like a race track car, it comes with some default settings but it is to be customized to squeeze maximum performance where as general purpose binary distributions may not offer much customizations but might offer a larger set of software packages precompiled which are tuned for least common denominator to maintain generality, its like driving a sedan, its robust, safe and takes from point A to B

Leave a Reply

Your email address will not be published. Required fields are marked *