NVIDIA/Troubleshooting
Corrupted screen: "Six screens" Problem
For some users, using GeForce GT 100M's, the screen gets corrupted after X starts, divided into 6 sections with a resolution limited to 640x480. The same problem has been recently reported with Quadro 2000 and hi-res displays.
To solve this problem, enable the Validation Mode NoTotalSizeCheck
in section Device
:
Section "Device" ... Option "ModeValidation" "NoTotalSizeCheck" ... EndSection
'/dev/nvidia0' input/output error
This error can occur for several different reasons, and the most common solution given for this error is to check for group/file permissions, which in almost every case is not the problem. The NVIDIA documentation does not talk in detail on what you should do to correct this problem but there are a few things that have worked for some people. The problem can be a IRQ conflict with another device or bad routing by either the kernel or your BIOS.
First thing to try is to remove other video devices such as video capture cards and see if the problem goes away. If there are too many video processors on the same system it can lead into the kernel being unable to start them because of memory allocation problems with the video controller. In particular on systems with low video memory this can occur even if there is only one video processor. In such case you should find out the amount of your system's video memory (e.g. with lspci -v
) and pass allocation parameters to the kernel, e.g. for a 32-bit kernel:
vmalloc=384M
If running a 64bit kernel, a driver defect can cause the NVIDIA module to fail initializing when IOMMU is on. Turning it off in the BIOS has been confirmed to work for some users. [1]User:Clickthem#nvidia module
Another thing to try is to change your BIOS IRQ routing from Operating system controlled
to BIOS controlled
or the other way around. The first one can be passed as a kernel parameter:
PCI=biosirq
The noacpi
kernel parameter has also been suggested as a solution but since it disables ACPI completely it should be used with caution. Some hardware are easily damaged by overheating.
Crashing in general
- Try disabling
RenderAccel
in xorg.conf. - If Xorg outputs an error about
"conflicting memory type"
or"failed to allocate primary buffer: out of memory"
, or crashes with a "Signal 11" while using nvidia-96xx drivers, addnopat
to your kernel parameters. - If the NVIDIA compiler complains about different versions of GCC between the current one and the one used for compiling the kernel, add in
/etc/profile
:
export IGNORE_CC_MISMATCH=1
- If Xorg is crashing , try disabling PAT. Pass the argument
nopat
to kernel parameters.
More information about troubleshooting the driver can be found in the NVIDIA forums.
Bad performance after installing a new driver version
If FPS have dropped in comparison with older drivers, check if direct rendering is enabled (glxinfo
is included in mesa-utils):
$ glxinfo | grep direct
If the command prints:
direct rendering: No
A possible solution could be to regress to the previously installed driver version and rebooting afterwards.
Avoid screen tearing
Tearing can be avoided by forcing a full composition pipeline, regardless of the compositor you are using. To test whether this option will work, run:
$ nvidia-settings --assign CurrentMetaMode="nvidia-auto-select +0+0 { ForceFullCompositionPipeline = On }"
Or click on the Advanced button that is available on the X Server Display Configuration menu option. Select either Force Composition Pipeline or Force Full Composition Pipeline and click on Apply.
In order to make the change permanent, it must be added to the "Screen"
section of the Xorg configuration file. When making this change, TripleBuffering
should be enabled and AllowIndirectGLXProtocol
should be disabled in the driver configuration as well. See example configuration below:
/etc/X11/xorg.conf.d/20-nvidia.conf
Section "Device" Identifier "Nvidia Card" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX 1050 Ti" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" Option "MetaModes" "nvidia-auto-select +0+0 {ForceFullCompositionPipeline=On}" Option "AllowIndirectGLXProtocol" "off" Option "TripleBuffer" "on" EndSection
If you do not have an Xorg configuration file, you can create one for your present hardware using nvidia-xconfig
(see NVIDIA#Automatic configuration) and move it from /etc/X11/xorg.conf
to the preferred location /etc/X11/xorg.conf.d/20-nvidia.conf
.
20-nvidia.conf
by using nvidia-xconfig
are set automatically by the driver and are not needed. To only use this file for enabling composition pipeline, only the section "Screen"
containing lines with values for Identifier
and Option
are necessary. Other sections may be removed from this file.Multi-monitor
For multi-monitor setup you will need to specify ForceCompositionPipeline=On
for each display. For example:
$ nvidia-settings --assign CurrentMetaMode="DP-2: nvidia-auto-select +0+0 {ForceCompositionPipeline=On}, DP-4: nvidia-auto-select +3840+0 {ForceCompositionPipeline=On}"
Without doing this, the nvidia-settings
command will disable your secondary display.
You can get the current screen names and offsets using --query
:
$ nvidia-settings --query CurrentMetaMode
The above line is for two 3840x2160 monitors connected to DP-2 and DP-4. You will need to read the correct CurrentMetaMode
by exporting xorg.conf
and append ForceCompositionPipeline
to each of your displays. Setting ForceCompositionPipeline
only affects the targeted display.
~/.nvidia-settings-rc
as 0/XVideoSyncToDisplayID=
or by installing nvidia-settings and using the graphical configuration options.Avoid screen tearing in KDE (KWin)
The problem is caused by incorrect assumption by the KDE devs about the behaviour of glXSwapBuffers
and should be fixed in Plasma 5.12, Plasma 5.15, Plasma 5.16 [2]. Additionally, NVIDIA#DRM kernel mode setting may be required.
Legacy solutions
For posterity, these are the legacy workarounds. Do not apply both workarounds because this may lead to high CPU load [3].
1. GL threads
Set GL threads to sleep by exporting __GL_YIELD="USLEEP"
to just kwin_x11
. Unlike setting up a global environment variable, this affects only KWin. It should also have the advantage over other workarounds, like forcing triple buffering or forcing composition pipeline in the driver, that it does not introduce additional stuttering when scrolling in Firefox or moving windows.
The script can be executed automatically at login with an autostart script:
~/.config/autostart-scripts/kwin.sh
#!/bin/bash (sleep 2s && __GL_YIELD="USLEEP" kwin_x11 --replace)
Flag the script as executable.
The sleep
argument helps to prevent issues when KWin is restarted/hanging after logging in, you might need to increase this time.
2. Use TripleBuffering
Make sure TripleBuffering
has been enabled for the driver, see #Avoid screen tearing.
Create the /etc/profile.d/kwin.sh
file:
/etc/profile.d/kwin.sh
export KWIN_TRIPLE_BUFFER=1
Use OpenGL 2.0 or later as rendering backend under System Settings > Display and Monitor > Compositor.
Modprobe Error: "Could not insert 'nvidia': No such device" on linux >=4.8
With linux 4.8, one can get the following errors when trying to use the discrete card:
$ modprobe nvidia -vv
modprobe: INFO: custom logging function 0x409c10 registered modprobe: INFO: Failed to insert module '/lib/modules/4.8.6-1-ARCH/extramodules/nvidia.ko.gz': No such device modprobe: ERROR: could not insert 'nvidia': No such device modprobe: INFO: context 0x24481e0 released insmod /lib/modules/4.8.6-1-ARCH/extramodules/nvidia.ko.gz
# dmesg
... NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:139b) NVRM: installed in this system is not supported by the 370.28 NVRM: NVIDIA Linux driver release. Please see 'Appendix NVRM: A - Supported NVIDIA GPU Products' in this release's NVRM: README, available on the Linux driver download page NVRM: at www.nvidia.com. ...
This problem is caused by bad commits pertaining to PCIe power management in the Linux Kernel (as documented in this NVIDIA DevTalk thread).
The workaround is to add pcie_port_pm=off
to your kernel parameters. Note that this disables PCIe power management for all devices.
Screen corruption after resuming from suspend
The nvidia driver normally only saves essential video allocations on suspend and hibernate. The resulting loss of video memory contents can lead to failures such as rendering corruption and application crashes upon exit from power management cycles. To save and restore all video memory contents, nvidia
kernel module can be loaded NVreg_PreserveVideoMemoryAllocations=1
option and enable nvidia-suspend.service
. [4]
/tmp
, which is of type tmpfs. If the size is not sufficient for the amount of memory, point to a different location with the NVreg_TemporaryFilePath
option, e.g. NVreg_TemporaryFilePath=/var/tmp
.CPU spikes with 400 series cards
If you are experiencing intermittent CPU spikes with a 400 series card, it may be caused by PowerMizer constantly changing the GPU's clock frequency. Switching PowerMizer's setting from Adaptive to Performance, add the following to the Device
section of your Xorg configuration:
Option "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x3322; PowerMizerDefaultAC=0x1"
Full system freeze or crashes when using Flash
If you experience occasional full system freezes using Flash, a possible workaround is to disable Hardware Acceleration:
/etc/adobe/mms.cfg
EnableLinuxHWVideoDecode=0
Or, if you want to keep Hardware acceleration enabled but allowing a higher chance of screen tearing, you may try to before starting a browser:
export VDPAU_NVIDIA_NO_OVERLAY=1
Laptops: X hangs on login/out, worked around with Ctrl+Alt+Backspace
If, while using the legacy NVIDIA drivers, Xorg hangs on login and logout (particularly with an odd screen split into two black and white/gray pieces), but logging in is still possible via Ctrl+Alt+Backspace
(or whatever the new "kill X" key binding is), try adding this in /etc/modprobe.d/modprobe.conf
:
options nvidia NVreg_Mobile=1
One user had luck with this instead, but it makes performance drop significantly for others:
options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=33 NVreg_DeviceFileMode=0660 NVreg_SoftEDIDs=0 NVreg_Mobile=1
Note that NVreg_Mobile
needs to be changed according to the laptop:
- 1 for Dell laptops.
- 2 for non-Compal Toshiba laptops.
- 3 for other laptops.
- 4 for Compal Toshiba laptops.
- 5 for Gateway laptops.
See NVIDIA Driver's README: Appendix K for more information.
Screen(s) found, but none have a usable configuration
Sometimes NVIDIA and X have trouble finding the active screen. If your graphics card has multiple outputs try plugging your monitor into the other ones. On a laptop it may be because your graphics card has VGA/TV out. Xorg.0.log will provide more info.
Another thing to try is adding invalid "ConnectedMonitor" Option
to Section "Device"
to force Xorg throws error and shows you how correct it.
Here
more about ConnectedMonitor setting.
After re-run X see Xorg.0.log to get valid CRT-x,DFP-x,TV-x values.
nvidia-xconfig --query-gpu-info
could be helpful.
Blackscreen at X startup / Machine poweroff at X shutdown
If you have installed an update of Nvidia and your screen stays black after launching Xorg, or if shutting down Xorg causes a machine poweroff, try the below workarounds:
- Prepend "xrandr --auto" to your xinitrc
- Use the
rcutree.rcu_idle_gp_delay=1
kernel parameter.
- You can also try to add the
nvidia
module directly to your mkinitcpio config file.
- If the screen still stays black with both the
rcutree.rcu_idle_gp_delay=1
kernel parameter and thenvidia
module directly in the mkinitcpio config file, try re-installing nvidia and nvidia-utils in that order, and finally reload the driver:
# modprobe nvidia
Backlight is not turning off in some occasions
By default, DPMS should turn off backlight with the timeouts set or by running xset. However, probably due to a bug in the proprietary Nvidia drivers the result is a blank screen with no powersaving whatsoever. To workaround it, until the bug has been fixed you can use the vbetool
as root.
Install the vbetool package.
Turn off your screen on demand and then by pressing a random key backlight turns on again:
vbetool dpms off && read -n1; vbetool dpms on
Alternatively, xrandr is able to disable and re-enable monitor outputs without requiring root.
xrandr --output DP-1 --off; read -n1; xrandr --output DP-1 --auto
Driver 415: HardDPMS
Proprietary driver 415 includes a new feature called HardDPMS. This is reported by some users to solve the issues with suspending monitors connected over DisplayPort.
It is reported to become the default in a future driver version, but for now, the HardDPMS
option can be set in the Device
or Screen
sections. For example:
/etc/X11/xorg.conf.d/20-nvidia.conf
Section "Device" ... Option "HardDPMS" "true" ... EndSection Section "Screen" ... Option "HardDPMS" "true" ... EndSection
HardDPMS
will trigger on screensaver settings like BlankTime
. The following ServerFlags
will set your monitor(s) to suspend after 10 minutes of inactivity:
/etc/X11/xorg.conf.d/20-nvidia.conf
Section "ServerFlags" Option "BlankTime" "10" EndSection
Xorg fails to load or Red Screen of Death
If you get a red screen and use GRUB, disable the GRUB framebuffer by editing /etc/default/grub
and uncomment GRUB_TERMINAL_OUTPUT=console
. For more information see GRUB/Tips and tricks#Disable framebuffer.
Black screen on systems with Intel integrated GPU
If you have an Intel CPU with an integrated GPU (e.g. Intel HD 4000) and have installed the nvidia package, you may experience a black screen on boot, when changing virtual terminal, or when exiting an X session. This may be caused by a conflict between the graphics modules. This is solved by blacklisting the Intel GPU modules. Create the file /etc/modprobe.d/blacklist.conf
and prevent the i915 and intel_agp modules from loading on boot:
/etc/modprobe.d/blacklist.conf
install i915 /usr/bin/false install intel_agp /usr/bin/false
No audio over HDMI
Sometimes nvidia HDMI audio devices are not shown when you do
aplay -l
For whatever reason on some new machines, the audio chip on the nvidia GPU is disabled at boot. Read more here and here
You need to reload the nvidia device with audio enabled. In order to do that make sure that your GPU is on (in case of laptops/Bumblebee) and that you are not running X on it, because it's going to reset:
# setpci -s 01:00.0 0x488.l=0x2000000:0x2000000 # rmmod nvidia-drm nvidia-modeset nvidia # echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove # echo 1 > /sys/bus/pci/devices/0000:00:01.0/rescan # modprobe nvidia-drm # xinit -- -retro
If you are running your TTY on nvidia, put the lines in a script so you do not end up with no screen.
Black screen on systems with VIA integrated GPU
As above, blacklisting the viafb module may resolve conflicts with NVIDIA drivers:
/etc/modprobe.d/blacklist.conf
install viafb /usr/bin/false
X fails with "no screens found" when using Multiple GPUs
In situations where you might have multiple GPUs on a system and X fails to start with:
[ 76.633] (EE) No devices detected. [ 76.633] Fatal server error: [ 76.633] no screens found
then you need to add your discrete card's BusID to your X configuration. This can happen on systems with an Intel CPU and an integrated GPU or if you have more than one Nvidia card connected. Find your BusID:
# lspci | grep -E "VGA|3D controller"
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) 01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GTX 650] (rev a1) 08:00.0 3D controller: NVIDIA Corporation GM108GLM [Quadro K620M / Quadro M500M] (rev a2)
Then you fix it by adding it to the card's Device section in your X configuration. In my case:
/etc/X11/xorg.conf.d/10-nvidia.conf
Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BusID "PCI:1:0:0" EndSection
In the example above 01:00.0
is stripped to be written as 1:0:0
, however some conversions can be more complicated. lspci
output is in hex format, but in config files the BusID's are in decimal format! This means that in cases where the BusID is greater than 9 you will need to convert it to decimal!
ie: 5e:00.0
from lspci becomes PCI:94:0:0
.
Xorg fails during boot, but otherwise starts fine
On very fast booting systems, systemd may attempt to start the display manager before the NVIDIA driver has fully initialized. You will see a message like the following in your logs only when Xorg runs during boot.
/var/log/Xorg.0.log
[ 1.807] (EE) NVIDIA(0): Failed to initialize the NVIDIA kernel module. Please see the [ 1.807] (EE) NVIDIA(0): system's kernel log for additional error messages and [ 1.808] (EE) NVIDIA(0): consult the NVIDIA README for details. [ 1.808] (EE) NVIDIA(0): *** Aborting ***
In this case you will need to establish an ordering dependency from the display manager to the DRI device. First create device units for DRI devices by creating a new udev rules file.
/etc/udev/rules.d/99-systemd-dri-devices.rules
ACTION=="add", KERNEL=="card*", SUBSYSTEM=="drm", TAG+="systemd"
Then create dependencies from the display manager to the device(s).
/etc/systemd/system/display-manager.service.d/10-wait-for-dri-devices.conf
[Unit] Wants=dev-dri-card0.device After=dev-dri-card0.device
If you have additional cards needed for the desktop then list them in Wants and After seperated by spaces.
xrandr BadMatch
If you are trying to configure a WQHD monitor such as DELL U2515H using xrandr and xrandr --addmode
gives you the error X Error of failed request: BadMatch
, it might be because the proprietary NVIDIA driver clips the pixel clock maximum frequency of HDMI output to 225 MHz or lower. To set the monitor to maximum resolution you have to install nouveau drivers. You can force nouveau to use a specific pixel clock frequency by setting nouveau.hdmimhz=297
(or 330
) in your Kernel parameters.
Alternatively, it may be that your monitor's EDID is incorrect. See #Override EDID.
Another reason could be that per default current NVidia drivers will only allow modes explicitly reported by EDID; but sometimes refresh rates and/or resolutions are desired which are not reported by the monitor (although the EDID information is correct; it's just that current NVidia drivers are too restrictive).
If this happens, you may want to add an option to xorg.conf
to allow non-EDID modes:
Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" ... Option "ModeValidation" "AllowNonEdidModes" ... EndSection
This can be set per-output. See NVidia driver readme (Appendix B. X Config Options) for more information.
Override EDID
See Kernel mode setting#Forcing modes and EDID, Xrandr#Troubleshooting and Qnix QX2710#Fixing X11 with Nvidia.
Overclocking with nvidia-settings GUI not working
Workaround is to use nvidia-settings CLI to query and set certain variables after enabling overclocking (as explained in NVIDIA/Tips and tricks#Enabling overclocking, see nvidia-settings(1) for more information).
Example to query all variables:
nvidia-settings -q all
Example to set PowerMizerMode to prefer performance mode:
nvidia-settings -a [gpu:0]/GPUPowerMizerMode=1
Example to set fan speed to fixed 21%:
nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUTargetFanSpeed=21
Example to set multiple variables at once (overclock GPU by 50MHz, overclock video memory by 50MHz, increase GPU voltage by 100mV):
nvidia-settings -a GPUGraphicsClockOffsetAllPerformanceLevels=50 -a GPUMemoryTransferRateOffsetGPUGraphicsClockOffsetAllPerformanceLevels=50 -a GPUOverVoltageOffset=100
Overclocking not working with Unknown Error
If you are running Xorg as a non-root user and trying to overclock your NVIDIA GPU, you will get an error similar to this one:
$ nvidia-settings -a "[gpu:0]/GPUGraphicsClockOffset[3]=10"
ERROR: Error assigning value 10 to attribute 'GPUGraphicsClockOffset' (trinity-zero:1[gpu:0]) as specified in assignment '[gpu:0]/GPUGraphicsClockOffset[3]=10' (Unknown Error).
To avoid this issue, Xorg has to be run as the root user. See Xorg#Rootless Xorg for details.
System will not boot after driver was installed
If after installing the NVIDIA driver your system becomes stuck before reaching the display manager, try to disable kernel mode setting.
X fails with "Failing initialization of X screen"
If /var/log/Xorg.0.log
says X server fails to initialize screen
(EE) NVIDIA(G0): GPU screens are not yet supported by the NVIDIA driver (EE) NVIDIA(G0): Failing initialization of X screen
and nvidia-smi says No running processes found
The solution is at first reinstall latest nvidia-utils, and then copy /usr/share/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf
to /etc/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf
, and then edit /etc/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf
and add the line Option "PrimaryGPU" "yes"
. Restart the computer. The problem will be fixed.
System does not return from suspend
What you see in the log:
kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices kernel: nvidia-modeset: WARNING: GPU:0: Failure processing EDID for display device DELL U2412M (DP-0). kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DELL U2412M (DP-0) kernel: nvidia-modeset: ERROR: GPU:0: Failure reading maximum pixel clock value for display device DELL U2412M (DP-0).
A possible solution based on [5]:
Run this command to get the version
string:
# strings /sys/firmware/acpi/tables/DSDT | grep -i 'windows ' | sort | tail -1
Add the acpi_osi=! "acpi_osi=version"
kernel parameter to your boot loader configuration.
Vulkan error on applications start
On executing an application that require Vulkan acceleration, if you get this error
Vulkan call failed: -4
try to delete the ~/.nv
or ~/.cache/nvidia
directory.