gdm doesn’t start after Arch Linux update(Feb 2017).

I update my workstation once a week, and I typically do it before I turn the machine off, so that I have an updated system the next time I’m using it. Being on a bleeding edge distro like Arch, I occassionally have trouble with the update, like it happened this morning. (Actually it is amazing how infrequent problems with upgrades are, it speaks volumes about the Arch Linux development community, hats off to you guys).

This morning when I turned my machine on, after the boot sequence, I was just shown a blank screen instead of the usual GDM login prompt. The logs (Xorg logs and system logs) showed nothing wrong. And restarting gdm didn’t help either. I was clueless and I switched to using lightDM. LightDM gave me the login screen. I still wasn’t sure what the problem was. I thought it could be some broken Wayland dependency. But when I tried runing the HipChat Linux client, I noticed the following errors on the terminal:

libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast

And then it dawned on me, that this could be a driver error and I actually might be missing the GL driver for my NVIDIA Quadro 600 Card, so I ran:

$ sudo pacman -S nvidia-libgl

and restarted gdm, and it worked. The HipChat client runs too. This problem most likely occurs when using the nVidia binary drivers, like I do. I’m not sure if  the Nouveau drivers have the same issue. Or if

Printing bits of a byte in C

Printing the bit representation of a byte is a very simple and interesting problem. There are numerous ways this can be done in many different programming languages out there. In C, one would do something like this for signed integers:

What exactly is this function doing?

Most of the code is pretty obvious. The key portions of the code are in lines 3 and 9.

Line 3, calculates the maximum possible positive value for the given type. In the example above, the given type is an int, and on my machine running GNU/Linux(x86_64),  the number of bits in an int is 32 and its maximum possible positive value for 2147483648.

The binary representation(shown in 4 chunks of 8 bits) of 1 on a 32 bit machine is:

00000000 00000000 00000000 00000001

which when left-shifted 31 times(line 3 in the above code), becomes(notice how the digit 1 has moved from being the LSB to being the MSB):

10000000 00000000 00000000 00000000

whose decimal value is 2147483648.

Having got the max value, we then bitwise-and(&) it with the give number(represented by num) and check the result(line 8). The bitwise-and will yield a number greater than 0, if there is a 1 in the MSB and will yield a 0 otherwise. Based on the result of this operation, we either print a 1 or a 0.

In each iteration, we left-shift  the given number by 1 bit(line 9). We continue to iterate until we’ve traversed all bits.

We can modify the function print_bits() to print the output in a much more readable fashion:

We can also implement a function to count the number of 1’s and 0’s in the binary representation of a given number:

The count_bits() function above for input number 2 which in binary is 00000000 00000000 00000000 00000010 we get an output:

2 has 31 0’s and 1 1’s in its binary representation.

 

This post is a result of a discussion with my nephew who recently took to programming and I was trying to explain bits and bytes to him. Although C is probably not the right language when introducing a high-schooler to programming, I thought it is the language that expresses bits and bytes well, and also, it is the language that I know the best.

Updates on QNAP Finder for Linux

I have made a few fixes to the qnap-finder, and pushed those changes to qnap-finder repository on github. Visually, the only change that is seen is the access URL:

$ ./qnap-finder
1)
Hostname    : cher
IP Address  : 10.0.0.59
Type        : NAS(TS-410)TS-419ITS-410
URL         : https://10.0.0.59/cgi-bin/login.html

If run with ‘-h’ or ‘–help’, qnap-finder will now display a list of all available options.

$ ./qnap-finder -h
qnap-finder v0.1

Usage: qnap-finder [options]

options include:
–help|-h          This help text
–detail|-d        Query for detailed information.(default is brief)
–verbose|-v       Verbose debug
–version|-V       Prints current version

More updates, when I have them.

Arch Linux kernel soft lockup issues!

Earlier this morning, my laptop (a Lenovo Thinkpad X220) refused to wake up from suspend. And afer a hard reset, my machine refused to boot with the following messages on the console:


BUG: soft lockup = CPU#1 stuck for 22s! [which:604]
hda_codec: rates == 0 (ni=0x10, val=0x0, ovrd=1)
hda_codec: cannot attach PCM stream 0 for codec #0
INFO: rcu_preempt detected stalls on CPUs/tasks: { 0}

Rebooting the machine again didn’t help,It was the same error again, but this time around it was systemd and systemd-udev causing the lockup. Pretty clueless and with no access to the interwebs I was at least hoping to get a stack trace of some sort. There were none.

My first (and the most obvious) suspect was the kernel and drivers. I had a quick look at my grub.cfg, part of which is shown here:


menuentry 'Arch Linux Linux, with Linux core repo kernel' --class arch --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-core repo kernel-true-7cb6bd41-b198-4d1d-8ae4-fc8f169abb00' {
load_video
set gfxpayload=keep
insmod gzio
insmod part_gpt
insmod ext2
set root='hd0,gpt3'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt3 --hint-efi=hd0,gpt3 --hint-baremetal=ahci0,gpt3  7cb6bd41-b198-4d1d-8ae4-fc8f169abb00
else
search --no-floppy --fs-uuid --set=root 7cb6bd41-b198-4d1d-8ae4-fc8f169abb00
fi
echo    'Loading Linux core repo kernel ...'
linux    /boot/vmlinuz-linux root=UUID=7cb6bd41-b198-4d1d-8ae4-fc8f169abb00 ro root=/dev/sda3 ro fastboot splash=silent quiet threadirqs add_efi_memmap pcie_aspm=force i915.i915_enable_rc6=7 i915.i915_enable_fbc=1 i915.lvds_downclock=1 rootfstype=ext4 init=/usr/lib/systemd/systemd
echo    'Loading initial ramdisk ...'
initrd    /boot/initramfs-linux.img
}

I decided to cut down all the cruft and see if I can get the machine up and running. Switching to the grub command-line, i did the following(remember TAB completion is your friend):


insmod part_gpt
insmod ext2
set root=(hd0,gpt3)
linux /boot/vmlinuz-linux root=/dev/sda3 rootfstype=ext4
initrd /boot/initramfs-linux.img
boot

Thankfully, That worked!

With my machine back in action. The next step was to look through the kernel docs. And I found this about soft lockups:

A ‘softlockup’ is defined as a bug that causes the kernel to loop in
kernel mode for more than 20 seconds (see “Implementation” below for
details), without giving other tasks a chance to run.

More about it here. There is even a command line parameter, softlockup_panic (search for it in kernel parameters doc) that will help generate the much needed stack trace.

This problem is clearly system specific. Some combination of the kernel alongwith drivers and/or firmware messes around with usage of the cpu. In my case it is one of the following parameters:

* threadirqs
* add_efi_memmap
* pcie_aspm=force
* i915 and family

I had a personal committment to get to, so didn’t probe this issue any further, will get to it during the weekend. For now, I thought I should share it, in case someone else might find this information helpful and also has some pointers on what caused the problem.

I am currently running 3.13.2-1-ARCH kernel and xf86-video-intel 2.99.910-1 drivers. The machine has an Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz and the following PCI hardware:

00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
00:16.3 Serial controller: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b4)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation QM67 Express Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 [Taylor Peak] (rev 34)
0d:00.0 System peripheral: Ricoh Co Ltd MMC/SD Host Controller (rev 07)

QNAP finder for Linux

I have a QNAP Turbo NAS TS410. And under a DHCP setup that is available where I live, it is difficult to figure out the IP address of my NAS box is every time I restart, since the router isn’t doing a great job with the DHCP leases. The QNAP Finder for Linux that QNAP provides has strict dependencies on a specific gtk+ version. The source isn’t available, making it impossible to use on my Arch Linux machine.

It got me wondering  though, what the app actually does. I installed the Windows version of it on a borrowed laptop and using wireshark, I figured a bunch of things, all of which I’ve documented here. With that information, I wrote a small command-line utility that will essentially do the job for me.  I call it qnap-finder and it is available on github.

A quick preview of what the tool shows when run:


$ ./qnap-finder
1)
Hostname    : blah
IP Address  : 10.0.0.30
Type           : NAS(TS-410)TS-419ITS-410

This is just the basic detail. There is a lot more information available, like the number of hard disks, the version of the firmware which I still haven’t got to parsing and displaying, will get that done soon. Hopefully someone else will find this useful. And it is quite possible that I haven’t done something right, so please feel free to contribute/fork or report bugs.

* Disclaimer
============
This program is a work in progress and isn't guaranteed to work with all QNAP devices.
Although this is meant to be a replacement for the Windows/Mac versions of QNAP finder,
it is not guaranteed to work that way.

Arch Linux Systemd upgrade issues

After an update of my Arch OS over the weekend, my system refused to boot with this error:

Error: Root device mounted successfully, but /sbin/init does not exist. Bailing out, you are on your own. Good luck.

I should admit that I love the unassuming tone of the message.

For the latest update to happen, a bit of manual intervention is required.  More details on that here. Once the update is done, the machine will refuse to boot when restarted. To get your system running again edit the grub entry in the grub menu, find the line init=/bin/systemd and replace it with init=/usr/lib/systemd/systemd and continue to boot the system with modified init parameter.

To make the change permanent, modify the /etc/default/grub file, and change the value of GRUB_CMDLINE_LINUX_DEFAULT by appending init=/usr/lib/systemd/systemd to it. On my machine, it looks like this:

GRUB_CMDLINE_LINUX_DEFAULT="init=/usr/lib/systemd/systemd"

And generate a new grub.cfg by running grub-mkconfig. Make sure that you replace your existing grub.cfg (typically in /boot/grub/) with the new one.

The reason for this problem is that with the latest version of systemd (version 204-1) on Arch, the symbolic link that existed earlier to /bin doesn’t anymore, since /bin doesn’t exist anymore (read this post, for the reason behind that).

Suspend issue with my Linux box

A few weeks ago, i completely moved to using systemd on my machine (it runs Arch). And since then i was facing a really weird issue when i closed my laptop lid. My laptop, with Gnome3, suspends itself when i close. And when i open my laptop lid, i was prompted with the familiar gnome-screensaver password dialog, as expected. But right after i could see the password dialog, the system went back to suspend state again. Looking through the system logs, i saw that the suspend was being called twice in succession. And i didn’t have the problem, if I ran pm-suspend from the command-line. It was clear that suspend was being done twice, but i wasn’t sure what was actually triggering it.

After a quick web search and not finding anything useful, i started digging into systemd’s man pages and found the logind.conf(5) man page.  Voila, there it was:

HandlePowerKey=HandleSuspendKey=HandleHibernateKey=HandleLidSwitch=

Controls whether logind shall handle the system power and sleep keys and the lid switch to trigger actions such as system power-off or suspend. Can be one ofignorepoweroffreboothaltkexec and hibernate. If ignore logind will never handle these keys. Otherwise the specified action will be taken in the respective event. Only input devices with the power-switch udev tag will be watched for key/lid switch events. HandlePowerKey= defaults to poweroffHandleSuspendKey= andHandleLidSwitch= default to suspendHandleHibernateKey= defaults to hibernate.

The Arch Wiki has a good explanation of the settings in the power management section of the systemd wiki page.

I realised that gnome does its own power management and with systemd running, it does its job of suspended as well. So i had this happening twice. My first instinct was to disable suspend settings in Gnome, but that didn’t seem possible, even with the gnome-tweak-tool installed. So I just put the following lines in my /etc/systemd/logind.conf:

HandlePowerKey=ignore
HandleSuspendKey=ignore
HandleHibernateKey=ignore
HandleLidSwitch=ignore

This essentially will make sure that systemd doesn’t handle the system lid and suspend functions.

This is the first issue that i encountered after i moved to using systemd. (A non-issue actually). I am very happy with how systemd works. I should admit i was quite skeptical before i started using it, but now I’m very used to the whole idea. And whats more, i do notice the faster boot and shutdown times. I haven’t actually measured it, but I’m quite sure that if i did, i would notice a difference. But, thats for later!