Friday, 23 March 2018

Kernel Commits with "Fixes" tag

Over the past 5 years there has been a steady increase in the number of kernel bug fix commits that use the "Fixes" tag.  Kernel developers use this annotation on a commit to reference an older commit that originally introduced the bug, which is obviously very useful for bug tracking purposes. What is interesting is that there has been a steady take-up of developers using this annotation:

With the 4.15 release, 1859 of the 16223 commits (11.5%) were tagged as "Fixes", so that's a fair amount of work going into bug fixing.  I suspect there are more commits that are bug fixes, but aren't using the "Fixes" tag, so it's hard to tell for certain how many commits are fixes without doing deeper analysis.  Probably over time this tag will be widely adopted for all bug fixes and the trend line will level out and we will have a better idea of the proportion of commits per release that are just devoted to fixing issues.  Let's see how this looks in another 5 years time,  I'll keep you posted!

Wednesday, 21 February 2018

Linux Kernel Module Growth

The Linux kernel grows at an amazing pace, each kernel release adds more functionality, more drivers and hence more kernel modules.  I recently wondered what the trend was for kernel module growth per release, so I performed module builds on kernels v2.6.24 through to v4.16-rc2 for x86-64 to get a better idea of growth rates: one can see, the rate of growth is relatively linear with about 89 modules being added to each kernel release, which is not surprising as the size of the kernel is growing at a fairly linear rate too.  It is interesting to see that the number of modules has easily more than tripled in the 10 years between v2.6.24 and v4.16-rc2,  with a rate of about 470 new modules per year. At this rate, Linux will see the 10,000th module land in around the year 2025.

Saturday, 3 February 2018

stress-ng V0.09.15

It has been a while since my last post about stress-ng so I thought it would be useful to provide an update on the changes since V0.08.09.

I have been focusing on making stress-ng more portable so it can build with various versions of clang and gcc as well as run against a wide range of kernels.   The portability shims and config detection added to stress-ng allow it to build and run on a wide range of Linux systems, as well as GNU/HURD, Minix, Debian kFreeBSD, various BSD systems, OpenIndiana and OS X.

Enabling stress-ng to work on a wide range of architectures and kernels with a range of compiler versions has helped me to find and fix various corner case bugs.  Also, static analysis with a various set of tools has helped to drive up the code quality. As ever, I thoroughly recommend using static analysis tools on any project to find bugs.

Since V0.08.09 I've added the following stressors:
  • inode-flags  - (using the FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctl, see ioctl_iflags(2) for more details.
  • sockdiag - exercise the Linux sock_diag netlink socket diagnostics
  • branch - exercise branch prediction
  • swap - exercise adding and removing variously sized swap partitions
  • ioport - exercise I/O port read/writes to try and cause CPU I/O bus delays
  • hrtimers - high resolution timer stressor
  • physpage - exercise the lookup of a physical page address and page count of a virtual page
  • mmapaddr - mmap pages to randomly unused VM addresses and exercise mincore and segfault handling
  • funccall - exercise function calling with a range of function arguments types and sizes, for benchmarking stack/CPU/cache and compiler.
  • tree - BSD tree (red/black and splay) stressor, good for exercising memory/cache
  • rawdev - exercise raw block device I/O reads
  • revio - reverse file offset random writes, causes lots of fragmentation and hence many file extents
  • mmap-fixed - stress fixed address mmaps, with a wide range of VM addresses
  • enosys - exercise a wide range of random system call numbers that are not wired up, hence generating ENOSYS errors
  • sigpipe - stress SIGPIPE signal generation and handling
  • vm-addr - exercise a wide range of VM addresses for fixed address mmaps with thorough address bit patterns stressing
Stress-ng has nearly 200 stressors and many of these have various stress methods than can be selected to perform specific stress testing.  These are all documented in the manual.  I've also updated the stress-ng project page with various links to academic papers and presentations that have used stress-ng in various ways to stress computer systems.  It is useful to find out how stress-ng is being used so that I can shape this tool in the future.

As ever, patches for fixes and improvements are always appreciated.  Keep on stressing!

Friday, 1 September 2017

Static analysis on the Linux kernel

There are a wealth of powerful static analysis tools available nowadays for analyzing C source code. These tools help to find bugs in code by just analyzing the source code without actually having to execute the code.   Over that past year or so I have been running the following static analysis tools on linux-next every weekday to find kernel bugs:
Typically each tool can take 10-25+ hours of compute time to analyze the kernel source; fortunately I have a large server at hand to do this.  The automated analysis creates an Ubuntu server VM, installs the required static analysis tools, clones linux-next and then runs the analysis.  The VMs are configured to minimize write activity to the host and run with 48 threads and plenty of memory to try to speed up the analysis process.

At the end of each run, the output from the previous run is diff'd against the new output and generates a list of new and fixed issues.  I then manually wade through these and try to fix some of the low hanging fruit when I can find free time to do so.

I've been gathering statistics from the CoverityScan builds for the past 12 months tracking the number of defects found, outstanding issues and number of defects eliminated:

As one can see, there are a lot of defects getting fixed by the Linux developers and the overall trend of outstanding issues is downwards, which is good to see.  The defect rate in linux-next is currently 0.46 issues per 1000 lines (out of over 13 million lines that are being scanned). A typical defect rate for a project this size is 0.5 issues per 1000 lines.  Some of these issues are false positives or very minor / insignficant issues that will not cause any run time issues at all, so don't be too alarmed by the statistics.

Using a range of static analysis tools is useful because each one has it's own strengths and weaknesses.  For example smatch and sparse are designed for sanity checking the kernel source, so they have some smarts that detect kernel specific semantic issues.  CoverityScan is a commercial product however they allow open source projects the size of the linux-kernel to be built daily and the web based bug tracking tool is very easy to use and CoverityScan does manage to reliably find bugs that other tools can't reach.  Cppcheck is useful as scans all the code paths by forcibly trying all the #ifdef'd variations of code - which is useful on the more obscure CONFIG mixes.

Finally, I use clang's scan-build and the latest verion of gcc to try and find the more typical warnings found by the static analysis built into modern open source compilers.

The more typical issues being found by static analysis are ones that don't generally appear at run time, such as in corner cases like error handling code paths, resource leaks or resource failure conditions, uninitialized variables or dead code paths.

My intention is to continue this process of daily checking and I hope to report back next September to review the CoverityScan trends for another year.