Add new warnings to GCC with Python

January 23, 2014 4 comments

I found myself getting annoyed lately with a certain C++ anti-pattern: functions which take string parameters by value unnecessarily.  The pattern itself isn’t a huge sin, and can have legitimate uses, but often functions would be just as happy with a string const& parameter, so copying it is a waste.  No matter how the optimized string copy is, it won’t be better than a simple reference.  Then throw more complex classes in the mix, and it would be good have a tool or diagnostic to tell when things are copied unnecessarily.

My first thought was to try this with elfutils libdw, or perhaps Dyninst SymtabAPI.  It shouldn’t be that hard to iterate debuginfo for all the functions in a binary, and scan their parameters for direct class or struct types.  I may still try this, actually, just as a case study for relatively simple example code.  However, one of my coworkers suggested this idea would also work as a GCC plugin, so I tried it with gcc-python-plugin and found it embarrassingly easy.  Voila!

import gcc

def on_pass_execution(p, fn):
    # This pass is called fairly early on, per-function, after the
    # CFG has been built.  Skip every other pass.
    if p.name != '*warn_function_return':
        return

    # I don't want notices for system headers.
    # (is data like -isystem available?)
    if fn.decl.location.file.startswith('/usr/'):
        return

    for parm, parmtype in zip(fn.decl.arguments,
                              fn.decl.type.argument_types):
        if type(parmtype) == gcc.RecordType:
            gcc.inform(parm.location,
                       'parameter type is not trivial')

gcc.register_callback(gcc.PLUGIN_PASS_EXECUTION,
                      on_pass_execution)

I started with the hello-world example, which already shows how to hook every function and extract more info than I even need.  You’ll see that my case is even shorter in code, just checking the rough parameter type and issuing a notice.  In practice it looks like this:

$ cat test.cc
#include <string>
size_t len(std::string s) { return s.size(); }

$ gcc-with-python3 script.py -c test.cc
test.cc: In function ‘size_t len(std::string)’:
test.cc:2:24: note: parameter type is not trivial
 size_t len(std::string s) { return s.size(); }
                        ^

I got this working with my whole build just by tweaking make options:

make CXXLD=g++ CXX="gcc-with-python3 /path/to/script.py"

Pull that into Vim quickfix, and I’m now easily browsing through every instance of non-trivial parameter copies in my code!

Part of the reason this was so easy was that I just flag every direct class/struct parameter, regardless of whether an option like const& would fit. This might be a little more intelligent if it scanned the CFG to see if the parameter really could be constant. It could also relax a bit on what’s flagged, perhaps just types for which std::is_trivially_copyable returns false in C++11.  But for just scratching an itch, I’m amazed at how quick this simple plugin was to write.

SystemTap monitoring ptrace activity

November 25, 2013 2 comments

I’d like to present a real and hopefully interesting example: how I recently used SystemTap to help with a complex ptrace issue. I was trying to track down an issue which involved multiple processes, and one of those was ptrace-attached to the other. In this situation, it would be perfectly fine to attach a debugger to the controlling process, but since ptrace is exclusive, I could not have attached again to that child process.  Where traditional tools are left out, SystemTap may still help!

For my problem, in a little more detail, the ptracer process was watching for forks in the child. When a fork occurred, it executed a bit of instrumentation in both the original child and the new fork, then allowed both to continue. My bug was that the fork return value in the caller (as seen after the instrumentation) should have been the new PID, but it was always 56. A bit of corruption, it seemed, but always the same 56. Except, even more strangely, on some systems once auditd started, this changed to -1 with errno set to ENOSYS, despite the fork having truly succeeded.

My “aha!” moment was when I realized that on x86_64, SYS_clone happens to be 56, and further the syscall number on entry and its return value on exit both use the same %rax register. The ptrace controller needs to save and restore register state when running instrumentation, so it seemed likely here was the culprit somehow. Now this is something easily traced with SystemTap, even across processes and kernel activity. I only needed to watch fork, ptrace, and the wait4 that gets ptrace signals. Here’s that script I used:

# NB: TASK_COMM_LEN is 16, including '\0', so truncate to 15 chars
global mutator = "test_driver"
global mutatee = "test4_2.dyn_g++"

function trace(str) {
    printf("%16s[%5d] %s\n", execname(), pid(), str)
}

probe syscall.fork {
  if (execname() == mutatee)
    trace(sprintf("-> %s (%s)", name, argstr))
}

probe syscall.fork.return {
  if (execname() == mutatee)
    trace(sprintf("<- %s = %s", name, retstr))
}

global ptrace_watch
probe begin {
  ptrace_watch[7] = 1 # PTRACE_CONT
  ptrace_watch[12] = 1 # PTRACE_GETREGS
  ptrace_watch[13] = 1 # PTRACE_SETREGS
  ptrace_watch[24] = 1 # PTRACE_SYSCALL
}

probe syscall.ptrace {
  if (execname() == mutator && request in ptrace_watch)
    trace(sprintf("%s (%s)", name, argstr))
}

probe syscall.wait4.return {
  if (execname() == mutator)
    trace(sprintf("%s = %s  %s", name, retstr, status_str))
}

This is a relatively simple script, with just a little logic to filter the events down, and then it prints a trace at each point. Even with that filtering, I got more output than I care to flood in a blog post, but here’s the excerpt of the bug in action:

 test4_2.dyn_g++[18304] -> fork ()
     test_driver[18287] wait4 = 18305  WSTOPSIG=SIGSTOP
     test_driver[18287] wait4 = 18304  WSTOPSIG=SIGTRAP | PTRACE_EVENT_FORK << 8
     test_driver[18287] ptrace (PTRACE_GETREGS, 18304, data=0x7fff291f0990)
     test_driver[18287] ptrace (PTRACE_CONT, 18304, SIG_0)
 test4_2.dyn_g++[18304] <- fork = 18305
     test_driver[18287] wait4 = 18304  WSTOPSIG=SIGTRAP
     test_driver[18287] ptrace (PTRACE_SETREGS, 18304, data=0x7fff291f09c0)

This showed me that my guess was pretty much on the mark!  I have a pet peeve against blind bug fixes, where some programmers will just take a guess at a problem and make changes, without really confirming that they’ve understood what’s happening.  So it’s gratifying to me to see this real data showing how things went down.  There’s the fork entered and returned in one process, including the proper return value (at least from the kernel perspective).  The PTRACE_EVENT_FORK came during the middle of the fork, so when the ptracer grabbed current registers, %rax was still the entry value. The fork returned the right value to the temporary instrumentation path (that didn’t care about it). When the ptracer went to restore registers to get back to where fork should have returned, it would be restoring the syscall’s entry value — 56.

It’s a tricky situation, made trickier by the fact that I could reproduce the bug on RHEL5 and Fedora, but RHEL6 was unaffected.  But with the same script, I could see that RHEL6 waited until after the internal kernel fork returned to deliver the PTRACE_EVENT_FORK stop, so the PTRACE_GETREGS just happened to be from a more convenient place.  (The behavior difference is due to utrace, which is a whole can of worms intertwined with ptrace internals.)

My plan to fix it was to introduce a PTRACE_SYSCALL call after the event, so the kernel fork would complete but still trap before returning to userspace.  As with tracking down the bug, the proof of a fix requires data too:

 test4_2.dyn_g++[19488] -> fork ()
     test_driver[19471] wait4 = 19488  WSTOPSIG=SIGTRAP | PTRACE_EVENT_FORK << 8
     test_driver[19471] ptrace (PTRACE_SYSCALL, 19488, SIG_0)
 test4_2.dyn_g++[19488] <- fork = 19489
     test_driver[19471] wait4 = 19489  WSTOPSIG=SIGSTOP
     test_driver[19471] wait4 = 19488  WSTOPSIG=TRACESYSGOOD
     test_driver[19471] ptrace (PTRACE_GETREGS, 19488, data=0x7fff07d26e90)
     test_driver[19471] ptrace (PTRACE_CONT, 19488, SIG_0)
     test_driver[19471] wait4 = 19488  WSTOPSIG=SIGTRAP
     test_driver[19471] ptrace (PTRACE_SETREGS, 19488, data=0x7fff07d26ec0

That’s a better looking sequence of events, and indeed the fork is finally showing the proper PID even after instrumentation.  It worked on all systems which previously reproduced the bug, and even the unaffected RHEL6 was fine with the added wait.  Fixed and confirmed!

Finally, I should admit, the problem was not so cleanly found and resolved as I might make it sound.  Like many debugging sessions, I went through a lot steps before I’d narrowed it down this far.  I read a lot of both application code and kernel code, and I wrote quite a few one-off SystemTap scripts watching behavior in different areas.  (Perhaps a journal of this process might be interesting some other time…)  So I don’t mean to imply that immediately SystemTap magically shined a light right where I needed to look — but it did make it possible for me to see into an otherwise opaque problem.

Defining your own Git commands

November 17, 2013 Comments off

When you use a command prompt long enough, you’ll find yourself repeating some common command sequences.  If you value your time, you’ll probably start saving shell aliases and even full shell scripts to simplify these things.

The same is true with using Git: you’ll probably find yourself repeating some long commands, and wish for an easier way.  You may also just have some tricky Git recipes that you want to remember.  Well sure enough, you can write Git aliases too, as well as add your own scripts as git subcommands.  The Git wiki has a pretty good page describing aliases, so I’ll let you read that, and there are examples too.  Here are a few more examples that I felt are worth sharing.

up = remote update

This first one is just pure laziness.  A git remote update is the same as git fetch for all configured remotes, and now git up can do the same with less typing.

new = log ..@{upstream}
pend = log @{upstream}..

These show how the current branch differs from the upstream tracking branch (often origin/master, but could be anything).  My version of git new is different than the example on the Git wiki, instead showing what commits you’ve fetched but not yet pulled/merged.  Then git pend is the opposite, showing what commits are pending to go out in a push.

log1 = log --abbrev-commit --pretty=oneline
logf = log --pretty=fuller

These are just shortcuts for formatted logs; git log1 for one line per commit, and git logf for the fuller format that includes committer info (instead of just author).  The log1 options could also be condensed down to --oneline these days, but that wasn’t available when I first wrote this alias.

lol = log --graph --decorate --pretty=oneline --abbrev-commit
lola = log --graph --decorate --pretty=oneline --abbrev-commit --all

More logging tricks, and for these I’ll give credit to the blog where I found them.  These add ASCII-art graph lines to help visualize merges, and they also label tags and branch heads.

show-patch = format-patch --stdout -1

This works like git show, dumping the most recent commit on stdout, but with format-patch it uses email-style headers.  This makes it easier to elsewhere use git am on that patch with preserved commit message, author, and date.  It also includes binary diffs by default.

make = !test -d \"$(git rev-parse --show-toplevel)\" && make -C \"$_\"

This runs make from the top of the git directory, no matter what your current working directory is, so you can just git make anywhere.  The test -d beforehand is just a precaution to make sure you’re really within a git path, otherwise you’ll get both a git error and make complaining of an empty -C argument.

Once you start getting into complicated “!...” sequences, it may make sense to break it out into its own file.  As long as you name it starting with “git-“, like git-unique-abbrev in my last post, and save it somewhere in your PATH, then you can call it as you would any other git command like git unique-abbrev.  Another good example is this git merge-ff script on StackOverflow, which lets you safely fast-forward any branch without affecting your current work tree.

The last one I’ll leave you with is back to shell aliases:

alias gcd='test -d "$(git rev-parse --show-toplevel)" && cd "$_"'

This changes your current directory to the top of the git tree, using the same mechanism as I did for git make.  This has to be a shell alias, not a git command or a script, because it’s changing the working directory of the shell.  Don’t forget cd - to jump back too.

How short can Git abbreviate?

November 10, 2013 3 comments

How short can a Git hash be abbreviated?  A full SHA-1 hash is 160 bits, 40 hex characters, which is a bit long to talk about and retype when you’re not just copy-pasting.  Therefore most people will use just the first few characters to refer to commits, but how safe is this?

I know of a couple posts where Linus Torvalds laments Git’s default hash abbreviation of 7 characters.  This gives you 28 unique bits, almost 270 million possibilities, which seems like it should be plenty.  Even with the Linux kernel, one of the biggest repositories I know, there are fewer than 4 million objects.  However, since hash functions like SHA-1 have approximately random distribution, you easily can and will run into the birthday problem at this level.

As Linus noted in that second link, Git will now make sure that the abbreviated hashes it prints are currently long enough to be unique, but that’s no guarantee against future collisions.  He recommends kernel developers use 12 characters, which got me curious how to evaluate how much headroom this leaves for any given repository.  One way to get a basic idea is to ask Git to abbreviate all commit objects as much as possible.  For example, on linux.git:

$ git rev-list --all --abbrev=0 --abbrev-commit | wc -L
11

To see which commits require 11-character abbreviations:

$ git rev-list --all --abbrev=0 --abbrev-commit | grep -E '.{11,}'
8b82547e33e
3ee5014185b
21949f00a02

And indeed, if you try a command like git show on any of those with just 10 characters, it will complain that this is ambiguous. Notice that none of those 3 are near each other though, which means that the collision must have been with some other object type, like a tree or blob. In fact, git log is happy with 10 characters on these, as it only has to disambiguate among revisions.

How about a histogram of commit abbreviation lengths:

$ git rev-list --all --abbrev=0 --abbrev-commit |
  awk '{ a[length] += 1 } END { for (len in a) print len, a[len] }'
5 1771
6 286066
7 106897
8 7899
9 494
10 27
11 3

In fact, most commits here are just fine with that 7-character abbreviation, even a few as low as 5, but there’s that tail end of commits which require up to 11. So 12 does seem like a reasonable suggestion to leave some headroom, but I don’t think it’s quite the “later generations” headroom that Linus wanted.

This is all fine for commit objects, but I didn’t find a way for Git to print this minimized abbreviation on all object types. So, I wrote a quick script to run through various lengths:

#!/bin/bash
# git-unique-abbrev

OBJECTS="$(mktemp)"
git rev-list --all --objects | cut -c1-40 | sort >"$OBJECTS"
printf "%d objects\n" $(wc -l <"$OBJECTS")
for abbrev in $(seq 4 40); do
    DUPES="$(mktemp)"
    uniq -D -w $abbrev <"$OBJECTS" >"$DUPES"
    count=$(wc -l <"$DUPES")
    acount=$(uniq -w $abbrev <"$DUPES" | wc -l)
    printf "%2d: %d / %d\n" $abbrev $count $acount
    test $count -eq 0 && cat "$OBJECTS"
    mv "$DUPES" "$OBJECTS"
    test $count -eq 0 && break
done
rm -f "$OBJECTS"

On linux.git:

$ git-unique-abbrev 
3253824 objects
 4: 3253824 / 65536
 5: 3107326 / 854621
 6: 573758 / 277590
 7: 39577 / 19748
 8: 2609 / 1304
 9: 160 / 80
10: 12 / 6
11: 2 / 1
12: 0 / 0
d597639e2036f04f0226761e2d818b31f2db7820
d597639e203a100156501df8a0756fd09573e2de

Each line is reporting how many total objects are ambiguous at that abbreviation and how many such abbreviations there are. The minimum Git allows is 4, which is totally saturated in this repository — every object is ambiguous, and all 65536 possible prefixes are included.  The default 7 does disambiguate most objects, but there’s still 1.2% remaining.  At 8 characters, with 2609 objects and 1304 common prefixes, we can infer there’s even still a triplicate around.  We really do require 12 characters now to disambiguate all objects.  The last lines are the final set of duplicates, and git cat-file -t will tell you these are both tree objects.

For a probabilistic view, we can use the square approximation from Wikipedia: p(n) \approx n^2/2m, where n is the number of random values, m is the number of possible values, and here we should use m=2^{4c} for c characters.  For n = 3253824, this gives us approximately 30% chance of colliding at 11 characters, and only 1.9% at 12. So it’s not all that strange to have gotten our d597639e203 result for 11 after all.

What’s the takeaway from all this? If you are a kernel developer, or use a Git repository of similar size, you should definitely set Git to core.abbrev of at least 12, and maybe more to have some real headroom.  For smaller projects, you can try some of the commands above to see where you stand, but it may just be prudent for everyone to get used to using longer hashes everywhere.  Finally, when you reference a Git hash for posterity, e.g. in another commit message, I’d recommend always using the full value.

Boot Time versus LVM Snapshots

January 21, 2013 1 comment

Linux tip: LVM snapshots are a handy feature, but they will destroy your boot time if they diverge very far.

I do a lot of development and testing in virtual machines. Sometimes if I want to try something disruptive, I’ll do it in a clone of some base image. Using LVM snapshots makes it easy and quick to create the new image.

But recently I noticed my workstation was taking abnormally long to boot. The command “systemd-analyze blame” told me that fedora-storage-init.service alone was taking over two minutes. While I imagine this might happen occasionally with fsck, it certainly shouldn’t take so long every time.

I finally found out that the culprit was the LVM vgchange command which initializes the volumes, and apparently this can take quite a while with snapshots to enumerate the extents.  I had a few old snapshotted machines I had used for testing, and the base images those had originated from had since diverged quite a bit due to simple update activity.

I didn’t need those tests anymore, so delete them and reboot — and I booted in only 15 seconds, which even includes a few seconds to enter the encryption passphrase!  This is an order of magnitude faster — or rather, now I know the LVM snapshots were causing an order of magnitude slowdown.

Lesson learned, don’t keep LVM snapshots around for a long time.  It’s too bad, because this capability is otherwise very useful…

Monitoring a CyberPower UPS in Fedora

November 23, 2012 1 comment

I recently bought an HP MicroServer N40L for a small home/media server, and I chose to install pre-release Fedora 18 on it.  Fedora is perhaps not the best choice for a server, but I’m quite familiar with it, and I enjoy the First aspect of that distro.

Anyway, I’m not doing anything terribly complicated with the server, but one of the things I found slightly tricky to set up was UPS monitoring.  I have a CyberPower CP425HG connected via USB, and one of its fun “features” is that it resets the USB interface every 20 seconds if a driver has not connected.  So before I even got any monitoring going, it was flooding the logs with disconnect/reconnect messages.  Then when I tried to get Network UPS Tools to monitor it, I found it quite racy for the driver to connect while the UPS in a usable state.

So, here’s how I now have it configured, for my own reference and in case anyone else has similar issues.  First is to simply install the tools with “yum install nut nut-client“.  Then the NUT configuration files need a few basic additions, just added at the end of the existing files.

Added to /etc/ups/ups.conf:

[cps]
        driver = "usbhid-ups"
        port = "auto"
        vendorid = "0764"

The specific “vendorid” is probably not actually necessary, but I chose to go ahead and restrict my configuration to the CyberPower vendor anyway.

Added to /etc/ups/upsd.users:

[nutmon]
        password = CENSORED
        upsmon master

and to /etc/ups/upsmon.conf:

MONITOR cps@localhost 1 nutmon CENSORED master

The username “nutmon” and its password don’t really matter for local use; it’s just used for the monitor to talk to upsd. You’d pay more attention if you were linking multiple machines’ UPS status over the network.

Now the tricky part is how to get NUT started at the right time as the system boots. A usual “systemctl enable” never seemed to catch the UPS at a sane point of its cycle during boot, and if missed the service just fails. So for that, I added a udev rule to start the service as soon as the UPS is detected.

Create a new file /etc/udev/rules.d/99-nut.rules:

SUBSYSTEM=="usb", ENV{DEVTYPE}=="usb_device", ATTR{idVendor}=="0764",
  TAG+="systemd", ENV{SYSTEMD_WANTS}+="nut-server.service nut-monitor.service"

(Note, I wrapped that line for blogging, but the rule needs to be all on one line.)

When I look at the system boot logs, I am still getting one cycle where the UPS disconnects itself, and thus NUT fails to start too. But when the UPS connects the second time, udev kicks off the service again and I finally get proper UPS monitoring. Whew!

Procrastinating Timelord

June 1, 2012 Comments off

If I could decide now to have done something earlier, then surely I would, for I’d be done with no more effort required.  Maybe I should work on that.

What do we want? Time travel! When do we want it? It's irrelevant!

And that concludes this Inane Friday Thought Experiment.  Get back to whatever you weren’t doing…

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: