Life, Teams, and Software Engineering

Category: unit testing

AAA vs BDD Structuring in Unit Tests

UPDATE: I updated this to include updates to the BDD example’s test function name.  I’m starting to dislike having the called interface in the test name.  It’s inflexible and unnecessary and ultimately doesn’t help the reader all that much.

It’s always good when there are people on your team whom you can both learn from and teach things to. Such is the case with my current team. A couple team members have never done unit testing as it’s known in the industry today. Mostly just “pound at the interface until I’m comfortable and throw it over to testing to deal with” unit testing.

During our first code review there were a lot of issues with unit tests. I actually prefer to let it pan out this way; this way people actually get in there and try it out, to later see what works, what didn’t, and (hopefully) get some ideas from the rest of the team about how to improve it. One recurring theme was that there was a lot of redundant code throughout test suites. Setup and teardown were outright missing! That’s good, because now they’ve seen the problem, and part of the solution is to use those. Another thing was how the test cases themselves were structured. I’ve come across two widely accepted ways of structuring unit tests:

Arrange – Act – Assert
Given – When – Then

I’ve personally used both in the same codebase when it makes sense, but I’m wondering if there’s more to it than just semantics and readability. With AAA you are more likely to interact with the class under test directly inside your test function:

  1. void interface_context_somethingHappens()
  2. {
  3.     //arrange
  4.     mock1->setCallShouldSucceed(false);
  5.     mock2->addFakeValue(“Value”);

  6.     //act
  7.     out->interface();

  8.     //assert
  9.     CPPUNIT_ASSERT(somethingHappened);
  10. }

However, I’ve started to notice that if you read your tests, I mean really read them, then using the Behavior-Driven-Development (BDD) Given-When-Then structuring will actually nudge you towards factoring the real test preparation and calls to the class under test out of your test case:

  1. void SomethingShouldHappenInSomeContext()
  2. {
  3.     givenSomeContext();

  4.     whenActionPerformed();
  5.     thenSomethingShouldHappen();
  6. }

  7. void givenSomeContext()
  8. {
  9.     //configure context
  10.     mock1->setCallShouldSucceed(false);
  11.     mock2->addFakeValue(“Value”);
  12. }
  13. void whenActionPerformed()
  14. {
  15.     //execute action
  16.     out->interface();
  17. }
  18. void thenSomethingShouldHappen()
  19. {
  20.     //check that what should happen happened
  21.     CPPUNIT_ASSERT(didSomething);
  22. }

Yes, it’s more code, and yes, it may just be semantics, but I see something more. The naming alone has suggested that maybe I should remove the details from the test. Not only does this produce well-factored code but, but it pulls communication with the class under test to the boundary of my test suite and away from my test cases. Now, if the usage of this class changes for some reason, I only have to update it in one or two places in my test suite, rather than in every single test case. Obviously this approach may not be suited for every single situation, but like I said, I use both where it feels right. Granted, this is considered a good practice when using AAA as well, but you’ve got to name those newly extracted functions something don’t you?

I’d love to get some feedback on this. What convention does your team follow? Are my observations valid or can it be chalked up to something else?

Autotest for Compiled Languages (C#, C++) using Watchr

When I was learning Rails I set up Autotest on Ubuntu with Growl notifications, which I thought was a pretty slick idea. On Ruby this whole technique is super easy and efficient because Ruby is an interpreted language; there’s no compile time to slow you down, and no other files to pollute your directory tree. Compiled languages don’t have that advantage, but I think we deserve some continuous feedback too. Here I’ll describe how to configure Watchr, a generic Autotest utility, to run compiled tests whenever a source file in the path is updated. This tutorial will use a C# example, but it’s trivial to have it trigger on different file types.

Getting Started

First, we’ll need to install Ruby and Watchr.  Because I’m using Windows I just downloaded RubyInstaller.  Make sure you put the Ruby/bin directory in your PATH.

Next, download Watchr from Github, extract the archive and navigate to that directory.  Or you can just download the gem directly, but some people might want to run the tests locally first. The following command will install the gem from the local directory:

C:\mynyml-watchr-17fa9bf\>gem install Watchr

Configuring Watchr

Now that we have all the dependencies installed, we need to configure Watchr. This process is easiest if you already have a single point of entry for your continuous build process, but if you don’t it’s not that bad and you’ll probably want one anyway. Now, at the same level as the directory(ies) containing your source code, create a text file. I usually call this autotest.watchr, but you could call it autotest.unit or autotest.integration if you’re into that sort of thing. For now, just put in the following line in:

  1. watch(‘./.*/(.*)\.cs$’) {system “cd build && buildAndRunTests.bat && cd ..\\}


Yes, it’s that easy. What this is doing is telling Watchr to monitor any files that match the regular expression (in this case a recursive directory search for .cs files) inside the watch() call, and then execute the command on the right. I also have it configured to return to the same directory when it’s finished, but I don’t know if that’s actually necessary. The watch() pattern is what you would modify for different environments. For example, you could use watch('./.*/(.*)\.[h|cpp|hpp|c]$') for a Mixed C/C++ system, or watch('./.*/(.*)\.[cs|vb|cpp|h]$') for a .NET project with components built in different languages. An important thing to note is the $ at the end of the regex. Because it’s likely that there will be a lot of intermediary files generated during the build process, we don’t want a file which happens to match this pattern that’s generated at build time to trigger an infinite loop of build & test (like happened to me). The heavy lifting is done here, but the stuff specific to your project happens in build/buildAndRunTests.bat. Let’s take a look at that:

  1. pushd ..\
  2. echo Building tests
  3. “C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE\devenv.com” Tests.Unit\Tests.Unit.csproj /rebuild Release
  4. popd
  5. pushd ..\Tests.Unit\bin\Release
  6. echo Running tests through nunit-console
  7. nunit-console.exe Tests.Unit.dll /run=Tests.Unit
  8. popd


You’ll obviously want to customize this to the specifics of your project, but right now it’s hard-coded to call Visual Studio 2008’s devenv.com (on a 64-bit OS) and build a project called Tests.Unit. For brevity it also assumes that nunit-console.exe is available on the PATH. Not terribly interesting, but that’s the rest of the work.

Now to have all the magic happen. Run the following command in a new console window from your project directory:

C:\Projects\MyProject>Watchr autotest.watchr

That’s it! Watchr is now monitoring for changes to files that match your pattern. Simply modify any file matching the pattern and watch the whole process set off. Once it finishes, you can hopefully see the results and it will wait for the next change.

Now there’s one less thing you have to do during your heavy refactoring sessions, or just with day-to-day development.

Is tooling only for the youth?

Disclaimer: This post maybe have no basis in reality at all outside my team, it’s just a question. This post is done with information on a case study of 1.

I moved to the wide world of ANSI C++ after working in C# with the love of ReSharper for the better part of 2 years. I managed but I definitely felt the absence of the great refactoring tool that took the things that should be easy and trivial and made them so. C++ is not that way, especially with old IDEs. So I started searching this past weekend for a C++ refactoring tool that actually supported my IDE. I found one that claimed to support it, Visual Assist X, so I installed it and started playing around with it. It works pretty well for what I need it to do but I haven’t fully explored it yet.
Either way, this post isn’t about Visual Assist X. Today after I set up my keyboard shortcuts (to match ReSharper, no less) I called my team’s senior developer over and showed him what it could do.
Him: Another tool?!? *shakes head* I don’t trust tools to do much of anything. They’re just more systems with their own bugs.
Me: Yes but what they do they do well. Why not take it for what it is and accept that nothing is perfect?
Notice these are not direct quotes. I remember better what he said than my own response (mostly because I was surprised by his reaction), but that was the general nature of the conversation. He then proceeded to call me “Mr. Tool” and was quick to dismiss it. I was a bit confused by this, but didn’t think much of it at the time. I had work to do so I went on my way.
Now, sitting in my living room catching up on back episodes of Legend of the Seeker something creeps up from the back of my mind. Do I really rely on that many tools? Let’s list them out:

Development:

  1. Visual Studio
  2. CppUnit (is that really a “tool”?)
  3. Rational PureCoverage for capturing code coverage
  4. Hudson for Continuous Integration
  5. CppDepend, CCCC, CppCheck, SourceMonitor for various static analysis (some do things better/more simply than others)
  6. And now Visual Assist X for refactoring support.

Process:

  1. Jira + GreenHopper
  2. Confluence
  3. Crucible
  4. Fisheye
    I don’t think this list is unreasonable at all. OK, so maybe the static analysis tools are a bit excessive but I like data, especially when it costs me next to nothing to get it through Continuous Integration.

    Let’s look back to September 2008 when I joined the team. They were using exactly ONE of these tools, Visual Studio. No unit testing, no continuous integration, no process support tools, certainly no automated testing, and worst of all no feedback mechanisms of any kind until a project ended and you handed it over to the customer to say “I hope this is what you wanted”.

    Flash to today. We have 20+ configurations in Hudson, our latest project has 90%+ unit test coverage at all times, our system testing is as automated as it can be so our test team isn’t overwhelmed, all our documentation is maintained in Confluence, and all our issues and tasks are tracked in Jira.

    I feel like each of the tools listed above plays an integral part in my day-to-day work as a developer. Obviously as developers we spend less time in the management systems and use very limited features of them, but does that make them any less important? No. If management can just jump out to Jira to check our status or out to Confluence to answer their question, that’s one less thing they had to bother me or my team about. It makes me happy and I don’t even know it, and I’m sure they appreciate it too.

    Now I finally get to the question. Is the reason for his reaction a generational thing or am I completely off base? Are the youth more likely to find a tool-based solution to a pain point while the more seasoned have just learned to deal with it?

    Then there’s the other possibility, am I over-reliant on tools? Maybe I could simplify, but do I understand how they work, their purpose? Absolutely. I know what the scope of their functionality is, how they do what they do, what they’re NOT meant to do, and how to bend them to my will within the constraints of the tool. Each one of them adds value and there are only 3 of them that any developer on the team really has to know or use; Visual Studio, CppUnit, and PureCoverage…I take that back, PureCoverage isn’t necessary for them to understand, they just need to have it installed since every time they run a build it runs the unit tests with coverage. They could completely ignore the results and I wouldn’t know the difference.

    What do you think? I’m sure there are exceptions as there are with any rule, but are the youth more likely to blaze new trails?

    Revisited: Reducing code duplication in rhinomocks

    I’ve been wanting to come back to this for some time now but, quite frankly, I’m a bit ashamed to even look at it again. I’m sure that by now I’m far enough away from it to be able to observe my work as if it were someone else’s, and that’s exactly how I’m going to structure this post.

    About a year and a half ago I wrote a post titled Reducing Code Duplication in RhinoMocks Tests that now makes me want to scratch my eyeballs out. Take a look, it will make you feel better about yourself :). I suppose there would be nothing if not for progress so I’m revisiting it to go over what I (and most of you I’m sure) see as being entirely wrong with the approach described in that post, and talk about what I’ve learned since then about mocking and unit testing.

    The approach described in that post has several glaring issues; smells if you will:

    1. Branching unit tests, which is a general no-no
    2. The tests are over-specified. The close relationship between test code and the code under test greatly reduces the value of the tests and will make the tests very fragile
    3. The mocks are over-relied upon and have become a Maslow’s Hammer. I was trying so hard to isolate the presenter class that I missed the point. I was shooting for metrics while ignoring the business logic the MVP implementation is supposed to realize. I should have been focusing on the interactions between the classes, and should have only isolated to directly test specific, complex logic. Only if the presentation logic was sufficiently complex should the model have been mocked away (and in such a case may not even belong in the presenter :)).
    4. This code isn’t really testing the classes in a way that is as close as possible to the way they’ll be integrated in production, and as a result are very likely to miss things.

    So what have I been doing about it? I’ve been away from .NET development for the better part of a year working in C++ which I think has been really good for me. C++ has the bare-essential constructs necessary to perform object-oriented design and as a result some of the more modern frameworks are less available. I’ve only been able to find a single mocking framework for C++, googlemock, but I haven’t used it yet. Not because I don’t want to, but because it hasn’t been necessary.

    Since reading Steve Freeman and Nat Pryce’s Growing Object-Oriented Software Guided by Tests I’ve been approaching my unit testing differently. I’ve started treating my unit tests as an executable specification that is defined before the “green” code is ever written, and now I can’t do it any other way. Yes, it’s just TDD, but Freeman and Pryce’s explanation felt more natural to me. Anyway, their approach calls for acceptance-level tests to be written followed by smaller, more specific, unit tests that verify the details of your specification. So how does this relate to mocking? Simple, following this approach allows my design to naturally evolve into something where mocking becomes the final step, not the first, stretching my test fixtures to the point where run-time information is necessary. Just mock out the necessary classes, provide the stimulus, and we’re good. Further, it enables me to keep my classes linked together in the way they’ll really be used in production and send my stimuli all the way through the system to verify the final effects/output. Obviously you’ll want tests in the deeper regions of your classes, but I’ve gotten more value from my tests implementing them in this way.

    Reducing Code Duplication in RhinoMocks Tests

    UPDATE: This post was later revisited, but it’s still worth reading for an example of what mocking shouldn’t be :).

    I was recently placed on my first project at my new job (I’ll post about that later) and they’re having me integrate a bunch of great things into their process. These include automated unit testing, continuous integration and refactoring to patterns. Anyway, I’ve been writing some of my unit tests using isolation with RhinoMocks and came up with a way to achieve full path coverage without duplicating my Expects between test cases.

    Below is a fairly standard MVP example. A presenter method is invoked and makes decisions based on information obtained from the view. Say you are testing the following method using isolation and assume the dependencies on view and model are injected via the presenter’s constructor:


    public class Presenter
    {
    IView view;
    IModel model;

    //...constructors, properties

    public void Foo()
    {
    if (view.val1)
    {
    //...
    model.F1();
    //...
    if (view.val2)
    {
    //...
    model.F2();
    //...
    }
    else
    {
    //...
    model.F3();
    //...
    }
    }
    else
    {
    //...
    model.F4();
    //...
    }
    }
    }

    Now obviously there are 3 possible paths through this method:

    val1 = true & val2 = true
    val1 = true & val2 = false
    val1 = false & val2 = don’t care

    Now, my approach to this sort of problem in the past had been to duplicate a lot of test code only to change a single case return:

    [Test]
    public void CanFooVal1TrueVal2True()
    {
    Expect.Call(view.val1).Return(true);
    //...some expects
    model.F1();
    LastCall.IgnoreArguments();
    //...some more expects
    Expect.Call(view.val2).Return(true);
    //...some expects
    model.F2();
    LastCall.IgnoreArguments();
    //...some more expects

    mockery.ReplayAll();

    presenter.Foo();
    //Assertions
    }

    [Test]
    public void CanFooVal1TrueVal2False()
    {
    Expect.Call(view.val1).Return(true);
    //...some expects
    model.F1();
    LastCall.IgnoreArguments();
    //...some expects
    Expect.Call(view.val2).Return(false);
    //...some expects
    model.F3();
    LastCall.IgnoreArguments();

    mockery.ReplayAll();

    presenter.Foo();
    //Assertions
    }

    [Test]
    public void CanFooVal1False()
    {
    Expect.Call(view.val1).Return(false);
    //...some expects
    model.F4();
    LastCall.IgnoreArguments();

    mockery.ReplayAll();

    presenter.Foo();
    //Assertions
    }

    The approach I’ve come up with to solve this problem is to have each test case call a single method which uses its parameter list to take a path through the method being tested (granted this is one of the simplest mocking cases):


    [Test]
    public void CanFooVal1TrueVal2True()
    {
    FooPaths(true, true);

    mockery.ReplayAll();

    presenter.Foo();
    //Assertions
    }

    [Test]
    public void CanFooVal1TrueVal2False()
    {
    FooPaths(true, false);

    mockery.ReplayAll();

    presenter.Foo();
    //Assertions
    }

    [Test]
    public void CanFooVal1False()
    {
    FooPaths(false, true); //keep in mind val2 doesn't matter in this case

    mockery.ReplayAll();

    presenter.Foo();
    //Assertions
    }

    private void FooPaths(bool val1, bool val2)
    {
    Expect.Call(view.val1).Return(val1);
    if (val1)
    {
    //...
    model.F1();
    LastCall.IgnoreArguments();
    Expect.Call(view.val2).Return(val2);
    if (val2)
    {
    //...
    model.F2();
    LastCall.IgnoreArguments();
    }
    else
    {
    //...
    model.F3();
    LastCall.IgnoreArguments();
    }
    }
    else
    {
    //...
    model.F4();
    LastCall.IgnoreArguments();
    }
    }

    This makes it so that you only need to change Expects and other information relative to the code you are testing in once place (FooPaths) when the code you are testing changes.

    Does anyone else out there have any other solutions to this problem?

    Copyright © 2017 Life, Teams, and Software Engineering

    Theme by Anders NorenUp ↑